
And what about binary files, like classic Forth blockfiles? Well, you could use 'REFILL' in that context too, but it would probably break up words since it can't find an end-of-line marker and its buffer is smaller than 1024 characters. Does that mean it can't be done? No! But 'REFILL' makes it easier for you, because it handles a few tasks automatically.
First, it has its own buffer (TIB). When you're not using 'REFILL' you have to define one yourself. Second, it terminates the string for you. You don't want 'WORD' to wander into new territory, do you? Third, it sets '>IN' for you every time its receives new input. You have to take care of that one too.
Never heard of '>IN'? Well, the only way for 'WORD' to know on what position the previous scan ended is to store that information into a variable. This variable is called '>IN'.
Not all internal 4tH variables are accessable, mostly because we can't imagine what use they could have to you. Some variables are just better left alone. But '>IN' is available for some very obvious reason: you can use it to point at your own input-buffer and make 'WORD' work for you.
The following program will read the first screen of a block-file for you and print out all the words. You will see that all spaces are eliminated and every word is printed on a new line, just the behaviour you would expect from 'WORD'.
1025 constant size \ screensize + terminator
size 1- value c/scr \ screensize
size string WorkSpace \ 1: our own buffer
64 string filename \ filename string
: openfile \ open the block file
c" romans.blk" filename copy
input open \ open block file
if
input file \ read from file
else
." Cannot open file"
cr quit \ message when error
then
;
: readfile \ fill the buffer
WorkSpace c/scr over over \ address and count
bl fill \ clear the buffer
accept drop \ fill the buffer
input close \ close the file
;
: initparse \ configures parsing
0 WorkSpace c/scr + c! \ 2: terminate screen
WorkSpace >in ! \ 3: set >IN to Workspace
;
: parseblock
begin
bl word \ get word
count dup 0<> \ length zero?
while
type cr \ if so, print it
repeat
drop drop \ else drop addr/cnt
." End of block" cr \ signal "End of block"
;
: parsefile \ do it all
openfile \ open the file
readfile \ read it
initparse \ set up parsing
parseblock \ parse it
;
parsefile
Note there is no need to reset '>IN'. If you use 'REFILL', it will be reset automatically. If you want to parse again or from another area, you will have to set '>IN' manually.
If you wonder where the 'C"' comes from, it is actually an alias for '"'. If you ever want to port your program to ANS-Forth, you'll have to use 'C"' inside colon-definitions and '"' outside. Note that 4tH doesn't care!