                           CHAPTER 11 - Files


            One  of the most common operations when using a computer 

        is to either read from, or write to a file.  You are already 

        somewhat experienced in file handling from the last chapter, 

        because in computer terminology, the keyboard, terminal, and 

        printer are all classified as files.   A file is any  serial 

        input  or  output  device that the computer has  access  to.  

        Since  it  is  serial,  only one  piece  of  information  is 

        available to the computer at any instant of time. This is in 

        contrast to an array,  for example, in which all elements of 

        the array are stored internally and are all available at any 

        time.

            Several  years  ago computers were all large  cumbersome 

        machines with large peripheral devices such as magnetic tape 

        drives,  punch card readers,  paper tape readers or punches, 

        etc.  It was a simple task to assign the paper tape reader a 

        symbol and use that symbol whenever it was necessary to read 

        a  paper tape.   There was never more than one file  on  the 

        paper  tape being read,  so it was simply read sequentially, 

        and  hopefully  the data was the  desired  data.   With  the 

        advent  of  floppy  disks,  and hard disks  too,  it  became 

        practical to put several files of data on one disk,  none of 

        which  necessarily had anything to do with any of the  other 

        files on that disk.   This led to the problem of reading the 

        proper file from the disk, not just reading the disk.

            Pascal  was  originally released  in  1971,  before  the 

        introduction  of  the  compact floppy  disk.   The  original 

        release  of Pascal had no provision for selecting a  certain 

        file  from  among  the many  included  on  the  disk.   Each 

        compiler  writer had to overcome this deficiency and he  did 

        so  by defining an extension to the standard Pascal  system.  

        Unfortunately,  all of the extensions were not the same, and 

        there  are  now several ways to accomplish  this  operation.  

        There   are  primarily  two  ways,   one  using  the  ASSIGN 

        statement, and the other using the OPEN statement.  They are 

        similar  to  each  other and they accomplish  the  same  end 

        result.

            All  of the above was described to let you know that  we 

        will have a problem in this chapter, namely, how do we cover 

        all  of  the possible implementations of  Pascal  available?  

        The answer is,  we can't.   Most of what is covered in  this 

        chapter will apply to all compilers, and all that is covered 

        will  apply to the TURBO Pascal compiler.   If your compiler 

        complains about some of the statements, it will be up to you 

        to dig out the details of how to do the intended operations.  

        If there is no way to do any of these operations, you should 

        seriously  consider getting another compiler because all  of 

        these operations are needed in a useful Pascal environment.




                                 Page 51









                           CHAPTER 11 - Files


                       READING AND DISPLAYING A FILE

            Examine  the file READFILE for an example of  a  program 

        that  can  read a text file from the disk,  in fact it  will 

        read  itself  from  the disk and display  it  on  the  video 

        monitor.   The  first statement in the program is the ASSIGN 

        statement.   This  is TURBO Pascal's way of selecting  which 

        file on the disk will be either read from or written to.  In 

        this case we will read from the disk.  The first argument in 

        the  ASSIGN  statement is the device  specifier  similar  to 

        "lst"  used  in the last chapter for the printer.   We  have 

        chosen  to  use  "turkey",  but could have  used  any  valid 

        identifier.   This  identifier  must  be defined  in  a  VAR 

        declaration as a TEXT type variable.   The next argument  is 

        the  filename  desired.   The filename can be defined  as  a 

        string constant, as it is here, or as a string variable.

            The TEXT type is a predefined type and is used to define 

        a file identifier.  It is predefined as a "file of CHAR", so 

        it can only be used for a text file.  We will see later that 

        there is another type of file, a binary file.

            Now  that we have a file identified,  it is necessary to 

        prepare it for reading by executing a RESET statement.   The 

        reset statement positions the read pointer at the  beginning 

        of  the file ready to read the first piece of information in 

        the  file.   Once we have done that,  data is read from  the 

        file  in  the same manner as it was when  reading  from  the 

        keyboard.   In this program,  the input is controlled by the 

        WHILE  loop  which is executed until we exhaust the data  in 

        the file.

                 WHAT ARE THE "EOF" AND "EOLN" FUNCTIONS?

            The "eof" function is new and must be defined.   When we 

        read  data from the file,  we move closer and closer to  the 

        end,  until  finally we reach the end and there is  no  more 

        data  to  read.   This  is  called  "end  of  file"  and  is 

        abbreviated "eof".   Pascal has this function which is false 

        until we reach the last line of the file,  but when there is 

        no  more  data in the file to be read,  the  function  "eof" 

        becomes  true.   To use the function,  we merely give it our 

        file identifier as an argument.   It should be clear that we 

        will  loop  until we read all of the data available  in  the 

        file.

            The "eoln" function is not used in this program but is a 

        very useful function.   If the input pointer is anywhere  in 

        the  text  file  except at the end of  a  line,  the  "eoln" 

        function  is  false,  but at the end of a line,  it  becomes 

        true.   This function can therefore be used to find the  end 



                                 Page 52









                           CHAPTER 11 - Files


        of  a  line  of text for variable length  text  input.   The 

        "eoln"  function is not available,  and in fact  meaningless 

        when you are reading a binary file, to be defined later.

            To actually read the data,  we use the READLN procedure, 

        giving  it  our  identifier "turkey" and  the  name  of  the 

        variable we want the data read into.   In this case, we read 

        up  to  80  characters  into  the string  and  if  more  are 

        available,  ignore  them.   Remember this from the  keyboard 

        input?  It  is  the same here.   Since we would like  to  do 

        something  with the data,  we simply output the line to  the 

        default device,  the video monitor.   It should be clear  to 

        you by now that the program will simply read the entire file 

        and display it on the monitor.

            Finally,  we CLOSE the file "turkey".   It is not really 

        necessary to close the file because the system will close it 

        for  you automatically at program termination,  but it is  a 

        good  habit to get into.   It must be carefully pointed  out 

        here,  that  you did not do anything to the input file,  you 

        only  read it and left it intact.   You could RESET  it  and 

        reread it again in this same program.   Compile and run this 

        program to see if it does what you expect it to do.

                        A PROGRAM TO READ ANY FILE

            Examine  the next program READDISP for an improved  file 

        reading  program.   This is very similar except that it asks 

        you for the name of the file that you desire to display, and 

        enters   the   name  into  a  12  character   string   named 

        "name_of_file_to_input".   This  is then used in the  ASSIGN 

        statement  to select the file to be read,  and the  file  is 

        reset  as  before.   A  header is then  displayed,  and  the 

        program  is  identical  to  the last  one  with  some  small 

        additions.   In  order to demonstrate the use of a  function 

        within the WRITELN specification,  the program calls for the 

        length of the input string and displays it before each line.  

        The  lines are counted as they are read and  displayed,  and 

        the line count is then displayed at the end of the  listing.  

        You  should  be  able  to  see clearly  how  each  of  these 

        operations is accomplished.   Compile and run this  program, 

        entering  any  filename  we  have used so far  (be  sure  to 

        include  the  .PAS).    After  a  successful  run,  enter  a 

        nonexistent filename and see the I/O error.

                       HOW TO COPY A FILE (SORT OF)

            Examine the file READSTOR for an example of both reading 

        from a file and writing to another one.   In this program we 

        request  an operator input for the filename to  read,  after 

        which we ASSIGN the name to the file and RESET it.  Next, we 



                                 Page 53









                           CHAPTER 11 - Files


        request a different filename to write to,  which is assigned 

        to a different identifier.  The next statement is new to us, 

        the REWRITE statement.   This name apparently comes from the 

        words  REset  for WRITEing because that is exactly  what  it 

        does.   It  clears  the entire file of any  prior  data  and 

        prepares to write into the very beginning of the file.  Each 

        time you write into it,  the file grows by the amount of the 

        data written.

            Once  the identifier has been defined,  and the  REWRITE 

        has  been  executed,  writing  to the file is  identical  to 

        writing  to the display with the addition of the  identifier 

        being specified before the first output field.  With that in 

        mind, you should have no trouble comprehending the operation 

        of the program.   It is similar to the last program,  except 

        that  it  numbers the lines as the file  is  copied.   After 

        running  the  program,  look on your default  disk  for  the 

        filename  you  input when it asked for the output  filename.  

        Examine that file to see if it is truly a copy of the  input 

        file with line numbers added.   One word of caution,  if you 

        used an existing filename for the output file,  the file was 

        overwritten,  and the original destroyed.   In that case, it 

        was  good that you followed instructions at the beginning of 

        this  tutorial and made a working copy.   You did  do  that, 

        didn't you?

                   HOW TO READ INTEGER DATA FROM A FILE

            It is well and good to be able to read text from a file, 

        but now we come to the time to read data from a file.  First 

        we will read data from a text file, then later from a binary 

        file.   Examine  the  program  READINTS for  an  example  of 

        reading data from a text file.  A text file is an ASCII file 

        that can be read by a text editor, printed, displayed, or in 

        some cases, compiled and executed.  It is simply a file made 

        up of a long string of CHAR type data,  and usually includes 

        linefeeds, carriage returns, and blanks for neat formatting.  

        Nearly  every  file on the Tutorial disk you  received  with 

        this  package is a text file.   The notable exception is the 

        file named LIST.COM, which is an executable program file.

            The  example  program has nothing  new,  you  have  seen 

        everything in it before.  We have an assignment, followed by 

        a reset of our file,  followed by four read and write loops.  

        Each  of the loops has a subtle difference to illustrate the 

        READ  and READLN statements.   Notice that the same file  is 

        read in four times with a RESET prior to each,  illustrating 

        the nondestructive read mentioned a few paragraphs ago.

            The file we will be using is named INTDATA.TXT and is on 

        your  disk.   You  could display it at this time  using  the 



                                 Page 54









                           CHAPTER 11 - Files


        program  READDISP  we covered recently.   Notice that it  is 

        simply  composed  of  the integer values  from  101  to  148 

        arranged four to a line with a couple of spaces between each 

        for  separation and a neat appearance.   The important thing 

        to remember is that there are four data points per line.

                  READ AND READLN ARE SLIGHTLY DIFFERENT

            As  variables  are read in with  either  procedure,  the 

        input  file  is scanned for the variables  using  blanks  as 

        delimiters.  If there are not enough data points on one line 

        to satisfy the arguments in the input list, the next line is 

        searched also,  and the next,  etc.  Finally when all of the 

        arguments  in  the  input list are satisfied,  the  READ  is 

        complete, but the READLN is not.  If it is a READ procedure, 

        the input pointer is left at that point in the file,  but if 

        it is a READLN procedure,  the input pointer is advanced  to 

        the  beginning of the next line.   The next paragraph should 

        clear that up for you.

            The input data file INTDATA.TXT has four data points per 

        line but the first loop in the program READINTS.PAS requests 

        only  three  each time through the  loop.   The  first  time 

        through, it reads the values 101, 102, and 103, and displays 

        those  values,  leaving the input pointer just prior to  the 

        104, because it is a READ procedure.  The next time through, 

        it reads the value 104,  advances to the next line and reads 

        the values 105,  and 106,  leaving the pointer just prior to 

        the 107.  This continues until the 5 passes through the loop 

        are completed.

            The next loop contains a READLN procedure and also reads 

        the values 101,  102,  and 103, but when the input parameter 

        list is satisfied,  it moves the pointer to the beginning of 

        the next line,  leaving it just before the 105.   The values 

        are printed out and the next time we come to the READLN,  we 

        read the 105,  106, and 107, and the pointer is moved to the 

        beginning  of  the next line.   It would be good to run  the 

        program now to see the difference in output data for the two 

        loops.

            When you come back to the program again, notice the last 

        two loops, which operate much like the first two except that 

        there  are  now five requested integer  variables,  and  the 

        input  file  still  only has four  per  line.   This  is  no 

        problem.   Both  input procedures will simply read the first 

        four in the first line,  advance to the second line for  its 

        required  fifth  input,  and each will do its own  operation 

        next.   The READ procedure will leave the input pointer just 

        before  the second data point of the second  line,  and  the 

        READLN  will  advance the input pointer to the beginning  of 



                                 Page 55









                           CHAPTER 11 - Files


        the  third  line.   Run this program and  observe  the  four 

        output fields to see an illustration of these principles.

                NOW TO READ SOME REAL VARIABLES FROM A FILE

            By  whatever method you desire,  take a look at the file 

        named  REALDATA.TXT supplied on your Pascal  Tutorial  disk.  

        You  will see 8 lines of what appears to be scrambled  data, 

        but it is good data that Pascal can read.  Notice especially 

        line  4  which has some data missing,  and line 6 which  has 

        some extra data.

            Examine the program file READDATA which will be used  to 

        illustrate  the  method of reading  REAL  data.   Everything 

        should be familiar to you,  since there is nothing new here.  

        Notice  the READLN statement.  It is requesting one  integer 

        variable,  and  three real variables,  which is what most of 

        the input file contained.   When we come to the fourth line, 

        there are not enough data points available, so the first two 

        data points of the next line are read to complete the fourth 

        pass.  Since the pointer is advanced to the beginning of the 

        next line,  we are automatically synchronized with the  data 

        again.   When  we come to the sixth line,  the last two data 

        points  are simply ignored.   Run the program to see if  the 

        results are as you would predict.

            If a READ were substituted for the READLN,  the  pointer 

        would not be advanced to the beginning of line 6,  after the 

        fourth  pass  through the loop.   The next attempt  to  read 

        would result in trying to read the .0006 as an INTEGER,  and 

        a  run time error would result.   Modify the program and see 

        if this is not true.

            That is all there is to reading and writing text  files.  

        If  you  learn the necessities,  you will not  be  stumbling 

        around   in   the  area  of  input/output  which   is   very 

        intimidating to many people.  Remember to ASSIGN, then RESET 

        before  reading,  REWRITE before writing,  and CLOSE  before 

        quitting.   It  is of the utmost importance to close a  file 

        you  have been writing to before quitting to write the  last 

        few  buffers to the file,  but it is not important to  close 

        read  files unless you are using a lot of them,  as there is 

        an  implementation dependent limit of how many files can  be 

        open at once.  It is possible to read from a file, close it, 

        reopen it,  and write in it in one program.  You can reuse a 

        file  as often as you desire in a program,  but  you  cannot 

        read from and write into a file at the same time.







                                 Page 56









                           CHAPTER 11 - Files


                      NOW FOR BINARY INPUT AND OUTPUT

            Examine  the file BINOUT for an example of writing  data 

        to a file in binary form.   First there is a record  defined 

        in  the  type declaration part composed of  three  different 

        variable types.   In the VAR part,  "output_file" is defined 

        as  a "FILE of dat_rec",  the record defined  earlier.   The 

        variable  "dog_food"  is  then defined as an  array  of  the 

        record, and a simple variable is defined.

            Any  file assigned a type of TEXT,  which is a "FILE  of 

        CHAR", is a text file.  A text file can be read and modified 

        with a text editor,  printed out,  displayed on the monitor, 

        etc. If a file is defined with any other definition, it will 

        be  a  binary  file  and will be in an  internal  format  as 

        defined by the Pascal compiler.   Attempting to display such 

        a file will result in very strange looking gibberish on  the 

        monitor.

            When we get to the program,  the output file is assigned 

        a name,  and a REWRITE is performed on it to reset the input 

        point  to  the beginning of the file,  empty the  file,  and 

        prepare  for  writing data into it.   The next  loop  simply 

        assigns  nonsense  data to all of the variables  in  the  20 

        records so we have something to work with.

            We  finally write a message to the display that  we  are 

        ready  to start outputting data,  and we output the data one 

        record at a time with the standard WRITE statement.   A  few 

        cautions are in order here.   The output file can be defined 

        as  any simple variable type,  INTEGER,  BYTE,  REAL,  or  a 

        record, but cannot be mixed.  The record however, can be any 

        combination of data including other records, if desired, but 

        any  file  can only have one type of record written  to  it.  

        Also,  a  WRITELN  statement  is illegal when writing  to  a 

        binary file because a binary file is not line  oriented.   A 

        WRITE   statement  is  limited  to  one  output  field   per 

        statement.  It is a simple matter to put one WRITE statement 

        in  the  program for each variable you wish to write out  to 

        the  file.   It is important to CLOSE the file when you  are 

        finished writing to it.

                           WHY USE A BINARY FILE

            A binary file written by a Pascal program cannot be read 

        by a word processor,  a text editor, any application program 

        such  as a database or spreadsheet,  and it may not even  be 

        readable  by  a  Pascal  program  compiled  by  a  different 

        companies  compiler  because  the  data  is   implementation 

        dependent.   It can't even be read by a Pascal program using 

        the  correct compiler unless the data structure is identical 



                                 Page 57









                           CHAPTER 11 - Files


        to the one used to write the file.  With all these rules, it 

        seems  like  a  silly way to  output  data,  but  there  are 

        advantages to using a binary output.

            A  binary file uses less file space than a corresponding 

        text  file  because  the data is stored in  a  packed  mode.  

        Since all significant digits of REAL data are stored,  it is 

        more   precise  unless  you  are  careful  to   output   all 

        significant  data to the corresponding TEXT file.   Finally, 

        since the binary data does not require formatting into ASCII 

        characters,  it will be considerably faster than  outputting 

        it  in TEXT format.   When you run the example  program,  it 

        will create the file KIBBLES.BIT,  and put 20 records in it.  

        Return  to  DOS  and  look  for this  file  and  verify  its 

        existence.   If  you try to TYPE it,  you will have  a  real 

        mess, but that might be a good exercise.

                           READING A BINARY FILE

            BININ  is another example program that will read in  the 

        file  we just created.   Notice that the variables are named 

        differently,  but the types are all identical to those  used 

        to  write  the  file.   An additional line is found  in  the 

        program,  the IF statement.   We must check for the "end  of 

        file"  marker to stop reading when we find it or Pascal will 

        list  an  error and terminate operation.   Three  pieces  of 

        information  are written out to verify that we actually  did 

        read the data file in.

            Once  again,  a  few rules are in order.   A  READLN  is 

        illegal since there are no lines in a binary file,  and only 

        one variable or record can be read in with a READ statement.

            WHAT ABOUT FILE POINTERS, GET, AND PUT STATEMENTS?

            File pointers and the GET and PUT procedures are a  part 

        of standard Pascal,  but since they are redundant,  they are 

        not  a  part of TURBO Pascal.   The standard READ and  WRITE 

        procedures are more flexible,  more efficient, and easier to 

        use.   The  use  of GET and PUT will not be  illustrated  or 

        defined here.  If you ever have any need for them, they will 

        be covered in detail in your Pascal reference manual for the 

        particular implementation you are using.

            Pointers  will be covered in detail in the next  chapter 

        of this tutorial.








                                 Page 58









                           CHAPTER 11 - Files


                           PROGRAMMING EXERCISES

        1.  Write a program to read in any text file, and display it 

            on  the  monitor  with line numbers and  the  number  of 

            characters  in each line.  Finally display the number of 

            lines  found  in  the file,  and  the  total  number  of 

            characters in the entire file.  Compare this number with 

            the filesize given by the DOS command DIR.

        2.  Write  a silly program that will read two text files and 

            display  them both on the monitor on alternating  lines. 

            This is the same as "shuffling" the two files  together. 

            Take  care  to  allow them to end  at  different  times, 

            inserting  blank  lines  for the  file  that  terminates 

            earlier.






































                                 Page 59

