UC3M

Telematic/Audiovisual Syst./Communication Syst. Engineering

Systems Architecture

September 2017 - January 2018

10.8.  Data storage in files

For storing the data in files in a persistent way, one of the issues to take into account is that it is required to use certain methods to put a limitation on when an element starts and when this element finishes, when a data field starts and when this data field finishes, or when there are no more elements in a file. In this way, e.g. If you need to store a string of undefined length, a possible solution would be to add an additional field in order to indicate the string length (in bytes) in the file.

Next figure shows a way to store the information about photos and authors in a single file. We can observe that initially there are 4 bytes reserved to indicate in the file the total number of photos that there will be stored in the file, and next there will be the information of each one of the photos, including its author information. For each integer, there will be 4 bytes in the file to store it, but there will be an additional field of 4 bytes for each string that will indicate the length in bytes of the string, in order to locate it.

There are different ways to store the same information. Another way of storing the available information in the file, is shown in the following figure. Here, the information is stored divided by authors. In this way, there is an initial field of 4 bytes, which indicates the total number of stored authors, and next the information for each author. For each author, there will be the own data for each author, and when the information about an author is finished, then a 4 bytes stored number, indicating the total number of photos that have been taken by this author. Finally, there is the associated information for each one of these photos.

The choice for deciding one among different ways to store the information in a file, depends on an analysis in which the different advantages and disadvantages must be taken into account, and there is a trade-off. In this direction, you must take into account which solution will occupy more space in a file (in the presented case, the first choice will occupy more space, because the same author information must be replicated several times), or which will be more effective for the execution time for the required application operations. For example, if an application requires to list the photos of a specific author many times, then for this operation will be more effective the second choice, because the photos would be already ordered by author, so the application would not have to search photo by photo, but once the author is located then all the information would be in consecutive positions of the file. However, an application that would only require listing all the photos in the system with their information, without ordering them by author, then the first option would be more effective, because for adding a new photo, the application can do it at the end of the file, without having to put the photo in the file to group it with its author.

10.8.1.  Self-assessment questions

Answer the following questions

  1. What things do you should take into account before to store the data in files?

    • The starts and the end of the file

    • When a data field starts, when this data field finishes and when there are no more elements

  2. If you need to store a string in the file you should add also an additional field in order to indicate the string length (in bytes) in the file in order to locate the string.

    • It is not neccessay because string contatains its size

    • You may use the \0 character

  3. To store an integer and a string of four characteres of length in a file, the number of fields required are:

    • 3 fields

    • 4 bytes