Table of Contents
The data structures in the C programming languages are simpler than those offered in Java because there is no notion of “class” nor “object”. C offers basic data types and two constructions to create more complex data. The access control to data present in Java (private, public and protected methods and fields) does not exist in C. Variables are either global, local to a file or local to a block of code.
C offers three basic data types:
Integers defined with the keyword
int
Characters defined with the keyword
char
Real or floating point numbers defined with the
keywords float
or double
.
Defined with “int
” and two
optional prefixes are allowed:
“short
” and
“long
”. Modifies the size in bits of the
integer. Thus, there exist three types of integers:
“int
”, “short int
”
(which can be abbreviated “short
”) and
“long int
” (which can be abbreviated
“long
”).
The C programming languages does not define a fixed
size for the basic data types. The only guarantee is that a
short int
has a size less or equal
to a int
itself with a size less or equal to a long
int
. This feature of the language has made the creation of
programs that are compatible with multiple platforms quite
complex.
“unsigned
”: defines a
natural number (greater or equal to zero).
In your development environment create a text file with the following structure (you may simply copy and paste the text in the following frame):
int main() { }
Insert in the main
function integer
definitions to test all possible combinations (up to ten). To check that
the program is syntactically correct, open a window with a command
interpreter and in the folder where you created the file type the
command gcc -Wall -o program file.c replacing
file.c with the name of the file you created. If the
command does not print any message, your program is correct. You may see
that the compiler generated a file with extension
“.o
”, you may delete it.
Answer the following questions (check your answers also compiling the program):
The variables of type character are declared as
“char
”. To refer to a character, the symbol must
be surrounded by simple quotes: 'M'
. Characters are
internally represented as numbers and the C language
allows arithmetic operations with them such as 'M' +
25
.
The strings are represented as tables of
“char
”. The library functions to manipulate
strings all assume that the last byte of the chain has value zero. The
strings are written in the program surrounded by double quotes and
contain the value zero at the end. It follows an example with two
definitions:
#define SIZE 6 char a = 'A'; char b[SIZE] = "hello";
Why the second definition has a size of six characters when the string only has five?
Reuse the program from the previous section and add char and string definitions. For the last ones, use different table sizes (too small and too large for the string). Write also arithmetic expressions over the characters. Remember that if the compiler does not emit any message, the program is correct.
Consider the following declaration:
#define SIZE 6 char m[SIZE] = 'strag';
If you think the answers are wrong, write the declaration in a program and compile it.
The real numbers are defined with
“float
” or
“double
”. The difference is the amount of
precision used to represent the numbers internally. There is an infinite
number of real numbers, but they are represented with a finite number of
bits in the computer. The bigger the number of bits used, the better the
precision. Real numbers defined with “double
”
are represented with a size double of those declared as
“float
”. As in the case of the integers, the
size of these representations depends on the platform.
Some platforms offer an extra type of real numbers with
size larger than “double
” that are defined as
“long double
”. Typical sizes for the
“float
”, “double
” and
“double long
” data types are 4, 8 and 12 bytes
respectively. It follows some examples of real number definitions.
float a = 3.5; double b = -5.4e-12; long double c = 3.54e320;
Add to the program used in the previous sections floating point numbers. Try to define very large and very small numbers to see the representation capabilities of each of the three types. Compile to check if the definitions are correct.
C arrays are almost identical to arrays in Java, the size surrounded by square brackets follows the array name. Also as in the case of Java, table elements begin with index zero. Some examples of array definitions are the following:
#define SIZE_TABLE 100 #define SIZE_SHORT 5 #define SIZE_LONG 3 #define SIZE_NAME 10 int table[SIZE_TABLE]; short st[SIZE_SHORT] = { 1, 2, 3, 4, 5 }; long lt[SIZE_LONG] = { 20, 30, 40}; char name[SIZE_NAME];
Table elements are accessed with the table name followed by the index in square brackets..
One of the differences between C and Java is that no array
verification is performed in C. If an array is accessed with an incorrect
index in a Java program an exception of type
“ArrayIndexOutOfBounds
” is produced. This check
is never done in C (unless it is explicitly written
int the program). If an array is accessed with an incorrect index, the
data in an incorrect memory area is manipulated, and the program proceeds
with the execution.
After this incorrect access, two things may happen. The first one is that the accessed memory location is out of the limits of the program. In this case the execution terminates abruptly and the command interpreter shows the message “segmentation fault”. The second possibility is that a memory location still inside the program data area is accessed and the program keeps executing. This situation will likely produce an error with symptoms difficult to related to the incorrect access.
C allows the definition of multiple dimension arrays by writing the multiple sizes surrounded by square brackets and concatenated. The access is done by providing as many indexes as required, each of them surrounded by square brackets. As in the case of uni-dimensional arrays, C performs no check on the indexes when accessing an element. It follows some examples multiple dimension array definitions.
#define MATRIX_A 100 #define MATRIX_B 30 #define COMMON_SIZE 10 int matrix[MATRIX_A][MATRIX_B]; long squarematrix[COMMON_SIZE][COMMON_SIZE]; char soup[COMMON_SIZE][COMMON_SIZE];
Add to the previous program definitions and manipulations of arrays of the different basic data types. Check that they are syntactically correct with the compiler.
The size of the basic data types in C may vary from one platform to another. This language feature has been highly criticized because it may translate in compatibility problems (an application behaves differently when executed in different platforms).
As an example, the following table includes the sizes for the data types in the Linux/Intel i686 Platform
Table 2.1. Size of C data types in the Linux/Intel i686 platform
Type | Size (bytes) |
---|---|
char , unsigned char | 1 |
short int , unsigned short int
| 2 |
int , unsigned int , long
int , unsigned long int
| 4 |
float | 4 |
double | 8 |
long double | 12 |