Preprocessor directives
Preprocessor directives are lines included in the code of our programs that are not program statements but directives for the preprocessor. These lines are always preceded by a pound sign (#). The preprocessor is executed before the actual compilation of code begins, therefore the preprocessor digests all these directives before any code is generated by the statements.
Preprocessor directives are lines included in the code of our programs that are not program statements but directives for the preprocessor. These lines are always preceded by a pound sign (#). The preprocessor is executed before the actual compilation of code begins, therefore the preprocessor digests all these directives before any code is generated by the statements.
These preprocessor directives
extend only across a single line of code. As soon as a newline character is
found, the preprocessor directive is considered to end. No semicolon (;) is
expected at the end of a preprocessor directive. The only way a preprocessor
directive can extend through more than one line is by preceding the newline
character at the end of the line by a backslash (\).
macro
definitions (#define, #undef)
To
define preprocessor macros we can use #define. Its
format is:
#define identifier replacement
When the preprocessor
encounters this directive, it replaces any occurrence of identifier in the rest of the code by replacement. This
replacement can be an expression, a statement, a block or simply anything.
The preprocessor does not understand C++, it simply replaces any occurrence of identifier by replacement.
#define TABLE_SIZE 100 int table1[TABLE_SIZE]; int table2[TABLE_SIZE]; |
After the preprocessor has
replaced TABLE_SIZE, the code becomes equivalent to:
int table1[100]; int table2[100]; |
This use of #define as
constant definer is already known by us from previuos tutorials, but #define can work also with parameters to define function macros:
#define getmax(a,b) a>b?a:b |
This would replace any
occurrence of getmax followed by two arguments by the replacement expression, but
also replacing each argument by its identifier, exactly as you would expect if
it was a function:
// function macro #include <iostream> using namespace std;
#define getmax(a,b) ((a)>(b)?(a):(b))
int main() {
int x=5, y;
y= getmax(x,2);
cout << y << endl;
cout << getmax(7,x) << endl;
return 0;
}
|
5
7
|
Defined macros are not
affected by block structure. A macro lasts until it is undefined with the
#undef preprocessor directive:
#define TABLE_SIZE 100 int table1[TABLE_SIZE]; #undef TABLE_SIZE #define TABLE_SIZE 200 int table2[TABLE_SIZE]; |
This would generate the same
code as:
int table1[100]; int table2[200]; |
Function macro definitions
accept two special operators (# and ##) in the replacement sequence:
If the operator # is used before a parameter is used in the replacement sequence, that parameter is replaced by a string literal (as if it were enclosed between double quotes)
If the operator # is used before a parameter is used in the replacement sequence, that parameter is replaced by a string literal (as if it were enclosed between double quotes)
#define str(x) #x cout << str(test);
|
This would be translated
into:
cout << "test";
|
The operator ## concatenates two arguments leaving no blank spaces between
them:
#define glue(a,b) a ## b glue(c,out) << "test";
|
This would also be translated
into:
cout << "test";
|
Because preprocessor
replacements happen before any C++ syntax check, macro definitions can be a
tricky feature, but be careful: code that relies heavily on complicated macros
may result obscure to other programmers, since the syntax they expect is on
many occasions different from the regular expressions programmers expect in
C++.
Conditional
inclusions (#ifdef, #ifndef, #if, #endif, #else and #elif)
These directives allow to
include or discard part of the code of a program if a certain condition is met.
#ifdef allows a section of a
program to be compiled only if the macro that is specified as the parameter has
been defined, no matter which its value is. For example:
#ifdef TABLE_SIZE int table[TABLE_SIZE]; #endif |
In this case, the line of
code int
table[TABLE_SIZE]; is only compiled if TABLE_SIZE was previously defined with #define,
independently of its value. If it was not defined, that line will not be
included in the program compilation.
#ifndef serves for the exact
opposite: the code between #ifndef and #endif directives is only compiled if the specified identifier has not
been previously defined. For example:
#ifndef TABLE_SIZE #define TABLE_SIZE 100 #endif int table[TABLE_SIZE]; |
In this case, if when
arriving at this piece of code, the TABLE_SIZE macro
has not been defined yet, it would be defined to a value of 100. If it already
existed it would keep its previous value since the #define
directive would not be executed.
The #if, #else and #elif (i.e., "else if") directives serve to specify some condition
to be met in order for the portion of code they surround to be compiled. The
condition that follows #if or #elif can only evaluate constant expressions, including macro
expressions. For example:
#if TABLE_SIZE>200 #undef TABLE_SIZE #define TABLE_SIZE 200
#elif TABLE_SIZE<50 #undef TABLE_SIZE #define TABLE_SIZE 50
#else #undef TABLE_SIZE #define TABLE_SIZE 100 #endif
int table[TABLE_SIZE]; |
Notice how the whole
structure of #if, #elif and #else chained directives ends with #endif.
The behavior of #ifdef and #ifndef can also be achieved by
using the special operators defined and !defined respectively in any #if or #elif directive:
#if !defined TABLE_SIZE #define TABLE_SIZE 100 #elif defined ARRAY_SIZE #define TABLE_SIZE ARRAY_SIZE int table[TABLE_SIZE]; |
Line control (#line)
When
we compile a program and some error happen during the compiling process, the
compiler shows an error message with references to the name of the file where
the error happened and a line number, so it is easier to find the code
generating the error.
The #line directive allows us to control both things, the line numbers
within the code files as well as the file name that we want that appears when
an error takes place. Its format is:
#line number "filename"
Where number is the new line number that will be assigned to the next code
line. The line numbers of successive lines will be increased one by one from
this point on.
"filename" is an
optional parameter that allows to redefine the file name that will be shown.
For example:
#line 20 "assigning variable" int a?; |
This code will generate an
error that will be shown as error in file "assigning variable", line 20.
Error directive
(#error)
This
directive aborts the compilation process when it is found, generating a compilation
the error that can be specified as its parameter:
#ifndef __cplusplus #error A C++ compiler is required! #endif |
This example aborts the
compilation process if the macro name __cplusplus is
not defined (this macro name is defined by default in all C++ compilers).
Source file inclusion
(#include)
This
directive has also been used assiduously in other sections of this tutorial.
When the preprocessor finds an #include directive it replaces
it by the entire content of the specified file. There are two ways to specify a
file to be included:
#include "file" #include <file> |
The only difference between
both expressions is the places (directories) where the compiler is going to
look for the file. In the first case where the file name is specified between double-quotes,
the file is searched first in the same directory that includes the file
containing the directive. In case that it is not there, the compiler searches
the file in the default directories where it is configured to look for the
standard header files. If the file name
is enclosed between angle-brackets <> the
file is searched directly where the compiler is configured to look for the
standard header files. Therefore, standard header files are usually included in
angle-brackets, while other specific header files are included using quotes.
Pragma directive
(#pragma)
This
directive is used to specify diverse options to the compiler. These options are
specific for the platform and the compiler you use. Consult the manual or the
reference of your compiler for more information on the possible parameters that
you can define with #pragma.
If the compiler does not
support a specific argument for #pragma, it is ignored - no
error is generated.
Predefined macro names
The
following macro names are defined at any time:
macro
|
Value
|
__LINE__
|
Integer
value representing the current line in the source code file being compiled.
|
__FILE__
|
A
string literal containing the presumed name of the source file being
compiled.
|
__DATE__
|
A
string literal in the form "Mmm dd yyyy" containing the date in
which the compilation process began.
|
__TIME__
|
A
string literal in the form "hh:mm:ss" containing the time at which
the compilation process began.
|
__cplusplus
|
An
integer value. All C++ compilers have this constant defined to some value. If
the compiler is fully compliant with the C++ standard its value is equal or
greater than 199711L depending on the version of the standard they comply.
|
For example:
// standard macro names
#include <iostream>
using namespace std;
int main()
{
cout << "This is the line number " << __LINE__;
cout << " of file " << __FILE__ << ".\n";
cout << "Its compilation began " << __DATE__;
cout << " at " << __TIME__ << ".\n";
cout << "The compiler gives a __cplusplus value of " << __cplusplus;
return 0;
}
|
This is the line number 7 of file /home/jay/stdmacronames.cpp.
Its compilation began Nov 1 2005 at 10:12:29.
The compiler gives a __cplusplus value of 1
|
C++
provides the following classes to perform output and input of characters
to/from files:
- ofstream:
Stream class to write on files
- ifstream:
Stream class to read from files
- fstream:
Stream class to both read and write from/to files.
These classes are derived
directly or indirectly from the classes istream, and ostream. We have already used objects whose types were these classes: cin is an object of class istream and cout is an object of class ostream.
Therfore, we have already been using classes that are related to our file
streams. And in fact, we can use our file streams the same way we are already
used to use cin and cout, with the only difference
that we have to associate these streams with physical files.
Let's see an example:
// basic file operations #include <iostream> #include <fstream> using namespace std;
int main () { ofstream myfile;
myfile.open ("example.txt");
myfile << "Writing this to a file.\n";
myfile.close();
return 0;
}
|
[file example.txt]
Writing this to a file
|
This code creates a file
called example.txt and inserts a sentence into it in the same way we are used to
do with cout, but using the file stream myfile
instead.
But let's go step by step:
Open a file
The first operation generally
performed on an object of one of these classes is to associate it to a real
file. This procedure is known as to open a file. An open file is
represented within a program by a stream object (an instantiation of one of
these classes, in the previous example this was myfile) and
any input or output operation performed on this stream object will be applied
to the physical file associated to it.
In order to open a file with
a stream object we use its member function open():
open (filename, mode);
Where filename is a null-terminated character sequence of type const char * (the same type that string literals have) representing the name
of the file to be opened, and mode is an optional parameter
with a combination of the following flags:
ios::in
|
Open
for input operations.
|
ios::out
|
Open
for output operations.
|
ios::binary
|
Open
in binary mode.
|
ios::ate
|
Set
the initial position at the end of the file.
If this flag is not set to any value, the initial position is the beginning of the file. |
ios::app
|
All
output operations are performed at the end of the file, appending the content
to the current content of the file. This flag can only be used in streams
open for output-only operations.
|
ios::trunc
|
If
the file opened for output operations already existed before, its previous
content is deleted and replaced by the new one.
|
All these flags can be
combined using the bitwise operator OR (|). For
example, if we want to open the file example.bin in
binary mode to add data we could do it by the following call to member function
open():
ofstream myfile;
myfile.open ("example.bin", ios::out | ios::app | ios::binary);
|
Each one of the open() member functions of the classes ofstream, ifstream and fstream has a default mode that is
used if the file is opened without a second argument:
class
|
default
mode parameter
|
ofstream
|
ios::out
|
ifstream
|
ios::in
|
fstream
|
ios::in
| ios::out
|
For ifstream and ofstream classes, ios::in and ios::out are automatically and
respectivelly assumed, even if a mode that does not include them is passed as
second argument to the open() member function.
The default value is only
applied if the function is called without specifying any value for the mode
parameter. If the function is called with any value in that parameter the
default mode is overridden, not combined.
File streams opened in binary
mode perform input and output operations independently of any format considerations.
Non-binary files are known as text files, and some translations may
occur due to formatting of some special characters (like newline and carriage
return characters).
Since the first task that is
performed on a file stream object is generally to open a file, these three
classes include a constructor that automatically calls the open() member function and has the exact same parameters as this
member. Therefor, we could also have declared the previous myfile object and conducted the same opening operation in our previous
example by writing:
ofstream myfile ("example.bin", ios::out | ios::app | ios::binary);
|
Combining object construction
and stream opening in a single statement. Both forms to open a file are valid
and equivalent.
To check if a file stream was
successful opening a file, you can do it by calling to member is_open() with no arguments. This member function returns a bool value of
true in the case that indeed the stream object is associated with an open file,
or false otherwise:
if (myfile.is_open()) { /* ok, proceed with output */ } |
Closing a file
When
we are finished with our input and output operations on a file we shall close
it so that its resources become available again. In order to do that we have to
call the stream's member function close(). This member function
takes no parameters, and what it does is to flush the associated buffers and
close the file:
myfile.close();
|
Once this member function is
called, the stream object can be used to open another file, and the file is
available again to be opened by other processes.
In case that an object is
destructed while still associated with an open file, the destructor
automatically calls the member function close().
Text files
Text
file streams are those where we do not include the ios::binary flag
in their opening mode. These files are designed to store text and thus all
values that we input or output from/to them can suffer some formatting
transformations, which do not necessarily correspond to their literal binary
value.
Data output operations on
text files are performed in the same way we operated with cout:
// writing on a text file #include <iostream> #include <fstream> using namespace std;
int main () { ofstream myfile ("example.txt");
if (myfile.is_open())
{
myfile << "This is a line.\n";
myfile << "This is another line.\n";
myfile.close();
}
else cout << "Unable to open file";
return 0;
}
|
[file example.txt]
This is a line.
This is another line.
|
Data input from a file can
also be performed in the same way that we did with cin:
// reading a text file #include <iostream> #include <fstream> #include <string> using namespace std;
int main () { string line;
ifstream myfile ("example.txt");
if (myfile.is_open())
{
while (! myfile.eof() )
{
getline (myfile,line);
cout << line << endl;
}
myfile.close();
}
else cout << "Unable to open file";
return 0;
}
|
This is a line.
This is another line.
|
This last example reads a
text file and prints out its content on the screen. Notice how we have used a
new member function, called eof() that returns true in the
case that the end of the file has been reached. We have created a while loop
that finishes when indeed myfile.eof() becomes true (i.e., the end
of the file has been reached).
Checking state flags
In
addition to eof(), which checks if the end of file has been reached, other member
functions exist to check the state of a stream (all of them return a bool
value):
bad()
Returns true if a reading or
writing operation fails. For example in the case that we try to write to a file
that is not open for writing or if the device where we try to write has no
space left.
fail()
Returns true in the same
cases as bad(), but also in the case that a format error happens, like when an
alphabetical character is extracted when we are trying to read an integer
number.
eof()
Returns true if a file open
for reading has reached the end.
good()
It is the most generic state
flag: it returns false in the same cases in which calling any of the previous
functions would return true.
In order to reset the state
flags checked by any of these member functions we have just seen we can use the
member function clear(), which takes no parameters.
get and put stream pointers
All
i/o streams objects have, at least, one internal stream pointer:
ifstream, like istream, has a pointer known as the get pointer that points to
the element to be read in the next input operation.
ofstream, like ostream, has a pointer known as the put pointer that points to
the location where the next element has to be written.
Finally, fstream, inherits both, the get and the put pointers, from iostream (which is itself derived from both istream and ostream).
These internal stream
pointers that point to the reading or writing locations within a stream can be
manipulated using the following member functions:
tellg() and tellp()
These
two member functions have no parameters and return a value of the member type pos_type, which is an integer data type representing the current
position of the get stream pointer (in the case of tellg) or
the put stream pointer (in the case of tellp).
seekg() and seekp()
These
functions allow us to change the position of the get and put stream pointers.
Both functions are overloaded with two different prototypes. The first
prototype is:
seekg ( position
);seekp ( position );
Using this prototype the
stream pointer is changed to the absolute position position
(counting from the beginning of the file). The type for this parameter is the
same as the one returned by functions tellg and tellp: the member type pos_type, which is an integer
value.
The other prototype for these
functions is:
seekg ( offset,
direction );seekp ( offset, direction );
Using this prototype, the
position of the get or put pointer is set to an offset value relative to some
specific point determined by the parameter direction. offset is of the member type off_type,
which is also an integer type. And direction is of
type seekdir, which is an enumerated type (enum) that
determines the point from where offset is counted from, and that can take any
of the following values:
Ios::beg
|
offset
counted from the beginning of the stream
|
Ios::cur
|
offset
counted from the current position of the stream pointer
|
Ios::end
|
offset
counted from the end of the stream
|
The following example uses
the member functions we have just seen to obtain the size of a file:
// obtaining file size #include <iostream> #include <fstream> using namespace std;
int main () { long begin,end;
ifstream myfile ("example.txt");
begin = myfile.tellg();
myfile.seekg (0, ios::end);
end = myfile.tellg();
myfile.close();
cout << "size is: " << (end-begin) << " bytes.\n";
return 0;
}
|
size is: 40 bytes.
|
Binary files
In
binary files, to input and output data with the extraction and insertion
operators (<< and >>) and functions like getline is not efficient, since we do not need to format any data, and
data may not use the separation codes used by text files to separate elements
(like space, newline, etc...).
File streams include two
member functions specifically designed to input and output binary data
sequentially: write and read. The first one (write) is a member function of ostream
inherited by ofstream. And read is a member function of istream that is inherited by ifstream.
Objects of class fstream have both members. Their
prototypes are:
write (
memory_block, size );read ( memory_block, size );
Where memory_block is of type "pointer to char" (char*), and represents the address of an array of bytes where the
read data elements are stored or from where the data elements to be written are
taken. The size parameter is an integer value that specifies the number of
characters to be read or written from/to the memory block.
// reading a complete binary file #include <iostream> #include <fstream> using namespace std;
ifstream::pos_type size;
char * memblock;
int main () { ifstream file ("example.txt", ios::in|ios::binary|ios::ate);
if (file.is_open())
{
size = file.tellg();
memblock = new char [size];
file.seekg (0, ios::beg);
file.read (memblock, size);
file.close();
cout << "the complete file content is in memory";
delete[] memblock;
}
else cout << "Unable to open file";
return 0;
}
|
the complete file content is in memory
|
In this example the entire
file is read and stored in a memory block. Let's examine how this is done:
First, the file is open with
the ios::ate flag, which means that the get pointer will be positioned at
the end of the file. This way, when we call to member tellg(), we will directly obtain the size of the file. Notice the type
we have used to declare variable size:
ifstream::pos_type size;
|
ifstream::pos_type is a specific type
used for buffer and file positioning and is the type returned by file.tellg(). This type is defined as an integer type, therefore we can
conduct on it the same operations we conduct on any other integer value, and
can safely be converted to another integer type large enough to contain the
size of the file. For a file with a size under 2GB we could use int:
Int size; size = (int) file.tellg();
|
Once we have obtained the
size of the file, we request the allocation of a memory block large enough to
hold the entire file:
memblock = new char[size];
|
Right after that, we proceed
to set the get pointer at the beginning of the file (remember that we opened
the file with this pointer at the end), then read the entire file, and finally
close it:
file.seekg (0, ios::beg);
file.read (memblock, size);
file.close();
|
At this point we could
operate with the data obtained from the file. Our program simply announces that
the content of the file is in memory and then terminates.
Buffers and Synchronization
When we operate with file
streams, these are associated to an internal buffer of type streambuf. This buffer is a memory block that acts as an intermediary
between the stream and the physical file. For example, with an ofstream, each time the member function put
(which writes a single character) is called, the character is not written
directly to the physical file with which the stream is associated. Instead of
that, the character is inserted in that stream's intermediate buffer.
When the buffer is flushed,
all the data contained in it is written to the physical medium (if it is an
output stream) or simply freed (if it is an input stream). This process is
called synchronization and takes place under any of the following
circumstances:
- When
the file is closed: before closing a file all buffers
that have not yet been flushed are synchronized and all pending data is
written or read to the physical medium.
- When
the buffer is full: Buffers have a certain size. When the
buffer is full it is automatically synchronized.
- Explicitly,
with manipulators: When certain manipulators are used on
streams, an explicit synchronization takes place. These manipulators are: flush and endl.
- Explicitly, with member function sync(): Calling stream's member function sync(), which takes no parameters, causes an immediate synchronization. This function returns an int value equal to -1 if the stream has no associated buffer or in case of failure. Otherwise (if the stream buffer was successfully synchronized) it returns 0.
Ascii Codes
It is a very well-known fact that computers can manage internally only 0s (zeros) and 1s (ones). This is true, and by means of sequences of 0s and 1s the computer can express any numerical value as its binary translation, which is a very simple mathematical operation (as explained in the paper numerical bases).
Nevertheless, there is no such evident way to represent letters and other non-numeric characters with 0s and 1s. Therefore, in order to do that, computers use ASCII tables, which are tables or lists that contain all the letters in the roman alphabet plus some additional characters. In these tables each character is always represented by the same order number. For example, the ASCII code for the capital letter "A" is always represented by the order number 65, which is easily representable using 0s and 1s in binary: 65 expressed as a binary number is 1000001.
The standard ASCII table defines 128 character codes (from 0 to 127), of which, the first 32 are control codes (non-printable), and the remaining 96 character codes are representable characters:
*
|
0
|
1
|
2
|
3
|
4
|
5
|
6
|
7
|
8
|
9
|
A
|
B
|
C
|
D
|
E
|
F
|
0
|
NUL
|
SOH
|
STX
|
ETX
|
EOT
|
ENQ
|
ACK
|
BEL
|
BS
|
TAB
|
LF
|
VT
|
FF
|
CR
|
SO
|
SI
|
1
|
DLE
|
DC1
|
DC2
|
DC3
|
DC4
|
NAK
|
SYN
|
ETB
|
CAN
|
EM
|
SUB
|
ESC
|
FS
|
GS
|
RS
|
US
|
2
|
!
|
"
|
#
|
$
|
%
|
&
|
'
|
(
|
)
|
*
|
+
|
,
|
-
|
.
|
/
| |
3
|
0
|
1
|
2
|
3
|
4
|
5
|
6
|
7
|
8
|
9
|
:
|
;
|
<
|
=
|
>
|
?
|
4
|
@
|
A
|
B
|
C
|
D
|
E
|
F
|
G
|
H
|
I
|
J
|
K
|
L
|
M
|
N
|
O
|
5
|
P
|
Q
|
R
|
S
|
T
|
U
|
V
|
W
|
X
|
Y
|
Z
|
[
|
\
|
]
|
^
|
_
|
6
|
`
|
a
|
b
|
c
|
d
|
e
|
f
|
g
|
h
|
i
|
j
|
k
|
l
|
m
|
n
|
o
|
7
|
p
|
q
|
r
|
s
|
t
|
u
|
v
|
w
|
x
|
y
|
z
|
{
|
|
|
}
|
~
|
* This panel is organized to be easily read in hexadecimal: row numbers represent the first digit and the column numbers represent the second one. For example, the "A" character is located at the 4th row and the 1st column, for that it would be represented in hexadecimal as 0x41 (65).
Because most systems nowadays work with 8bit bytes, which can represent 256 different values, in addition to the 128 standard ASCII codes there are other 128 that are known as extended ASCII, which are platform- and locale-dependent. So there is more than one extended ASCII character set.
The two most used extended ASCII character sets are the one known as OEM, that comes from the default character set incorporated by default in the IBM-PC and the other is the ANSI extend ASCII which is used by most recent operating systems.
The first of them, the OEM character set, is the one used by the hardware of the immense majority of PC compatible machines, and was also used under the old DOS system. It includes some foreign signs, some marked characters and pieces to represent panels.
The ANSI character set is a standard that many systems incorporate, like W
indows, some UNIX platforms and many standalone applications. It includes many more local symbols and marked letters so that it can be used with no need of being redefined in many more languages:
No comments:
Post a Comment