Lexical Issues
Now that you have seen several short Java programs, it is time to more formally describe the atomic elements of Java. Java programs are a collection of whitespace, identifiers, comments, literals, operators, separators, and keywords. The operators are described in the next chapter. The others are described next.
Now that you have seen several short Java programs, it is time to more formally describe the atomic elements of Java. Java programs are a collection of whitespace, identifiers, comments, literals, operators, separators, and keywords. The operators are described in the next chapter. The others are described next.
Whitespace
Java is a free-form language. This means that you do not need to follow any special indentation rules. For example, the Example program could have been written all on one line or in any other strange way you felt like typing it, as long as there was at least one whitespace character between each token that was not already delineated by an operator or separator. In Java, whitespace is a space, tab, or newline.
Identifiers
Identifiers are used for class names, method names, and variable names. An identifier may be any descriptive sequence of uppercase and lowercase letters, numbers, or the underscore and dollar-sign characters. They must not begin with a number, lest they be confused with a numeric literal. Again, Java is case-sensitive, so VALUE is a different identifier than Value. Some examples of valid identifiers are:
AvgTemp count a4 $test this_is_ok
Invalid variable names include:
2count high-temp Not/ok
Literals
A constant value in Java is created by using a literal representation of it. For example, here are some literals:
100 98.6 ‘X’ “This is a test”
Left to right, the first literal specifies an integer, the next is a floating-point value, the third is a character constant, and the last is a string. A literal can be used anywhere a value of its type is allowed.
Comments
As mentioned, there are three types of comments defined by Java. You have already seen two: single-line and multiline. The third type is called a documentation comment. This type of comment is used to produce an HTML file that documents your program. The documentation comment begins with a /** and ends with a */.
Separators
In Java, there are a few characters that are used as separators. The most commonly used separator in Java is the semicolon. As you have seen, it is used to terminate statements. The separators are shown in the following table:
Symbol - ( )
Name - Parentheses
Purpose - Used to contain lists of parameters in method definition and invocation. Also used for defining precedence in expressions, containing expressions in control statements, and surrounding cast types.
Purpose - Used to contain lists of parameters in method definition and invocation. Also used for defining precedence in expressions, containing expressions in control statements, and surrounding cast types.
Symbol - { }
Name - Braces
Purpose - Used to contain the values of automatically initialized arrays. Also used to define a block of code, for classes, methods, and local scopes.
Name - Braces
Purpose - Used to contain the values of automatically initialized arrays. Also used to define a block of code, for classes, methods, and local scopes.
Symbol - [ ]
Name - Brackets
Purpose - Used to declare array types. Also used when dereferencing array values.
Name - Brackets
Purpose - Used to declare array types. Also used when dereferencing array values.
Symbol - ;
Name - Semicolon
Purpose - Terminates statements.
Name - Semicolon
Purpose - Terminates statements.
Symbol - ,
Name - Comma
Purpose - Separates consecutive identifiers in a variable declaration. Also used to chain statements together inside a for statement.
Name - Comma
Purpose - Separates consecutive identifiers in a variable declaration. Also used to chain statements together inside a for statement.
Symbol - .
Name - Period
Purpose - Used to separate package names from subpackages and classes. Also used to separate a variable or method from a reference variable.
Name - Period
Purpose - Used to separate package names from subpackages and classes. Also used to separate a variable or method from a reference variable.
The Java Keywords
There are 49 reserved keywords currently defined in the Java language. These keywords, combined with the syntax of the operators and separators, form the definition of the Java language. These keywords cannot be used as names for a variable, class, or method.
Java Reserved Keywords
abstract, continue, goto, package, synchronized, assert, default, if, private, this, boolean, do, implements, protected, throw, break, double, import, public, throws, byte, else, instanceof, return, transient, case, extends, int, short, try, catch, final, interface, static, void, char, finally, long, strictfp, volatile, class, float, native, super, while, const, for, new, switch
The keywords const and goto are reserved but not used. In the early days of Java, several other keywords were reserved for possible future use. However, the current specification for Java only defines the keywords
In addition to the keywords, Java reserves the following: true, false, and null. These are values defined by Java. You may not use these words for the names of variables, classes, and so on.
The Simple Types
The Simple Types
Java defines eight simple (or elemental) types of data: byte, short, int, long, char, float, double, and boolean. These can be put in four groups:
■ Integers This group includes byte, short, int, and long, which are for wholevalued signed numbers.
■ Floating-point numbers This group includes float and double, which represent numbers with fractional precision.
■ Characters This group includes char, which represents symbols in a character set, like letters and numbers.
■ Boolean This group includes boolean, which is a special type for representing true/false values.
You can use these types as-is, or to construct arrays or your own class types. Thus, they form the basis for all other types of data that you can create.
The simple types represent single values—not complex objects. Although Java is otherwise completely object-oriented, the simple types are not. They are analogous to the simple types found in most other non–object-oriented languages. The reason for this is efficiency. Making the simple types into objects would have degraded performance too much.
The simple types are defined to have an explicit range and mathematical behavior. Languages such as C and C++ allow the size of an integer to vary based upon the dictates of the execution environment. However, Java is different. Because of Java’s portability requirement, all data types have a strictly defined range. For example, an int is always 32 bits, regardless of the particular platform. This allows programs to be written that are guaranteed to run without porting on any machine architecture. While strictly specifying the size of an integer may cause a small loss of performance in some environments, it is necessary in order to achieve portability.
Let’s look at each type of data in turn.