2. Variables, types and expressions

2.1. More output

As we mentioned in the last chapter, you can put as many statements as you want in main. For example, to output more than one line:

#include <iostream>
using namespace std;

// output more than one line
int main()
{
    cout << "Hello world." << endl;     // output one line
    cout << "How are you?" << endl;     // output another 
    return 0;
}

As you can see, it is legal to put comments at the end of a line, as well as on a line by themselves.

The phrases that appear in quotation marks are called strings, because they are made up of a sequence (string) of letters. Each individual element of a string is a character. Strings can contain any combination of letters, numbers, puntuation marks, and other special characters.

Often it is useful to display the output from multiple output statements all on one line. You can do this by leaving out the first endl:

#include <iostream>
using namespace std;

int main()
{
    cout << "The Answer to the Ultimate Question of Life, the Universe,";
    cout << " and Everything is " << 42 << '.' << endl;
    return 0;
}

In this case the output appears in a single line as “The Answer to the Ultimate Question of Life, the Universe, and Everything is 42.” The output stream, cout, in this example is sent three types of literals, two strings, an integer, and a character (the '.'). They are all converted into a sequence of characters in a single output string.

Note

There is actually another type of thing sent to the output stream here, the endl, but we’re not going to confuse things by talking about it yet.

Notice that there is a space before the word “and” and after the word “is”. These spaces appear inside the quotations, so they are characters in the string. They are part of the output, so they affect the behavior of the program.

Spaces that appear outside of quotation marks generally do not affect the behavior of the program. For example, we could write:

#include <iostream>
using namespace std;

int main()
{
cout<<"The Answer to the Ultimate Question of Life, the Universe,";
cout<<" and Everything is "<<42<<'.'<< endl;
return 0;
}

This program will compile and run just as well as the original.

The breaks at the end of the lines (newlines) do not affect the program’s behavior either (accept for the #include line), so we could even write:

#include <iostream>
using namespace std;int main(){cout<<"The Answer to the Ultimate Question of Life, the Universe,";cout<<" and Everything is "<<42<<'.'<< endl;return 0;}

This one works too, although you have probably noticed that the program is getting harder and harder to read.

Only the newline at the end of the #include line and the single spaces required to separate things like int and main are included in this version, which contains only two lines (you’ll have to horizontally scroll to read the second line if you’re viewing this with a web browser).

Newlines and spaces are useful for organizing your program visually, making it easier to read the program and locate syntax errors.

2.2. Values

A value is one of the fundamental things - like a letter or a number - that a program manipulates. The only values we have manipulated so far are the string values we have been outputting, like “Hello, world.”, and the number value 42. You (and the compiler) can identify string values because they are enclosed in quotation marks.

An integer is a whole number like 1 or 17. As we have seen, we can output integer values the same way we do strings, with statements like:

cout << 42 << endl;

A character value is a letter or digit or punctuation mark or whitespace enclosed in single quotes, like 'a', '5', or ' '. Character values can be output the same way as strings or integers:

cout << '}' << endl;

This example outputs a single close curly brace on a line by itself.

It is easy to confuse different types of values like "5", '5', and 5, but if you pay attention to the punctuation, it should be clear that the first is a string, the second a character, and the third an integer. The reason this distinction is important will become clear soon.

All values belong to one of several classes of values called a data type. A data type specifies the set of values that belong to it together with the operations that can be performed on them.

The power the abstraction of higher level data types give us makes them much easier for us to use. It is important to understand that on a more fundamental level, all values are represented by a sequence of bits.

As mentioned in the previous chapter, C++ is a middle level language, which means it exposes some of the lower level details about how the computer works “under the hood” that higher level languages hide. The char data type is a case in point. While char is usually used to represent characters, it can just as easily be used for small integers.

#include <iostream>
using namespace std;

int main()
{
    char n = 33;
    n *= 2;
    cout << n << " is n printed as a character." << endl;
    cout << "But look at this: " << n + 1 << " and " << ++n;
    cout << ". What's all that about?" << endl;
    cout << "Oh, the 1st one is evaluated as an int, and the 2nd a char.";
    cout << endl;
    return 0;
}

Compile and run this program and see what you get. Take time to think about what you are seeing. When char values are sent to the cout stream, they are rendered as characters using ASCII <https://en.wikipedia.org/wiki/ASCII> values. When a literal 1 is added to a char value, the result it an integer value, which cout renders accordingly.

The sizeof operator lets us display the number of 8 bit elements, called bytes, that a given value has.

The following program:

#include <iostream>
using namespace std;

int main()
{
    cout << "char: " << sizeof(char) << endl;
    cout << "a: " << sizeof 'a' << endl;
    cout << "int: " << sizeof(int) << endl;
    cout << "42: " << sizeof 42 << endl;
    cout << "42000968: " << sizeof 42000968 << endl;
    cout << "Hello!: " << sizeof "Hello!" << endl;

    return 0;
}

will produce the following output:

char: 1
a: 1
int: 4
42: 4
42000968: 4
Hello!: 7

Notice that sizeof can be used with either types or values, and that when used with types the type must be enclosed in parentheses.

C++ has a set of fundamental types that you will be asked to investigate in the exercises.

2.3. Variables

One of the most powerful features of a programming language is the ability to manipulate variables. A variable is a named location that stores a value.

Just as there are different types of values (integer, character, etc.), there are different types of variables. When you create a new variable, you have to declare what type it is. For example, the character type in C++ is called char. The following statement creates a new variable named fred that has type char:

char fred;

This kind of statement is called a declaration.

The type of a variable determines what kind of values it can store. A char variable can contain characters, and it should come as no surprise that int variables can store integers.

There are several types in C++ that can store string values, but we are going to skip that for now (see Strings chapter).

To create an integer variable, the syntax is:

int alice;

where alice is the arbitrary name you made up for the variable. In general, will want to make up variable names that indicate what you plan to do with the variable. For example, if you saw these variable declarations:

char first_letter;
char last_letter;
int hour, minute;

you could probably make a good guess at what values will be stored in them. This example also demonstrates the syntax for declaring multiple variables with the same type: hour and minute are both integers (int type).

Variable names in C++ have to follow the following rules:

  • They can contain any combination of from 1 to 255 letters (a - z, A - Z), digits (0 - 9), and underscores (_), but they can not begin with a digit.

  • They are case sensitive, so var, Var, and VAR are three different variable names.

  • They can not be a C++ keyword.

2.4. Expressions and operators

Operators are special symbols that are used to represent simple computations like addition and substraction. Most operators are binary operators, which means they perform an operation on two operands. An expression is a combination of operators and operands. A single value results when an expression is evalutated.

Most of the operators in C++ do exactly what you would expect them to do, because they are common mathematical symbols. For example, the operator for adding two integers is +.

The following are all legal C++ expressions whose meaning is more or less obvious:

1+1    hour-1    hour*60+minute    minute/60

Expressions can contain both variable names and integer values. In each case the name of the variable is replaced with its value before the computation is performed.

Addition, substraction and multiplication all do what you expect, but division gives us a surprise. When we run the following program:

#include <iostream>
using namespace std;

int main()
{
    int hour, minute;
    hour = 11;
    minute = 59;
    cout << "Number of minutes since midnight: ";
    cout << hour * 60 + minute << endl;
    cout << "Fraction of the hour that has passed: ";
    cout << minute / 60 << endl;
    return 0;
}

we get the following output:

Number of minutes since midnight: 719
Fraction of the hour that has passed: 0

The first line is what we expected, but the second line is not. The value of the variable minute is 59, and 59 divided by 60 is 0.98333, not 0. The reason for the discrepancy is that C++ is performing integer division.

When both of the operands are integers (operands are the things operators operate on), the result must also be an integer, and integer division trucates the resulting value at the decimal point. Running this program:

#include <iostream>
using namespace std;

int main()
{
    int i1, i2, i3;
    i1 = 5;
    i2 = 2;
    i3 = -5;
    cout << "With integer division 5 divided by 2 is " << i1 / i2 << ", ";
    cout << "and -5 divied by 2 is " << i3 / i2 << '.' << endl;
    return 0;
}

gives us:

With integer division 5 divided by 2 is 2, and -5 divided by 2 is -2.

Truncating here means cutting off the value at the decimal point. It is different from rounding down, since -5/2 rounded down would be -3.

A possible alternative in this case is to calculate the percentage rather than a fraction:

cout << "Percentage of the hour that has passed: " << minute*100/60 << endl;

which results in:

Percentage of the hour that has passed: 98

Again the result is truncated, but at least now the result is approximately correct. In order to get an even more accurate answer, we could use a different type of variable, called a floating-point, that is capable of storing fractional values. We’ll get to that in the next chapter.

One very common operator in C++ that you might not immediately recognize is the modulo operator, which evaluates to the remainder of an integer division. The following statement:

cout << 5 % 3 << ' ' << 7 % 4 << ' ' << 3 % 5 << endl;

will print out 2 3 3. Be sure you to practice with modulo expressions is you are not already familiar with them until you feel comforatable with them.

2.5. Increment and decrement operators

Incrementing and decrementing are such common operations that C++ provides special operators for them. The ++ operator adds one to the current value of an int, char, or double, and -- substracts one. Neither operators works on strings, and neither should be used on bools.

Technically, it is legal to increment a variable and use it in an expression at the same time. For example, take a look at this:

int n = 1;
cout << n++ << ' ';
cout << ++n << endl;

This will print:

1 3

because the n++ expression increments n after it is evaluated, while ++n increments it before it is evaluated.

It is a common error to write something like

index = index++;    // WRONG!

Unfortunately, this is syntactically legal, so the compiler will not warn you. The effect of this statement is to leave the value of index unchanged. This is often a difficult bug to track down.

Remember, you can write index = index + 1;, or you can write index++;, but you shouldn’t mix them.

2.6. Compound Assignment Operators

A common pattern in programming is to modify a variable in some way and then store the result back in the same variable. C++ supports this pattern through compound assignment operators.

int n = 1;
n *= 2;
cout << n << ' ';
n += 14;
cout << n << ' ';
n /= 4;
cout << n << endl;

will print 2 16 4. Additional assignment operators exist for subtraction, modulo and the bitwise operators we will see later.

2.7. Order of Operations

When more than one operator appears in an expression the order of evaluation depends on the rules of precedence. A complete explaination of precendence can get complicated, but just to get you started:

  • Multiplication and division happen before addition and subtraction. So 2*3-1 yields 5, not 4, and 2/3-1 yields -1, not 1 (remember that with integer division 2/3 is 0).

  • If the operators have the same precedence, they are evaluated from left to right. So in the expression minute*100/60, the multiplication happens first, yielding 5900/60, which after the division yields 98. If the operations had gone from right to left, the result would be 59*1 and then 59, which is wrong.

  • Any time you want to override the rules of precedence (or you are not sure what they are) you can use parentheses. Expressions in parentheses are evaluated first, so 2*(3-1) is 4. You can use parentheses to make an expression easier to read, as in (minute * 100) / 60, even though it doesn’t change the result.

2.8. Assignment

Now that we have created some variables, we would like to store values in them. We do that with an assignment statement.

first_letter = 'a';  // give first_letter the value 'a'
hour = 11           // assign the value 11 to hour
minute = 59         // set minute to 59

This example shows three assignments, and the comments show three different ways people talk about assignment statements. The vocabularly can be confusing here, but the idea is straightforward:

  • When you declare a variable, you create and named storage location.

  • When you make an assignment to a variable, you give it a value.

A common way to represent variables on paper is to draw a box with the name of the variable on the outside and the value of the variable on the inside. This kind of figure is called a state diagram because it shows what state each of the variables is in (you can think of it as the variable’s “state of mind”). This diagram shows the effect of the three assignment statements:

State diagram illustration

We sometimes use different shapes to indicate different variable types. These shapes should help remind you that one of the rules of C++ is that a variable has to have the same type as the value you assign it. For example, you cannot store a string in an int variable. The following statement generates a compiler error:

int hour;
hour = "Hello.";   // WRONG!

This rule is sometimes a source of confusion, because there are many ways that you can convert values from one type to another, and C++ sometimes conversts things automatically. But for now you should remember that as a general rule variables and values have the same type. We’ll talk about the special cases later.

Another source of confusion is that some strings look like integers, but they are not. For example, the string "123", which is made up of the characters '1', '2', and '3' is not the same as the number 123. This assignment is illegal:

minute = "59";   // WRONG!

2.9. Outputting variables

You can output the value of a variable using the same commands we used to output simple values.

int hour, minute;
char colon;

hour = 11;
minute = 59;
colon = ':';

cout << "The current time is ";
cout << hour;
cout << colon;
cout << minute;
cout << endl;

This program creates two integer variables named hour and minute, and a character variable named colon. It assigns appropriate values to each of the variables and then uses a series of output statements to generate the following:

The current time is 11:59

When we talk about “outputting a variable”, we mean outputting the value of the variable. To output the name of a variable, you have to put it in quotes. For example: cout << "hour";

As we have seen, you can include more than one value in a single output statement, which can make the previous program more concise:

int hour, minute;
char colon;

hour = 11;
minute = 59;
colon = ':';

cout << "The current time is " << hour << colon << minute << endl;

On one line, this program outputs a string, two integers, a character, and the special value endl. Very impressive!

2.10. Keywords

In listing the rules for variable names, the last rule said they can not be C++ keywords. Keywords are certain words that are reserved in C++ because the are used by the compiler to parse the structure of your program, and if you use them as variable names, it will get confused. These include int, char, return, and many more.

The complete list of keywords is included in the C++ Standard, which is the official language adopted by the International Organization for Standardization (ISO) The official standard needs to be purchased from the ISO, but you can see the list of keywords in the online API Reference Document.

Rather than memorize the list, we suggest that you take advantage of a feature provided in many development environments: code highlighting. As you type, different parts of your program should appear in different colors. For example, keywords might be red, strings blue, and other code black. If you type a variable name and it turns red, watch out! You might get some strange behavior from the compiler.

2.11. Operators for characters

Interestingly, the same mathematical operations that work on integers also work on characters. For example,

char letter;
letter = 'a' + 1;
cout << letter << endl;

outputs the letter b. Although it is syntactically legal to multiply characters, it is almost never useful to do it.

Earlier we said you can only assign integer values to integer variables and character values to character variables, but that is not completely true. In some cases C++ converts automatically between types. For example, the following is legal C++:

int number;
number = 'a';
cout << number << endl;

The result is 97, which is the number that is used internally by C++ to represent the letter ‘a’. However, it is generally a good idea to treat characters as characters, and integers as integers, and only convert from one to the other if there is a good reason.

Automatic type conversion is an example of a common problem in designing a programming language, which is that there is a conflict between formalism, which is the requirement that formal languages should have simple rules with few exceptions, and convenience, which is the requirement that programming languages be easy to use in practice.

More often than not, convenience wins, which is usually good for expert programmers, who are spared from rigorous but unwiedy formalism, but bad for beginning programmers, who are often baffled by the complexity of the rules and the number of exceptions. In this book we have tried to simplify things by emphasizing the rules and omitting many of the exceptions.

2.12. Composition

So far we have looked at the elements of a programming language - variables, expressions, and statements - in isolation, without talking about how to combine them.

One of the most useful features of programming languages is their ability to take small building blocks and compose them. For example, we know how to multiply integers and we know how to output values; it turns out we can do both at the same time:

cout << 17 * 3;

Actually, we shouldn’t say “at the same time”, since in reality the multiplication has to happen before the output, but the point is that any expression, involving numbers, characters, and variables, can be used inside an output statement. We’ve already seen one example:

cout << hour * 60 + minute << endl;

You can also put arbitrary expressions on the right-hand side of an assignment statement:

int percentage;
percentage = (minute * 100) / 60;

This ability may not seem so impressive now, but we will see other examples where composition makes it possible to express complex computations neatly and concisely.

2.13. Glossary

assignment

A statement that assigns a value to a variable.

composition

The ability to combine simple expressions and statements into compound statements and expressions in order to represent complex computations concisely.

declaration

A statement that creates a new variable and defines its type.

expression

A combination of variables, operators and values that represents a single result value. Expressions also have types, as determined by their operators and operands.

operand

One of the values on which an operator operates.

operator

A special symbol that represents a simple computation like addition or multiplication.

precedence

The order in which operations are evaluated.

statement

A line of code that represents a command or action. So far, the statements we have seen are declarations, assignments, and output statements.

string

A sequence of characters. Strings literals in C++ are enclosed in quotes (“”).

data type

A set of values together with the operations that can be performed on them. They types we have seen are integers (int in C++), characters (char in C++), and strings.

value

A letter, or number, or other thing that can be stored in a variable.

2.14. Exercises