10. Vectors¶
To quote the online C++ reference, a “vector is a sequence container that encapsulates dynamic size arrays.” Like the Strings from a few chapters back, vectors provide us an opportunity to introduce object-oriented programming concepts by using a built-in C++ object.
In order to use vectors, you need to include the header file at the top of your program:
#include <vector>
Vector objects provide a template for creating
and manipulating sequences of a desired type. To create a vector you need to
specify the type of the elements by enclosing the type in angle brackets
(<
and >
):
vector<int> counts;
vector<double> doubles;
The first line creates a vector of integers named count
; the second
creates a vector of doubles named doubles
. Although these statements are
legal, they are not very useful because they create vectors that have no
elments (their size is zero). It is more common to specify the size of the
vector in parentheses:
vector<int> counts(256);
The syntax here is a little odd. It looks like a combination of a variable
declaration and a function call. In fact, that’s exactly what it is. The
function we are invoking is a vector
constructor. A constructor
is a special function that creates new objects and initializes their instance
variables. In this case, the constructor takes a single argument, which is the
size of the new vector.
The following figure shows how vectors are represented in state diagrams:
The large circles inside the boxes are the elements of the vector. Each element is accessed using the address of its box, beginning with 0. When you allocate a new vector, the elements are not initialized. They could contain any values.
There is another constructor for vectors
that takes two arguments. The
second is a fill value, the value that will be assigned to each of the
elements.
vector<int> count(256, 0);
This statement creates a vector of 256 elements and initializes all of them to zero.
10.1. Accessing elements¶
Elements in a vector are accessed with the same sytax used for arrays - square brackets enclosing the index of the element.
Here is a version of the square numbers program we saw in the previous chapter using vectors instead of arrays:
#include <iostream>
#include <vector>
using namespace std;
int main()
{
vector<int> square_numbers(10);
// Initialize square_numbers vector
for (int i = 0; i < square_numbers.size(); i++)
square_numbers[i] = i * i;
// Print the square numbers
for (int i = 1; i < square_numbers.size(); i++)
cout << square_numbers[i] << ' ';
cout << endl;
return 0;
}
The size member function, not surprisingly, it returns the size of the vector (the number of elements).
It is a good idea to use this value as the upper bound of a loop, rather than a constant. That way, if the size of the vector changes, you won’t have to go through the program changing all the loops; they will work correctly for any size vector.
Since you can use the resize member function changes the number of elements stored in the vector, it is common for a vector’s size to change.
The close correspondance between pointers and arrays we discussed last chapter
still holds for vectors. The vector member function data()
gives the
starting address of the array data encapsulated in the vector object. We
could add the following to the program listed above and it too would print out
all the square numbers in the vector:
int* ip = square_numbers.data();
for (int i = 0; i < square_numbers.size(); i++)
cout << *ip++ << ' ';
cout << endl;
10.2. Iterators for vectors¶
Vector objects are part of C++’s Standard Template Library (STL). We will continue to learm more about the STL throughout the rest of this book. For now we want to introduce the concepts that vectors are containers, and that all STL containers have iterators defined for them that can be used to iterate over the elements which they contain.
Here is another way to print our square numbers using an iterator:
for (auto item : square_numbers) {
cout << item << ' ';
}
cout << endl;
The auto
keyword used as the type for item
is a
type inference. It tells the
compiler to infer the type of item
from the type of square_numbers
.
The variable item
will assume each of the values contained in the
vector square_numbers
in the sequence in which they occur.
This pattern is useful if we want to access the elements of the vector, but not if we want to change them. For that, we can use an iterator that points to each element.
#include <iostream>
#include <vector>
using namespace std;
int main()
{
vector<int> numbers(10);
// Initialize numbers vector with even numbers
for (int i = 0; i < numbers.size(); i++)
numbers[i] = 2 * i;
// Print the even numbers with value iterator
for (auto number: numbers) {
cout << number << ' ';
}
cout << endl;
// Change the even numbers to odd numbers with pointer iterator
for (auto itr = numbers.begin(); itr < numbers.end(); itr++) {
*itr += 1;
}
// Now print the odd numbers with value iterator
for (auto number: numbers) {
cout << number << ' ';
}
cout << endl;
return 0;
}
The output of this program is:
0 2 4 6 8 10 12 14 16 18
1 3 5 7 9 11 13 15 17 19
10.3. Copying vectors¶
There is one more constructor for vectors, which is called a copy constructor because it takes one vector as an argument and creates a new vector that is the same size, with the same elements.
vector<int> copy(count);
Although this syntax is legal, it is almost never used for vectors because there is a better alternative:
vector<int> copy = count;
The =
operator works on vectors pretty much the way you would expect.
10.4. Other member functions¶
The following program uses a few other vector member functions we should get to know to sort a list of integers using an insertion sort.
#include <iostream>
#include <vector>
using namespace std;
void print_numbers(const vector<int> v) {
if (!v.empty()) {
auto itr = v.begin();
cout << *itr;
for (itr++; itr < v.end(); itr++)
cout << ' ' << *itr;
}
cout << endl;
}
int main() {
vector<int> numbers = {13, 21, 4, 16, 7, 22, 5, 19, 2, 11};
int pos;
cout << "Unsorted numbers: ";
print_numbers(numbers);
for (int num_sorted = 0; num_sorted < numbers.size(); num_sorted++) {
int num = numbers.back();
numbers.pop_back();
for (pos = 0; num > numbers[pos] && pos <= num_sorted; pos++);
numbers.insert(numbers.begin() + pos, num);
cout << "After iteration " << num_sorted + 1 << ": ";
print_numbers(numbers);
}
cout << "Sorted numbers: ";
print_numbers(numbers);
return 0;
}
push_back adds a
new element to the end of a vector, while pop_back removes the last
element. back returns
the last element of the vector, and front returns the first
element. Since pop_back
does not return the value of the element removed,
it is important to call back
to save it before removing when it is needed,
as is done here.
insert(pos, value)
inserts value
at the position given by pos
. Notice the inner for loop
used immediately before the insert
function is called. It has no loop
body. The is not an uncommon patter in C++ for loops. The purpose of the loop
is to find the position where the current element should be inserted. This
is accomplished with the initializer, condition, and incrementor
components of the loop, so no body is needed.
empty
is a boolean function that returns true when the vector has no
elements and false otherwise. It was used above in the print_numbers
function. Here is another use of it in a loop which reverses the order
of elements in a vector by popping them off one vector and pushing them
onto another:
vector<int> reversed;
while (!numbers.empty()) {
reversed.push_back(numbers.back());
numbers.pop_back();
}
cout << "Reversed numbers: ";
print_numbers(reversed);
Add this above the return
statement in the previous program and run it.
There are several other vector member functions that we won’t discuss here, but you are encouraged to explore them on your own.
10.5. Statistics¶
The numbers generated by rand
are supposed to be distributed uniformly.
That means that each value in the range should be equally likely. If we count
the number of times each value appears, it should be roughly the same for all
values, provided that we generate a large number of values. The larger the
number of values, the more true this should be.
In the next few sections, we will write a program that generates a sequence of
random numbers and check whether this property holds true. To generate random
numbers, we’ll need to include cstdlib
and ctime
. We are going to
want to print out a nice looking table, so we will need a new include,
iomanip
, so our program should begin like this:
#include <iostream>
#include <vector>
#include <ctime>
#include <cstdlib>
#include <iomanip>
using namespace std;
10.6. Vector of random numbers¶
The first step is to generate a large number of random values and store them in a vector. The following function takes a 3 arguments, the number of random values we want – the size of the vector – and the minimum and maximum values of the random numbers we want to fill it with.
vector<int> random_vector(int n, int min, int max)
{
vector<int> vec(n);
srand(time(NULL));
for (int i = 0; i < vec.size(); i++) {
vec[i] = min + (rand() % (max + 1));
}
return vec;
}
The return type is vector<int>
, which means that this function returns a
vector of integers. To test this function, it is convenient to have a function
that outputs the contents of a vector.
void print_vector(const vector<int>& vec)
{
for (int i = 0; i < vec.size() - 1; i++) {
cout << vec[i] << " ";
}
cout << vec[vec.size() - 1] << endl;
}
Notice that it is legal to pass vector
s by reference. In fact it is quite
common, since it makes it unnecessary to copy the vector. Since print_vector
does not modify the vector, we declare the parameter const
.
One more function to count the number of each of our random values in the resulting vector.
Now our main
function uses these other functions to generate a large
vector of random numbers and print out the results:
int num_values = 20;
int lower_bound = 1;
int upper_bound = 10;
vector<int> v = random_vector(num_values, lower_bound, upper_bound);
print_vector(v);
On one run on my machine I got this as output:
5 1 4 10 3 11 7 11 3 11 4 4 1 7 10 6 10 6 6 2
which is pretty random-looking, though there are three 4’s, 6’s and 11’s, and no 8’s or 9’s.
If these numbers are really random, we expect each digit to appear the same number of times - twice each.
Do these results mean the values are not really uniform? It’s hard to tell. With so few values, the chances are slim that we would get exactly what we expect. But as the number of values increases, the outcome should be more predictable.
To test this theory, we’ll add a function that counts the number of times each value appears, and then see what happens when we increase the number of values.
10.7. Counting¶
A good approach to problems like this is to think of simple functions that are easy to write, and that might turn out to be useful. Then you can combine them into a solution. This approach is sometimes called bottom-up design. Of course it is not easy to know ahead of time which functions are likely to be useful, but as you gain experience you will have a better idea.
Also, it is not always obvious what sort of things are easy to write, but a good approach is to look for subproblems that fit a pattern you have seen before.
Back in the Traversal section of the Strings chapter we looked a loop that traversed a string and counted the number of times a given letter appeared. You can think of this program as an example of a pattern called traverse and count. The elements of this pattern are:
A set or container that be traversed, like a string or a vector.
A test that you can apply to each element in the container.
A counter that keeps track of how many elements pass the test.
In this case, we have a function in mind called how_many
that counts the
number of elements in a vector that equal a given value. The parameters are the
vector and the integer value we are looking for. The return value is the number
of times the value appears.
int how_many(const vector<int>& vec, int value)
{
int count = 0;
for (int i = 0; i < vec.size(); i++) {
if (vec[i] == value) count++;
}
return count;
}
how_many
only counts the occurances of a particular value, and we are
interested in seeing how many times each value appears. We can solve this
problem with a loop that calls how_many
with each value in our
choosen range and then prints out the result. Here’s a main function that
does just that:
int main()
{
int num_values = 100000;
int lower_bound = 1;
int upper_bound = 10;
vector<int> v = random_vector(num_values, lower_bound, upper_bound);
cout << "value\thow_many" << endl;
for (int i = lower_bound; i <= upper_bound; i++)
cout << left << setw(6) << i << how_many(v, i) << endl;
return 0;
}
This code uses the loop variable as an argument to how_many
, in order to
check each value between 1 and 10, in order. Two new I/O manipulators are
used to align our output,
left, defined
in iostream
, and setw <https://cplusplus.com/reference/iomanip/setw/>,
which requires iomanip
and takes an integer argument to set the width of
the output.
The result is:
value how_many
1 9137
2 9213
3 9059
4 8997
5 9038
6 9171
7 9090
8 9130
9 9014
10 9066
In each case, the number of appearances is within about 1% of the expected value (10,000), so we conclude that the random numbers are probably uniform.
10.8. A histogram¶
It is often useful to take the data from the previous tables and store them for
later access, rather than just print them. What we need is a way to store 10
integers. We could create 10 integer variables with names like how_manyOnes
,
how_manyTwos
, etc. But that would require a lot of typing, and it would be a
real pain later if we decided to change the range of values.
A better solution is to use a vector with size 10. That way we can create all ten storage locations at once and we can access them using indices, rather than ten different names. Here’s how:
int num_values = 100000;
int upper_bound = 10;
vector<int> vector = random_vector(num_values, upper_bound);
vector<int> histogram(upper_bound);
for (int i = 0; i < upper_bound; i++) {
int count = how_many(vector, i);
histogram[i] = count;
}
We called the vector histogram
because that’s a statistical term for a
vector of numbers that counts the number of appearances of a range of values.
The trick thing here is that we are using the loop variable in two different
ways. First, it is an argument to how_many
, specifying which value we are
interested in. Second, it is an index into histogram
, specifying which
location the result should be stored in.
Although this code works, it is not as efficient as it could be. Every time it
calls how_many
, it traverses the entire vector. In this example we have to
traverse the vector ten times!
It would be better to make a single pass through the vector. For each value in the vector we could find the corresponding count and increment it. In other words, we could use the value from the vector as an index into the histogram. Here’s what that looks like:
vector<int> histogram(upper_bound, 0);
for (int i = 0; i < num_values; i++) {
int index = vector[i];
histogram[index]++;
}
The first line intializes the elements of the histogram to zeros. That way,
when we use the increment operator (++
) inside the loop, we know we are
starting from zero. Forgetting to initialize counters is a common error.
10.9. Glossary¶
- bottom-up design¶
A method of program development that starts by writing small, useful functions and then assembling them into larger solutions.
- constructor¶
A special function that creates a new object and initializes its instance variables.
- deterministic¶
A program that does the same thing every time it is run.
- element¶
One of the values in a vector. The
[]
operator selects elements of a vector.- histogram¶
A vector of integers where each integer counts the number of values that fall into a certain range.
- pseudorandom¶
A sequence of numbers that appear to be random, but which are actually the result of a deterministic computation.
- seed¶
A value used to initialize a pseudorandom number sequence. Using the same seed should yield the same sequence of values.
- vector¶
A named collection of values, where all the values have the same type, and each value is identified by an index.