If we are going to implement a graph as a data structure, we need some way to represent it in memory. There are a couple of options that we can use, but we must remember that we ultimately want whatever methods we employ (e.g., adding an edge, checking to see if there is an edge between two given vertices, iterating over vertices adjacent to a given vertex, etc.) to be both space and time efficient for the types of graphs that we are likely to encounter.
One key consideration for us will be that real-world graphs tend to be sparse. That is to say, even if they have a huge number of vertices, they tend to have a small average vertex degree. When the average vertex degree is large, we say the graph is dense. The below graphic gives examples of both types of graphs, along with their respective total number of edges. Again, in the real world, you are far more likely to see a graph similar to the one on the left than you are to see one like that shown on the right.
One option for representing a given graph would be to keep a list of edges in that graph. Each edge consists of two vertices, so we would need only create a linked list or array that stored pairs of vertices. The below shows how one graph (consisting of three connected components) could be represented in this way, with one edge highlighted for clarity:
A second option would be to represent a graph using an adjacency matrix. Essentially, one would maintain a two-dimensional $V \times V$ array boolean array
adj, where for each edge connecting vertices $v$ and $w$,
adj[v][w] == adj[w][v] == true and all other entries in the array would be false. This representation could be very naturally extended to represent digraphs, where
adj[v][w] == true if and only if $v$ and $w$ were connected by an edge directed from $v$ to $w$. The below shows an example of such an adjacency matrix.
As a final option to consider, we could use an adjacency list to represent a graph. In this case, one would maintain a vertex-indexed array of lists -- with each list consisting of all vertices adjacent to a particular vertex. The Bag class provides a convenient way to store these lists (A bag is essentially a stack without a
pop() method, or equivalently a queue without a
dequeue method. The below provides an example:
If one determines the cost of each of the above representations above (as shown in the table below), it becomes clear that for real-world graphs that tend to have a large number of vertices but are "sparse", having a small average vertex degree, the advantages of the adjacency list representation is clear. In practice, this will be the preferred way to represent such graphs.
|representation||space||add edge||edge between v and w?||iterate over vertices adjacent to v|
|list of edges||$E$||$1$||$E$||$E$|