Chapter 5. Graph Traversal
A graph G = (V, E)
consists of a set of vertices V
together with a set E
of vertex pairs or edges. Graphs are important because they can be used to represent essentially any relationship.
5.1 Flavors of Graphs
The first step in any graph problem is determining the flavors of graphs you are dealing with:
- Undirected vs. Directed — A graph G = (V, E) is undirected if edge (x, y) ∈ E implies that (y, x) is also in E. If not, we say that the graph is directed.
- Weighted vs. Unweighted — Each edge (or vertex) in a weighted graph G is as- signed a numerical value, or weight.
For unweighted graphs, the shortest path must have the fewest number of edges, and can be found using a breadth-first search. Shortest paths in weighted graphs requires more sophisticated algorithms.
- Simple vs. Non-simple — Certain types of edges complicate the task of working with graphs. A self-loop is an edge (x, x) involving only one vertex. An edge (x,y) is a multiedge if it occurs more than once in the graph. Any graph that avoids them is called simple.
- Sparse vs. Dense. There is no official boundary between what is called sparse and what is called dense, but typically dense graphs have a quadratic number of edges, while sparse graphs are linear in size.
Sparse graphs are sparsely connected (eg: Trees). Usually the number of edges is in O(n) where n is the number of vertices. Therefore adjacency lists are preferred since they require constant space for every edge.
Dense graphs are densely connected. Here number of edges is usually O(n²). Therefore adjacency matrix is preferred.
- Cyclic vs. Acyclic — An acyclic graph does not contain any cycles.
- Embedded vs. Topological — A graph is embedded if the vertices and edges are assigned geometric positions. “visit each node exactly once” defines the topology which is called “complete graph”.
- Implicit vs. Explicit — Certain graphs are not explicitly constructed and then traversed, but built as we use them.
- Labeled vs. Unlabeled — Each vertex is assigned a unique name or identifier in a labeled graph to distinguish it from all other vertices. In unlabeled graphs, no such distinctions have been made.
5.2 Data Structures for Graphs
Basic choices are adjacency matrices (M (n*n)) and adjacency lists.
Degree (represented with d in above table) of a vertex is the number of edges connecting with the vertex.
A typical implementation of graph using an array of linked lists:
Array of Pointers: https://www.tutorialspoint.com/cplusplus/cpp_array_of_pointers.htm.
E.g
int *ptr[MAX];
Each element of the array is a pointer.This declares ptr as an array of MAX integer pointers. Thus, each element in ptr, now holds a pointer to an int value. In the example above, each element in edges (edges[0], edges[1]…) holds a pointer to a linked list.
Pointer to array: https://www.geeksforgeeks.org/pointer-array-array-pointer/.
E.g:
int (*ptr)[5];
Here ptr is pointer that can point to an array of 5 integers.
int arr[5] = { 1, 2, 3, 4, 5 }; int *p = arr;
points to the 0th element of the array.
int(*ptr)[5]; ptr = &arr;
Points to the whole array arr.The pointer that points to the 0th element of array and the pointer that points to the whole array are totally different.
ptr++
indicates the jump of 5 x sizeof(int) = 20 bytes.p++
indicates the position of next 1 x sizeof(int) = 4 bytes.
5.3, 5.4 omitted.
5.5 Traversing a Graph
The key idea behind graph traversal is to mark each vertex when we first visit it and keep track of what we have not yet completely explored.
Each vertex will exist in one of three states:
- undiscovered — the vertex is in its initial, virgin state.
- discovered — the vertex has been found, but we have not yet checked out all its incident edges.
- processed — the vertex after we have visited all its incident edges.
Each undirected edge will be considered exactly twice, once when each of its endpoints is explored. Directed edges will be considered only once, when exploring the source vertex.
5.6 Breadth-First Search
Idea: Traverse nodes in layers. Problem: Since we have cycles, each node will be visited infinite times. Solution: use a boolean visited array. Below is an implementation using adjacency list.
void Graph::BFS(int s)
{
// Mark all the vertices as not visited
bool *visited = new bool[V];
for(int i = 0; i < V; i++)
visited[i] = false;
// Create a queue for BFS
list<int> queue;
// Mark the current node as visited and enqueue it
visited[s] = true;
queue.push_back(s);
// 'i' will be used to get all adjacent
// vertices of a vertex
list<int>::iterator i;
while(!queue.empty())
{
// Dequeue a vertex from queue and print it
s = queue.front();
cout << s << " ";
queue.pop_front();
// Get all adjacent vertices of the dequeued
// vertex s. If a adjacent has not been visited,
// then mark it visited and enqueue it
for (i = adj[s].begin(); i != adj[s].end(); ++i)
{
if (!visited[*i])
{
visited[*i] = true;
queue.push_back(*i);
}
}
}
}
Ref: https://www.geeksforgeeks.org/breadth-first-search-or-bfs-for-a-graph/