Suppose you want to generate a graph with V
vertices and E
edges, such that each graph (i.e. each configuration of edges) is equally likely to be generated. For sparse graphs, you can just pick two vertices, and connect them if they’re not connected already. If they are connected, pick another two vertices. (Repeat E
times.) For dense graphs one solution is to generate all edges, randomly permute them, and then pick the first E
. The time complexity of both of these algorithms is O(V+E)
) (for sparse and dense graphs respectively).
However, if you want to handle both sparse and dense graphs in a single function you end up having to mash the two algorithms together with an if statement checking if the graph is sparse or dense. Here is a cleaner way:
import random | |
import math | |
def random_pick(n, m): | |
"""Pick m integers from a bag of the integers in [0, n) without replacement""" | |
d = {i : i for i in range(m)} # For now, just pick the first m integers | |
res = [] | |
for i in range(m): # Pick the i-th number | |
j = random.randrange(i, n) | |
# Pick whatever is in the j-th slot. If there is nothing, then pick j. | |
if j not in d: | |
d[j] = j | |
d[i], d[j] = d[j], d[i] # Swap the contents of the i-th and j-th slot | |
res.append(d[i]) | |
return res | |
def gen_random_graph(V, E): | |
"""Generate an undirected graph in the form of an adjacency list with no duplicate edges or self loops""" | |
g = [[] for _ in range(V)] | |
edges = random_pick(math.comb(V, 2), E) # Pick E integers that represent the edges | |
for e in edges: # Decode the edges into their vertices | |
u = int((1 + math.sqrt(1 + 8 * e)) / 2) | |
v = e - math.comb(u, 2) | |
g[u].append(v) | |
g[v].append(u) | |
return g | |
# The complete graph on 4 vertices | |
print(gen_random_graph(4, 6)) # [[3, 2, 1], [3, 2, 0], [1, 0, 3], [0, 1, 2]] |
This algorithm is similar to the algorithm for dense graphs, but instead of generating all edges and permuting up front, you do it one edge at a time. The key is an “implicit” list of edges, stored as a dictionary of integers -> edges (we encode edges as integers). We can think of the list in two parts: indices [0,i)
where we have finalized what edges we picked, and indices [i,n)
which store the remaining edges that we can pick from. Thus at each step to pick an edge, we pick a random index from [i,n)
and move that edge into index i (while moving what was stored at index i into the selected index).
The edge -> integer encoding/decoding scheme is as follows:
How do we decode an integer into the vertices it represents? First, let’s figure out what the first vertex is (e.g. a
is the first vertex in the edge (a,b)
). In a graph with V
vertices and E
edges, given an edge represented as an integer e
(sorry), if the first vertex was u
then
Solving for the roots and throwing out the solution where u≤0 we get u≥12⋅(1+1+8e). Since we want the largest u that still satisfies this condition, we round it down to the nearest integer.
There are 1+2+…+(u−1)=(u^2)
edges with their first vertex being <u
. After that, the edges look like (u,0),(u,1),…,(u,u−1),(u+1,0),…)
, so it is sufficient to subtract (u2) from e to obtain the second vertex v.
Generating the base adjacency list is O(V)
, generating the encoded edges is O(E)
, and decoding each edge is O(1)
so the total time complexity is O(V+E)
.