Network Design: the Diameter-Degree Problem
The "Diameter-Degree Problem" is a graph theory problem that comes up in the design of computer networks, particularly peer-to-peer software networks or "virtual networks" (such as VPNs) and parallel processing architectures using node-to-node links for inter-CPU data exchange (Beowulf clusters are a prominent example).
Edge-lists for all the graphs discussed here are on this page.
Graphs discussed on this page
d=1 | d=2 | d=3 | d=4 | d=5 | |
k=1 | n=2: K_{2} | ||||
k=2 | n=3: K_{3} = C_{3} | n=5: C_{5} | n=7: C_{7} | n=9: C_{9} | n=11: C_{11} |
k=3 | n=4: K_{4} = T | n=6: C_{6(1,3)}
n=8: C_{8(1,4)} n=10: P | n=10: C_{10(1,5)}
n=12: C_{12(1,6)} n=14: PP_{7(1;3)} n=14: Heawood n=20: C_{5}×F_{4} | n=14: C_{14(1,7)}
n=16: C_{16(1,8)} n=22: PP_{11(1;3)} n=24: McGee n=30: Tutte-Coxeter (3,8)-cage n=38: von Conta; Alegre et al. | n=70: (3,10)-cage (?) |
k=4 | n=5: K_{5} | n=6: O
n=8: K_{4,4} n=9: G_{3} n=11: C_{11(1,3)} n=15: "K_{3} * C_{5}" | n=12: K_{3,3}×K_{2}
n=20: P×K_{2} n=20: "Moebius D" n=21: PP_{7(1;2;3)C3} n=26: (4,6)-cage n=28: PP_{7(1;2;3;2)C4} n=41: Allwright | n=16: Q4 = C×K_{2} (tesseract)
n=25: C_{5}×C_{5} n=28: PP_{7(1;3)}×K_{2} n=28: Heawood ×K_{2} n=36: PP_{9(1;2;3;4)C4} n=45: PP_{9(1;2;3;4;2)C5} n=55: PP_{11(1;2;4;3;5)C5} n=78: PP_{13(1;5;4;2;3;6)C6} n=98: Exoo-98 | |
k=5 | n=6: K_{6} | n=18: PP_{9(1,3;2,4)}
n=24: K_{3} * X_{8} | n=30: Robertson-Wegner
n=36: PP_{9(1;2;3;4)K4} n=72: Exoo_72 | n=56: PP_{7(1;2;3;2)C4} ×K_{2}
n=212: Exoo-212 |
The problem is stated by peer-to-peer software designers in two different ways. The first is:
If each node connects to no more than k other nodes, how many nodes can there be in a network such that no two nodes are separated by more than d steps?
and another statement of the problem is:
If each node connects to no more than k other nodes, and there are n nodes in the network, what is the maximum number of steps d between any two nodes, assuming the network has been wired so as to reduce this number as much as possible?
In networks, the number of steps is sometimes called "TTL" (time-to-live) or is involved in estimates of "propagation time" or "latency".
The same problem as stated by supercomputer or Beowulf designers sometimes takes the forms just given, but can also appear as follows:
If there are n nodes, and the maximum distance between any two nodes can be no more than d, how can you wire the network so as to minimize the number of connections k of the node with the greatest number of connections, and what is the resulting value of k?
All three are pretty much equivalent, and are stated in graph theory in roughly the following form:
What is the largest k-regular graph of diameter d?
Which means:
What is the maximum number of nodes in a graph such that all nodes have exactly k edges and no two nodes are more than d steps apart?
Terminology
We're using k for degree because it seems to be the most common letter used for that purpose in graph theory. At any rate it can't be d because we use that letter for "diameter". Degree is sometimes called "valency", but v makes me think of "vertices" and that won't do. I'm using n for number of vertices.
There are a lot of standard graph names that use uppercase letters, like T, K_{3,3}, and C_{5}×C_{5}. I'll explain each the first time I discuss it. I also define some of my own graph names, including P for the Petersen graph, PP with various subscripts for the generalized Petersen graphs, and C with a more complex subscript for the circulant graphs. Hopefully my names should make sense, and if anyone can supply better names I'll gladly update this page.
Basics
Since this is a finite-diameter problem, we are only considering connected graphs.
If either k or the number of nodes n is even, it is possible for all the nodes to have degree k. This is called a "regular graph of degree-k" or a "k-regular graph". If k and n are both odd, it's possible for all but one node to be of degree k, and that node can have degree k-1. It's also possible for a solution to have multiple nodes with less than degree k, but I have been paying the most attention to the k-regular graphs. It's harder to maximize n or to balance the network traffic on a graph that isn't k-regular.
When giving any answer to the problem one usually specifies a graph or describes how to construct a graph that demonstrates the answer. I list more than one graph for each combination of k and d, because they are all useful for practical network applications. If you're putting together a Beowulf cluster, it's good to have a large number of choices for the number of nodes in your network, and each should be symmetrical and balanced.
The Trivial Cases
k=1, Any d
If k=1, you can't get more than 2 nodes in a connected graph, so the answer is n=2 no matter what you specify for d. Two points connected by a single line is called K_{2}.
Any k, d=1
It's easy to see that for any value of k and d=1, the answer is n=k+1, and the graph is the complete graph K_{n}. K_{n} is what you get by arranging n points in a circle and connecting each point to each of the others. For example, for k=3 you get the graph K_{4}, which is a square with both diagonals drawn in. K_{4} is also topologically equivalent to the 4 vertices and 6 edges of a tetrahedron.
k=2, Any d
For k=2 and any value of d the best answer is a cyclic graph C_{2d+1}. This is a regular polygon of 2d+1 sides.
Small Nontrivial Cases
The real fun begins when d>1 and k>2. Since both d and k make the problem harder, it makes sense to start with the cases that have a smaller total d+k and work your way up. It is also important to look carefully at the solutions with lower d and/or k and how they can be used as part of a solution for a higher combination of d and k.
For any value of k and d there is an absolute maximum to n given by a tree with one node at the center and all the other nodes no more than d links away. We'll call this a maximum tree for diameter D and order V. For example, for k=4 and d=1, 2, 3, 4,... the maximum tree has 5, 17, 53, 161,... nodes.
k=3, d=2
Several good symmetrical graphs have valency 3 and diameter 2. For n=6, you can connect 6 nodes in a ring (a regular hexagon) and add three more links that go straight across the middle of the hexagon. The graph is called C_{6(1,3)}.
The same design also works with 8 nodes in a regular octagon, giving an n=8 solution. The graph is called C_{8(1,4)}. Along with C_{6(1,3)} it is an example of a circulant graph. A circulant graph is made from combinations of regular polygons and/or stars (like the pentagram and the star of David) drawn on top of each other so they share the same set of vertices.
For k=3, d=2 the theoretical maximum is 10 nodes. As it turns out, this maximum is attainable. It's actually a pretty good puzzle to try to figure out if you approach it right (hint: The solution must include the maximum tree, so start by drawing the maximum tree. Now you have 10 nodes, and only 6 more lines to add. Look for a symmetrical solution, and don't read the next paragraph till you're done.)
The solution for k=3, d=2 is called the Petersen graph P. As I have already implied, it can be drawn with 3-way rotational symmetry. However, it is normally drawn with 5-way rotational symmetry. To draw P this way, start with a 5-pointed star, then draw a pentagon around the star such that the pentagon does not touch the star, then connect each point of the star to the nearest point in the pentagon with a straight line. You now have 10 vertices, each with degree 3, and you can get from any of them to any of the others in at most 2 steps.
k=3, d=3
For k=3, d=3 the theoretical maximum is 22 nodes. A simple example is the cube graph C, with 8 nodes.
We already know we can get at least 10, because of the existing solution for k=2, d=2. But it's easy to find better solutions. Almost any graph with 10 nodes will do better than P if you use all three links per node.
Two slightly better solutions continue the idea we used for d=2 with the circulant graphs C_{10(1,5)} and C_{12(1,6)}. They have 10 and 12 nodes respectively. You create these networks by drawing a regular decagon (10-sided) or dodecagon (12-sided) polygon and then adding 5 (or 6) more lines connecting each vertex to the one directly across from it. The four graphs C_{6(1,3)}, C_{8(1,4)}, C_{10(1,5)} and C_{12(1,6)} are also called "Moebius ladders" because they are like a "ladder graph" with only one edge.
A better solution for k=3, d=3 has n=14 nodes. I call it PP_{7(1;3)}. It is drawn similarly to the Petersen graph but with a 7-sided polygon and the "modulo-3" type of 7-pointed star. The modulo-3 star is the one you get by connecting each vertex to the ones 3 to the right and left. Actually, if you use the other type of 7-pointed star you get PP_{7(1;2)} which also satisfies the requirements for k=3, d=3 and n=14.
There is another solution with n=14 that is better known in graph theory. It is called the Heawood graph, and is also called a (3,6)-cage. A (3,6)-cage is a 3-regular graph with girth 6 that has fewer nodes than any other 3-regular graph with girth 6. As mentioned before, "3-regular" means that all nodes have degree 3, and "girth" is the length of the shortest cycle (closed path) in the graph.
To make the Heawood graph, draw a circle and place 14 points equally-spaced around the circle, and number the points. Connect all the even-numbered points to a point 5 spaces further clockwise (which will be an odd-numbered point). The result is a graph which allows you to get from any point to any other in three steps or less.
One principle that shows up often in this problem, but is difficult to prove, has to do with small loops in the graph. If there is a way to travel on a path along the lines of the graph and end up where you started, the path is called a cycle. The length of the shortest cycle in a graph is called the graph's girth. Notice that there are cycles of length 4 in the C_{12(1,6)} solution, and with a little more effort you can see that there are no cycles in PP_{7(1;3)} shorter than 5. The Heawood graph has no cycles shorter than 6. It is typical that as you get better (more nodes) solutions of this problem for any given k and d, the number of small cycles decreases and the girth increases.
k=4, d=2
For k=4, d=2 there are several small symmetrical solutions. The first has n=6, and is the octahedron graph O, the vertices and edges of an octahedron. The only length-2 minimal paths are those between two opposite corners of the octahedron.
An n=8 solution is the complete bipartite graph K_{4,4}. To get this graph, put 4 points on a line and another 4 on a parallel line. Each point in line A gets connected to all the points in line B. This graph is topologically equivalent to the graph you get by taking the cube graph C and adding a link between each vertex and the vertex furthest from it.
For n=9 we have G_{3} = K_{3}×K_{3}, a 3x3 torus. G_{3} is what you get if you arrange 9 points in a 3x3 square, and then connect each point to each of the other two in its row, and then connect each point to each of the other two in its column.
Of course, the solution for k=3, d=2 can also be used for k=4, d=2, with n=10. It's doesn't perform as well as the k=4 graphs because the extra links in a k=4 graph would allow the load to be spread out over more links.
A solution with n=11 is the circulant graph C_{11(1,3)}. To make C_{11(1,3)}, draw a regular undecagon (11-sided polygon), then add 11 more lines connecting each point to the points 3 spaces away in either direction.
The best solution, with n=15, is a graph sometimes called "K_{3} * C_{5}" (but note it is not one of the standard graph products of K_{3} and C_{5}). To make it, start with 15 nodes in a circle (C_{15}) and number the nodes in order from 1 to 15. Then add links as follows:
- To nodes 1, 6, 11 add a link to the node 4 steps to the right.
- To nodes 2, 7, 12 add a link to the node 5 steps to the right.
- To nodes 3, 8, 13 add a link to the node 7 steps to the right.
- To nodes 4, 9, 14 add links to the nodes 4 and 7 steps to the right.
The optimal solution with n=15 falls a little short of the theoretical limit, which is 17=1+4+12.
k=3, d=4
As with the smaller values of d we get two simple, small solutions with the circulant graphs C_{14(1,7)} and C_{16(1,8)}. A significant load-balancing issue starts to show up as these networks get larger: the "diameter" links get used a lot less than the "perimeter" links. For example, if every node on a C_{16(1,8)} network sent one message to every other node, using shortest routes with ties resolved so as to balance the load, the "diameter" links would each carry 14 messages and the "perimeter" links would each carry 32. The ratio 32/14 is 2.28; the corresponding ratio for the smaller network C_{8(1,4)} is only 8/6=1.33.
For k=3, d=4, we can get n=22 with PP_{11(1;3)}. This is like PP_{7(1;3)} but with an undecagon (11-sided polygon) and an 11-pointed star C_{11(3)} inside.
We can improve this to n=24 with the McGee graph, which is a (3,7)-cage (the term cage is defined above).
To make the McGee graph, draw a circle and put 24 points on the circle at equal distances. Number the points from 1 to 24. For each point, add a line according to the following rules, where n is the number of the point:
For points 1, 4, 7, etc (n+2 is a multiple of 3), connect to point n+12 (subtracting 24 if n+12>24).
For points 2, 5, 8, etc (n+1 is a multiple of 3), connect to point n+7 (subtracting 24 if n+8>24).
For points 3, 6, 9, etc (n is a multiple of 3), connect to point n+17 (subtracting 24 if n+16>24).
The result is the McGee graph. You can get from any point to any other in at most 4 steps.
The Tutte-Coxeter graph, a (3,8)-cage, improves this to n=30. The rules are similar: Draw a circle and put 30 points on the circle at equal distances. Number the points from 1 to 30. For each point, add a line according to the following rules, where n is the number of the point:
For points 1, 7, 13, etc (n+2 is a multiple of 6), connect to point n+17 (subtracting 30 if n+17>30).
For points 2, 8, 14, etc (n+1 is a multiple of 6), connect to point n+21 (subtracting 30 if n+21>30).
For points 3, 9, 15, etc (n is a multiple of 6), connect to point n+7 (subtracting 30 if n+7>30).
I don't know if there is a graph with higher n for k=3, d=4. The theoretical limit is 46=1+3+6+12+24.
k=4, d=3
Whenever there is a solution for a given k and d, you can get a solution for k+1 and d+1 with twice as many nodes. You take two copies of the solution for the smaller k and d, and connect each node in one copy to the corresponding node in the other copy. Using this technique you can make a 12-node graph with k=4, d=3. Use two copies of K_{3,3} with each vertex in the first K_{3,3} connected to the corresponding vertex in the other. K_{3,3} is defined like K_{4,4} and is also called the "Thomsen graph" or "utility graph".
Using the same technique you can get n=20 for k=4, d=3, using two copies of the Petersen graph P. Connect each vertex in the first P to the corresponding vertex in the other P.
There is another n=20 solution with an interesting shape. I call it the "Moebius D". Take a dodecahedron (whose graph is called D) and connect each vertex to its antipodes (the vertex furthest from it). The dodecahedron graph normally has diameter 5 but with the antipodal lines added its diameter goes down to 3.
The graph I call PP_{7(1;2;3)C3} has n=21. This graph is made from the cyclic graphs C_{7} and two circulant graphs C_{7(2)} and C_{7(3)}. Each of these has 7 vertices; label them A through G then connect all sets of 3 same-labeled vertices by adding three more lines in a triangle. This connects the three C_{7x} graphs together in such a way that any point is accessible from any other in no more than 3 steps.
An even better solution is the (4,6)-cage, a graph with 26 nodes (n=26). To make it, draw a circle, place 26 points equally-spaced around the circle, and number the points. From all the even-numbered points, add a line to the point 5 spaces further clockwise, and another line to the point 9 spaces counter-clockwise (both of which will be odd-numbered points). The result is a graph with 26 nodes that allows you to get from any point to any other in 3 steps or less.
You can get to n=28 with four 7-node cyclic graphs linked by 7 copies of C_{4}. The result is PP_{7(1;2;3;2)C4}.
I don't know if there is a graph with higher n for k=4, d=3. The theoretical limit is 53=1+4+12+36.
k=5, d=2
The 18-node graph PP_{9(1,3;2,4)} has diameter 2 and valency 5. It is made from the two circulant graphs C_{9(1,3)} and C_{9(2,4)} with corresponding nodes linked to each other.
The theoretical limit for k=5, d=2 is 26=1+5+20.
k=3, d=5
The dodecahedron graph D has 20 vertices.
The theoretical limit for k=3, d=5 is 94=1+3+6+12+24+48.
k=4, d=4
A popular network for k=4, d=4 is the tesseract graph, which has n=16. It is the product of two simpler graphs, C×K_{2}. To make it, take two copies of the cube graph C and connect each vertex in the first cube to the corresponding vertex in the other.
Another popular network for k=4, d=4 is a 5x5 torus grid, with n=25. It's called C_{5}×C_{5}, the square product (also called "Cartesian product") of C_{5} with itself. C_{5} is the 5-node cyclic graph, which is just the vertices and edges of a pentagon. To construct C_{5}×C_{5}, arrange 25 points in a square, then connect each point to its four neighbors (up, down, left, right), then connect each of the 5 points on the right edge of the square to the corresponding point on the left edge, then do the same thing with the top and bottom edges. You can get from any point in this square to any other in at most 2 horizontal steps and 2 vertical steps.
The doubling trick mentioned earlier gives a solution with n=28, by connecting two copies of PP_{7(1;3)}. I call the resulting graph PP_{7(1;3)}×K_{2}. The same trick also works with two copies of the Heawood graph, producing a different n=28 solution.
You can get n=36 with a pseudo-Petersen graph made from four 9-cycles. The graphs C_{9}, C_{9(2)}, C_{9(3)} and C_{9(4)} are connected via 9 copies of C_{4}. My name for this graph is PP_{9(1;2;3;4)C4}. The rules for getting from one node to another are a little tricky. If your start and destination are both within the same C_{9x}, you can take get there in four steps by moving entirely within the C_{9x}, unless you're in C_{9(3)}. C_{9(3)} is a little different because it is not connected — it is three disjoint triangles. In the C_{9(3)} case, you sometimes have to use a C_{4} link to move to the adjacent C_{9(2)} or C_{9(4)}, take one step, than go back to the C_{9(3)}, possibly taking one more step there. If your start and destination are in different C_{9x}'s, you take A steps along one of the C_{4}'s, then one step on a C_{9x}, then B steps along another C_{4} to the destination. A and B can be zero and their sum never has to be more than 3.
A similar network, PP_{9(1;2;3;4;2)C5}, raises the number of nodes to 45. As its name suggests, it connects five C_{9x} graphs with 9 copies of C_{5}. The rules for getting from one node to another are even more complicated. There are more situations in which you have to travel along two different C_{9x}'s.
This method continues to yield results with C_{11x} graphs. The graph PP_{11(1;2;4;3;5)C5} has n=55 nodes. Its travel algorithm is actually a bit simpler than PP_{9(1;2;3;4;2)C5} because none of the C_{11x}'s is disconnected.
Another similar network raises n to 78: PP_{13(1;5;4;2;3;6)C6} contains six different 13-node circulant graphs linked by 13 copies of C_{6}. Its travel algorithm is sufficiently complex that I had to use a computer program to check all the cases.
I don't know if there is a graph with higher n for k=4, d=4. The theoretical limit is 161=1+4+12+36+108.
k=5, d=3
The Robertson-Wegner (5,5)-cage has 30 vertices. Start with a regular dodecahedron (20 vertices). There are 10 ways to choose 4 vertices on the dodecahedron such that the 4 vertices lie on the corners of an inscribed tetrahedron; each vertex of the dodecahedron lies in two such 4-point sets. Furthermore, the sets occur in "cube-pairs"; the two members of a cube-pair include the 8 vertices of an inscribed cube. Once you have identified the 10 subsets and which pair they belong to, add 10 new vertices to the graph corresponding to these 10 sets, and connect each of these 10 new vertices to the 4 dodecahedral vertices in its set (this adds 2 links to each of the 20 original points). Then finally add 5 more links connecting each of the new vertices to its cube-pair counterpart. The resulting graph has 30 points each at the junction of 5 edges. The really interesting part about it is that this graph is actually just as symmetrical as the original dodecahedron, that is, there's nothing particularly special about the 10 new points. If the graph was being displayed by a computer in 3-d, you could move the vertices around and rearrange them so that some of the new points appear to be part of the dodecahedron and some of the original dodecahedral vertices appear to be part of the "new 10", and the 75 edges would all be in the same places as before. Graph theorists have a more cryptic way of expressing this, which they refer to as the properties of the graph's "symmetry group".
For an n=36 solution, there is the pseudo-Petersen graph PP_{9(1;2;3;4)K4}. It has four 9-node cyclic and circulant graphs linked by 9 copies of K_{4}.
There is clearly a much better solution for k=5 and d=3; the theoretical limit is 106=1+5+20+80.
k=5, d=4
I haven't looked at the k=5, d=4 case yet. "Doubling" the best case for k=4, d=3 gives a 5-regular solution PP_{7(1;2;3;2)C4}×K_{2} with n=56.
The k=4, d=4 networks show that we can certainly do better than n=78. The theoretical limit for k=5 and d=4 is 426=1+5+20+80+320.
Sources and Related Pages
UPC Graph Theory Department's Degree-Diameter Table
http://maite71.upc.es/grup_de_grafs/table_g.html
Graph theory glossaries and introductions:
Chris Caldwell's glossary at UTM
and his tutorials
The first section of this page at Wisconsin
Oriented specifically at the computer application
Direct Interconnection Networks from CS838 at Wisconsin
s.11