Jekyll2023-05-14T21:44:17+00:00https://minorfree.github.io/feed.xmlRambling on GraphsBlog by Hung LeHung LeAmazing Recent Advances on Shotcut Set and The Likes2023-05-14T00:00:00+00:002023-05-14T00:00:00+00:00https://minorfree.github.io/hopset-adv<p><a href="http://www.weizmann.ac.il/math/parter/home">Merav Parter</a> gave a talk at UMass theory seminar last week titled “Reachability Shortcuts: New Bounds and Algorithms”. In the talk, Merav summarized recent developments in shortcut sets and related concepts. It was amazing to see all the results, and I cannot resist the attempt to share my take here in this blog post.</p>
<h1 id="1-shortcut-set-what-is-it">1. Shortcut set: what is it?</h1>
<p>A \(d\)-shortcut set of a <em>directed graph</em> \(G=(V,E)\) is a set of directed edges \(E_H\) such that (i) the graph \(H = (V,E\cup E_H)\) has the same transitive closure as \(G\) and (ii) for every \(u\not=v\in V\), if \(v\) is reachable from \(u\), then there is directed \(u\)-to-\(v\) path of at most \(d\) edges (a.k.a. hops).</p>
<p><img src="/assets/figs/shortcut-set.svg" alt="" /></p>
<p><em>Figure: Adding two shortcuts (the red dashed edges) reduces the hop distance from \(u\) to \(v\) to 4. For a shortcut set of linear size, DAG is the most difficult instance: one can shortcut any strongly connected graph to diameter \(2\) by \(n-1\) edges.</em></p>
<p>The shortcut set problem was introduced by <a href="https://link.springer.com/chapter/10.1007/3-540-56402-0_48">Thorup</a>. The main goal is to understand the trade-off between the hop diameter \(d\) and the size of the shortcut set \(E_H\). The most well-studied regime is fixing \(\lvert E_H\rvert = \tilde{O}(n)\), and then minimizing \(d\). (Here, \(n\) is the number of vertices.) We will call this special regime <em>the shortcut set problem</em>.</p>
<p>For the shortcut set problem, a folklore sampling gives \(d = O(\sqrt{n})\): sample each vertex with probability \(\tilde{O}(1/\sqrt{n})\) to get a set \(S\), and add all (permissible) directed edges between vertices in \(S\). The key insight for bounding the diameter is: For any shortest path \(\pi_G(u,v)\) from \(u\) to \(v\) with at least \(2\sqrt{n}\) edges, \(S\) will likely hit both the prefix and suffix of length \(\sqrt{n}\) of \(\pi_G(u,v)\). (This algorithm is attributed to <a href="https://dl.acm.org/doi/10.1145/97444.97686">Ullman and Yannakakis</a>.) A long-standing open problem is:</p>
<hr />
<p><strong>Question 1</strong>: Is diameter bound \(\sqrt{n}\) optimal for shortcut sets of nearly linear size?</p>
<hr />
<p>The answer is no due to the <a href="https://arxiv.org/pdf/2111.13240.pdf">recent breakthrough</a> by Kogan and Parter: using \(\tilde{O}(n)\) shorcuts, they reduced the diameter down to \(n^{1/3}\). Their result deservingly won the best paper award at SODA 22. A key conceptual contribution of their work is a shift in perspective: shortcutting by connecting vertices to paths of length roughly \(n^{1/3}\), instead of only making vertex-to-vertex connections as the folklore sampling. Find this intriguing? Read <a href="https://arxiv.org/pdf/2111.13240.pdf">the paper</a> or watch <a href="https://www.youtube.com/watch?v=qaY14SnLdMM">Merav’s talk</a> or both. The paper has a bunch of other interesting results.</p>
<p>At this point, you might wonder: how far could one go about reducing the diameter? Thorup <a href="https://link.springer.com/chapter/10.1007/3-540-56402-0_48">conjectured</a> that the diameter could be reduced all the way down to \(\mathrm{poly}(\log(n))\) with a nearly linear number of shortcuts. This conjecture was disproved by <a href="https://dl.acm.org/doi/pdf/10.5555/644108.644216">Hesse</a>, who was a graduate student at UMass Amherst at the time with our <a href="https://www.mathgenealogy.org/id.php?id=30442">Neil Immerman</a> :). Hesse constructed a directed graph with \(m = \Theta(n^{19/17})\) edges such that one has to use \(\Omega(mn^{1/17})\) shortcuts to reduce the diameter to below \(\Theta(n^{1/17})\). This lower bound <a href="https://arxiv.org/abs/1802.06271">has been</a> <a href="https://arxiv.org/abs/2110.15809">improved</a> <a href="https://arxiv.org/pdf/2304.02193.pdf">further</a>. Notably, <a href="https://arxiv.org/pdf/2304.02193.pdf">Bodwin and Hoppenworth</a> recently obtained a lower bound of \(\Omega(n^{1/4})\) on the diameter, which is very close to the upper bound of \(O(n^{1/3})\) by Kogan and Parter. This result leaves another fascinating open problem:</p>
<hr />
<p><strong>Open Problem</strong>: Closing the gap \(O(n^{1/3})\) vs. \(\Omega(n^{1/4})\) on the diameter for shortcut sets of nearly linear size.</p>
<hr />
<h1 id="2-approximate-hopsets">2. (Approximate) hopsets</h1>
<p>Hopset is very similar to the shortcut set but applied to weighted graphs instead. More formally, a \(d\)-hopset set of a <em>directed, weighted graph</em> \(G=(V,E)\) is a set of directed, <em>weighted</em> edges \(E_H\) such that (i) the graph \(H = (V,E\cup E_H)\) has the same distance metric as \(G\) and (ii) for every \(u\not=v\in V\), there is a shortest \(u\)-to-\(v\) path of at most \(d\) edges in \(H = (V,E\cup E_H)\).</p>
<p>Clearly, the hopset problem is at least as hard as the shortcut set problem: any hopset is a shortcut set with the same diameter bound.</p>
<p>A related notion is <em>approximate hopset</em>, in which we are given an additional parameter \(\epsilon \in (0,1)\), and for every \(u,v\), we would like to have a \((1+\epsilon)\)-approximate \(u\)-to-\(v\) path of at most \(d\) edges. We allow \(d\) to depend on \(\epsilon\)—think of \(\epsilon\) as a small constant.</p>
<p>One could check that the folklore sampling also works for hopset (and hence for approximate hopset as well): by adding \(\tilde{O}(n \log n)\) edges, one could reduce the diameter bound to \(\sqrt{n}\). The same question arises: Is the \(\sqrt{n}\) bound on the diameter optimal for hopsets and approximate hopsets of nearly linear size?</p>
<p>For the approximate hopset, the same paper by Kogan and Parter provided a no answer: They constructed a nearly linear hopset with a diameter bound of \(n^{2/5}\). In follow-up work, <a href="https://arxiv.org/abs/2207.04507">Bernstein and Wein</a> improved the diameter to \(n^{1/3}\), matching the known bound for shortcut set; <a href="https://video.cs.utexas.edu/node/428">watch the talk here</a>.</p>
<p>How about the (exact) hopset? One might expect that the diameter bound should also be improved to \(n^{1/3}\). <a href="https://arxiv.org/pdf/2304.02193.pdf">Bodwin and Hoppenworth</a> recently showed that \(\sqrt{n}\) is the best one could do (up to some polylog). I personally find this result very surprising. How do they achieve their lower bound? Read their paper; the exposition of the ideas in the paper is so well written that any attempt to summarize better would be in vain.</p>
<p>Somehow, the exact hopset is strictly harder than the approximate hopset and the shortcut set. Could approximate hopset be strictly harder than shortcut set? Bernstein and Wein provided state-of-the-art bounds for approximate hopsets, matching those of shortcut sets. As far as I understand, their proof does not provide a black-box reduction. Is such a reduction possible? I am curious to know the answer.</p>Hung LeMerav Parter gave a talk at UMass theory seminar last week titled “Reachability Shortcuts: New Bounds and Algorithms”. In the talk, Merav summarized recent developments in shortcut sets and related concepts. It was amazing to see all the results, and I cannot resist the attempt to share my take here in this blog post. 1. Shortcut set: what is it? A \(d\)-shortcut set of a directed graph \(G=(V,E)\) is a set of directed edges \(E_H\) such that (i) the graph \(H = (V,E\cup E_H)\) has the same transitive closure as \(G\) and (ii) for every \(u\not=v\in V\), if \(v\) is reachable from \(u\), then there is directed \(u\)-to-\(v\) path of at most \(d\) edges (a.k.a. hops). Figure: Adding two shortcuts (the red dashed edges) reduces the hop distance from \(u\) to \(v\) to 4. For a shortcut set of linear size, DAG is the most difficult instance: one can shortcut any strongly connected graph to diameter \(2\) by \(n-1\) edges. The shortcut set problem was introduced by Thorup. The main goal is to understand the trade-off between the hop diameter \(d\) and the size of the shortcut set \(E_H\). The most well-studied regime is fixing \(\lvert E_H\rvert = \tilde{O}(n)\), and then minimizing \(d\). (Here, \(n\) is the number of vertices.) We will call this special regime the shortcut set problem.Halperin-Zwick Algorithm for Spanners2023-02-26T00:00:00+00:002023-02-26T00:00:00+00:00https://minorfree.github.io/hz-spanner<p>This post describes a simple linear time algorithm by Halperin-Zwick [6] for constructing a \((2k-1)\)-spanner of unweighted graphs for any given integer \(k\geq 1\). The spanner has \(O(n^{1+1/k})\) edges and hence is sparse. When \(k = \log n\), it only has \(O(n)\) edges. This sparsity makes spanners an important object in many applications.</p>
<p>This post was inspired by my past attempt to track down the detail of the Halperin-Zwick algorithm. Halperin and Zwick never published their algorithm, and all papers I am aware of cite their unpublished manuscript [6]. The algorithm by Halperin-Zwick is a simple modification of an earlier algorithm by Pelege and Schäffer [8], which, according to Uri Zwick, is the reason why they did not publish their result. The spanner by Pelege and Schäffer [8] has stretch \(4k-3\) for the same sparsity. The idea of Halperin-Zwick algorithm was given as Exercise 3 in Chapter 16 of the book by Peleg [7].</p>
<p>First, let’s define spanners. Graphs in this post are connected.</p>
<hr />
<p><strong>\(t\)-Spanner</strong>: Given a graph \(G\), a \(t\)-spanner is a subgraph of \(G\), denoted by \(H\), such that for every two vertices \(u,v\in V(G)\):</p>
\[d_H(u,v)\leq t\cdot d_G(u,v)\]
<hr />
<p>Here \(d_H\) and \(d_G\) denote the graph distances in \(H\) and in \(G\), respectively. Graph \(G\) could be weighted or unweighted; we only consider unweighted graphs in this post. The distance constraint on \(H\) implies that \(H\) is connected and spanning.</p>
<p>Parameter \(t\) is called the <em>stretch</em> of the spanner. We often construct a spanner with an odd stretch: \(t = 2k-1\) for some integer \(k\geq 1\). Why not even stretches? Short answer: there is no gain in terms of the worst case bounds for even stretch [1].</p>
<hr />
<p><strong>Theorem</strong> (Halperin-Zwick): Let \(G\) be an unweighted graph with \(n\) vertices and \(m\) edges. Let \(k\geq 1\) be any given integer. There is an algorithm that runs in time \(O(m)\) and constructs a \((2k-1)\)-spanner of \(G\) with \(O(n^{1+1/k})\) edges.</p>
<hr />
<p>It is often instructive to think about \(k=2\), i.e, constructing a \(3\)-spanner. And this is where we start.</p>
<h1 id="stretch-3">Stretch 3</h1>
<p>Here we seek a \(3\)-spanner with \(O(n^{3/2})\) edges. There are two steps: clustering and connecting the clusters. Let’s focus on clustering first. The idea is to: construct a set of radius-1 clusters (a set of stars) that have at least \(\sqrt{n}\) vertices each. This implies that the number of clusters is \(O(\sqrt{n})\) and hence we can afford to add one edge from each vertex to each cluster. The remaining vertices induce a graph of at most \(O(n^{3/2})\); we can add all the edges.</p>
<p>The cluster can be constructed greedily; the pseudocode of the algorithm is given below. We use \(N_G(v)\) to denote neighbors of \(v\) in a graph \(G\).</p>
<hr />
<p><span style="font-variant: small-caps">Clustering</span>\((G)\)</p>
<blockquote>
<p>\(1.\) \({\mathcal C} \leftarrow \emptyset, \quad G_1\leftarrow G\)<br />
\(2.\) while \(G_i \not= \emptyset\)<br />
\(3.\) \(x\leftarrow\) an arbitrary vertex in \(G_i\)<br />
\(4.\) \(C_x\leftarrow {x}\)<br />
\(5.\) if \(\lvert N_{G_i}(x)\rvert \geq \sqrt{n}\)<br />
\(6.\) \(C_v\leftarrow C_v\cup N_{G_i}(x)\)<br />
\(7.\) \({\mathcal C} \leftarrow {\mathcal C}\cup {C_x}\) <br />
\(8.\) \(G_{i+1}\leftarrow G_i\setminus C_v, \quad i\leftarrow i+1\)<br />
\(9.\) return \({\mathcal C}\)</p>
</blockquote>
<hr />
<p>We call the vertex \(v\) in the cluster \(C_v\) in line 4 the <em>center</em> of the cluster. We use \(E(C_v)\) to the edges of \(G\) connecting \(v\) to other vertices in \(C_v\).</p>
<p>Observe that every cluster \(C\in {\mathcal C}\) has radius at most \(1\) and it has either at least \(\sqrt{n}\) vertices or exactly one vertex. We call \(C\) a <em>heavy cluster</em> if \(\lvert C \rvert\geq \sqrt{n}\), and a <em>light cluster</em> otherwise.</p>
<hr />
<p><strong>Observation 1</strong>: The number of heavy clusters in \({\mathcal C}\) is at most \(\sqrt{n}\).</p>
<hr />
<p>To get a 3-spanner of \(G\), we simply add an edge from every vertex to each heavy cluster of \({\mathcal C}\), and an edge between every pair of light clusters. (Light clusters are singletons.)</p>
<hr />
<p><span style="font-variant: small-caps">3Spanner</span>\((G)\)</p>
<blockquote>
<p>\(1.\) \({\mathcal C} \leftarrow\)<span style="font-variant: small-caps">Clustering</span>\((G)\)<br />
\(2.\) \(H\leftarrow (V,\emptyset)\)<br />
\(3.\) for each heavy cluster \(C\in {\mathcal C}\)<br />
\(4.\) add \(E(C)\) to \(H\)<br />
\(5\). for each vertex \(v \in N_G(C)\)<br />
\(6.\) \((v,u)\leftarrow\) an arbitrary edge from \(v\) to \(C\)<br />
\(7.\) add \((u,v)\) to \(H\)<br />
\(8.\) add to \(H\) all edges between light clusters<br />
\(9.\) return \(H\)</p>
</blockquote>
<hr />
<p>In line 5, we use \(N_G(C)\) to denote the set of neighbors of \(C\), which are vertices are not in \(C\) and having at least one edge to \(C\). The running time is clearly \(O(m)\).</p>
<p><strong>Sparsity analysis.</strong> Note that \(E(C)\leq \lvert C \rvert-1\). Thus, the total number of edges added to \(H\) in line 4 over all iterations is at most \(n-1\). Furthermore, the number of edges added to \(H\) in the loop in line 5 is at most \(n\) and hence by Observation 1, the total number of edges added in lines 3-7 is \(O(n\sqrt{n}) = O(n^{3/2})\).</p>
<p>To bound the number of edges added in line 8, observe that, if we order light clusters by the order it is added to \({\mathcal C}\) in line 7 of algorithm <span style="font-variant: small-caps">Clustering</span>, then each light cluster is incident to at most \(\sqrt{n}\) light clusters following it in the order. It follows that the total number of edges added in line 8 is \(O(n\sqrt{n}) = O(n^{3/2})\).</p>
<p><strong>Stretch analysis.</strong> We show that \(d_G(u,v)\leq 3 d_H(u,v)\). By the triangle inequality, it suffices to show the inequality for every edge \((u,v)\) of \(G\). This means we have to show that \(d_H(u,v)\leq 3\). This inequality holds if \((u,v)\in E(H)\), and hence we only need to consider the case where at least one of \(u\) and \(v\) is in a heavy cluster.</p>
<p><img src="/assets/figs/clusters.svg" alt="" /></p>
<p><em>Figure 1: (a) stretch-3 path for edge \((u,v)\) and (b) stretch-\((2k-1)\) path for edge \((u,v)\)</em></p>
<p>If \(u\) and \(v\) are in the same heavy cluster \(C\), then \(d_H(u,v)\leq 2\) and the stretch guarantee holds. Otherwise, let \(C_x\) be the heavy cluster centered at \(x\) containing \(v\), say. As \((u,v)\not\in H\), there must be another vertex \(w\in C_x\) such that \((u,w)\in H\) by the construction in line 6. Thus, the path \(u\rightarrow w\rightarrow x\rightarrow v\) is a path of length 3 in \(H\) between \(u\) and \(v\), as desired. See Figure 1(a).</p>
<h1 id="larger-stretches">Larger Stretches</h1>
<p>The algorithm for constructing a \((2k-1)\)-spanner with \(O(n^{1+1/k})\) edges is somewhat similar to the stretch-3 case, but we will need a finer analysis. A key observation, which we also use in the 3-spanner construction, is that if we have a cluster, say \(C\), of radius \(k\), and a vertex \(v\in N_G(C)\), it suffices to keep only one edge from \(v\) to \(C\). Thus, as long as \(N_G(C)\) has at most \(n^{1/k}\lvert C \rvert\) vertices, we can add an edge from \(v\) to \(C\) for each \(v\in N_G(C)\); the <em>average number of edges added per vertex</em> of \(C\) is \(n^{1/k}\).</p>
<p>What if \(\lvert N_G(C) \rvert \geq n^{1/k}\lvert C \rvert\)? In this case, we simply grow \(C\) by adding all of its neighbors. How many times will it grow? At most \(k-1\) times, as every time \(C\) grows, its size increases by a factor of strictly larger than \(n^{1/k}\), and there are only \(n\) vertices in the graph.</p>
<p>The pseudocode of the algorithm is given below. The set \(A\) holds the edges between \(C\) and its neighbors described above. The rest is essentially the same as the clustering for stretch 3.</p>
<hr />
<p><span style="font-variant: small-caps">Clustering</span>\((G,k)\)</p>
<blockquote>
<p>\(1.\) \({\mathcal C} \leftarrow \emptyset, \quad A\leftarrow \emptyset, \quad G_1\leftarrow G\)<br />
\(2.\) while \(G_i \not= \emptyset\)<br />
\(3.\) \(x\leftarrow\) an arbitrary vertex in \(G_i\)<br />
\(4.\) \(C_x\leftarrow {x}\)<br />
\(5.\) while \(\lvert N_{G_i}(C_x)\rvert \geq n^{1/k} \lvert C_{x} \rvert\)<br />
\(6.\) \(C_v\leftarrow C_x\cup N_{G_i}(C_x)\)<br />
\(7.\) for each \(v \in N_{G_i}(C_x)\)<br />
\(8.\) \((v,u)\leftarrow\) an arbitrary edge from \(v\) to \(C\)<br />
\(9.\) add \((v,u)\) to \(A\)<br />
\(10.\) \({\mathcal C} \leftarrow {\mathcal C}\cup {C_v}\) <br />
\(11.\) \(G_{i+1}\leftarrow G_i\setminus C_v, \quad i\leftarrow i+1\)<br />
\(12.\) return \(({\mathcal C},A)\)</p>
</blockquote>
<hr />
<p>Once we perform clustering, we only need to add the set \(A\) and the edges inside each cluster to the spanner.</p>
<hr />
<p><span style="font-variant: small-caps">Spanner</span>\((G,k)\)</p>
<blockquote>
<p>\(1.\) \(H\leftarrow (V,\emptyset)\)<br />
\(2.\) \(({\mathcal C},A) \leftarrow\)<span style="font-variant: small-caps">Clustering</span>\((G,k)\)<br />
\(3.\) add \(A\) to \(H\)<br />
\(4.\) for each cluster \(C\in {\mathcal C}\)<br />
\(5.\) add \(E(C)\) to \(H\)<br />
\(6.\) return \(H\)</p>
</blockquote>
<hr />
<p><strong>Sparsity analysis.</strong> The number of edges added in the loop in line 4 is at most \(n-1\). Observe that for each cluster \(C_x\) added to \({\mathcal C}\) in line 6 of <span style="font-variant: small-caps">Clustering</span>, the number of edges added to \(A\) in the loop in line 7 is at most \(n^{1/k}\lvert C_{x} \rvert\). Thus, \(\lvert A \rvert\leq n^{1/k}\sum_{C}\lvert C \rvert \leq n^{1+1/k}\). This implies that \(\lvert E(H) \rvert = O(n^{1+1/k})\).</p>
<p><strong>Stretch analysis.</strong> Let \((u,v)\) be any edge of \(G\) such that \((u,v)\not\in H\). We need to show that \(d_H(u,v)\leq 2k-1\). Observe that:</p>
<hr />
<p><strong>Observation 2</strong>: Every cluster \(C_x\in {\mathcal C}\) has radius at most \(k-1\).</p>
<hr />
<p>Proof: Every time the radius of \(C_x\) increases by \(1\), the size of \(C_x\) increases by a factor of strictly larger than \(n^{1/k}\) by the construction. Thus, after \(t\) rounds, \(n\geq \lvert C_{x} \rvert > n^{t/k}\), which gives \(t\leq k-1\).</p>
<hr />
<p>Let \(C_x\) be the cluster containing \(v\). If \(u\in C_x\), then \(d_G(u,v)\leq 2\cdot (k-1)\). Otherwise, suppose w.l.o.g, that \(v\) is clustered before \(u\). Observe that \(u\in N_{G_{i}}(C_x)\) and hence an edge \((u,w)\) is added to \(A\), which is eventually added to \(H\). See Figure 1(b). Thus, the path consisting of an edge \((u,w)\), the shortest path from \(w\) to \(x\), and the shortest path from \(x\) to \(v\), is a path of length at most \(2(k-1)+1 = 2k-1\) between \(u\) and \(v\) in \(H\), as desired.</p>
<h1 id="optimality-the-girth-conjecture">Optimality: The Girth Conjecture</h1>
<p>It is not hard to construct a class of graphs such that for any graph \(G\) of size \(n\) in the class, the Halperin-Zwick algorithm produces a \((2k-1)\)-spanner for \(G\) that has \(\Omega(n^{1+1/k})\) edges. Could we go below the bound \(\Theta(n^{1+1/k})\) on the number of edges (by a different algorithm, say)? The consensus seems to be no, though currently we do not have a definite answer.</p>
<p>Spanners have a tight connection to the girth of graphs; a graph has girth \(g\) if the shortest simple cycle in the graph has length \(g\).</p>
<hr />
<p><strong>Observation 3</strong>: Let \(H\) be a graph of girth \(2k+1\). Then any \((2k-1)\)-spanner of \(H\) must contain every edge of \(H\).</p>
<hr />
<p>Observation 3 essentially says that any \((2k-1)\)-spanner of \(H\) must be itself. Thus, the question of the optimality of spanners reduces to: is there any graph with \(o(n^{1+1/k})\) edges and girth \((2k+1)\)? The Erdős’ Girth Conjecture implies that the answer is no.</p>
<hr />
<p><strong>Erdős’ Girth Conjecture [5]</strong>: For any \(n \geq 1\) and \(k\geq 1\), there exists a graph with \(n\) vertices of girth \((2k+1)\) that has \(\Omega(n^{1+1/k})\) edges.</p>
<hr />
<p>Erdős stated a lower bound \(c_k\cdot n^{1+1/k}\) on the number of edges in the conjecture [5]; that is, the constant is allowed to degrade as \(k\) increases. The spanner literature often cites the stronger version above, where the constant remains the same for every \(k\). The Erdős’ Girth Conjecture is known to hold for a few small values of \(k\).</p>
<p>While Erdős’ Girth Conjecture remains wide open, we could ask: is it possible to construct a \((2k-1)\)-spanner that has girth at least \(2k+1\)? If yes, then the output spanner is (existentially) optimal regardless of the truth of Erdős’ Girth Conjecture.</p>
<p>It turns out that the following simple greedy algorithm, formally described in [2] and attributed to Marshall Bern, does the job: consider edges in increasing weight order and add an edge \(e\) to the current spanner if the distance between its endpoints in the spanner is larger than \((2k-1)w(e)\). The algorithm works for weighted graphs as well. It is an instructive exercise to show that the output graph is a \((2k-1)\)-spanner and has girth at least \(2k+1\).</p>
<p>The major downsize of the greedy algorithm is its running time: the current best known implemtation takes \(O(mn^{1+1/k})\) time. Even in unweighted graphs, to the best of my knowledge, the following problem remains open:</p>
<hr />
<p><strong>Open Problem</strong>: Construct a maximal subgraph of girth at least \((2k+1)\) in nearly linear time.</p>
<hr />
<p>For an unweighted graph, a maximal subgraph of girth at least \((2k+1)\) is a \((2k-1)\)-spanner.</p>
<h1 id="conclusion">Conclusion</h1>
<p>We have mentioned two algorithms for constructing a spanner. Another beautiful algorithm that I hope to cover in a future post is the randomized construction by Baswana and Sen [4]. A notable feature of the Baswana-Sen algorithm is that it can be implemented efficiently in both parallel and distributed models. The recent survey paper [1] contains almost all known algorithms for spanners and its sibling problems.</p>
<h1 id="bibliographical-notes">Bibliographical Notes</h1>
<p>The concept of a spanner was formally introduced by Peleg and Schaffer [8], though its conception was much earlier. Peleg and Schaffer constructed a \((4k-3)\)-spanner with \(O(n^{1+1/k})\) edges by connecting every pair of clusters in the output of <span style="font-variant: small-caps">Clustering</span>\((G,k)\) by an edge. This clustering procedure was due to Awerbuch [2].</p>
<h1 id="references">References</h1>
<p>[1] Ahmed, R., Bodwin, G., Sahneh, F. D., Hamm, K., Jebelli, M. J. L., Kobourov, S., and Spence, R. (2020). Graph spanners: A tutorial review. Computer Science Review, 37, 100253.</p>
<p>[2] Althöfer, I., Das, G., Dobkin, D., Joseph, D., and Soares, J. (1993). On sparse spanners of weighted graphs. Discrete \& Computational Geometry, 9(1), 81-100.</p>
<p>[3] Awerbuch, B. (1985). Complexity of network synchronization. Journal of the ACM (JACM), 32(4), 804-823.</p>
<p>[4] Baswana, S., and Sen, S. (2007). A simple and linear time randomized algorithm for computing sparse spanners in weighted graphs. Random Structures & Algorithms, 30(4), 532-563.</p>
<p>[5] Erdős, P. (1965). Extremal problems in graph theory. In Proceedings of the Symposium on Theory of Graphs and its Applications, page 29-36, 1963.</p>
<p>[6] Halperin, S., and Zwick, U. (1996). Unpublished manuscript.</p>
<p>[7] Peleg, D. (2000). Distributed computing: a locality-sensitive approach. Society for Industrial and Applied Mathematics.</p>
<p>[8] Peleg, D., and Schäffer, A. A. (1989). Graph spanners. Journal of graph theory, 13(1), 99-116.</p>Hung LeThis post describes a simple linear time algorithm by Halperin-Zwick [6] for constructing a \((2k-1)\)-spanner of unweighted graphs for any given integer \(k\geq 1\). The spanner has \(O(n^{1+1/k})\) edges and hence is sparse. When \(k = \log n\), it only has \(O(n)\) edges. This sparsity makes spanners an important object in many applications. This post was inspired by my past attempt to track down the detail of the Halperin-Zwick algorithm. Halperin and Zwick never published their algorithm, and all papers I am aware of cite their unpublished manuscript [6]. The algorithm by Halperin-Zwick is a simple modification of an earlier algorithm by Pelege and Schäffer [8], which, according to Uri Zwick, is the reason why they did not publish their result. The spanner by Pelege and Schäffer [8] has stretch \(4k-3\) for the same sparsity. The idea of Halperin-Zwick algorithm was given as Exercise 3 in Chapter 16 of the book by Peleg [7]. First, let’s define spanners. Graphs in this post are connected. \(t\)-Spanner: Given a graph \(G\), a \(t\)-spanner is a subgraph of \(G\), denoted by \(H\), such that for every two vertices \(u,v\in V(G)\): \[d_H(u,v)\leq t\cdot d_G(u,v)\] Here \(d_H\) and \(d_G\) denote the graph distances in \(H\) and in \(G\), respectively. Graph \(G\) could be weighted or unweighted; we only consider unweighted graphs in this post. The distance constraint on \(H\) implies that \(H\) is connected and spanning. Parameter \(t\) is called the stretch of the spanner. We often construct a spanner with an odd stretch: \(t = 2k-1\) for some integer \(k\geq 1\). Why not even stretches? Short answer: there is no gain in terms of the worst case bounds for even stretch [1].Shortcutting Trees2023-01-08T00:00:00+00:002023-01-08T00:00:00+00:00https://minorfree.github.io/tree-shortcutting<p>Over the past two years or so, I have been thinking about a cute problem, which turned out to be much more useful to my research than I initially thought. Here it is:</p>
<hr />
<p><strong>Tree Shortcutting Problem</strong>: Given an edge-weighted tree \(T\), add (weighted) edges to \(T\), called <em>shortcuts</em>, to get a graph \(K\) such that:</p>
<ol>
<li>\(d_K(u,v) = d_T(u,v)~\quad \forall u,v\in V(T)\). That is, \(K\) preserves distances in \(T\).</li>
<li>For every \(u,v\in V(T)\), there exists a shortest path from \(u\) to \(v\) in \(K\) containing at most \(k\) edges for some \(k\geq 2\).</li>
</ol>
<p>The goal is to minimize the product \(k \cdot \mathrm{tw}(K)\), where \(\mathrm{tw}(K)\) is the treewidth of \(K\).</p>
<hr />
<p><img src="/assets/figs/shortcut.svg" alt="" /></p>
<p><em>Figure 1: An emulator \(K\) obtained by adding one edge to the tree \(T\) has hop bound \(3\) and treewidth \(2\). Compared to \(T\), the hop bound decreases by 1 while the treewidth increases by 1.</em></p>
<p>Such a graph \(K\) is called a <em>low-hop and low-treewidth emulator</em> of the tree \(T\). The parameter \(k\) is called the <em>hop bound</em> of \(K\). It is expected that there will be some trade-off between the hop bound and the treewidth. We are interested in minimizing \(k \cdot \mathrm{tw}(K)\). This product directly affects parameters in our application; see the conclusion section for more details.</p>
<p>For readers who are not familar with treeewidth, see <a href="https://en.wikipedia.org/wiki/Treewidth">here</a> and <a href="https://www.win.tue.nl/~nikhil/courses/2015/2WO08/treewidth-erickson.pdf">here</a> for an excellent introduction, and why treewidth is an interesting graph parameter.</p>
<blockquote>
<p><strong>Remark 1:</strong> The tree shortcutting problem is already non-trivial for unweighted trees; the shortcuts must be weighted though. Furthermore, for each edge \((u,v)\) added to \(K\), the weight of the edge will be \(d_T(u,v)\). Thus, in the construction below, we do not explicitly assign weights to the added edges.</p>
</blockquote>
<p>A version of a tree shortcutting problem where one seeks to minimize <em>the number of edges of \(K\)</em>, given a hop bound \(k\), was studied extensively (see <a href="http://www.math.tau.ac.il/~haimk/adv-ds-2008/Alon-Schieber.ps">1</a>,<a href="https://dl.acm.org/doi/abs/10.5555/640186.640193">2</a>,<a href="https://arxiv.org/abs/1005.4155">3</a>,<a href="https://dl.acm.org/doi/10.1145/800070.802185">4</a>, including my own work <a href="https://arxiv.org/abs/2107.14221">with</a> <a href="https://arxiv.org/abs/2112.09124">others</a>), arising in the context of spanners and minimum spanning tree problem. What is interesting there IMO is that we see all kinds of crazy slowly growing functions in computer science: for \(k = 2\), the number of edges of \(K\) is \(\Theta(n \log n)\); for \(k = 3\), the number of edges of \(K\) is \(\Theta(n \log\log n)\); for \(k = 4\), the number of edges is \(\Theta(n \log^* n)\); \(\ldots\) [too difficult to describe]; and for \(k = \alpha(n)\), the number of edges is \(\Theta(n)\). Here, as you might guess, \(\alpha(\cdot)\) is the notorious (one parameter) inverse Ackermann function. (The \(\Theta\) notation in the number of edges means there exist matching lower bounds.) I hope to cover this problem in a future blog post.</p>
<p>Now back to our tree shortcutting problem. Let \(n = \lvert V(T) \rvert\). There are two extreme regimes that I am aware of:</p>
<ol>
<li>Hop bound \(k=2\) and treewidth \(\mathrm{tw}(K) = O(\log n)\). This is a relatively simple exercise.</li>
<li>Hop bound \(k=O(\log n)\) and treewidth \(\mathrm{tw}(K) = O(1)\). This regime is harder to prove; trying to show this for a path graph will be an insightful exercise. It follows from a well-known fact [1] that any tree decomposition of width \(t\) can be turned into a tree decomposition of width \(O(t)\) and depth \(O(\log n)\).</li>
</ol>
<p>The two regimes might suggest that a lower bound \(k\cdot \mathrm{tw}(K) = \Omega(\log n)\) for any \(k\). In our recent paper [2], we show that this is not the case:</p>
<hr />
<p><strong>Theorem 1</strong>: There exists an emulator \(K\) for any \(n\)-vertex tree \(T\) such that \(h(K) = O(\log \log n)\) and \(\mathrm{tw}(K) = O(\log \log n)\).</p>
<hr />
<p>Theorem 1 implies that one can get an emulator with \(k\cdot \mathrm{tw}(K) = O((\log \log n)^2)\), which is exponentially smaller than \(O(\log(n))\). The goal of this post is to discuss the proof of Theorem 1. See the conclusion section for a more thorough discussion on other aspects of Theorem 1, in particular, the construction time and application.</p>
<p>For the tree shortcutting problem, it is often insightful to look into the <strong>path graph with \(n\) vertices</strong>, which is a special case. Once we solve the path graph, extending the ideas to trees is not that difficult.</p>
<h1 id="1-the-path-graph">1. The Path Graph</h1>
<p>The (unweighted )path graph \(P_n\) is a path of \(n\) vertices. To simplify the presentation, assume that \(\sqrt{n}\) is an integer. The construction is recursive and described in the pseudo-code below. First we divide \(P_n\) into \(\sqrt{n}\) sub paths of size \(\sqrt{n}\) each. Denote the endpoints of these subpaths by \(b_i = i\sqrt{n}\) for \(1\leq i\leq\sqrt{n}\). We call \({b_i}_{i}\) boundary vertices. We have two types of recursions: (1) top level recursion – lines 2 and 3 – and (2) subpath recursion – lines 5 to 9. The intuition of the top level recursion is a bit tricky and we will get to that later. See Figure 2.</p>
<p><img src="/assets/figs/recursion.svg" alt="" /></p>
<p><em>Figure 2: (a) The recursive construction applied to the path \(P_n\). (b) Gluing \(\mathcal T_B\) and all \(\{\mathcal T_i\}\) to obtain a tree decomposition \(\mathcal{T}\) of \(K\) via red edges.</em></p>
<p>The subpath recursion is natural: recursively shortcut each subpath \(P[b_i,b_{i+1}]\) (line 6). The subpath recursion returns the shortcut graph \(K_i\) and its tree decomposition \(\mathcal T_i\). Next, we add edges from each boundary vertex \(b_i, b_{i+1}\) of the subpath to all other vertices on the subpath (lines 7 and 8). This step guarantees that each vertex of the subpath can “jump” to the boundary vertices using only <strong>one edge</strong>. In terms of tree decomposition, it means that we add both \(b_i\) and \(b_{i+1}\) to every bag of \(\mathcal T_i\) (line 9).</p>
<p>The top level recursion serves two purposes: (i) creating a low hop emulator for boundary vertices (recall that each vertex can jump to a boundary vertex in the same subpath using one edge) and (ii) gluing \({\mathcal T_i}\) together. More precisely, let \(P_{\sqrt{n}}\) be a path of boundary vertices, i.e., \(b_i\) is adjacent to \(b_{i+1}\) in \(P_{\sqrt{n}}\). We shortcut \(P_{\sqrt{n}}\) recursively, getting the shortcut graph \(K_B\) and its tree decomposition \(\mathcal T_B\) (line 3). Since \((b_i,b_{i+1})\) is an edge in \(P_{\sqrt{n}}\), there must be a bag in \(\mathcal T_B\) containing both \(b_i,b_{i+1}\); that is, the bag \(X\) in line 11 exists. See Figure 2(b). Recall that in line 9, every bag in \(\mathcal T_i\) contains both \(b_i,b_{i+1}\), and that \(K_B\) and \(K_i\) only share two boundary vertices \(b_i,b_{i+1}\). Thus, we can connect \(X\) to an arbitrary bag of \(\mathcal T_i\) as done in line 12. This completes the shortcutting algorithm.</p>
<hr />
<p><span style="font-variant: small-caps">PathShortcutting</span>\((P_n)\)</p>
<blockquote>
<p>\(1.\) \(B \leftarrow {0,\sqrt{n}, 2\sqrt{n}, \ldots, n}\) and \(b_i \leftarrow i\sqrt{n}\) for every \(0\leq i \leq \sqrt{n}\)<br />
\(2.\) \(P_{\sqrt{n}} \leftarrow\) unweighted path graph with vertex set \(B\).<br />
\(3.\) \((K_B,\mathcal T_B) \leftarrow\)<span style="font-variant: small-caps">PathShortcutting</span>\((P_{\sqrt{n}})\)<br />
\(4.\) \(K\leftarrow K_B,\quad \mathcal{T}\leftarrow \mathcal T_B\)<br />
\(5.\) for \(i\leftarrow 0\) to \(\sqrt{n}-1\)<br />
\(6.\) \((K_i,\mathcal T_i) \leftarrow\)<span style="font-variant: small-caps">PathShortcutting</span>\((P_{n}[b_i, b_{i+1}])\)<br />
\(7.\) for each \(v\in P_{n}[b_i, b_{i+1}]\)<br />
\(8.\) \(E(K_i)\leftarrow {(v,b_i), (v,b_{i+1})}\) <br />
\(9.\) add both \({v_i,v_{i+1}}\) to every bag of \(\mathcal T_i\)<br />
\(10.\) \(K\leftarrow K \cup K_i\) <br />
\(11.\) Let \(X\) be a bag in \(\mathcal{T}\) containing both \(b_i,b_{i+1}\)<br />
\(12.\) Add \(\mathcal T_i\) to \(\mathcal{T}\) by connecting \(X\) to an arbitrary bag of \(\mathcal T_i\) <br />
\(13.\) return \((K,\mathcal{T})\)</p>
</blockquote>
<hr />
<p>It is not difficult to show that \(\mathcal{T}\) indeed is a tree decomposition of \(K\). Thus, we focus on analyzing the hop bound and the treewidth.</p>
<blockquote>
<p><strong>Remark 2:</strong> For notational convenience, we include \(0\) in the set \(B\) though \(0\not\in P_n\). When calling the recursion, one could simply drop 0.</p>
</blockquote>
<p><img src="/assets/figs/hopbound.svg" alt="" /></p>
<p><em>Figure 3: A low-hop path from \(u\) to \(v\).</em></p>
<p><strong>Analyzing the hop bound \(h(K)\).</strong> Let \(u\) and \(v\) be any two vertices of \(P_{n}\); w.l.o.g, assume that \(u \leq v\), and \(h(n)\) be the hop bound. Let \(b_{u}\) and \(b_v\) be two boundary vertices of the subpaths containing \(u\) and \(v\), respectively, such that \(b_u,b_v \in P[u,v]\). See Figure 3. As mentioned above, line 8 of the algorithms guarantees that there are two edges \((u,b_u)\) and \((b_v,v)\) in \(K\), and the top level recursion (line 3) guarantees that there is a shortest path of hop length \(h(\sqrt{n})\) between \(b_u\) and \(b_v\) in \(K_B\). Thus we have:</p>
\[h(n) \leq h(\sqrt{n}) + 2\]
<p>which solves to \(h(n)= O(\log\log n)\).</p>
<p><strong>Analyzing the treewidth \(\mathrm{tw}(K)\).</strong> Note that the treewidth of \(\mathcal T_B\) and all \({\mathcal{T_i}}\) (before adding boundary vertices in line 9) is bounded by \(\mathrm{tw}(\sqrt{n})\). Line 9 increases the treewidth of \({\mathcal{T_i}}\) by at most \(2\). Since the treewidth of \(K\) is the maximum treewidth \(\mathcal T_B\) and all \({\mathcal{T_i}}\), we have:</p>
\[\mathrm{tw}(n) \leq \mathrm{tw}(\sqrt{n}) + 2\]
<p>which solves to \(\mathrm{tw}(n)= O(\log\log n)\).</p>
<p>This completes the proof of Theorem 1 for the path graph \(P_n\).</p>
<h1 id="2-trees">2. Trees</h1>
<p>What is needed to extend the construction of a path graph to a general tree? Two properties we exploited in the construction of \(P_n\):</p>
<ol>
<li>\(P_n\) can be decomposed into \(\sqrt{n}\) (connected) subpaths.</li>
<li>Each subpath in the decomposition has at most two boundary vertices.</li>
</ol>
<p>Property 2 implies that the total number of boundary vertices is about \(\sqrt{n}\), which plays a key role in analyzing the top level recursion.</p>
<p>It is well known that we can obtain a somewhat similar but weaker decomposition for trees: one can decompose a tree of \(n\) vertices to roughly \(\sqrt{n}\) connected subtrees such that the number of boundary vertices is \(\sqrt{n}\). (A vertex is a boundary vertex of a subtree if it is incident to an edge not in the subtree.) This decomposition is weaker in the sense that a subtree could have more than 2, and indeed up to \(\Omega(\sqrt{n})\), boundary vertices. Is this enough?</p>
<p>Not quite. To glue the tree decomposition \(\mathcal T_i\) to \(\mathcal{T}\) (and effectively to \(\mathcal T_B\)), we rely on the fact that there is a bag \(X\in \mathcal T_B\) containing both boundary vertices in line 11. The analogy for trees would be: there exists a bag \(X\) containing all boundary vertices of each subtree. This is problematic if a subtree has \(\Omega(\sqrt{n})\) boundary vertices.</p>
<p>Then how abound guaranteeing that each subtree has \(O(1)\) vertices, say 3 vertices. Will this be enough? The answer is pathetically no. To guarantee 3 vertices in the same bag, one has to add a clique of size 3 between the boundary vertices in the top level recursion. What it means is that, the graph between boundary vertices on which we recursively call the shortcuting procedure is <em>no longer a tree</em>. Thus, we really need a decomposition where every subtree has <strong>at most 2 boundary vertices</strong>.</p>
<hr />
<p><strong>Lemma 1:</strong> Let \(T\) be any tree of \(n\) vertices, one can decompose \(T\) into a collection \(\mathcal{D}\) of \(O(\sqrt{n})\) subtrees such that every tree \(T’\in \mathcal{D}\) has \(\lvert V(T’)\rvert \leq \sqrt{n}\) and at most 2 boundary vertices.</p>
<hr />
<p>Lemma 1 is all we need to prove Theorem 1, following exactly the same construction for the path graph \(P_n\); the details are left to readers.</p>
<blockquote>
<p><strong>Remark 3:</strong> In developing our shortcutting tree result, we were unaware that Lemma 1 was already known in the literature. A reviewer later pointed out that one can get Lemma 1 from a weaker decomposition using <em>least common ancestor closure</em> [3], which I reproduce below.</p>
</blockquote>
<p><strong>Proof of Lemma 1:</strong> First, decompose \(T\) into a collection \(\mathcal{D}’\) of \(O(\sqrt{n})\) subtrees such that each tree in \(\mathcal{D}’\) has size at most \(\sqrt{n}\) and that the total number of boundary vertices is \(O(\sqrt{n})\). As mentioned above, this decomposition is well known; see Claim 1 in our paper [2] for a proof.</p>
<p>Let \(A_1\) be the set of boundary vertices; \(\lvert A_1\rvert = O(\sqrt{n})\). Root \(T\) at an arbitrary vertex. Let \(A_2\) be the set containing the ancestor of every pair of vertices in \(A_1\). Let \(B = A_1\cup A_2\).</p>
<p>It is not hard to see that \(\lvert A_2\rvert \leq \lvert A_1\rvert - 1 = O(\sqrt{n})\). Thus, \(\lvert B\rvert = O(\sqrt{n})\). Furthermore, every connected component of \(T\setminus B\) has at most \(\sqrt{n}\) vertices and has edges to at most 2 vertices in \(B\). The set \(B\) induces a decomposition \(\mathcal{D}\) of \(T\) claimed in Lemma 1.</p>
<h1 id="3-conclusion">3. Conclusion</h1>
<p>The emulator in Theorem 1 can be constructed in time \(O(n \log \log n)\); see Theorem 10 our paper [2] for more details. The major open problem is:</p>
<p><strong>Open problem:</strong> Is \(O((\log \log n)^2)\) the best possible bound for the product \(\mathrm{tw}(K)\cdot h(K)\)?</p>
<p>This open problem is intimately connected to another problem: embedding planar graphs of diameter \(D\) into low treewidth graphs with additive distortion at most \(+\epsilon D\) for any \(\epsilon \in (0,1)\). More precisely, though not explicitly stated [2], one of our main results is:</p>
<hr />
<p><strong>Theorem 2:</strong> If one can construct an emulator \(K\) of treewidth \(\mathrm{tw}(n)\) and hop bound \(h(n)\) for any tree of \(n\) vertices, then one can embed any planar graphs with \(n\) vertices and diameter \(D\) into a graph of treewidth \(O(h(n)\cdot \mathrm{tw}(n)/\epsilon)\) and additive distortion \(+\epsilon D\) for any given \(\epsilon \in (0,1)\).</p>
<hr />
<p>That is, the product of treewdith and hop bound directly bounds the treewdith of the embedding. Theorem 1 give us an embedding with treewidth \(O((\log\log n)^2/\epsilon)\), which has various algorithmic applications [2]. My belief is that the bound \(O((\log \log n)^2)\) is the best possible.</p>
<h1 id="4-references">4. References</h1>
<p>[1] Bodlaender, H.L. and Hagerup, T. (1995). Parallel algorithms with optimal speedup for bounded treewidth. In ICALP ‘95, 268-279.</p>
<p>[2] Filtser, A. and Hung, L. (2022). Low Treewidth Embeddings of Planar and Minor-Free Metrics. ArXiv preprint <a href="https://arxiv.org/abs/2203.15627">arXiv:2203.15627</a>.</p>
<p>[3] Fomin, F.V., Lokshtanov, D., Saurabh, S. and Zehavi, M. (2019). Kernelization: theory of parameterized preprocessing. Cambridge University Press.</p>Hung LeOver the past two years or so, I have been thinking about a cute problem, which turned out to be much more useful to my research than I initially thought. Here it is: Tree Shortcutting Problem: Given an edge-weighted tree \(T\), add (weighted) edges to \(T\), called shortcuts, to get a graph \(K\) such that: \(d_K(u,v) = d_T(u,v)~\quad \forall u,v\in V(T)\). That is, \(K\) preserves distances in \(T\). For every \(u,v\in V(T)\), there exists a shortest path from \(u\) to \(v\) in \(K\) containing at most \(k\) edges for some \(k\geq 2\). The goal is to minimize the product \(k \cdot \mathrm{tw}(K)\), where \(\mathrm{tw}(K)\) is the treewidth of \(K\).Sparsity of minor-free graphs2022-11-30T00:00:00+00:002022-11-30T00:00:00+00:00https://minorfree.github.io/minor-sparsity<p>Planar graphs are sparse: any planar graph with \(n\) vertices has at most \(3n-6\) edges. A simple corollary of this sparsity is that planar graphs are \(6\)-colorable. There is simple and beautiful proof based on the Euler formula, which can easily be exteded to bounded genus graphs, a more general case: any graph embedddable in orientable surfaces of genus \(g\) with \(n\) vertices has at most \(3n + 6g-6\) edges.</p>
<p>How’s about the number of edges of \(K_r\)-minor-free graphs? This is a very challenging question. A reasonable speculation is \(O(r)\cdot n\): a disjoint union of \(n/(r-1)\) copies of \(K_{r-1}\) excludes a \(K_r\) minor and has \(\Theta(r)\cdot n\) edges. But this isn’t the case. And surprisingly, the correct bound is \(O(r\sqrt{\log r})n\), which will be the topic of this post.</p>
<hr />
<p><strong>Theorem 1:</strong> Any \(K_r\)-minor-free graphs with \(n\) vertices has at most \(O(r\sqrt{\log r})\cdot n\) edges.</p>
<hr />
<p>The bound in Theorem 1 is tight; see a lower bound in Section 4. The sparsity bound, which is the ratio of the number of edges to the number of vertices, \(O(r\sqrt{\log r})\) was first discovered by Kostochka [4]; the proof is quite non-trivial, so as other follow-up proofs. A short proof was just found recently by Alon, Krivelevich, and Sudakov (AKS) [1], which I will present here in Section 3. See the bibliographical notes section for a detailed discussion of other proofs.</p>
<p>The goal of this post isn’t just to present the proof of Theorem 1. At various points in the past, I am interested in a more intuitive proof that gives good enough sparsity bounds, say \(O(\mathrm{poly}(r))\), or even \(O(f(r))\) bound for any function that depends on \(r\) only. This post describes different proofs, of increasing complexity (or ingenuity), giving different bounds. In particular, sparsity bound $2^{r}$ follows by a simple induction. Sparsity bound \(O(r^2)\) relies crucially on the fact that a graph of sparsity \(d\) has a highly connected minor of small size; the high connectivity allows us to show that for any vertex subset of size \(k \approx \sqrt{d}\) has \({k \choose 2}\) internally vertex-disjoint paths connecting these vertices, thereby giving us a \(K_{\sqrt{d}}\) minor. The optimlal bound \(O(r\sqrt{\log r})\) also relies on a highly connected minor of small size, but employs a clever probabilistic argument to construct a \(K_{r}\) minor.</p>
<h1 id="1-exponential-sparsity-2r">1. Exponential Sparsity: \(2^{r}\)</h1>
<p>I learn a beatiful proof of the following theorem in the graph theory book of Reinhard Diestel (Proposition 7.2.1. [3]).</p>
<hr />
<p><strong>Theorem 2:</strong> Any \(K_r\)-minor-free graphs with \(n\) vertices has at most \(2^{r-1}\cdot n\) edges.</p>
<hr />
<p>Proof: Let \(G\) be a \(K_r\)-minor-free graph with \(n\) vertices. The proof is by induction on \( \vert V(G) \vert + r\). If there is a vertex \(v\) of degree at most \(2^{r-1}\), then by removing \(v\) and applying the induction hypothesis, we are done. Now consider the case where every vertex has degree more than \(2^{r-1}\); let \(v\) be such a vertex. The key idea is to find an edge \((u,v)\) such that \(u\) and \(v\) share only a few neighbors. We then contract \((u,v)\) and apply induction.</p>
<blockquote>
<p>Claim 1: There is a neighbor \(u\) of \(v\) such that \( \vert N_G(v)\cap N_G(u) \vert \leq 2^{r-1}-1\).</p>
</blockquote>
<p>Suppose that the claim holds, then let \(G’\) be the graph obtained from \(G\) by contracting \((u,v)\), i.e., \(G’= G/(u,v)\). Then \( \vert V(G’) \vert \leq r-1\) and \( \vert E(G’) \vert \geq \vert E(G) \vert - 2^{r-1}\). By induction, \( \vert E(G’) \vert \leq 2^{r-1}(n-1)\), which implies \( \vert E(G) \vert \leq 2^{r-1}(n-1) + 2^{r-1} = 2^{r-1}\cdot n\) as desired.</p>
<p>We now turn to Claim 1. Let \(H = G[N_G(v)]\) be the sugraph induced by \(N_G(v)\). Then \(H\) is \(K_{r-1}\)-minor-free. By induction, \( \vert E(H) \vert \leq 2^{r-2}\cdot \vert V(H) \vert \). Thus, there exists a vertex \(u\in H\) such that \(d_H(u) \leq 2 \vert E(H) \vert / \vert V(H) \vert \leq 2^{r-1}\). This gives \( \vert N_G(v)\cap N_G(u) \vert \leq d_H(u)-1 = 2^{r-1}-1\).</p>
<hr />
<p>The exponential term \(2^{r-1}\) in the above theorem is due to a loss of a factor of \(2\) in each step of the induction by using \(d_H(v) \leq 2( \vert E(H) \vert / \vert V(H) \vert )\).</p>
<h1 id="2-polynomial-sparsity-or2">2. Polynomial Sparsity: \(O(r^2)\)</h1>
<p>The goal of this section is to improve the exponential bound in Theorem 2 to a polynomial bound.</p>
<hr />
<p><strong>Theorem 3:</strong> Any \(K_r\)-minor-free graphs with \(n\) vertices has at most \(O(r^2)\cdot n\) edges.</p>
<hr />
<p>Proving Theorem 3 requires a deeper understanding of the structures of minor-free graphs. The key insight is that any graph with at least \(d\cdot n\) edges has a minor, say \(K\), of \(O(d)\) size that is highly connected (Lemmas 1 and 2 below). Then we then show that given any set of vertices, say \(R\), of size about \(\Theta(\sqrt{d})\) in \(K\), we can find a collection of pairwise disjoint paths of \(H\) connecting every pair of vertices in \(R\) (due to high connectivity), which gives us a clique minor of size \(\Theta(\sqrt{d})\) (Lemma 3). Taking contrapositive gives Theorem 3.</p>
<p>Let \(\varepsilon(G) = \vert E(G) \vert / \vert V(G) \vert \) and \(\delta(G)\) be the minimum degree of \(G\). Fist, we show that \(G\) contains a minor of size \(\leq 2d\) and minimum degree at least \(d\).</p>
<hr />
<p><strong>Lemma 1:</strong> Let \(G\) be any graph of \(n\) vertices such that \( \vert E(G) \vert \geq d\cdot n\). Then \(G\) has a minor \(H\) such that \(\delta(H)\geq d \geq \vert V(H) \vert /2\).</p>
<hr />
<p>Proof: Let \(K\) be a <em>minimal</em> minor of \(G\) such that \(\varepsilon(K)\geq d\). The minimality implies two properties:</p>
<ol>
<li>\( \vert N_K(u)\cap N_K(v) \vert \geq d\) for every edge \((u,v)\); otherwise, we can contract edge \((u,v)\).</li>
<li>There exists a vertex \(x\) such that \(d_K(x)\leq 2d\); otherwise, we can delete an edge from \(K\).</li>
</ol>
<p>Then \(H = K[N_K(x)]\) satisfying the lemma.</p>
<hr />
<p>Next, we show that \(H\) has a highly connected subgraph.</p>
<hr />
<p><strong>Lemma 2:</strong> Let \(H\) be such that \(\delta(H)\geq d \geq \vert V(H) \vert /2\). Then \(H\) has a subgraph \(K\) that has (i) \( \vert V(K) \vert \leq 2d\) and (ii) \(K\) is \(d/3\)-vertex connected.</p>
<hr />
<p>Proof: If \(H\) is \(d/3\)-vertex connected, then \(K = H\). Otherwise, there is a set \(S\subseteq V(H)\) of size at most \(d/3\) such that \(H\setminus S\) has at least two connected components. Let \(K\) be the smallest connected component of \(H\). Then \( \vert V(K) \vert \leq \vert V(H) \vert /2\leq d\) and \(\delta(K)\geq \delta(H) - \vert S \vert \geq 2d/3\). Thus, for every \(u\not= v\in K\), \(u\) and \(v\) must share at least \( \vert N_K(u) \vert + \vert N_K(v) \vert - \vert V(K) \vert \geq d/3\) neighbors in \(K\). That is, \(K\) is \(d/3\)-vertex connected.</p>
<hr />
<p><strong>A detour: diameter of highly connected graphs.</strong> Graph \(K\) in Lemma 2 has another nice property: its diameter is at most \(7\). To see this, suppose that there is a shortest path \({v_0,v_1,\ldots,v_8, \ldots}\) of length at least \(8\). Consider three vertices \(v_1,v_4, v_7\). Then \(N_{K}(v_1),N_{K}(v_4), N_{K}(v_7)\) are pairwise disjoint. As \(K\) is \(d/3\)-connected, \(\delta(K)\geq 3\). Thus, \( \vert N_{K}(v_1)\cup N_{K}(v_4) \cup N_{K}(v_7) \vert \geq 3(d/3+1) > d\geq V(K)\), a contradiction. We will use the same kind of arguments in several proofs below.</p>
<p>We now construct a minor of size \(\Theta(\sqrt{d})\) for graph \(K\) in Lemma 2. We do so by showing that for any given \(p \leq d/40\) distinct pairs of vertices \({(s_1,t_1, \ldots, (s_p,t_p)}\) in \(K\) (two pairs might share the same vertex), then there are \(p\) internally vertex-disjoint paths connecting them (Lemma 3). (Two paths are internally vertex-disjoint if they can only share endpoints.) Then one can construct a minor of size \(\sqrt{d/40}\) by picking an arbitrary set \(R\) of \(\sqrt{d/40}\) vertices, and connect all pairs of vertices in \(R\) using disjoint paths in Lemma 3, which implies Theorem 3.</p>
<p><img src="/assets/figs/paths-large.svg" alt="" /></p>
<p><em>Figure 1: (a) \(\mathcal{P}\) includes two paths of black edges. (b) deleting \(\mathcal{P}\) except \(s_1,t_1\). (c) \(v\) could not have more than 3 neighbors on the path from \(s_i\) to \(t_i\)</em></p>
<hr />
<p><strong>Lemma 3:</strong> Let \(\mathcal{T} = {(s_1,t_1), \ldots, (s_p,t_p)}\) be any \(p \leq d/40\) distinct pairs of vertices in \(K\) (in Lemma 2). Then there are \(p\) internally vertex-disjoint paths connecting the all pairs in \(\mathcal{T}\).</p>
<hr />
<p>Proof: Let \(\mathcal{P}\) be a set of internally vertex-disjoint paths, each of of length at most \(10\), that connects a maximal number of pairs in \(\mathcal{T}\). Subject to the pairs connected by \(\mathcal{P}\), we choose \(\mathcal{P}\) such that the total number of edges of paths in \(\mathcal{P}\) is minimal. If \(\mathcal{P}\) connects every pair, we are done. Otherwise, w.l.o.g, we assume that \(s_1\) and \(t_1\) are not connected by \(\mathcal{P}\). See Figure 1(a).</p>
<p>Let \(K^-\) be obtained from \(K\) be removing all vertices in \(\mathcal{P}\), except \(s_1\) and \(t_1\), from \(K\). See Figure 1(b). Observe that the total number of vertices in \(\mathcal{P}\) is at most \(11\cdot p = 11d/40 < d/3\). Since \(K\) is \(d/3\)-vertex-connected, \(K^-\) is connected.</p>
<blockquote>
<p>Claim 2: for every \(v\in K^-\) and \(P\in \mathcal{P}\), \( \vert N_K(v)\cap V(P) \vert \leq 3\).</p>
</blockquote>
<p>Suppose the claim is not true, then we can shortcut \(P\) via \(v\) to get a shorter path connecting the endpoints of \(P\), contradicting the minimality of \(\mathcal{P}\). See Figure 1(c).</p>
<p>Note that \(\delta(K)\geq d/3\) as it is \(d/3\)-connected. By Claim 2, \(\delta(K^-)\geq \delta(K) -3\cdot p \geq d/3 - 3d/30 \geq d/4\). The same argument in the detour above implies that \(K^-\) has diameter at most \(10\). Thus, there is a path of length at most \(10\) from \(s_1\) to \(t_1\) in \(K^-\), contradicting the maximality of \(\mathcal{P}\).</p>
<hr />
<h1 id="3-optimal-sparsity-orsqrtlogr">3. Optimal Sparsity: \(O(r\sqrt{\log{r}})\)</h1>
<p>We assume that the graph has at least \(d\cdot n\) edges. Our goal is to construct a minor of size \(\Omega(d/\sqrt{\log d})\). By taking contrapositive, we obtain Theorem 1.</p>
<h2 id="31-proof-ideas">3.1. Proof Ideas</h2>
<p>By Lemma 2, it suffices to construct a clique minor of size \(\Omega(d/\sqrt{\log d})\) for a \(d/3\)-vertex-connected graph \(K\) with at most \(2d\) vertices. In particular, we will construct a collection of \(h = d/(c_1 \sqrt{\log d})\) vertex-disjoint connected subgraphs \(C_1,C_2,\ldots, C_h\) such that (a) there is an edge between any two subgraphs \(C_i,C_j\) for \(1\leq i\not=j \leq h\) and (b) each \(C_i\) has \(O(c_0) \sqrt{\log d}\) vertices, for some constant \(1\ll c_0 \ll c_1\). These subgraphs will realize a \(K_h\)-minor of \(K\).</p>
<p>The choices of constants \(c_0\) and \(c_1\) imply that \( \vert V(C_1)\cup \ldots \cup V(C_h) \vert \leq d/12\). Thus, if we let \(H_i = K\setminus (C_1\cup \ldots \cup C_i)\) for any \(i\leq h\), then \(H_i\) is \(d/3 - d/12 = d/4\) vertex-connected. This in particular, implies that \(H_i\) has diameter at most \(22 = O(1)\).</p>
<p><img src="/assets/figs/minor-large.png" alt="" /></p>
<p><em>Figure 2: (a) \(C_1\) forms from \(S_1\), the set of black vertices, and its bad set \(B_1\). (b) \(C_2\) is constructed from \(S_2\) (black vertices), which avoids \(B-1\), and its bad set \(B_2\). (c) \(S_i\) has edges to all graphs \(C_1,C_2,\ldots,C_{i-1}\).</em></p>
<p>We will construct each \(C_i\) by random sampling. To gain intuition, let’s look at the first step: (1) sampling a set \(S_1\) of \(s = c_0\sqrt{\log n}\) vertices and (2) making \(S_1\) connected by adding a shortest path from one vertex to every other vertex in \(S_1\). See Figure 2 (a). There are two good reasons for doing this:</p>
<ol>
<li>As the graph has diameter \(O(1)\), \( \vert V(C_1) \vert = O(c_0 \sqrt{\log d})\).</li>
<li>About \(e^{-O(c_0)\sqrt{\log d}}\cdot d\) vertices are <strong>not dominated</strong> by \(S_1\). This is because each vertex has at least \(d/3\) neighbors (as \(K\) is \(d/3\)-connected), and hence probability that a vertex is not dominated by \(S_1\) is at most \((1-d/(3\cdot 2d))^{s} = e^{-O(c_0)\sqrt{\log d}}\). Let’s call these vertices bad vertices (for a reason explained later), and denote the set of bad vertices by \(B_1\). (A vertex is donimated by \(S_1\) if it is in \(S_1\) or adajcent to another vertex in \(S_1\).)</li>
</ol>
<p>The second step, we sample \(S_2\) in exactly the same way and make \(C_2\) by adding paths between vertices of \(S_2\). The graph we construct \(C_2\) now is \(H_1 = K\setminus V(C_1)\), and as argue above, \(H_1\) has roughly the same properties of \(K\): \(d/4\)-vertex-connected and diameter \(O(1)\). We want \(S_2\) to contain a vertex adjacent to \(S_1\) (in \(K\)) because we would like \(C_2\) to be adjacent to \(C_1\). That is, we want \(S_2 \not\subseteq B_1\): we say that \(S_2\) <strong>avoids</strong> bad set \(B_1\). See Figure 2(b). The reason 2 above implies that \(\mathrm{Pr}[S_2\subseteq B_1] \leq (e^{-O(c_0)\sqrt{\log d}})^{c_0\sqrt{\log d}} \leq 1/d^2\) for some chocie of \(c_0\gg 1\). Thus, w.h.p, \(S_2\) avoids \(B_1\).</p>
<p>In general, at any step \(i \in [1,h]\), we already constructed a set of \(i-1\) vertex-disjoint conected subgraphs \(C_1,C_2,\ldots C_{i-1}\), each is associated with a bad set (a set of non-neighbors). See Figure 2(c). We want to construct \(C_i\) by sampling a set \(S_i\) and adding paths between vertices of \(S_i\). By the same reasoning above for \(S_2\) and using the union bound, the probability that \(S_i\) is connected to all \(i-1\) subgraphs, i.e, \(S_i\) avoids all the \((i-1)\) bad sets, is at least \(1 - d/d^2 = 1-1/d > 0\). When \(i = h\), we obtain a \(K_h\) minor as desired.</p>
<h2 id="32-the-formal-proof">3.2. The formal proof</h2>
<p>Notation: for a given set \(S\subseteq V\) in a graph \(G = (V,E)\), denoted by \(B_G(S)\) be the set of vertices not dominated by \(S\). That is, \(B_G(S) = V\setminus (S\cup(\cup_{v\in S}N_G(v)))\).</p>
<p>We construct a set of subgraphs \(C_1,C_2,\ldots, C_h\) realizing a \(K_h\)-minor of \(K\), for \(h = d/(c_1 \sqrt{\log d})\), in \(h\) iterations as follows.</p>
<hr />
<p>Initially, \(H_0 = K, B_0 = \emptyset\).</p>
<blockquote>
<p>In \(i\)-th iteration, \(i\geq 1\), we find a set \(S_i\) of at most \(c_0\sqrt{\log d}\) vertices s.t (a) \( \vert B_{H_{i-1}}(S_i) \vert \leq 2de^{-c_0\sqrt{\log d}/8}\) and (b) \(S_i\) is connected to each of \(C_1,C_2,\ldots,C_{i-1}\) by an edge. Next, let \(C_i\) be obtained by adding shortest paths from an arbitrary vertex \(v\in S_i\) to every other vertex in \(S_i\setminus {v}\), and \(B_i= B_{H_{i-1}}(S_i)\). Then, we define \(H_i = H_{i-1}\setminus V(C_i)\) for the next iteration.</p>
</blockquote>
<p>Finally, output \(C_1,C_2,\ldots, C_h\).</p>
<hr />
<p>To show the correctness of the algorithm, we only have to show that the set \(S_i\) at iteration \(i\) exists, for some choices of \(1\ll c_0 \ll c_1\). If so, \(C_1,C_2,\ldots, C_h\) form a \(K_h\)-minor of \(K\), and hence, of \(G\).</p>
<p>First, we show that \(H_i\) has high connectivity and \(C_i\) has size \(O(c_0\sqrt{\log d})\).</p>
<hr />
<p><strong>Lemma 4:</strong> For every \(i\geq 1\), \( \vert V(C_i) \vert \leq 22 c_0\sqrt{\log d}\) and \(H_i\) is \(d/4\)-vertex connected when \(c_1 = 12c_0\).</p>
<hr />
<p>Proof: We prove by induction. Since \(H_{i-1}\) is \(d/4\)-connected and \( \vert V(H_i) \vert \leq 2d\), the diameter of \(H_{i-1}\) is at most \(22\). As we add at most \(c_0\sqrt{\log d}\) shortest paths to \(S_i\), \( \vert V(C_i) \vert \leq 22 c_0\sqrt{\log d}\).</p>
<p>Observe that \(\sum_{j=1}^{i} \vert V(C_j) \vert \leq c_0\sqrt{\log d}\cdot h= c_0\sqrt{\log d} \cdot d/(c_1\sqrt{\log d}) = d/12\). Since \(K\) is \(d/3\)-vertex connected, \(H_i\) is \(d/3-d/12 = d/4\) vertex connected.</p>
<hr />
<p>Now we show the existence of \(S_i\). Note that condition (b) is equivalent to that \(S_i\) avoids all the bad sets \(B_0,B_1,\ldots, B_{i-1}\). Let \(S_i\) be otabined by choosing each vertex of \(H_{i-1}\) with probability \(c_0\sqrt{\log d}/(2d)\); the expected size of \(S_i\) is a most \(c_0\sqrt{\log d}\). By Lemma 4, every vertex \(v\in H_{i-1}\) has degree at least \(d/4\). Thus, \(\mathrm{Pr}[v\in B_{i}]\leq (1-c_0\sqrt{\log d}/(2d))^{d/4}\sim e^{-c_0\sqrt{\log d}/8}\). In particular, \( \vert \mathbb{E}[B_i] \vert \leq (2d)e^{-c_0\sqrt{\log d}/8}\).</p>
<p>It remains to show that with non-zero probability, \(S_i\) avoids all \(B_0,\ldots, B_{i-1}\). Note that \( \vert V(H_{i-1}) \vert \geq d/4\). For a fixed \(j\in [0,i-1]\):</p>
<p>\(\mathrm{Pr}[S_i\subseteq B_j]\leq ( \vert B_j \vert / \vert V(H_{i-1}) \vert )^{ \vert S_j \vert }\leq 8(e^{-c_0\sqrt{\log d}/8})^{c_0\sqrt{\log d}}= 8e^{-c_0^2 \log(d)/8} \leq 1/d^2\)</p>
<p>for a sufficiently large \(c_0\geq 1\).</p>
<p>By union bound, the probability that \(\mathrm{Pr}[S_i\subseteq B_j]\) for some \(j\in [0,i-1]\) is at most \(h/d^2\leq 1/d\). Thus, the probability that \(S_i\) avoids all \(B_j\) is at least \(1-1/d\). This conclude the proof.</p>
<h1 id="4-a-lower-bound">4. A Lower Bound</h1>
<p>In this section, we show that for any \(n\) and \(r\) such that \(n \gg r\sqrt{\log r}\), there exists a graph \(G\) with \(n\) vertices and \(\Theta(n\cdot r\sqrt{\log r})\) edges such that \(G\) has no \(K_{r}\) minor. The key idea of the construction is Theorem 4 below.</p>
<hr />
<p><strong>Theorem 4:</strong> There exists a graph \(H\) with \(k\) vertices and \(\Theta(k^2)\) edges such that \(H\) has no \(K_s\)-minor where \(s = k/(\epsilon\sqrt{\log k})\) for some constant \(\epsilon\in (0,1)\).</p>
<hr />
<p>Theorem 4 implies a sparsity lower bound \(\Omega(r\sqrt{\log r})\) as follows. Let \(G\) be the disjoint union of \(\Theta(n/(r\sqrt{\log r}))\) copies of the same graph in Theorem 4 with \(k = \Theta(r\sqrt{\log r})\) vertices. Then \( \vert E(G) \vert = \Theta(n/(r\sqrt{\log r}))k^2 = \Theta(n\cdot r\sqrt{\log r})\). As \(H\) excludes a clique minor of size \(k/(\epsilon\sqrt{\log k}) \leq r\) (by choosing the constant in the definition of \(k\) appropriately), \(G\) excludes \(K_r\) as a minor.</p>
<p>Theorem 4 can be proven by the probabilistic method. To gain some intuition of the proof, consider any fixed partition of \(V(H)\) into vertex-disjont subsets \({V_1,V_2,\ldots, V_{s}}\) of size \(\epsilon \sqrt{\log k}\) each. For this partition to realize a \(K_s\) minor, there must be an edge between every two vertex sets \(V_i,V_j\) for \(i\not=j\). The probability of this is:</p>
\[(1-2^{-|V_i||V_j|})^ = (1-2^{-\epsilon ^2 \log(k)})^ \approx e^{-k^{2 - \epsilon^2}/\log(k)}\]
<p>By the union bound over at most \(k^k\) such partitions, the probability of having a \(K_s\) minor is at most \(k^k e^{-k^{2 - \epsilon^2}/\log(k)} \rightarrow 0 \) when \(k \rightarrow +\infty\). In other words, the probability of not having a \(K_{s}\) minor is close to \(1\).</p>
<p>In the formal proof, one has to work with the fact that the subsets in the partition might not have the same size; this can be resolved by simple algebraic manipulation.</p>
<p><strong>Proof of Theorem 4.</strong> Let \(H = G(k,1/2)\) where \(G(k,1/2)\) is the Erdős–Rényi random graph with probability \(p = 1/2\). We now show that the probability that \(H\) contains a \(K_s\) minor tends to \(0\) when \(k\rightarrow \infty\).</p>
<p>Recall that \(K_s\) is a minor of \(H\) if there is a set of non-empty, connected, and vertex-disjoint subgraphs \(\mathcal{C} = {C_1,C_2,\ldots, C_s}\) such that there is an edge in \(H\) connecting every two graphs \(C_i,C_j\) for \(1\leq i\not=j \leq s\).</p>
<p>We will bound the probability of exsiting such \(\mathcal{C}\). Observe that the number of (ordered) partitions of \( \vert V(H) \vert \) into \(s\) non-empty subset is at most:</p>
\[\frac{k!}{s!}{k-1\choose s-1} <k^k\]
<p>Fixed such a partition of \( \vert V(H) \vert \), denoted by \(\mathcal{P}\). Let \(n_i\) be the number of vertices in \(i\)-th set. The probability that there is an edge betwen two different sets \(i\) and \(j\) is \((1-2^{-n_i\cdot n_j})\). Thus, probability of having an edge between any two different sets of \(\mathcal{P}\) is:</p>
\[\prod_{(i,j)}(1-2^{-n_i\cdot n_j}) \leq \prod_{(i,j)}e^{-2^{-n_i\cdot n_j}} = e^{-\sum_{(i,j)}2^{-n_i\cdot n_j}}\]
<p>where the product and sum is over all unordered pairs \((i,j)\). This implies that:</p>
\[\mathrm{Pr}[\mathcal{C} \text{ exists}] \leq k^k \cdot e^{-\sum_{(i,j)}2^{-n_i\cdot n_j}}\]
<p>We now estimate \(\sum_{(i,j)}2^{-n_i\cdot n_j}\). By arithmetic–geometric mean inequality,
\(\sum_{(i,j)}2^{-n_i\cdot n_j}\geq {s \choose 2}\left(\prod_{(i,j)}2^{-n_i\cdot n_j}\right)^{1/{s\choose 2}} \geq {s \choose 2} \left(2^{-\sum_{(i,j)}n_i\cdot n_j}\right)^{1/{s\choose 2}}\)</p>
<p>Observe that \(\sum_{(i,j)}n_i\cdot n_j\) is the number of edges in a complete s-partite graph with \(k\) vertices. Thus, \(\sum_{(i,j)}n_i\cdot n_j\leq k^2/2\) and hence:</p>
\[\sum_{(i,j)}2^{-n_i\cdot n_j} \geq {s \choose 2} 2^{-k^2/s^2}\]
<p>Thus,</p>
\[\mathrm{Pr}[\mathcal{C} \text{ exists}] \leq k^k \cdot e^{- {s \choose 2} 2^{-k^2/s^2}}\]
<p>By chooosing \(s = ck/(\sqrt{\log k})\) for some big enough constant \(c\), we have \(\mathrm{Pr}[\mathcal{C} \text{ exists}] \rightarrow 0\) when \(k\rightarrow \infty\).</p>
<hr />
<h1 id="bibliographical-notes">Bibliographical Notes</h1>
<p>The exponential sparsity bound in Section is due to Reinhard Diestel (Proposition 7.2.1. [3]). The \(O(r^2)\) sparsity bound in Section 2 is obtained by a combination of various ideas, in particular, Lemma 1 is due to Exercise 21, Chapter 7, in [3], Lemma 2 is from the proof of Theorem 1 in [1], and Lemma 3 is a modification of Lemma 3.5.4 in [3].</p>
<p>Mader [5] proved a sparstiy bound \(O(r\log(r))\) for \(K_r\)-minor-free graphs. Kostacha was the first to show that the sparsity is \(\Theta(r\sqrt{\log r})\). Thomason [6] provided a more refined range for the sparsity bound: \([0.265r\sqrt{\log_2 r}(1+o(1)), 0.268r\sqrt{\log_2 r}(1+o(1))]\). The bound then was tightened <strong>exactly</strong> to \((\alpha +o(1))r\sqrt{\ln(r)}\) where \(\alpha = 0.319…\) is an explcit constant, also by Thomason [7]. The simpler proof presented in Section 3 is due to Alon, Krivelevich, and Sudakov [1].</p>
<p>The lower bound in Section 4 is due to Bollobás, Catlin, and Erdös [2].</p>
<h1 id="references">References</h1>
<p>[1] Alon, N., Krivelevich, M., and Sudakov, B. (2022). <em>Complete minors and average degree–a short proof</em>. ArXiv preprint <a href="https://arxiv.org/abs/2202.08530">arXiv:2202.08530</a>.</p>
<p>[2] Bollobás, B., Catlin, P. A., and Erdös, P. (1980). <em>Hadwiger’s conjecture is true for almost every graph</em>. Eur. J. Comb., 1(3), 195-199.</p>
<p>[3] Diestel, R. (2017). Graph theory. Springer.</p>
<p>[4] Kostochka, A. V. (1982). <em>A lower bound for the Hadwiger number of a graph as a function of the average degree of its vertices</em>. Discret. Analyz. Novosibirsk, 38, 37-58.</p>
<p>[5] Mader, W. (1968). <em>Homomorphiesätze für graphen</em>. Mathematische Annalen, 178(2), 154-168.</p>
<p>[6] Thomason, A. (1984). <em>An extremal function for contractions of graphs</em>. In Mathematical Proceedings of the Cambridge Philosophical Society (Vol. 95, No. 2, pp. 261-265). Cambridge University Press.</p>
<p>[7] Thomason, A. (2001). <em>The extremal function for complete minors</em>. Journal of Combinatorial Theory, Series B, 81(2), 318-338.</p>Hung LePlanar graphs are sparse: any planar graph with \(n\) vertices has at most \(3n-6\) edges. A simple corollary of this sparsity is that planar graphs are \(6\)-colorable. There is simple and beautiful proof based on the Euler formula, which can easily be exteded to bounded genus graphs, a more general case: any graph embedddable in orientable surfaces of genus \(g\) with \(n\) vertices has at most \(3n + 6g-6\) edges. How’s about the number of edges of \(K_r\)-minor-free graphs? This is a very challenging question. A reasonable speculation is \(O(r)\cdot n\): a disjoint union of \(n/(r-1)\) copies of \(K_{r-1}\) excludes a \(K_r\) minor and has \(\Theta(r)\cdot n\) edges. But this isn’t the case. And surprisingly, the correct bound is \(O(r\sqrt{\log r})n\), which will be the topic of this post. Theorem 1: Any \(K_r\)-minor-free graphs with \(n\) vertices has at most \(O(r\sqrt{\log r})\cdot n\) edges.