罗教授对于2019年RMM第三题的解析(英文原版)
作者 | 罗教授 编辑 | 罗数君
文 4570字 阅读时间约 9分钟
本文是罗教授对2019年RMM第三题的独家解析,因时间仓促目前只能提供英文版,在明后天的推送中会为大家带来翻译版本,敬请期待。
Problem 3. Given any positive real number ε, prove that for all but finitely many positive integers N, any simple graph on N vertices with at least (1+ε)N edges has two different simple cycles of equal length.
(A simple graph is a set V of vertices, together with a set E of edges, where each edge in E is a set of two vertices of V. A simple cycle of length K is a set C of K ≥ 3 distinct edges in E, such that there is a sequence of distinct vertices v1, v2, …, vK such that for each 1 ≤ i < K, {vi, vi+1} is in C, and {vk, v1} is in C.)
The official reference shows the original statement of this problem:
http://rmms.lbi.ro/rmm2019/index.php?id=problems_math
Here, I will explain the “sketch of solution 2” on the final page of that document which is attributed to me. Indeed, I thought this was the most interesting question on the entire paper, because it had research flavor. It is closely related to a field of active research, called extremal combinatorics, which is also my area of specialization. During the time that the national coaches were thinking about the exam problems, I focused on this one, and discovered an intuitive alternative solution, which is close to the best possible dependency between ε and the first N that works for that ε. This solution may look long, but that is because I am pointing out the intuition behind each step, instead of simply saying how it works.
Start by considering an arbitrary graph in which all simple cycles have different lengths, but which has at least (1+ε)N edges. We seek a contradiction for all sufficiently large N. Find the shortest cycle, and delete one of its edges. In this new graph, the shortest cycle is not the same as the previous shortest cycle, because one edge has been deleted. Since all cycle lengths are different, the shortest cycle is now strictly longer. Find it, and delete one of its edges. The length of the shortest cycle in the new graph is even larger. Therefore, after repeating this process εN/2 times, we will reach a graph which still has at least (1+(ε/2))N edges, but there are no cycles of length εN/2 or shorter. The technical term for this is the girth of a graph, which is the length of the shortest cycle in the graph. Therefore, it suffices to prove the following lemma (where we rescaled ε to produce a more elegant statement).
Lemma. Given any positive real number ε, prove that for all but finitely many positive integers N, any simple graph on N vertices with at least (1+ε)N edges has a cycle of length shorter than εN.
When I realized this, I knew I was not far from the solution, because cycles start to appear as soon as the number of edges rises to reach the number of vertices, and once there are many cycles, they start to intersect (creating more cycles), making it impossible to have only cycles of length at least linear in N. The rest of the solution may look very long, but it is actually a natural consequence of several intuitive steps:
(1) define a function to track the relationship between vertices, girth, and edges, (2) proceed by contradiction, using the minimum cycle length condition to find enormous trees in the graph, (3) deduce that the trees must have long paths without branching, or else they would use far too many vertices, (4) delete the long paths one by one, significantly reducing the total number of vertices at each stage while only increasing the difference between the numbers of edges and vertices by +1, and (5) show that the number of such stages is too small, contradicting the size of the assumed difference between numbers of edges and vertices.
We now proceed to prove the lemma in detail. The key idea is to use the following concept which is common in research combinatorics.
Definition. The excess of a graph is defined to be E − V + 1, where E is the number of edges and V is the number of vertices.
This is a useful definition because of the basic and well-known fact that it is possible for a graph with V vertices and E = V − 1 edges to have no cycles, but as soon as the number of edges reaches V, there is at least one cycle. In other words, there are graphs with excess 0 which have no cycles, but every graph with excess greater than zero definitely contains at least one cycle. When studying cycles in graphs, it is often more convenient to reason in terms of the excess rather than to reason in terms of the numbers of edges and vertices. Formulas are often much simpler. This motivates the definition of a very natural function, which is a reasonable function to study from a research perspective. (Indeed, when I reached this point, I felt that surely this precise function must already have been studied before by researchers, and that piqued my interest.)
Definition. Let F(N, g) denote the maximum excess of a graph which has N vertices and girth at least g (having no cycles of length less than g).
In terms of this definition, we can prove the Lemma by showing that for all sufficiently large N, we have F(N, εN) ≤ εN. The intuition is to upper-bound this evaluation of F in terms of evaluations of F with smaller parameters (recursive thinking).
So, consider an arbitrary graph G with N vertices, which has no cycles of length less than εN, but maximizes the excess subject to these conditions. Then G must have excess exactly F(N, εN). We consider several cases.
Case 1. Graph G has at least one vertex U of degree 1.
Delete vertex U, to obtain a new graph G’ with N − 1 vertices. It still has no cycles of length less than εN, because deleting an edge cannot create any cycles. Since U has degree 1, the excess of G’ is exactly the same as F(N, εN) (the excess of G). Yet the excess of G’ is less than or equal to F(N−1, εN), so we conclude that in this case: F(N, εN) ≤ F(N−1, εN).
Case 2. Graph G has no vertices of degree 1.
We now use the fact that there are no cycles of length less than εN, via the standard argument of considering an arbitrary vertex U of degree greater than 0, and considering the subgraph H of G induced by all vertices which are at distance up to (εN/2)−1 from U. Crucially, H is a tree, because if any two vertices Y and Z in H were adjacent, then it would be possible to walk from Y to U to Z and back to Y in less than or equal to ((εN/2)−1) + ((εN/2)−1) + 1 < εN steps, contradicting the fact that G has no cycles of length less than εN.
However, H is a very strange tree. It cannot have more than N vertices, because G has N vertices. Yet at the same time, it has a depth of about εN/2 from its root U, which is quite long. (For the remainder of this solution, my calculations will be off by lower order terms to simplify the presentation. For example, I will make calculations in terms of εN/2 instead of (εN/2)−1. This does not substantially change the nature of the asymptotic bounds.) Visualize the tree H exploding from its root U. There are no vertices of degree 1, and we will never reach a vertex of degree 0 because we are following along paths from U. Vertices of degree 3 and higher cause a chain reaction creating more and more branches, which will cause trouble because we cannot surpass N vertices. This suggests that as we travel away from U, much time is spent on long paths of degree 2, which do not create any further expansion. We want to show that there is a fairly long such path, so that we can use it to finish the problem another major step later.
So, let M be the maximum number of consecutive degree-2 vertices in a path in G. For each T = 0, 1, 2, 3, …, (εN/2)−1, think of the set ST of vertices which are exactly T steps away from U along the tree H. This looks like a wave expanding away from U. Since there are no vertices of degree 1, every vertex in ST either corresponds to exactly one vertex in ST+1 (if it had degree 2), or creates multiple vertices in ST+1 (if it had degree 3 or higher). This is similar to a bacterial multiplication process, in which you start at T=0 with one bacterium, and when moving to each new time step, each existing bacterium either remains or multiplies. Crucially, the time period between multiplications is less than or equal to (about) M. So, after T reaches (about) εN/2, we must finish with at least (about)
2^( (εN)/(2M) )
vertices in the final ST. This lower bound comes from considering the worst case scenario, in which every multiplication is simply from a degree-3 vertex (one edge coming from the direction of U, and two edges going in the direction of the next S), in which case every bacterium multiplies by exactly 2.
However, there are only N vertices altogether, and therefore we obtain the inequality:
N ≥ 2^( (εN)/(2M) )
Let lg() denote the base-2 logarithm. Then, we have:
lg(N) ≥ (εN)/(2M)
which implies:
M ≥ (εN)/(2 lg(N)),
i.e., that there is a path in G of at least (εN)/(2 lg(N)) many degree-2 vertices. Intuitively, this is rather large, because lg(N) grows very slowly compared to N.
Now consider the graph G’ obtained by deleting those M vertices and the M+1 edges incident to them. This new graph still has no cycles of length less than εN, and so its excess must be less than or equal to F(N−M, εN). At the same time, since exactly one more edge than vertex was deleted, the excess of G’ is exactly F(N, εN) − 1 (recall F(N, εN) was the excess of G). Therefore,
F(N, εN) − 1 ≤ F(N−M, εN)
which implies that
F(N, εN) ≤ F(N−M, εN) + 1,
with a particular value of M that is at least (εN)/(2 lg(N)). This completes Case 2.
To finish this proof, recall that the conclusion of Case 1 was that F(N, εN) ≤ F(N−1, εN). Observe that in both cases, we obtain an upper bound for F(N, εN) in terms of an evaluation of F with a smaller first parameter (number of vertices), possibly with a +1 at the end. We can repeat this argument enough times until the number of vertices shrinks down to a convenient value.
In particular, it is easy to see that for any K < L, we have F(K, L) = 0, because in any graph which has K < L vertices, if there are no cycles of length less than L, that means there are actually no cycles of any length at all, which means that the maximum excess is 0 (achieved by a tree).
So, F(N, εN) is actually less than or equal to the number of times we do the +1 (Case 2) step, as we reduce the first parameter (number of vertices) until that first parameter falls below εN, because that final F(N’, εN) with N’ < εN is actually 0. Each Case 2 step corresponds to reducing the first parameter by at least about (εN)/(2 lg(N)). This bound holds even in later iterations because the number N’ of vertices will shrink below N, but the girth condition will remain at εN, and conveniently the only change to the bound on M when N’ is used is M ≥ (εN)/(2 lg(N’)), which is still at least (εN)/(2 lg(N)). Therefore, within about (2 lg(N))/ε Case 2 steps, the first parameter would hit 0, implying that it would definitely fall below εN within (2 lg(N))/ε Case 2 steps. Each Case 2 step contributes a +1, so F(N, εN) ≤ (2 lg(N))/ε, which for sufficiently large N is much better than the required bound of F(N, εN) ≤ εN.
If this argument is run more carefully, it can even be used to prove that there is a function f(N) which has asymptotic order of magnitude sqrt(N log(N)), such that any simple graph on N vertices with at least N + f(N) edges has two different simple cycles of equal length.