For many random Constraint Satisfaction Problems (CSP), such as random graph coloring, random k-SAT, random Max k -sat, and hypergraph 2-coloring, by now there exist asymptotically tight estimates for the largest constraint density for which typical instances have solutions (see [5]). At the same time, all known efficient algorithms for each of these problems fare very poorly, i.e., they stop finding solutions at constraint densities much lower than those for which we can prove that solutions exist. Adding insult to injury, for each problem the best known algorithm asymptotically fares no better than some extremely naive algorithm.
For example, it has been known for nearly twenty years [10] that the following very simple algorithm will find a satisfying assignment of a random k-CNF formula with m=rn clauses for r=O(2k/k): if there is a unit clause satisfy it; otherwise assign a random value to a random unassigned variable. While it is known that random k CNF remain satisfiable for r=Θ(2k), no polynomial-time algorithm is known to find satisfying assignments for r=(2k/k)⋅ω(k) for some function ω(k)→∞.
Similarly, for all k≥3, the following algorithm [18][2] will k -color a random graph with average degree d≤klnk: select a random vertex with fewest available colors left and assign it a random available color. While it is known that random graphs remain k-colorable for d∼2klnk, no polynomial-time algorithm is known to k-color a random graph of average degree (1+ϵ)klnk for some fixed ϵ>0 and arbitrarily large k. Equivalently, it is trivial to color a random graph using twice as many colors as its chromatic number, but no polynomial-time algorithm is known that gets by with (2−ϵ)χ colors, for some fixed ϵ>0.
Random k -SAT and random graph coloring are not alone. In fact, for nearly every random CSP of interest, the known results establish an analogous state of the art:
There is a trivial upper bound on the largest constraint density for which solutions exist.
There is a non-constructive proof, usually via the second moment method, that the bound from (1) is essentially tight, i.e., that solutions do exist for densities nearly as high as the trivial upper bound.
A simple algorithm finds solutions up to a constraint density much below the one from (2).
No polynomial-time algorithm is known to succeed for a density asymptotically greater than that in (3).
In this paper we prove that this is not a coincidence. Namely, for random graph coloring, random k-SAT, and random hypergraph 2-coloring, we prove that the point where all known algorithms stop is the point where the geometry of the space of solutions undergoes a dramatic change. This is known as a “dynamical” phase transition in statistical physics and our results establish rigorously for random CSPs a large part of the “I-step Replica Symmetry Breaking” hypothesis [20]. Roughly speaking, this hypothesis asserts that while the set of solutions for low densities looks like a giant ball, at some critical point this ball shatters into exponentially many pieces that are far apart from one another and separated by huge “energy barriers”, like an error-correcting code. Algorithms (even extremely simple ones) have no problem finding solutions in the “ball” regime, but no algorithm is known to find solutions in the “error-correcting code” regime.
We believe that the presence of dynamical phase transitions in random CSPs is a very general phenomenon whose qualitative characteristics should be problem-independent, i.e., universal. The fact that we can establish the exact same qualitative picture for a problem with binary constraints over k-ary variables (random graph k-coloring) and a problem with k-ary constraints over binary variables (hypergraph 2-colorability) lends support to this notion.
Perhaps the following is an intuitive model of how a dynamical phase transition comes about. In random graph coloring, rather than thinking of the number of available colors as fixed and the constraint density (number of edges) as in-creasing, imagine that we keep the constraint density fixed, but we keep decreasing the number of available colors. If we start with q available colors where q≫χ, it is reasonable to imagine that the set of valid (q -colorings, viewed as a subset of {1,2,…,q}n, has a nice “round” shape, the rounder the greater q is relative to χ. By the same token, when we restrict our attention to the set of those q-colorings that only use colors {1,2,…,q−1}, we are taking a “slice’ of the set of q colorings. With each slicing the connectivity of the set at hands deteriorates, until at some point the set shatters. As an analogy, slicing the 2-dimensional unit sphere through the origin yields a circle, but slicing the circle, yields a pair of points.
Having said the above, we wish to emphasize that determining the location of the dynamical phase transition of a given CSP requires non-trivial, problem-specific ideas and computations. In this paper we do this for the three problems mentioned above, allowing us to demonstrate that the transition coincides with the demise of algorithms.
We conclude the introduction with a few words about the technical foundation for our work. To prove the existence (and determine the location) of a dynamical phase transition one needs to access the uniform measure over the solutions of random CSP instances. A geometric way of thinking about this is as follows. Given a CSP instance, say a random k CNF formula with m clauses over n variables, consider the function H on {0,1}n that assigns to each truth assignment the number of clauses it violates. In this manner, H defines a “landscape” in which satisfying assignments correspond to (valleys at) sea-level. Understanding statistical properties of the uniform measure over solutions entails understanding “the view” one enjoys from the bottom of a random such valley, a probabilistically formidable task.
As we discuss in Section 4, we establish the following: if the number of solutions of a random CSP is sufficiently concentrated around its exponentially large expectation, then the view from a random sea-level valley is “the same” as the view from an “artificial” sea-level valley. That is, in terms of our random k: -CNF formula example, from the valley created by first selecting a random σ∈{0,1}n and then forming a random k CNF formula with m clauses chosen uniformly among the clauses satisfied by σ, i.e., the view from the planted satisfying assignment. This is a much easier view to understand and we believe that the “transfer” theorems we establish in this paper will significantly aid in the analysis of random CSPs in general.
SECTION 2
Statement of Results
To present our results in a uniform manner we need to introduce some common notions. Let V be a set of n variables, all with the same domain D, and let C be an arbitrary set of constraints over the variables in V. A CSP instance is a subset of C. We let dist (σ,τ) denote the Hamming distance between σ,τ∈Dn and we turn Dn into a graph by saying that σ,τ are adjacent if dist (σ,τ)=1. For a given instance I, we let H=HI: Dn→N be the function counting the number of constraints of I violated by each σ∈Dn.
Definition 1
We say that σ∈Dn is a solution of an instance I) if H(σ)=0, We will denote by S(I) the set of all solutions of an instance I. The clusters of an instance I are the connected components of S(I) A region is a non-empty union of clusters. The height of a path σ0,σ1,…,σt∈Dn is maxiH(σi).
We will be interested in distributions of CSP instances as the number of variables n grows. The set C=Cn will typically consist of all possible constraints of a certain type, e.g., the set of all (nk) possible hyperedges in the problem of 2-coloring random k-uniform hypegraphs. We let In,m denote the set of all CSP instances k with precisely m distinct constraints from Cn and we let Tn,m denote the uniform distribution on the set of all instances In,m. We will say that a sequence of events En holds with high probability (w.h.p.) if limn→∞Pr[En]=1 and with uniformly positive probability (w.u.p.p.) if liminfn→∞Pr[En]>0. As per standard practice in the study of random structures, we will take the liberty of writing In,m to also denote the underlying random variable and, thus, write things like “The probability that S(In,m)…” ‘
2.1 Shattering
Definition 2
We sav that the set of solutions of In,m shat- ters if there exist constants β,γ,ζ,θ>0 such that w.h.p. S(In,m) can be partitioned into regions so that:
The number of regions is at least eβn.
Each region contains at most an e−γn fraction of all solutions.
The Hamming distance between any two regions is at least ζn.
Every path between vertices in distinct regions has height at least θn.
Our first main result asserts that the space of solutions for random graph coloring, random k-SAT, and random hypergraph 2-colorability shatters and that this shattering occurs just above the largest density for which any polynomial-time algorithm is known to find solutions for the corresponding problem. Moreover, we prove that the space remains shattered until, essentially, the CSP's satisfiability threshold. More precisely:
A random graph with average degree d, i.e., m=dn/2, is w.h p, k-colorable for d≤(2−γk)klnk, where γk→0. The best rigorously analyzed poly-time k-coloring algorithm w.h. p. fails for d≥(1+δk)klnk, where δk→0.
Theorem 1
There exists a sequence ϵk→0, such that the space of k-colorings of a random graph with average degree d shatters for all
(1+ϵk)klnk≤d≤(2−ϵk)klnk.(1)
View Source
(1+\epsilon_{k})k\ln k\leq d\leq(2-\epsilon_{k})k\ln k.
\eqno{\hbox{(1)}}
A random k -CNF formula with n variables and rn clauses is w.h.p. satisfiable for r≤2kln2−k. The best rigorously analyzed poly-time satisfiability algorithm w.h.p. fails for r>2k+1/k. In [23], non-rigorous, but mathematically sophisticated evidence is given that a different algorithm succeeds for r=Θ((2k/k)lnk) but not higher.
Theorem 2
There exists a sequence ϵk→0 such that the space of satisfying assignments of a random k-CNF formula with rn clauses shatters for all
(1+ϵk)2kklnk≤r≤(1−ϵk)2kln2.(2)
View Source
(1+\epsilon_{k}){2^{k}\over k}\ln k\leq r\leq(1-\epsilon_{k})2^{k}\ln 2.
\eqno{\hbox{(2)}}
A random k-uniform hypergraph with n variables and rn edges is w.h.p. 2-colorable for r≤2k−1ln2−32. The best rigorously analyzed poly-time 2-coloring algorithm w.h.p. fails for r>2k/k. In [23], non-rigorous, but mathematically sophisticated evidence is given that a different algorithm succeeds for r=Θ((2k/k)lnk), but not higher.
Theorem 3
There exists a sequence ϵk→0 such that the space of 2-colorings of a random k-uniform hypergraph with rn edges shatters for all
(1+ϵk)2k−1klnk≤r≤(1−ϵk)2k−1ln2.(3)
View Source
(1+\epsilon_{k}){2^{k-1}\over k}\ln k\leq r\leq(1-\epsilon_{k})2^{k-1}\ln 2.
\eqno{\hbox{(3)}}
2.2 Rigidity
The regions mentioned in Theorems 1, 2, 3 can be thought of as forming an error-correcting code in the solution-space of each problem. To make this precise we need to introduce the following definition and formalize the notion of “a random solution of a random instance”.
Definition 3
Given an instance I, a solution σ∈S(I), and a variable v∈V., we say that v in (I,σ):
Is f(n) -rigid, if every τ∈S(I) such that τ(v)≠σ(v) has dist (σ,τ)≥f(n).
Is f(n) -loose, iffor every j∈D, there exists τ∈S(I) such that τ(v)=j and dist (σ,τ)≤f(n).
We will prove that in a typical solution, while before the phase transition every variable is loose, after the phase transition nearly every variable is rigid. To formalize the notion of a random/typical solution, recall that In,m denotes the set of all instances with m constraints over n variables and let Λ=Λn,m denote the set of all instance-solution pairs, i.e., Λn,m={(I,σ):I∈In,m,σ∈S(I)}. We let U=Un,m be the probability distribution induced on Λn,m by the following experiment:
Choose an instance I∈In,m uniformly at random. If S(I)≠∅, select σ∈S(I) uniformly at random.
We will refer to instance-solution pairs generated according to Un,m as uniform instance-solution pairs. We note that although the definition of uniform pairs allows for S(I) to be typically empty, i.e., to be in the typically unsatisfiable regime, we will employ the definition for constraint densities such that w.h.p. S(I) contains exponentially many solutions. Hence, our liberty in using the term a “typical” solution. At the same time, we emphasize that Un,m is in general not the uniform distribution over Λn,m.
Theorem 4
Let (I,σ) be a uniform instance-solution pair where:
I is a graph with dn/2 edges, where d is as in (1), and σ is a k-coloring of v I or
I is a k-CNF formula with rn clauses, where r is as in (2), and σ is a satisfying assignment o I, or,
I 's a k-uniform hypergraph with rn edges, where r is as in (3), and σ is a 2-coloring of>
W.h.p. the number of Ω(n)-rigid variables in (I,σ) is at least γkn, for some sequence γk→1.
The picture drawn by Theorem 4, whereby nearly all variables are rigid in typical solutions above the dynamical phase transition, is in sharp contrast with our results for densities below the transition for graph coloring and hypergraph 2-colorability. While we believe that an analogous picture holds for k -SAT, see Conjecture 1, for technical reasons we cannot establish this presently. (We discuss the general additional difficulties imposed by random k-SAT in Section 4.)
Theorem 5
Let (I,σ) be a uniform instance-solution pair where:
I is a graph with dn/2 edges, where d≤(1−ϵk)klnk, and σ is a k-coloring of v I or
I is a k-uniform hypergraph with rn edges, where r≤(1−ϵk)(2k−1/k)lnk, and σ is a 2-coloring o I.
There exists a sequence ϵk→0 such that w.h.p. every variable in (I,σ) is o(n)-loose.
We note that in fact, for all d and r as in Theorem 5, w.u.p.p. (I,σ) is such that setting the color of any vertex to any color only requires changing the color of O(logn) other vertices.
Conjecture 1
Let (I,σ) be a uniform instance-solution pair where I is a k-CNF formula with rn clauses, where r≤(1−ϵk)(2k/k)lnk, and σ is a satisfying assignment of I. There exists a sequence ϵk→0 such that w.h.p. every variable in (I,σ) is o(n)-loose.
SECTION 3
Background and Related Work
3.1 Algorithms
Attempts for a “quick improvement” upon either of the naive algorithms mentioned in the introduction for satisfi-ability/graph coloring stumble upon the following general fact. Given a CSP instance, consider the bipartite graph in which every variable is adjacent to precisely those constraints in which it appears, known as the factor graph of the instance. For random formulas/graphs, factor graphs are locally tree-like, i.e., for any arbitrarily large constant D, the depth D neighborhood of a random vertex is a tree w.h p, In other words, locally, random CSPs are trivial, e.g., random graphs of any finite average degree are locally 2-colorable. Moreover, as the constraint density is increased, the factor graphs of random CSPs get closer and closer to being biregular, so that degree information is not useful either. Combined, these two facts render all known algorithms impotent, i.e., as the density is increased, their asymptotic performance matches that of trivial algorithms.
In [22], Mezard, Parisi, and Zecchina proposed a new satisfiability algorithm called Survey Propagation (SP) which performs extremely well experimentally on instances of random 3-SAT. This was very surprising at the time and allowed for optimism that, perhaps, random k -SAT instances might not be so hard. Later, SP was extended to other problems, e.g., k-coloring [9] and Max k -SAT [8]. An experimental evaluation of SP for values of k even as small as 5 or 6 is already somewhat problematic, but to the extent it is reliable it strongly suggests that SP does not find solutions for densities as high as those for which solutions are known to exist. The problem seems to lie with the “dec-imation” aspect of the algorithm, i.e., the process of repeatedly selecting (and setting permanently) the most “biased” variables in the formula (we comment on this further be-low). Perhaps more importantly, it can be shown that for densities at least as high as 2kln2−k, if SP can succeed at its main task (approximating the marginal probability distribution of the variables with respect to the uniform measure over an approximation of the cluster projections), then so can a much simpler algorithm, namely Belief Propagation (BP), i.e., dynamic programming on trees (for approximating the marginal probability distribution of the variables with respect to the uniform measure over satisfying assign-ments).
The trouble is that to use either BP or SP to find satisfying assignments one sets variables iteratively. So, even if it is possible to compute approximately correct marginals at the beginning of the execution (for the entire formula), this can stop being the case after some variables are set. Concretely, in [23], Montanari et al. showed that (even within the relatively generous assumptions of statistical physics computations) the following Gibbs-sampling algorithm fails above density e(2k/k), i.e., step 2 below fails to converge after only a small fraction of all variables have been assigned a value:
Select a variable v at random.
Compute the marginal distribution of v using Belief Propagation.
Set v to {0,1} according to the computed marginal distribution; simplify the formula; go to step 1.
3.2 Relating the Uniform and the Planted Model
The idea of deterministically embedding a property inside a random structure is very old and, in general, the process of doing this is referred to as “planting” the property. In our case, we plant a solution σ in a random CSP by only including constraints compatible with σ. Juels and Peinado [19] were perhaps the first to explore the relationship between the planted and the uniform model and they did so for the clique problem in dense random graphs Gn,1/2, i.e., where each edge appears independently with probability 1/2. They showed that the distribution resulting from first choosing G=Gn,1/2 and then planting a clique of size (1+ϵ)log2n is very close to Gn,1/2 and suggested this as a scheme to obtain a one-way-function. Since the planted clique has size only (1 +ε)log2n, the basic argument in [19] is closely related to subgraph counting. In contrast, the objects under consideration in our work (k-colorings, satisfying assignments, etc.) have an immediate impact on the global structure of the combinatorial object being considered, rather than just being local features, such as a clique on O(logn) vertices.
Coja-Oghlan, Krivelevich, and Vilenchik [12][13] proved that for constraint densities well above the threshold for the existence of solutions, the planted model for k-coloring and k SAT is equivalent to the uniform distribution conditional on the (exponentially unlikely) existence of at least one solution. In this conditional distribution as well as in the high-density planted model, the geometry of the solution space is very simple, as there is precisely one cluster of solutions, in stark contrast with the regime we analyze.
3.3 Solution-space Geometry
In [7][21] the first steps were made towards understanding the solution-space geometry of random k: -CNF formulas by proving the existence of shattering and the presence of rigid variables for r=Θ(2k). This was a far cry from the true r∼(2k/k)lnk threshold for the onset of both phenomena, as we establish here. Besides the quantitative aspect, there is also a fundamentally important difference in the methods employed in [7][21] vs. those employed here. In those works, properties such as the existence of frozen variables were shown to hold for all satisfying assignments and were correspondingly established by taking a union bound over all satisfying assignments. It is not hard to show that the derived results are best possible using those methods and, in fact, there is good reason to believe that the results are genuinely tight, i.e., that for densities o(2k) the derived properties simply do not hold for all satisfying assignments. Here, we instead establish a systematic connection between the planted model and random solutions of random instances. This argument allows us to analyze “typical” solutions while allowing for the possibility that a (relatively small, though exponential) number of “atypical” solutions exist. Therefore, we are for the first time in a position to analyze the extremely complex energy landscape of below-threshold instances of random CSPs, and to estimate quantities that appeared completely out of reach prior to this work.
SECTION 4
Our Point of Departure: Symmetry, Randomness and Inversion
As mentioned, the results in this paper are enabled by a set of technical lemmas that allow one to reduce the study of “random solutions of random CSP instances” to the study of “planted CSP solutions”. The conceptual origin of these lemmas can be traced to the following humble observation.
Let M be an arbitrary 0–1 matrix with the property that all its rows have the same number of 1 s and all its columns have the same the number of Is. A moment's reflection makes it clear that for such a matrix, both of the following methods select a uniformly random 1 from the entire matrix (and are therefore equivalent):
Select a uniformly random column and then a uniformly random 1 in that column.
Select a uniformly random row and then a uniformly random 1 in that row.
An example of how we employ this fact for random CSPs is as follows. Let F be the set of all k CNF formulas with n variables and m distinct clauses (chosen among all 2k(nk) possible k clauses). Say that σ∈{0,1}n NAE-satisfies a formula F∈F if under σ, every clause of F has at least one satisfied and at least one falsified literal. Let M be the 2n×|F| matrix where Mσ,F=1 iff σ∈{0,1}n NAE-satisfies F. By the symmetry o F, it is clear that all rows of M have the same number of 1s. Imagine, for a moment, that the same was true for all columns. Then, a uniformly random solution of a uniformly random instance would be distributed exactly as a “planted” instance-solution pair: first select σ∈{0,1}n uniformly at random; then select m distinct clauses uniformly at random among all 2k−1(nk) clauses NAE-satisfied by σ.
Our contribution begins with the realization that exact rowand column-balance is not necessary. Rather, it is enough for the 1 s in M to be “well-spread”. More precisely, it is enough that the marginal distribution induced on the rows of M by selecting a uniformly random 1 from the entire matrix to be “reasonably close to” uniform, and that exactly the same holds for the columns of M. For example, assume we can prove that Ω(|F|) columns of M have Θ(f(n)) 1s, where f(n) is the average number of Is per column in the entire matrix. Indeed, this is precisely the kind of property implied by the success of the second moment method for random NAE-k-SAT [3]. Under this assumption, proving that a property holds w.u.p.p. for a uniformly random solution of a uniformly random instance, reduces to proving that it holds w.h.p. for the planted solution of a planted instance, a dramatically simpler task.
There is a geometric intuition behind our transfer theorems and it is more conveniently described when every constraint is included independently with the same probability p=m/(2k(nk)). For all k≥3 and m=rn, it was shown in [3] that the resulting NAE-k-SAT instances W.U. p.p. have exponentially many solutions for r≤2k−1ln2−3/2. Consider now the following way of generating planted NAE k-SAT instances. First, select a formula F by including each clause with probability p, exactly as above. Then, select σ∈{0,1}n uniformly at random and remove from F all constraints violated by σ. Call the resulting instance F′. Our results say that as long as q≡r(1−2−k+1)≤2k−1ln2−3/2, the instance F′ is “nearly indistinguish-able” from a uniform instance created by including each clause with probability q. (We will make this statement precise shortly.)
To see how this happens, recall the function H:σ→N counting the number of violated constraints under each assignment. Clearly, selecting F specifies such a function HF, while selecting σ∈{0,1}n and removing all constraints violated by σ amounts to modifying HF so that HF(σ)=0. One can imagine that such a modification creates a gradient in the vicinity of σ, a “crater” with σ at its bottom. What we prove is that as long as HF already had an exponential number of craters and the number of craters is concentrated, adding one more crater does not make a big difference. Of course, if the density is increased further, the opened crater becomes increasingly obvious, as it takes a larger and larger cone to get from the typical values of HF down to O. This observation also relates to the ease with which algorithms solve planted instances of high density.
To prove our transfer theorems we instantiate the above idea for random graph k-coloring, random k-uniform hypergraph 2-coloring, and random k-SAT. A crucial step for this is deriving a lower bound on the number of solutions of a random instance. For example, in the case of random graph k-coloring, we prove that the number of k-colorings, |S(In,m)|, for a random graph with n vertices and m edges is “concentrated” around its expectation in the sense that w.h.p.n−1|ln|S(In,m)|−lnE|S(In,m)∥=o(1).(4)
View Source
n^{-1}\vert \ln\vert {\cal S}(I_{n, m})\vert -\ln {\bf E}\vert {\cal
S}(I_{n, m})\Vert =o(1).
\eqno{\hbox{(4)}}
To prove this, we use the upper bound on the second moment E[|S(In,m)|2] from [4] to show that w.u.p.p. |S(In,m)|=Ω(E|S(In,m)|). Then, we perform a sharp threshold analysis using theorems of Friedgut [15], to prove that (4) holds, in fact, with high probability. A similar approach applies to hypegraph 2-coloring.
The situation for random k SAT is more involved. In-deed, we can prove that the number of satisfying assignments is not concentrated around its expectation in the sense of (4). This problem is mirrored by the fact that the second moment of the number of satisfying assignments exceeds the square of the first moment by an exponential factor (for any constraint density). Nonetheless, letting Fk(n,m) denote a uniformly random k-CNF formula with n variables and m clauses, combining techniques from [6] with a sharp threshold analysis, we can derive a lower bound on the number of satisfying assignments that holds w.h.p., namely n−1ln|S(Fk(n,m))|≥n−1lnE|S(Fk(n,m))|−ϕ(k), where ϕ(k)→0 exponentially with k. This estimates allows us to approximate the uniform model by the planted model sufficiently well in order to establish Theorems 2 and 4.
In this section we give proof sketches of our results for k-coloring to offer a feel of the transfer theorems and of the style of the arguments one can employ given those the-orems. The proofs for hypergraph 2-coloring are relatively similar, as it is also a “symmetric” CSP and the second moment methods works on its number of solutions. For k-SAT, though, a significant amount of additional work is needed, as properties must be established with exponentially small error probability to overcome the large deviations in the number of satisfying assignments.
5.1 The Transfer for Random Graph Coloring
We consider a fixed number ε>0 and assume that k≥k0 for some sufficiently large k0=k0(ε). We denote { 1, …,k} as [k]. We are interested in the probability distribution Un,m on Λn,m resulting from first choosing a random graph G=G(n,m) and then a random k-coloring of G (if one exists). To analyze this distribution, we consider the distribution Pn,m on Λn,m induced by following expermient.
Generate a uniformly random k-partition σ∈[k]n.
Generate a graph G with m edges chosen uniformly at random among the edges bicolored under σ.
Output the pair (G,σ).
The distribution Pn,m is known as the planted model.
Theorem 6
Suppose that d=2m/n≤(2−ε)klnk. There exists a function f(n)=o(n) such that the following is true. Let D be any graph property such that G(n,m) has D with probability 1−o(1), and let E be any property of pairs (G,σ)∈Λn,m. If for all sufficiently large n, PrPnm[(G,σ) has E|G has D]≥1−exp(−f(n)),(5)
View Source
{\Pr}_{P_{nm}}[(G, \sigma)\ { has}\ {\cal E}\vert G\ { has}\ {\cal D}]\geq 1-\exp(-f(n)), \eqno{\hbox{(5)}}
then PrUn,m[(G,σ) has E]=1−o(1).
5.2 Loose Variables Below the Transition
Suppose that d≤(1−ε)klnk. Recall that a graph with vertex set V is said to be ζ -choosable if for any assignment of color lists of length at least ζ to the elements of V, there is a proper coloring in which every vertex receives a color from its list. To prove Theorem 5, we consider the property E that all vertices are o(n) -loose and the following condition D:
For any set S⊂V of size |S|≤g(n) the subgraph induced on S is 3-choosable. Here g(n) is some function such that f(n)≪g(n)=o(n), where f(n) is the function from Theorem 6. A standard argument shows that a random graph G(n,m) with m=O(n) satisfies D w.h.p.
By Theorem 6, we are thus left to establish (5). Let σ∈[k]n be a uniformly random k-partition, and let G be a random graph with m edges such that σ is a k-coloring of G. Since σ is uniformly random, we may assume that the color classes Vi=σ−1(i) satisfy |Vi|∼n/k. Let v0∈V be any vertex, and let l≠σ(v0) be the “target color” for v0. Our goal is to find a coloring τ such that τ(v0)=l and dist (σ,τ)≤g(n).
If v0 has no neighbor in V1, then we can just assign color l to v0. Otherwise, we run the following process. In the course of the process, every vertex is either awake, dead, or asleep. Initially, all the neighbors of v0 in Vl are awake, v0 is dead, and all other vertices are asleep. In each step of the process, pick an awake vertex w arbitrarily and declare it dead (if there is no awake vertex, the process terminates). If there are at least four colors c1(w),…,c4(w) available such that w has no asleep neighbor in Vci(w), then we do nothing. Otherwise, we pick four colors c1(w),…,c4(w) randomly and declare all asleep neighbors of w in Vcj(w) awake for 1≤j≤4.
Lemma I
With probability at least 1−exp(−f(n)) there are at most g(n) dead vertices when the process terminates.
Proof sketch. We relate the above process to a subcritical branching process. To this end, we bound the expected “off-spring” (i.e., number of new awake vertices) generated in any step of the process. Suppose that in some step of the process an awake vertex w is chosen. To bound the expected offspring, we basically observe that when d<(1−ε)klnk it is very likely that a vertex w has four immediately available colors, and thus no offspring will be generated at all. More precisely, the number of neighbors of w in any class Vi with i≠σ(w) is asymptotically Poisson with mean 2m(k−1)n≤(1−ε)klnkk−1.
View Source
{2m\over (k-1)n}\leq(1-\varepsilon){k\ln k\over k-1}.
Hence, the probability that w does not have a neighbor in Vi is asymptotically equal to πk=exp(−2m(k−1)n)≥kε/2−1
View Source
\pi_{k}=\exp\left(-{2m\over (k-1)n}\right)\geq k^{\varepsilon/2-1}
(for sufficiently large k). As there are k colors in total, the expected number of colors i≠σ(w) such that w has no neighbor in Vi is asymptotically Poisson with mean (k−1)πk≥kε/3. Consequently, the probability that w has at least than four available colors is at least 1−Pr[Poisson(kε/3)<4]≥1−exp(−kε/4),(6)
View Source
1-\Pr\Big[{\rm Poisson}(k^{\varepsilon/3}) < 4\Big]\geq 1-\exp(-k^{\varepsilon/4}), \eqno{\hbox{(6)}}
and in this case w does not produce any offspring at all. If w has fewer than four available colors, then the number of neighbors of w in a randomly chosen color class is stochastically dominated by a Poisson distribution with mean klnk conditioned on being at least one. Hence, in this case we can bound the expected number of newly awake vertices by 4⋅2klnk. Thus, (6) entails that the expected number of offspring vertices is at most 8klnk⋅exp(−kε/4)<1.
As a consequence, the total number of dead vertices at the end of the process is dominated by the total number of offspring generated by a branching process with rate at most 8klnk⋅exp(−kε/4)<1. Therefore, standard tail bounds for branching processes imply the assertion. □
To obtain a new coloring τ in which v0 takes color l we consider the set D of all dead vertices. We let τ(u)=σ(u) for all u∈V∖D. Moreover, conditioning on the event D, we can assign color l to v0 and a color from the list {c1(w),…,c5(w)}∖{l} to each w∈D∖{v0}. Thus, the new coloring τ differs from σ on at most |D|≤g(n)=o(n) vertices.
5.3 Rigid Variables Above the Transition
Suppose that d≥(1+ε)klnk. To prove Theorem 4 for coloring we apply Theorem 6 as follows. We let α,β>0 be sufficiently small numbers and denote by E the following property of a pair (G,σ)∈Λn,m:There is a subgraph G∗⊂ G of size |V(G∗)|≥(1 −α)n such that for every vertex v of G∗ andeach color i≠σ(v) there are at least βlnk verticesw in G∗ that are adjacent → v such that σ(w)=i.(7)
View Source
\eqalignno{&{\rm There\ is\ a\ subgraph}\ G_{\ast}\subset\ G\ {\rm
of\ size}\ \vert V(G_{\ast})\vert \geq \cr
&(1\ -\alpha)n\ {\rm such\ that\ for\ every\ vertex}\ v\ {\rm of}\
G_{\ast}\ {\rm and}\cr
&{\rm each\ color}\ i\neq\sigma(v)\ {\rm there\ are\ at\ least}\
\beta\ln k\ {\rm vertices}&\hbox{(7)}\cr
&w\ {\rm in}\ G_{\ast}\ {\rm that\ are\ adjacent\ \to}\ v\ {\rm such\ that}\ \sigma(w)=i.}
Also, we let D be the property that the maximum degree is at most (lnn)2.
Lemma 2
Condition (5) holds for D and E as above.
Proof sketch. Let (G,σ)∈Λn,m be a random pair chosen from the distribution Pn,m. We may assume that |σ−1(i)|∼n/k for all i. To obtain the graph G∗, we perform a “stripping process”. As a first step, we obtain a subgraph H by removing from G all vertices that have fewer than γlnk neighbors in any color class other than their own. If γ=γ(ε) is sufficiently small, then the expected number of vertices removed in this way is less than nk−δ for some fixed δ>0, because for each vertex w the expected number of neighbors in another color class is asymptotically Poisson with mean at least (1+ε)lnk. Then, we keep removing vertices from H that have “a lot” of neighbors outside of H. Given the event D, we can show that with probabiltiy 1−exp(−Ω(n)) the final result of this process is a subgraph G∗ that satisfies (7). □
Furthermore, the following lemma follows from a standard “first moment” argument.
Lemma 3
W.h.p. the random graph G=Gn,m _. has thetol-
There is no set S of vertices of size |S|≤n/(klnk)such that S spans at least 12|S|βlnk edges.(8)
View Source
\eqalignno{&{ There\ is\ no\ set}\ S\ { of\ vertices\ of\ size}\ \vert S\vert \leq n/(k\ln k)\cr
&{ such\ that}\ S\ { spans\ at\ least}\ {1\over 2}\vert S\vert \beta\ln k\ { edges}.
&\hbox{(8)}}
To complete the proof of Theorem 4 for coloring, consider a pair (G,σ) chosen from Un,m. By Lemma 2 and
Theorem 6 there is a subgraph G∗ as in (7) w.h’ p ‘Moreover, by Lemma 3 we may assume that G satisfies (8). Now, assume for contradiction that G has a k-coloring τ≠σ such that the set U={v∈G∗:σ(v)≠τ(v)} has size |U|≤n/(klnk). Let U+i={v∈G∗:τ(v)=i≠σ(v)},U−i={v∈G∗:σ(v)=i≠τ(v)}
View Source
\eqalignno{&U_{i}^{+} = \{v\in G_{\ast}:\tau(v)=i\neq\sigma(v)\},\cr
&U_{i}^{-} = \{v\in G_{\ast}:\sigma(v)=i\neq\tau(v)\}}
for 1≤i≤k. Then |U|=∑i=1k|U+i|=∑i=1k|U−i|.(9)
View Source
\vert U\vert =\displaystyle\sum_{i=1}^{k}\vert U_{i}^{+}\vert =\sum_{i=1}^{k}\vert U_{i}^{-}\vert.
\eqno{\hbox{(9)}}
Every vertex v∈G∗∖σ−1(i) has at least βlnk neighbors in G∗∩σ−1(i). Hence, if v∈U+i, then all of these neighbors lie inside of U−i. We claim that this implies that |U+i|<|U−i|; for assume that U+i≥U−i. Set S=U+i∪U−i. Then |S|≤|U|≤n/(klnk), and S spans at least |S|β2lnk edges, in contradiction to (8). Thus, we conclude that |U+i|<|U−i| for all i, in contradiction to (9). Hence, all the vertices in G∗ are nklnk -rigid.
5.4 Proof of Theorem 1
Theorem 1 concerns the “view” from a random coloring σ Of G(n,m). Basically, our goal is to show that only a tiny fraction of all possible colorings are “visible” from σ, i.e., σ lives in a small, isolated valley. To establish the theorem, we need a way to measure how “close” two colorings σ,τ are. The Hamming distance is inappropriate here because two colorings σ,τ can be at Hamming distance n, although τ simply results from permuting the color classes of σ, i.e., although σ and τ are essentially identical. Instead, we shall use the following concept. Given two coloring σ,τ, we let Mσ,τ=(Mijσ,τ)1≤i,j≤k be the matrix with entries Mijσ,τ=n−1|σ−1(i)∩τ−1(j)|.
View Source
M_{\sigma,\tau}^{ij}=n^{-1}\vert \sigma^{-1}(i)\cap\tau^{-1}(j)\vert.
To measure how close τ is to σ we let fσ(τ)=∥Mσ,τ∥2F=∑i,j=1k(Mijσ,τ)2
View Source
f_{\sigma}(\tau)=\Vert
M_{\sigma,\tau}\Vert_{F}^{2}=\displaystyle\sum_{i, j=1}^{k}(M_{\sigma,\tau}^{ij})
^{2}\,
be the squared Frobenius norm of Mσ,τ. Observe that this quantity reflects the probability that a single random edge is monochromatic under both σ and τ, i.e., the correlation of σ and τ, precisely as desired. Hence, fσ is a map from the set [k]n of k-partitions to the interval [k−2,fσ(σ)], where fσ(σ)≥k−1. Thus, the larger fσ(τ), the more τ resembles σ. Furthermore, for a fixed σ∈S(G) and a number λ>0 we let gσ,G,λ(x)=|{τ∈[k]n:fσ(τ)=x∧H(τ)≤λn}|.
View Source
g_{\sigma, G,\lambda}(x)=\vert \{\tau\in[k]^{n}:f_{\sigma}(\tau)=x\wedge H(\tau)\leq\lambda n\}\vert.
lowing property.
In order to show that S(Gn,m) with m=dn/2 decomposes into exponentially many regions, we employ the following lemma.
Lemma 4
Suppose that d>(1+εk)klnk. There are num-bers k−2<y1<y2<k−1 and λ,γ>0 such that with high probability a pair (G,σ)∈Λn,m chosen from the distributoin Un,m has the following two properties.
For all x∈[y1,y2] we have gσ,G,λ(x)=0.
The number of colorings τ∈S(G) such that fσ(τ)>y2 is at most exp(−γn)⋅|S(G)|.
Let G=Gn,m be a random graph and call σ∈S(G) good if both (1) and (2) hold. Then Lemma 4 states that w.h. p • a 1 −o(1) fraction of all σ∈S(G) are good. Hence, to decompose S(G) into regions, we proceed as follows. For each a∈S(G) we let Cσ={τ∈S(G):fσ(τ)>y2}. Then starting with the set S=S(G) and removing iteratively some Cσ for a good σ∈S yields an exponential number of regions. Furthermore, each such region Cσ is separated by a linear Hamming distance from the set S(G)∖Cσ. This is because fσ is “continuous” with respect to n−1× Hamming distance: for any ξ>0 there is η>0 such that for any two colorings τ,τ′ with dist (τ,τ′)<ηn we have |fσ(τ)−fσ(τ′)|<ξ. Thus, Theorem 1 follows from Lemma 4.
Finally, by Theorem 6, to prove Lemma 4 it is sufficient to show the following.
Lemma5
Suppose that d>(1+εk)klnh. There are k−2<y1<y2<k−1 and λ,γ>0 such that with probability at least 1−exp(−Ω(n)) a pair (G,σ)∈Λn,m chosen from the distributoin Pn,m has the two properties stated in Lemma 4.
The proof of Lemma 5 is based on the “first moment method”. That is, for any k−2<y<k−1 we compute the expected number of assignments τ∈[k]n such that fσ(τ)=y and H(τ)≤λn. This computation is feasible in the planted model and yields similar expressions as encountered in [4] in the course of computing the second moment of the number of k-colorings. Therefore, we can show that the expected number of such assignments τ is exponentially small for a regime y1<y<y2, whence Lemma 5 follows from Markov's inequality.