Hardness from Densest Subgraph
Conjectures
Yonatan Naamad
A Dissertation
Presented to the Faculty
of Princeton University
in Candidacy for the Degree
of Doctor of Philosophy
Recommended for Acceptance
by the Department of
Computer Science
Adviser: Moses Samson Charikar
November 2017
© Copyright by Yonatan Naamad, 2017.
All rights reserved.
Abstract
Karp’s seminal paper on NP-Completeness provided computer scientists with a
toolkit for showing computational hardness, conditioned on a complexity theoretic
conjecture, for a wide variety of optimization problems. However, the techniques
used in his paper say little about the hardness of approximating the optimal objec-
tive values of these problems. While researchers have since managed to find tight
bounds on the approximability of some of these NP-hard problems, many others
still have embarrassingly large gaps between their known upper and lower limits of
approximation.
To fill in these gaps, researchers have posed myriad new complexity theoretic
conjectures that entail stronger lower bounds at the expense of more demanding hy-
potheticals. One class of conjectures focuses on the approximability of the Densest
kSubgraph (DkS) problem. Namely, the conjectured hardness of both adversarial
and average-case variants of DkS has been shown to imply stronger hardness for
many problems for which good approximation algorithms have proven elusive.
In this dissertation, we study a variety of problems, each of which can have their
hardness tied back to that of Densest kSubgraph. We describe techniques for
proving stronger hardness results conditioned on both the known and the conjectured
hardness of DkS. We show how previous results on DkS imply stronger lower bounds
on Min-Rep-hard” problems, as well as a simple technique for converting many
proofs of Min-Rep-hardness into proofs of DkS-hardness. Using this and other tech-
niques, we show DkS hardness for problems such as Target Set Selection, Min-
imum Monotone Satisfying Assignment, Min-Cost Colored Cut, Net-
work Flow Interdiction, and Densest k-Common Subgraph, among others.
In the case of Target Set Selection and Minimum Monotone Satisfying
Assignment, we show how to obtain substantially stronger lower bounds by exploit-
ing the self-similar structure of instances conjectured to be average-case-hard. We
iii
introduce the Middlebox Routing-class of problems, and show show exact, ap-
proximation, and hardness results for its member problems. We provide an O(
n)
approximation algorithm for MkU, and discuss both approximation and hardness
results for Densest Common Subgraph variants.
iv
Acknowledgements
I am very grateful for all the people who have helped and inspired me during my time
as a student. First and foremost, I would like to thank my advisor, Moses Charikar,
for his guidance and support both in and out of the research context. Moses’s unique
ability to identify, simplify, and clearly explain the core components of algorithms,
theorems, and even entire research areas alike has had a profound impact on the way
I approach new problems.
In addition to Moses, a number of other researchers have taken on an unofficial
advising role throughout my academic career. In particular, I’d like to thank Sanmay
Das for inspiring me to go into research, Huy Nguyen for being an endless source of
sage advice, and Tony Wirth for taking the term “academic older brother” to heart,
as exemplified through his constant readiness to provide tips and guidance during and
between our many unusually-houred Skype calls. Additionally, I would like to thank
the rest of my PhD committee: Sanjeev Arora, Mark Braverman, and Matt Weinberg,
for their time and advice which proved critical in getting this thesis completed.
I’d like to thank everyone who made my time at Princeton, Stanford, and, for
a couple of weeks, at the University of Melbourne, enjoyable and pain-free. Mitra
Kelly, Melissa Lawson, Nicki Gotsis, Megan Harris, and Ruth Harris all helped take
care of so much of the red tape associated with the numerous transitions I’ve had
throughout graduate school. On a more personal level, I was fortunate to have the
friendship and moral support a number of exceptionally gifted researchers from both
Princeton and Stanford (many now at MIT), without which the completion of this
thesis would have been impossible. I would also like to thank everyone who helped
me succeed in my internship at Amazon, primarily my former and current manager,
Nina Mishra.
I’m deeply indebted to my parents for their endless affection and support. This
dissertation would not have been possible without those late night math riddles they
v
gave me a kid, their encouragement of my middle-school obsession with programming,
and their tireless work at instilling in me an unshakable love of mathematics. Finally,
I would like to thank Julia for being a source of strength over this last year, as well
as for dealing with me as I spent countless hours hunched away writing this thesis.
vi
To my parents.
vii
Contents
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
List of Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
1 Introduction 1
1.1 Core Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Densest kSubgraph . . . . . . . . . . . . . . . . . . . . . 3
1.1.2 Smallest kEdge Subgraph . . . . . . . . . . . . . . . . . 7
1.1.3 LabelCover-Min and Min-Rep . . . . . . . . . . . . . . . 7
1.2 Outline of Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Common notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2 Min-Rep and DkS 13
2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 DkS
C
-hardness of Min-Rep . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 From Min-Rep-hardness to DkS-hardness . . . . . . . . . . . . . . 16
3 Target Set Selection 19
3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2 TSS and Min-Rep . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2.1 Min–k–Rep hardness of TSS . . . . . . . . . . . . . . . . . . 23
viii
3.3 Hardness from Planted DkS Instances . . . . . . . . . . . . . . . . . 24
3.3.1 Hardness of TSS
4
. . . . . . . . . . . . . . . . . . . . . . . . 24
3.3.2 Recursive Planting . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3.3 Hardness of TSS
r
. . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3.4 Hardness of unbounded-round TSS . . . . . . . . . . . . . . . 31
3.4 Approximating TSS
O(1)
. . . . . . . . . . . . . . . . . . . . . . . . . 34
3.5 Dynamic Target Set Selection . . . . . . . . . . . . . . . . . . 37
4 Densest Common Subgraph 44
4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.2 Approximating DCS-MA . . . . . . . . . . . . . . . . . . . . . . . . 46
4.3 Lower bounds for DCS-MA . . . . . . . . . . . . . . . . . . . . . . . 49
4.3.1 Integrality Gap . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.3.2 Min-Rep hardness . . . . . . . . . . . . . . . . . . . . . . . . 51
4.3.3 Planted DkS hardness . . . . . . . . . . . . . . . . . . . . . . 53
4.4 Hardness of DCS-AM . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.5 Algorithms for DCS-AM with Small T . . . . . . . . . . . . . . . . . 58
5 Middlebox Routing and Placement 60
5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.2 Solving Middlebox Routing . . . . . . . . . . . . . . . . . . . . . 65
5.2.1 A Multiplicative Weights Approach . . . . . . . . . . . . . . . 68
5.3 Network Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.3.1 Algorithms for Middlebox Placement . . . . . . . . . . . . . . 75
5.3.2 Hardness of Middlebox Placement . . . . . . . . . . . . . . . 80
6 Other Problems 85
6.1 MMSA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.1.1 MMSA and Min-Rep . . . . . . . . . . . . . . . . . . . . . . 86
ix
6.1.2 Hardness from Planted DkS Instances . . . . . . . . . . . . . 87
6.2 Min-Query Q&A . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.3 Min-Cost Colored Cut . . . . . . . . . . . . . . . . . . . . . . . 92
6.3.1 CMMSA hardness of Min-Cost Colored Cut . . . . . . . 93
6.4 Min-k-Union . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.4.1 An O(
m)-approximation for Min-k-Union . . . . . . . . . . 96
6.5 Network Flow Interdiction . . . . . . . . . . . . . . . . . . . . 99
6.5.1 SkES hardness of Network Flow Interdiction . . . . . 100
6.5.2 Complete Inapproximability of MDNFI . . . . . . . . . . . . 100
Bibliography 104
x
List of Figures
1.1 Diagram of Planted DkS. . . . . . . . . . . . . . . . . . . . . . . . 4
3.1 A directed edge gadget for TSS. . . . . . . . . . . . . . . . . . . . . . 21
3.2 Sample of the first few layers of the TSS
r
to TSS reduction. . . . . . 32
3.3 Brute forcing Set Cover with DTSS. . . . . . . . . . . . . . . . . . 41
4.1 Integrality gap instance for DCS LP . . . . . . . . . . . . . . . . . . . 51
5.1 Sample instance of Middlebox Routing . . . . . . . . . . . . . . . 63
5.2 Hard instances of directed Min-MP and Max-MP . . . . . . . . . . 80
6.1 NIH table for computing heart health . . . . . . . . . . . . . . . . . . 89
6.2 An example DAG for computing heart health. . . . . . . . . . . . . . 89
xi
List of problems studied
1.1 Densest kSubgraph (DkS) . . . . . . . . . . . . . . . . . . . . . 3
1.2 Densest kSubgraph with Perfect Completeness (DkS
C
) . . 6
1.3 Smallest kEdge Subgraph (SkES) . . . . . . . . . . . . . . . . 7
1.4 Min-Rep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1 Min–k–Rep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.1 Target Set Selection (TSS) . . . . . . . . . . . . . . . . . . . . 19
3.2 r-round Target Set Selection (TSS
r
) . . . . . . . . . . . . . . 20
3.3 Dynamic Target Set Selection (DTSS) . . . . . . . . . . . . . 38
3.4 Dynamic Target Set Selection-Reachability (DTSS-R) . . 40
4.1 Densest Common Subgraph-MA (DCS-MA) . . . . . . . . . . . 45
4.2 Densest Common Subgraph-AM (DCS-AM) . . . . . . . . . . . 45
4.3 Densest k-Common Subgraph-MA (DkCS-MA) . . . . . . . . . 46
5.1 Middlebox Routing (MR) . . . . . . . . . . . . . . . . . . . . . . 62
5.2 Dependency Middlebox Routing (DMR) . . . . . . . . . . . . . 63
5.3 Min-Middlebox Placement (Min-MP) . . . . . . . . . . . . . . . 64
5.4 Max-Middlebox Placement (Max-MP) . . . . . . . . . . . . . . 64
5.5 Dependency Min-Middlebox Placement (Min-MP) . . . . . . 65
6.1 Circuit MMSA (CMMSA) . . . . . . . . . . . . . . . . . . . . . . 86
xii
6.2 Formula MMSA (FMMSA) . . . . . . . . . . . . . . . . . . . . . 86
6.3 Min-Query Q&A (MQQA) . . . . . . . . . . . . . . . . . . . . . . 91
6.4 Min-Cost Colored Cut (MCC) . . . . . . . . . . . . . . . . . . . 93
6.5 Min-Certificate Circuit Monotone Satisfying Assignment
(MCCMSA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.6 Min-k-Union (MkU) . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.7 Network Flow Interdiction (NFI) . . . . . . . . . . . . . . . . 99
6.8 Max-Difference (MDNFI) . . . . . . . . . . . . . . . . . . . . . . 99
6.9 R-Interdiction Covering (RIC) . . . . . . . . . . . . . . . . . . 100
xiii
Chapter 1
Introduction
When studying optimization problems that are hard to solve exactly, it is natu-
ral to ask how well we can approximate the value of the optimal solution. While
some problems (such as Knapsack) admit efficient algorithms with arbitrarily good
approximation ratios [67], many others have been shown to have limits on their ap-
proximability. In 1996, Arora and Lund classified the known hardness results into
four groups [16]:
1. Class I problems - those showing constant-factor hardness, as in Vertex
Cover
2. Class II problems - those showing O(log n)-factor hardness, as in Set Cover
3. Class III problems - those showing O(2
log
1
n
)-factor hardness, as in Label
Cover
4. Class IV problems - those showing n
-factor hardness, as in Max Clique
Although later discoveries (such as the polylogarithmic hardness of Group
Steiner Tree [62] and the tight log
n hardness of Asymmetric k–Center [42])
have since refined our understanding of hardness results, the majority of results over
the last few years still classified problems into one of the above four classes.
1
In 2006, Khot showed Class I-type hardness for the Densest kSubgraph
(DkS) problem assuming that NP does not have algorithms that run in random-
ized subexponential time [72]. In 2016, first Braverman et. al. [25] and then Ma-
nurangsi [82] expanded on this result by showing new lower bounds on the problem
assuming NP does not have “fast” deterministic algorithms, culminating in a hard-
ness factor of n
1/ log log
O(1)
n
for DkS with “perfect completeness”, a factor that lies
somewhere between Class III and Class IV.
However, despite much attention, the best approximation factor known for DkS
is roughly O(n
1/4
) [23]. In fact, this is the best factor known even for a simple
average-case version of the problem, in which we are asked to detect dense random
components planted within sparse random graphs. The difficulty of improving upon
the known approximation algorithms has led to a number of conjectures of the hard-
ness of DkS [11, 13, 17, 61], each of which assumes at the very least that DkS should
have Class IV-like hardness. Consequently, DkS has turned into a central problem
through which conditional hardness is often shown [39, 40, 43, 75].
In this dissertation, we study a variety of problems, each of which can be related
back to Densest kSubgraph. Along the way, we describe techniques for showing
DkS-based hardness for various problems, either by direct reduction or otherwise.
1.1 Core Problems
We now proceed to introduce the key problems discussed in this thesis. For simplicity,
most problems will only be introduced as function problems. Their decision versions
follow naturally.
2
1.1.1 Densest k–Subgraph
The Densest kSubgraph (DkS) problem can be seen as a natural optimization
variant of the NP-hard k-Clique problem. First formally defined by Kortsarz and
Peleg [76], DkS asks us to find the set of k vertices in a given graph G whose induced
subgraph contains the largest number of edges. Formally, it can be states as follows.
Problem 1.1 (Densest kSubgraph (DkS)): Given a graph G = (V, E) and an
integer k, find a subset V
0
V of size k such that |E(V
0
)| the number of edges in
E induced by V
0
is maximized.
Although DkS can be directly applied to problems such as community detection
(see, e.g., [61]), much of the interest in DkS stems from its relationship to other
combinatorial graph problems. In particular, much of the difficulty in approximating
a wide variety of optimization problems can be tied back to that of approximating
DkS. Because DkS has thus far evaded all attempts at an approximation algorithm
with a sub-polynomial factor, a number of papers have shown hardness for problems
conditioned on the conjecture that DkS is hard to approximate to within some n
factor (e.g. [7, 34]).
Conjecture 1: For some > 0, there exists no probabilistic polynomial-time n
-
approximation algorithm for the Densest kSubgraph problem.
Planted DkS
All known algorithms for DkS fail for one particularly simple family of instances,
called the Random-in-Random or Random Planted Model. This family, first
introduced in [23], is defined as follows. A “base graph” G
1
is an Erd˝os-R´enyi random
graph with edge probability 1/
n. Thus, all vertices in G
1
have degree roughly
n.
Now, among an arbitrarily-selected
n-vertex subset (which will have an average
induced degree of roughly 1), we “plant” an Erd˝os-R´enyi random graph G
2
with edge
3
Figure 1.1: In the Planted DkS problem, two G(n, 1/
n) graphs are sampled
independently. Inside one of these, a subgraph on
n vertices is chosen arbitrarily
and its average degree is boosted from O(1) to n
1/4
, at random. Conjecture 2 claims
that it is hard to identify which of the two graphs was modified and which was left
untouched.
probability n
1/4
. In particular, we construct the random graph G
2
, and replace the
subgraph induced by our chosen vertices with G
2
. Thus, the average degree induced
on this subgraph is increased from 1 to n
1/4
.
As discussed in Bhaskara. et. al. [23], detecting this planted dense subgraph ap-
pears out of reach of current methods. Subsequently, a number of papers have since
used variants of this observation as a basis for conjecturing hardness for a number of
different flavors of DkS [11, 13, 17]. One version of this conjecture can be stated as
follows.
Conjecture 2: Let > 0 and let G
1
and G
2
be two graphs, each sampled as follows.
(i) G
1
is an Erd˝os-R´enyi random graph G(n, n
1/2
). (ii) G
2
is an Erd˝os-R`enyi random
graph G(n, n
1/2
) inside which an G(
n, n
1/4
) subgraph is planted, as described
above. For every
0
> 0, no probabilistic polynomial-time algorithm can succeed with
probability 1/2 +
0
in determining which of the two graphs is G
1
and which is G
2
.
Roughly speaking, Conjecture 2 says that it is hard to detect a G(
n, n
1/4
)
random subgraph planted with a G(n, n
1/2
) random graph.
4
Connection to Planted Clique
The definition of this Planted Dense Subgraph problem has its origins in the
notorious Planted Clique problem. Formally, Planted Clique asks us to dis-
tinguish between graphs chosen uniformly from one of two graph ensembles:
The collection of all graphs (i.e. sampled from the Erd˝os-R´enyi distribution
G(n, 1/2)).
The collection of all graphs with a k-Clique (i.e. a G(n, 1/2) graph within which
a k-clique is planted arbitrarily).
When k > c
n log n for an appropriate constant c, clique vertices will (with
overwhelming probability) reveal themselves simply by virtue of having unusually high
degrees [77]. For k = Ω(
n), spectral, semidefinite programming, and combinatorial
approaches all manage to detect the planted component in polynomial time [4, 52, 53].
For k = o(
n), the problem remains open, and lies at the center of myriad conditional
hardness results (e.g. [17, 22, 26, 63]).
Despite the difficulty of finding a polynomial-time algorithm for k =
n, the
problem with k as small as (2 + ) log
2
n can be solved in n
O(log n)
time simply by
enumerating all subgraphs of size (2 + ) log
2
n and checking if one of them is a clique.
As G(n, 1/2) graphs are extremely unlikely to contain cliques of size (2 + ) log n [5],
the existence of a clique at least as large certifies that the graph is overwhelmingly
likely to contain a planted component. For the search problem, a simple greedy
algorithm can uncover the rest of the planted clique. As the fastest known algorithms
for Planted DkS run in time 2
n
Θ()
, reductions from Planted DkS have the added
advantage that, under the appropriate set of assumptions, they still show hardness
even for quasipolynomial-time algorithms.
5
Densest k–Subgraph with Perfect Completeness
Of particular importance is the special case of (worst-case) DkS in which Yes in-
stances are promised to contain a k-Clique. Such perfectly complete instances of DkS
form the basis of almost all hardness results for the more general DkS problem, and
will receive special attention in Section 2.2. The problem can be formalized as follows.
Problem 1.2 (Densest kSubgraph with Perfect Completeness (DkS
C
)):
Given a graph G = (V, E) and an integer k, distinguish between (i) the Yes case in
which G contains a k-Clique and (ii) the No case in which G contains no sufficiently
dense k-subgraph. In particular, instances to the DkS
C
[1, β] problem are promised to
either (i) contain a k-Clique or (ii) contain no k-subgraph with at least β ·
n
2
edges.
In [81], it is shown that DkS
C
is hard to approximate to within an n
1/polylog n
factor
assuming the Exponential Time Hypothesis (ETH). Roughly, ETH (resp. Ran-
domized ETH) is the conjecture that SAT has no deterministic (resp. randomized)
2
o(n)
-time algorithms.
1
Additionally, [81] provides an n
o(1)
-factor hardness assuming
Gap-ETH, the hypothesis that even gap instances of SAT (such as those generated
by the PCP theorem) don’t have 2
o(n)
-time algorithms [50]. Note that the latter
result is in some sense tight: implicit in [23] is an algorithm solving DkS
C
[1, n
] in
time n
O(1/)
.
We use Conjectures 0a and 0b to denote that randomized ETH and randomized
Gap-ETH each hold, respectively. Although appeals to these conjectures will always
appear in pairs, the hardness factors attained will correspond to which of Conjec-
ture 0a or 0b is assumed.
Conjecture 0a: No randomized algorithm can exactly solve SAT in time 2
o(n)
(i.e.
randomized ETH holds).
1
In its more precise form, ETH also rules out the possibility of a sequence of algorithms each
with running time 2
i
n
with inf
i
i
= 0.
6
Conjecture 0b: For some > 0, no randomized algorithm can solve Gap-SAT[1 , 1]
in time 2
o(n)
(i.e. randomized Gap-ETH holds).
1.1.2 Smallest k–Edge Subgraph
The Smallest kEdge Subgraph (SkES) problem can be seen as the natural
minimization complement of DkS. While the problem statement of DkS asks us to
maximize the number of edges induced by a prescribed number of vertices, SkES
asks us to minimize the number of vertices needed to induce a prescribed number of
edges. This problem, first formalized by [39], can be stated as follows.
Problem 1.3 (Smallest kEdge Subgraph (SkES)): Given a graph G = (V, E)
and an integer k, find the smallest subset V
0
V such that |E(V
0
)| the number
of edges in E induced by V
0
is at least k.
As pointed out in [39], the proof used in [84] to show the hardness of approximating
Node-Weighted Steiner Network also implies that if DkS is α(n)-hard to
approximate, then SkES is O(
p
α(n))-hard.
1.1.3 LabelCover-Min and Min-Rep
LabelCover was first introduced by [12] to show hardness for the various Nearest
Lattice Vector and related problems. The problem is formalized as follows. As
input, we are provided a regular, bipartite graph G = ((V
1
, V
2
), E), a set Σ of labels,
and, for each edge e, and a partial function Π
e
: Σ Σ. A labeling is an assignment
of one or more labels to each vertex, and an edge (v
1
, v
2
) (with v
1
V
1
and v
2
V
2
) is
covered if for every label a
2
assigned to v
2
, v
1
contains a label a
1
such that Π
e
(a
1
) = a
2
.
LabelCover-Min then asks for the labeling with the smallest number of distinct
labels that covers every edge in the bipartite graph. LabelCover-Max, on the
7
other hand, asks us to assign exactly one label to each vertex so to maximize the
total number of covered edges.
The Min-Rep problem, first introduced in [74], can be viewed as a variant of
LabelCover-Min that has shown itself to be a simpler starting point for many
hardness results, e.g. [24, 33, 37, 47]. One formalization of the problem is as follows.
The first part of the input consists of a bipartite graph with vertex set V and edge
set E. Within each bipartition, vertices are further partitioned into equal-sized sub-
sets, which are called supervertices. The set of all supervertices is denoted
ˆ
V . We
then establish an equivalence relation between edges based on whether or not their
endpoints fall in exactly the same pair of supervertices. The non-empty equivalence
classes are called the graph’s superedges, the set whereof is denoted
ˆ
E. Thus, an
instance of Min-Rep can be defined as a 4-tuple (V, E,
ˆ
V ,
ˆ
E), defining the vertices,
edges, supervertices, and superedges of the instance.
2
For a subset V
0
V of vertices, we say that V
0
covers a superedge ˆe
ˆ
E if there
exists an edge e ˆe such that both endpoints of e are in V
0
. Finally, the set of
feasible solutions for a given Min-Rep instance are exactly those vertex subsets that
cover every superedge in
ˆ
E. As the name Min-Rep suggests, the objective is to find
a feasible solution V
0
V of minimal cardinality.
Problem 1.4 (Min-Rep): Given a 4-tuple (V, E,
ˆ
V ,
ˆ
E) representing a bipartite
graph G = (V, E) with supervertex set
ˆ
V and superedge set
ˆ
E, find a minimal-
cardinality subset V
0
V such that for each superedge ˆe
ˆ
E, both endpoints of
some e
ˆ
E are included in V
0
.
As shown in [74], Min-Rep has no 2
log
1
n
approximation algorithm unless NP
DTIME[n
polylogn
]. LabelCover-Max has since been shown to admit a (slightly
stronger) hardness of 2
log
11/ log log
1/2
n
under similar complexity conjectures [49]. To
2
Although the set of superedges is already implied by V , E, and
ˆ
V , in this thesis we treat it as
part of the input for convenience.
8
see how these factors relate to Manurangsi’s result on the hardness of Densest
kSubgraph (assuming ETH), note that
log
c
1
n = 2
c
1
log log n
2
log
1
n
(1.1)
2
log
11/ log log
1/2
n
n
(1.2)
= 2
log n/ log
1/ log log
1/2
n
n
= n
1/ log
1/ log log
1/2
n
n
n
1/(2
log log n/ log log
1/2
n
)
= n
1/2
log log
1/2+
n
n
1/2
c
2
log log log n
= n
1/log log
c
2
n
(1.3)
n
, (1.4)
where c
1
, c
2
, and are arbitrary positive constants, and f(n) g(n) iff f = o(g). In
particular, Equation (1.1) establishes super-polylogarithmic hardness for Min-Rep,
(1.2) establishes that the aforementioned hardness of LabelCover-Max exceeds
that of Min-Rep, (1.3) establishes that the known hardness of DkS exceeds that
of LabelCover-Max, and (1.4) establishes that the known hardness of DkS is
sub-polynomial.
1.2 Outline of Results
Chapter 2 studies the interaction between Min-Rep and DkS
C
. We show that Min-
Rep is at least as hard to approximate as DkS
C
. This establishes stronger lower
bounds for Min-Rep assuming either the Exponential Time Hypothesis (ETH) or its
strengthening to Gap-ETH. This material is based on work with Moses Charikar and
Anthony Wirth [31].
Chapter 3 centers on the Target Set Selection (TSS) problem, in which we
are tasked with finding a small set of nodes that can kick-start an influence spreading
9
process that eventually reaches the whole graph. We show that while both TSS and
its finite-round variant are hard to approximate, the finite-round version admits a
nontrivial approximation algorithm. Along the way, we introduce a recursive ver-
sion of the random-in-random model of the Densest kSubgraph problem (see
Section 1.1.1), and show that it remains as hard as the un-recursed version of the
problem. We then study a dynamic version of TSS, in which previously influenced
vertices can become de-influenced if a sufficient number of their in-neighbors them-
selves have become de-influenced. As we show, the dynamics of this problem allow
for exponential-period orbits, and deducing even simple properties of the long-term
behavior is PSPACE-complete. This material is based on work with Moses Charikar
and Anthony Wirth [30].
Chapter 4 centers on the Densest Common Subgraph problem, in which we
search a sequence of graphs for subgraphs that consistently remain dense. We study
two versions of this problem: the “max-min-average” variant and the “max-average-
min” variant. We show hardness of approximation results for both problems, and
provide two algorithms: an
˜
O(
n)-factor poly-time approximation algorithm for the
former and a (1 + )-approximate FPT-time algorithm for the latter (where the pa-
rameter is the number of graphs). This material is based on joint work with Moses
Charikar and Jimmy Wu [32].
In Chapter 5, we study the Middlebox Routing problem, which is a practice-
inspired variant of Multi-Commodity Flow in which packets also need to be
processed by the vertices in the graph. While we show that the base problem can
be solved using either a polynomial-sized linear program or with a more efficient
multiplicative weights algorithm, we also prove that the underlying network design
problem is NP-hard. We study five variants of the network design problem, and
provide both approximation and hardness results. This material is based on joint
work with Moses Charikar, Jennifer Rexford, and X. Kelvin Zou [29].
10
In Chapter 6, we study a number of problems, each of which can be shown to be
DkS-hard. In Section 6.1, we show hardness for Minimum Monotone Satisfying
Assignment from two different conjectures on the hardness of DkS
C
. In Sections 6.2
and 6.3, we show DkS-based hardness for both Min-Cost Colored Cut and
Min-Query Q&A. In Section 6.4, we show that the Min-k-Union problem, which
generalizes SkES and is thus DkS-hard, admits an O(
m)-approximation algorithm.
Finally, in Section 6.5, we show DkS-hardness for Network Flow Interdiction
as well as much stronger hardness for a natural variant of the problem.
The material in Sections 6.1 and 6.3 is based on joint work with Moses Charikar
and Anthony Wirth [30, 31], that in Sections 6.4 and 6.5 is based on joint work with
Moses Charikar and Nicole Wein, and that in Section 6.2 is based on joint work with
Nina Mishra [83].
1.3 Common notation
For a graph G, we use G[V
0
] to denote the subgraph of G induced by V
0
, and E[G]
to denote the set of edges within G. We will use n to denote the number of vertices
in a graph, and m to denote the number of edges.
Most problems in this paper will be introduced as function problems; their corre-
sponding decision version will follow naturally. Depending on context, we use OPT
either to refer to an optimal feasible solution or to its objective value. Similarly, we
use ALG to denote either the solution found by a candidate algorithm or the objective
score of said solution. When we wish to explicitly refer to the appropriate objective
value, we use the notation cost(OPT) and cost(ALG). When describing approxima-
tion and hardness ratios, we use values both greater-than and less-than 1. The correct
interpretation of these values will be obvious from context. For two problems A and
B, we say that A
p
B iff every poly-time f(n)-approximation algorithm for B can
11
be used in a black-box manner to attain a polynomial-time poly(f(n))-approximation
algorithm for A. In such cases, we say that A is polynomially-related to B.
When the parameter might not be clear from context, use the notation o
r
(1) to
denote a value that tends to 0 as r tends to . The variable will always be used to
denote a small positive constant. We use N
0
to denote the set of nonnegative integers
and Z
+
to denote the set of strictly positive integers.
12
Chapter 2
Min-Rep and DkS
2.1 Overview
In this section, we study how Densest kSubgraph can be used to show stronger
hardness for Min-Rep-hard problems. We describe this connection in two parts.
First, we show that current results on the hardness of DkS translate into hardness
for Min-Rep. This exploits the fact that the current best hardness results for DkS
follow from the hardness of DkS
C
. Next, though we fall short of showing Min-Rep
hardness for (non-complete) DkS, we show that many Min-Rep-hardness proofs can
be converted into proofs of (effectively) DkS-hardness. As a key intermediate step in
our technique, we introduce the Min–k–Rep problem a natural generalization of
Min-Rep that also happens to be DkS-hard. Thus, by showing that many Min-Rep-
hardness proofs can be translated into proofs of Min–k–Rep-hardness, we establish
that a large number of Min-Rep-hard problems are also DkS-hard.
13
2.2 DkS
C
-hardness of Min-Rep
Before showing that DkS
C
p
Min-Rep, we first present a simple lemma proving
that graphs without dense k-subgraphs cannot have dense r-subgraphs when r is not
much larger than k.
Lemma 2.1: Let G be a graph in which every k-vertex subgraph has at most s
induced edges. Every subgraph of G containing at least t s induced edges must
have more than (k 1)
p
t/s vertices.
Proof. Let R be an r-vertex subgraph of G containing at least t induced edges.
Clearly, r k. Now let K be a k-vertex induced subgraph of R chosen by selecting
a uniformly random sample of k vertices from R. For every edge (u, v) R, the
probability that it also appears in K is the probability that both of its endpoints
survived the sampling process. Namely, Pr[(u, v) K] =
k
r
·
k1
r1
>
(k1)
2
r
2
. By
linearity of expectations, E[# edges in K] > t(k 1)
2
/r
2
. However, since a k-vertex
subgraph of R has at most s induced edges, s E[# edges in K]. Combining the
two inequalities, s > t(k 1)
2
/r
2
, and, solving for r, we get r > (k 1)
p
t/s.
With this lemma in hand, we can proceed to prove that Min-Rep’s hardness is
polynomially related to that of DkS
C
.
Theorem 2.2: If there is an α-approximation algorithm for Min-Rep (with α > 1),
then there exists a randomized algorithm for DkS
C
[1, β] whenever β < 1/(9α
2
). In
particular, DkS
C
p
Min-Rep.
1
Proof. To simplify the presentation of the proof, we make the benign assumption that
the number of vertices is evenly divisible by 6 log n. Similar analysis carries through
when this is not the case. Given an instance (G(V, E), k) of DkS
C
[1, β], we partition
1
To keep the analysis simple, no attempt is made to lower the factor of 9 in the bound on β.
14
V uniformly at random into k
0
= k/(3 log n) groups so that each has (3n/k) log n
vertices.
We now construct a Min-Rep instance as follows. The vertex set is exactly
V . The first k
0
/2 groups in the above partitioning form the supervertices of the
left-partition of our Min-Rep instance (
ˆ
A). The remaining k
0
/2 groups form the
supervertices of the right partition (
ˆ
B). The edge set is exactly those edges which
cross the bipartition, all other edges are discarded. The set of superedges is implied
by our choice of the other three sets.
If G is a Yes instance of DkS
C
[1, β], then it contains a k-clique. The partitioning
scheme above ensures that, with high probability, each supervertex contains at least 1
vertex of this clique. Thus, with high probability, every (A
i
, B
j
) pair induces at least
one superedge. Consequently, the optimal objective value is exactly k
0
with high
probability, as picking one vertex in each partition is both necessary and sufficient to
cover all superedges
k
0
2
superedges.
However, if G came from a No instance, then one of two things can happen.
Either (i) the constructed Min-Rep instance is missing some superedge (that is, some
(A
i
, B
j
) supervertex pair does not induce a superedge); or (ii) the constructed Min-
Rep instance has all k
02
/4 potential superedges. In the first case, we can immediately
identify that with high probability the input graph came from a No instance of DkS
C
and return accordingly. Thus, we focus our attention on the optimal objective values
of No instances in which all superedges are present.
Such an instance can be identified as a No instance when its optimal objective
value is greater than αk
0
: applying the factor-α approximation algorithm to Min-Rep
would reveal that the optimal objective value exceeds k
0
, hence, with high probability,
that of a Yes instance.
15
To figure out which values of β necessitate this αk
0
lower bound on OPT, we appeal
to Lemma 2.1. A feasible solution to the problem must cover all k
02
/4 superedges,
and thus the subgraph induced by the solution must contain at least k
02
/4 edges.
However, the promise problem we are solving, DkS
C
[1, β], guarantees that no k-
vertex subgraph, where k = 3k
0
log n, contains more than β
k
2
βk
2
= 9βk
02
log
2
n
edges. Applying Lemma 2.1 with t = k
02
/4, s = 9βk
02
log
2
n, and k = 3k
0
log n, we
see that a subgraph inducing k
02
/4 edges (and, in particular, the smallest one) has at
least
(3k
0
log n 1)
s
k
02
/4
9βk
02
log
2
n
=
3k
0
log n 1
6
β log n
>
k
0
3
β
vertices. Thus, we have the sought αk
0
lower bound on the optimal objective value
whenever α < 1/(3
β) or, equivalently, whenever β < 1/(9α
2
).
Combining Theorem 2.2 with the result of [81], we get that for some con-
stant c, Conjecture 0a implies that it is hard to approximate to within a factor
of O(n
1/ log log
c
n
). Additionally, Conjecture 0b implies that Min-Rep is hard to
approximate to within a factor of n
o(1)
.
Theorem 2.3: For some constant c, Min-Rep has no O(n
1/ log log
c
n
)-approximation
algorithm (resp. no n
o(1)
-approximation algorithm) unless randomized ETH (resp.
randomized Gap-ETH) is false.
2.3 From Min-Rep-hardness to DkS-hardness
To show DkS hardness for Min-Rep-hard problems, we introduce a natural general-
ization of Min-Rep, which we call Min–k–Rep. This problem is just like Min-Rep,
with the twist that the set of feasible solutions are those that cover at least k su-
peredges (instead of all of them).
16
Problem 2.1 (Min–k–Rep): Given and integer k and a bipartite graph with vertex
set V , edge set E, supervertex set
ˆ
V , and superedge set
ˆ
E, find the smallest subset
of V that covers at least k superedges.
Thus, the Min-Rep problem is simply Min–k–Rep with k equal the number of
superedges in G; Min–k–Rep is hence Min-Rep-hard. As it turns out, the hardness
of Min–k–Rep is also polynomially-related to that of DkS.
Theorem 2.4: DkS
p
Min–k–Rep
Proof. As discussed in Section 1.1.2, DkS
p
SkES. It thus suffices to show that
SkES
p
Min–k–Rep. We show this via a direct reduction: given an instance
(G(V, E); k) of SkES, we choose a random bipartition of V into A and B by plac-
ing each vertex in A independently with probability 1/2, and in B otherwise. The
Min–k–Rep instance returned has vertex set V , an edge set containing all edges
crossing the bipartition, and one singleton supervertex for every vertex in V (thus,
we have exactly |V | supervertices).
For a subset S V , let v
1
(S) be the objective value of S for the given SkES
instance, and v
2
(S) be the objective value of S under the constructed Min–k–Rep
instance. Clearly, v
2
(S) v
1
(S), as choosing both endpoints of an edge coincides with
covering a unique superedge. Additionally, by the random construction, E[v
2
(S)] =
v
1
(S)/2, and, by a standard application of Chernoff bounds, with high probability,
v
2
(S) v
1
(S)/3. Thus, the two problems differ only by a constant factor in their
objective scores for every set S, meaning that the two problems are polynomially
related.
With this theorem in hand, the goal is now to transform proofs of Min-Rep
hardness into proofs of Min–k–Rep hardness. In some cases, Min–k–Rep-hardness
follows immediately: in [9], the proof of Min-Rep-hardness for k-Steiner For-
est actually shows Min–k–Rep-hardness for the problem. In other instances, the
17
conversion requires more work. In the remainder of this thesis (and particularly in
Chapter 6), we show how to establish Min–k–Rep-hardness for a number of Min-
Rep-hard problems.
18
Chapter 3
Target Set Selection
3.1 Overview
In the early 2000s, the growing interest in direct marketing inspired Domingos and
Richardson to study the network value of customers, defined as the sum of the profit
directly generated by marketing to the customer and any additional profit that the
customer may generate through word-of-mouth marketing [51, 88]. While their work
focuses on the Bayesian setting, subsequent work by Kempe, Kleinberg, and Tardos
introduced the following deterministic multi-stage influence problem [71].
As input, we are given an (un)directed graph G = (V, E), and a non-negative
threshold function τ : V N
0
. A vertex can be either active (infected) or inactive
(uninfected). For an initial seed set of active vertices, influence proceeds in a sequence
of rounds. A previously inactive vertex v becomes active in a particular round if in
the previous round at least τ (v) (in-)neighbors of v were active. Once a vertex is
active, it remains active in all subsequent rounds. Since the process (essentially)
stops if there is no new active vertex in some round, there are at most n rounds. A
seed set is said to be contagious if it induces the entire graph to eventually become
active. The TSS problem asks for the smallest contagious seed set of the graph.
19
Problem 3.1 (Target Set Selection (TSS)): Given a graph G = (V, E) and a
nonnegative threshold function τ : V N
0
, find the smallest subset S V such that
S is contagious.
In 2009, Chen [33] showed that TSS is Min-Rep-hard to approximate, and that
the problem is maximally hard even with τ(·) 2 identically. While algorithms
have been found for special instances of TSS (such as when the graph has low-
treewidth [20] or is an expander [46]), no approximation algorithm better than the
trivial O(n) approximation has yet been found for worst-case instances.
Although the activation procedure underlying TSS was originally defined to pro-
ceed for an infinite number of rounds, Cicalese et. al. initiated the study of a latency-
bounded variant of Target Set Selection [44, 45]. In addition to a (di)graph and
threshold function, latency-bounded instances are also parameterized by a positive
integer r. The r-round Target Set Selection problem, denoted TSS
r
, is then
to find the smallest r-contagious set of vertices; i.e. the smallest set that would acti-
vate the entire graph within r rounds. We will discuss this variant in more detail in
Sections 3.3 and 3.4.
Problem 3.2 (r-round Target Set Selection (TSS
r
)): Given a graph G =
(V, E), a nonnegative threshold function τ : V N
0
, and a positive integer r, find
the smallest subset S V such that S is r-contagious.
The vast majority of this chapter is based on joint work with Moses Charikar and
Anthony Wirth.
3.2 TSS and Min-Rep
We first give a simplified version of Chen’s Min-Rep-hardness proof for Target Set
Selection.
20
1
1
1
2
u
v
Figure 3.1: A directed edge gadget from u to v. The threshold of each vertex is
indicated by the number inside its depiction.
Theorem 3.1 (from [33]): There exists an approximation preserving reduction (up
to constant factors) from Min-Rep to TSS, even when all edges are bidirected.
A key tool in this reduction is the directed edge gadget (Figure 3.1), which is
implicit in Chen’s work and explicitly introduced in [19, 30]. This gadget is meant
to simulate a directed edge, in that u being active would (via the gadget) give v one
additional active neighbor (albeit after 3 rounds), while v being active does not (by
means of the gadget) directly affect u. Additionally, these gadgets have the property
that they do not allow algorithms to benefit by “cheating” and including a gadget
vertex in their choice of contagious set: any solution containing one of the gadget
vertices is (weakly) dominated by a solution activating its in-vertex u instead. Thus,
we can assume without loss of generality that solutions of interest do not contain
gadget vertices. Thus, for the remainder of the document, we can allow w.l.o.g. for
TSS instances to contain directed edges
1
.
With such a gadget in hand, the reduction is pretty direct. Given a Min-Rep
instance (V, E,
ˆ
V ,
ˆ
E), we create a TSS instance with |V | + |E| + 2|
ˆ
E| vertices.
Vertices There are four layers of vertices, called the vertex layer, edge layer,
superedge layer, and completeness layer. The vertex layer A has one threshold-|
ˆ
E|
vertex a
i
for each v
i
V , the edge layer B has one threshold-2 vertex b
j
for each
1
Although this gadget reduction may alter the number of rounds, the round-based lower bounds
we derive in the remainder of this chapter stem from entirely bi-directed instances and are unaltered.
21
e
i
E, the superedge layer C has one threshold-1 vertex c
k
for each ˆe
k
ˆ
E, and the
completeness layer D has |
ˆ
E| threshold-|
ˆ
E| vertices that behave indistinguishably.
Edges The four layers are connected in a cyclical manner. Between A and B,
we add a directed edge gadget from vertex a
i
to vertex b
j
whenever v
i
is an endpoint
of e
j
. Between the B and C, we add a directed edge gadget from b
j
to c
k
whenever
e
j
ˆe
k
. The following two sets of edges, connecting C to D and D to A, are both
complete (i.e. all possible connections exist, and the induced graphs are both directed
bicliques).
This completes the construction. Before we check the correctness of this construc-
tion, we first prove a helpful lemma about the structure of optimal solutions.
Lemma 3.2 ([33]): For instances formed by the above construction, any contagious
set S can be transformed (in polynomial time) into a contagious set S
0
of size at most
2|S| contained entirely within the vertex layer.
Proof of Lemma 3.2. Let S be a contagious set. If D S, then the set S
0
= S C \D
is also contagious (as it activates D in one round), and is no larger than S. Otherwise,
if S contains only a proper subset D
0
of vertices from D, the activation of these vertices
contributes nothing to the spreading process (as all of D needs to be active to activate
anything in A, and the events leading to the activation of D\D
0
would have activated
D
0
anyway), and so S \D
0
is also contagious. Thus, solutions can be made to contain
only vertices from A B C with zero increase in objective value.
Now suppose the solution S contains only vertices from A B C. Because all
vertices in C have a threshold of 1, the inclusion of any vertex from C in S is weakly
dominated by the choice of any of its in-neighbors from B. Thus, solutions can be
made to contain only vertices from A B. Similarly, because vertices in B only
have threshold 2, removing each of them from S and instead including their two in-
neighbors in A can only increase the size of S by a factor of 2. Thus, we can transform
22
any solution into one containing only vertices from A, up to a multiplicative blowup
of at most 2 in total cost.
The proof of the main theorem is now rather simple.
Proof of Theorem 3.1. Given an instance M of Min-Rep, we construct a TSS in-
stance T according to the above construction. A feasible solution to the Min-Rep
instance can be converted into a solution to the constructed TSS instance by select-
ing the corresponding vertices from the vertex layer. This forms a contagious set
because the activation of these vertices would initially activate exactly the vertices
corresponding to the edges they cover. These, in turn, activate vertices correspond-
ing to the superedges covered by the solution, which (by feasibility of the Min-Rep
solution) includes all of the superedge layer. The activation of the entire layer then
activates the entire completeness layer, which subsequently activates the vertex and
edge layers. Thus, cost(OPT
T
) cost(OPT
M
).
On the other hand, any contagious subset of the vertex layer in T must correspond
to a feasible solution in M: the spreading behavior corresponding to an infeasible
solution to M must fail to activate at least one vertex from the superedge layer
within the first 2 rounds, at which point the spreading process comes to an abrupt
halt. Because Lemma 3.2 guarantees that any feasible solution to T can be made to
contain only vertices from the vertex layer up to a factor of 2 in objective cost, we
conclude that cost(OPT
M
) 2 · cost(OPT
T
). The result follows.
3.2.1 Min–k–Rep hardness of TSS
The above reduction neatly falls in the framework outlined in Section 2.3. To get
Min–k–Rep (and thus DkS) hardness for TSS, we simply reduce the threshold of
every vertex in the completeness layer down to k. Thus, only k superedges need to
become active for the completeness layer to activate (and in turn activate all other
23
vertices). The remainder of Theorem 3.1 carries through as before. Thus, we can
conclude the following. Min–k–Rep
Theorem 3.3: DkS
p
Min–k–Rep
p
TSS.
3.3 Hardness from Planted DkS Instances
Notation
In this section, we will refer to a graph-theoretic operation called planting. For two
graphs G and H (the latter of order n
H
), one can plant H in G by replacing some
n
H
-vertex subset of G with a copy of H. For our purposes, the exact choice of
which subset of G to replace does will not matter. Thus, we can use the (otherwise-
ambiguous) notation G C H to denote the planting of H in G. When we chain this
notation (e.g. G
1
C G
2
C G
3
), the C
00
operator is to be read as a right-associative
operator. In the previous example, this means that G
3
is an induced subgraph of
G
2
C G
3
, which is in turn an induced subgraph of G
1
C G
2
C G
3
.
3.3.1 Hardness of TSS
4
Recall that in Section 1.1.1, we introduced the Planted Dense Subgraph Con-
jecture (Conjecture 2), which states that no probabilistic polynomial time algorithm
can, with probability at least
1
/2 + , distinguish between a vanilla G(n, 1/
n) graph
and a G(n, 1/
n) graph within which a G(
n, n
1/4
) is planted. To convey the
intuition underlying Section 3.3, we start out by showing how such instances can be
used directly to prove hardness for TSS
4
.
Theorem 3.4 (warmup): Assuming Conjecture 2, it is hard to approximate TSS
4
to within a factor of n
1/4
.
24
The approach, roughly, will be as follows. Suppose that we are given a graph from
one of the two distributions described above. For the duration of this subsection and
Section 3.3.3, we set the threshold function τ to identically equal some large constant
τ
0
. We show that for some
0
= o
τ
0
(1), G(n, 1/
n) graphs have (with high probability)
no 4-contagious set of size n
1/2
0
. On the other hand, we show that (w.h.p.) every
set of
n vertices is 2-contagious. Thus, when there is a dense planted component,
it suffices to activate the component in the first 2 rounds in order to activate the
whole graph in 4 rounds. We then take advantage of the extra density in the planted
component to show that it (w.h.p.) contains a seed set of size n
1/4
, establishing the
sought gap. Formally, we show the following results
Lemma 3.5 (Lower bound on OPT): For all positive integers τ
0
and r both at least
2, the TSS
r
instance with graph G(n, 1/
n) and threshold function identically equal
to τ
0
has |OPT| =
˜
Ω(n
β
), where β =
τ
0
2
2τ
0
2
.
Lemma 3.6 (Upper bound on OPT): For all positive integers τ
0
, and for any
(0, 1/(3(τ
0
+ 1)
2
)), the TSS
4
instance with graph G(n, 1/
n) C G(
n, n
1/4
) and
threshold function identically equal to τ
0
has |OPT| = O(
4
n).
Both of these lemmas can be proven by an iterated application of the Chernoff
bound. These two lemmas are special cases of Lemmas 3.8 and 3.9, respectively, both
of which will be proven in full generality in Section 3.3.3.
3.3.2 Recursive Planting
While the approach used in Section 3.3.1 can be used to show n
1/4
hardness for
TSS
4
(and, indeed, for TSS
r
with r 4), the bounds given by Lemmas 3.5 and 3.6
are tight enough to show that a direct application of Conjecture 2 cannot attain any
substantially stronger hardness result. To get around this barrier, we introduce the
concept of recursive planting.
25
The key idea here is as follows. Conjecture 2 says that it is hard to detect a
G(n
1/2
, n
1/4
) graph planted within G(n, n
1/2
). To assist the exposition of the
intuition, we sacrifice a bit of accuracy and pretend that can be set to 0. Thus, for
n
0
=
n, the planted graph is a G(n
0
, 1/
n
0
) graph – effectively a scaled down version
of the G(n, 1/
n) graph in which we are </