Hardness from Densest Subgraph
Conjectures
Yonatan Naamad
A Dissertation
Presented to the Faculty
of Princeton University
in Candidacy for the Degree
of Doctor of Philosophy
Recommended for Acceptance
by the Department of
Computer Science
Adviser: Moses Samson Charikar
November 2017
© Copyright by Yonatan Naamad, 2017.
All rights reserved.
Abstract
Karp’s seminal paper on NP-Completeness provided computer scientists with a
toolkit for showing computational hardness, conditioned on a complexity theoretic
conjecture, for a wide variety of optimization problems. However, the techniques
used in his paper say little about the hardness of approximating the optimal objec-
tive values of these problems. While researchers have since managed to find tight
bounds on the approximability of some of these NP-hard problems, many others
still have embarrassingly large gaps between their known upper and lower limits of
approximation.
To fill in these gaps, researchers have posed myriad new complexity theoretic
conjectures that entail stronger lower bounds at the expense of more demanding hy-
potheticals. One class of conjectures focuses on the approximability of the Densest
kSubgraph (DkS) problem. Namely, the conjectured hardness of both adversarial
and average-case variants of DkS has been shown to imply stronger hardness for
many problems for which good approximation algorithms have proven elusive.
In this dissertation, we study a variety of problems, each of which can have their
hardness tied back to that of Densest kSubgraph. We describe techniques for
proving stronger hardness results conditioned on both the known and the conjectured
hardness of DkS. We show how previous results on DkS imply stronger lower bounds
on Min-Rep-hard” problems, as well as a simple technique for converting many
proofs of Min-Rep-hardness into proofs of DkS-hardness. Using this and other tech-
niques, we show DkS hardness for problems such as Target Set Selection, Min-
imum Monotone Satisfying Assignment, Min-Cost Colored Cut, Net-
work Flow Interdiction, and Densest k-Common Subgraph, among others.
In the case of Target Set Selection and Minimum Monotone Satisfying
Assignment, we show how to obtain substantially stronger lower bounds by exploit-
ing the self-similar structure of instances conjectured to be average-case-hard. We
iii
introduce the Middlebox Routing-class of problems, and show show exact, ap-
proximation, and hardness results for its member problems. We provide an O(
n)
approximation algorithm for MkU, and discuss both approximation and hardness
results for Densest Common Subgraph variants.
iv
Acknowledgements
I am very grateful for all the people who have helped and inspired me during my time
as a student. First and foremost, I would like to thank my advisor, Moses Charikar,
for his guidance and support both in and out of the research context. Moses’s unique
ability to identify, simplify, and clearly explain the core components of algorithms,
theorems, and even entire research areas alike has had a profound impact on the way
I approach new problems.
In addition to Moses, a number of other researchers have taken on an unofficial
advising role throughout my academic career. In particular, I’d like to thank Sanmay
Das for inspiring me to go into research, Huy Nguyen for being an endless source of
sage advice, and Tony Wirth for taking the term “academic older brother” to heart,
as exemplified through his constant readiness to provide tips and guidance during and
between our many unusually-houred Skype calls. Additionally, I would like to thank
the rest of my PhD committee: Sanjeev Arora, Mark Braverman, and Matt Weinberg,
for their time and advice which proved critical in getting this thesis completed.
I’d like to thank everyone who made my time at Princeton, Stanford, and, for
a couple of weeks, at the University of Melbourne, enjoyable and pain-free. Mitra
Kelly, Melissa Lawson, Nicki Gotsis, Megan Harris, and Ruth Harris all helped take
care of so much of the red tape associated with the numerous transitions I’ve had
throughout graduate school. On a more personal level, I was fortunate to have the
friendship and moral support a number of exceptionally gifted researchers from both
Princeton and Stanford (many now at MIT), without which the completion of this
thesis would have been impossible. I would also like to thank everyone who helped
me succeed in my internship at Amazon, primarily my former and current manager,
Nina Mishra.
I’m deeply indebted to my parents for their endless affection and support. This
dissertation would not have been possible without those late night math riddles they
v
gave me a kid, their encouragement of my middle-school obsession with programming,
and their tireless work at instilling in me an unshakable love of mathematics. Finally,
I would like to thank Julia for being a source of strength over this last year, as well
as for dealing with me as I spent countless hours hunched away writing this thesis.
vi
To my parents.
vii
Contents
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
List of Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
1 Introduction 1
1.1 Core Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Densest kSubgraph . . . . . . . . . . . . . . . . . . . . . 3
1.1.2 Smallest kEdge Subgraph . . . . . . . . . . . . . . . . . 7
1.1.3 LabelCover-Min and Min-Rep . . . . . . . . . . . . . . . 7
1.2 Outline of Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Common notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2 Min-Rep and DkS 13
2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 DkS
C
-hardness of Min-Rep . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 From Min-Rep-hardness to DkS-hardness . . . . . . . . . . . . . . 16
3 Target Set Selection 19
3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2 TSS and Min-Rep . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2.1 Min–k–Rep hardness of TSS . . . . . . . . . . . . . . . . . . 23
viii
3.3 Hardness from Planted DkS Instances . . . . . . . . . . . . . . . . . 24
3.3.1 Hardness of TSS
4
. . . . . . . . . . . . . . . . . . . . . . . . 24
3.3.2 Recursive Planting . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3.3 Hardness of TSS
r
. . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3.4 Hardness of unbounded-round TSS . . . . . . . . . . . . . . . 31
3.4 Approximating TSS
O(1)
. . . . . . . . . . . . . . . . . . . . . . . . . 34
3.5 Dynamic Target Set Selection . . . . . . . . . . . . . . . . . . 37
4 Densest Common Subgraph 44
4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.2 Approximating DCS-MA . . . . . . . . . . . . . . . . . . . . . . . . 46
4.3 Lower bounds for DCS-MA . . . . . . . . . . . . . . . . . . . . . . . 49
4.3.1 Integrality Gap . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.3.2 Min-Rep hardness . . . . . . . . . . . . . . . . . . . . . . . . 51
4.3.3 Planted DkS hardness . . . . . . . . . . . . . . . . . . . . . . 53
4.4 Hardness of DCS-AM . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.5 Algorithms for DCS-AM with Small T . . . . . . . . . . . . . . . . . 58
5 Middlebox Routing and Placement 60
5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.2 Solving Middlebox Routing . . . . . . . . . . . . . . . . . . . . . 65
5.2.1 A Multiplicative Weights Approach . . . . . . . . . . . . . . . 68
5.3 Network Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.3.1 Algorithms for Middlebox Placement . . . . . . . . . . . . . . 75
5.3.2 Hardness of Middlebox Placement . . . . . . . . . . . . . . . 80
6 Other Problems 85
6.1 MMSA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.1.1 MMSA and Min-Rep . . . . . . . . . . . . . . . . . . . . . . 86
ix
6.1.2 Hardness from Planted DkS Instances . . . . . . . . . . . . . 87
6.2 Min-Query Q&A . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.3 Min-Cost Colored Cut . . . . . . . . . . . . . . . . . . . . . . . 92
6.3.1 CMMSA hardness of Min-Cost Colored Cut . . . . . . . 93
6.4 Min-k-Union . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.4.1 An O(
m)-approximation for Min-k-Union . . . . . . . . . . 96
6.5 Network Flow Interdiction . . . . . . . . . . . . . . . . . . . . 99
6.5.1 SkES hardness of Network Flow Interdiction . . . . . 100
6.5.2 Complete Inapproximability of MDNFI . . . . . . . . . . . . 100
Bibliography 104
x
List of Figures
1.1 Diagram of Planted DkS. . . . . . . . . . . . . . . . . . . . . . . . 4
3.1 A directed edge gadget for TSS. . . . . . . . . . . . . . . . . . . . . . 21
3.2 Sample of the first few layers of the TSS
r
to TSS reduction. . . . . . 32
3.3 Brute forcing Set Cover with DTSS. . . . . . . . . . . . . . . . . . 41
4.1 Integrality gap instance for DCS LP . . . . . . . . . . . . . . . . . . . 51
5.1 Sample instance of Middlebox Routing . . . . . . . . . . . . . . . 63
5.2 Hard instances of directed Min-MP and Max-MP . . . . . . . . . . 80
6.1 NIH table for computing heart health . . . . . . . . . . . . . . . . . . 89
6.2 An example DAG for computing heart health. . . . . . . . . . . . . . 89
xi
List of problems studied
1.1 Densest kSubgraph (DkS) . . . . . . . . . . . . . . . . . . . . . 3
1.2 Densest kSubgraph with Perfect Completeness (DkS
C
) . . 6
1.3 Smallest kEdge Subgraph (SkES) . . . . . . . . . . . . . . . . 7
1.4 Min-Rep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1 Min–k–Rep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.1 Target Set Selection (TSS) . . . . . . . . . . . . . . . . . . . . 19
3.2 r-round Target Set Selection (TSS
r
) . . . . . . . . . . . . . . 20
3.3 Dynamic Target Set Selection (DTSS) . . . . . . . . . . . . . 38
3.4 Dynamic Target Set Selection-Reachability (DTSS-R) . . 40
4.1 Densest Common Subgraph-MA (DCS-MA) . . . . . . . . . . . 45
4.2 Densest Common Subgraph-AM (DCS-AM) . . . . . . . . . . . 45
4.3 Densest k-Common Subgraph-MA (DkCS-MA) . . . . . . . . . 46
5.1 Middlebox Routing (MR) . . . . . . . . . . . . . . . . . . . . . . 62
5.2 Dependency Middlebox Routing (DMR) . . . . . . . . . . . . . 63
5.3 Min-Middlebox Placement (Min-MP) . . . . . . . . . . . . . . . 64
5.4 Max-Middlebox Placement (Max-MP) . . . . . . . . . . . . . . 64
5.5 Dependency Min-Middlebox Placement (Min-MP) . . . . . . 65
6.1 Circuit MMSA (CMMSA) . . . . . . . . . . . . . . . . . . . . . . 86
xii
6.2 Formula MMSA (FMMSA) . . . . . . . . . . . . . . . . . . . . . 86
6.3 Min-Query Q&A (MQQA) . . . . . . . . . . . . . . . . . . . . . . 91
6.4 Min-Cost Colored Cut (MCC) . . . . . . . . . . . . . . . . . . . 93
6.5 Min-Certificate Circuit Monotone Satisfying Assignment
(MCCMSA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.6 Min-k-Union (MkU) . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.7 Network Flow Interdiction (NFI) . . . . . . . . . . . . . . . . 99
6.8 Max-Difference (MDNFI) . . . . . . . . . . . . . . . . . . . . . . 99
6.9 R-Interdiction Covering (RIC) . . . . . . . . . . . . . . . . . . 100
xiii
Chapter 1
Introduction
When studying optimization problems that are hard to solve exactly, it is natu-
ral to ask how well we can approximate the value of the optimal solution. While
some problems (such as Knapsack) admit efficient algorithms with arbitrarily good
approximation ratios [67], many others have been shown to have limits on their ap-
proximability. In 1996, Arora and Lund classified the known hardness results into
four groups [16]:
1. Class I problems - those showing constant-factor hardness, as in Vertex
Cover
2. Class II problems - those showing O(log n)-factor hardness, as in Set Cover
3. Class III problems - those showing O(2
log
1
n
)-factor hardness, as in Label
Cover
4. Class IV problems - those showing n
-factor hardness, as in Max Clique
Although later discoveries (such as the polylogarithmic hardness of Group
Steiner Tree [62] and the tight log
n hardness of Asymmetric k–Center [42])
have since refined our understanding of hardness results, the majority of results over
the last few years still classified problems into one of the above four classes.
1
In 2006, Khot showed Class I-type hardness for the Densest kSubgraph
(DkS) problem assuming that NP does not have algorithms that run in random-
ized subexponential time [72]. In 2016, first Braverman et. al. [25] and then Ma-
nurangsi [82] expanded on this result by showing new lower bounds on the problem
assuming NP does not have “fast” deterministic algorithms, culminating in a hard-
ness factor of n
1/ log log
O(1)
n
for DkS with “perfect completeness”, a factor that lies
somewhere between Class III and Class IV.
However, despite much attention, the best approximation factor known for DkS
is roughly O(n
1/4
) [23]. In fact, this is the best factor known even for a simple
average-case version of the problem, in which we are asked to detect dense random
components planted within sparse random graphs. The difficulty of improving upon
the known approximation algorithms has led to a number of conjectures of the hard-
ness of DkS [11, 13, 17, 61], each of which assumes at the very least that DkS should
have Class IV-like hardness. Consequently, DkS has turned into a central problem
through which conditional hardness is often shown [39, 40, 43, 75].
In this dissertation, we study a variety of problems, each of which can be related
back to Densest kSubgraph. Along the way, we describe techniques for showing
DkS-based hardness for various problems, either by direct reduction or otherwise.
1.1 Core Problems
We now proceed to introduce the key problems discussed in this thesis. For simplicity,
most problems will only be introduced as function problems. Their decision versions
follow naturally.
2
1.1.1 Densest k–Subgraph
The Densest kSubgraph (DkS) problem can be seen as a natural optimization
variant of the NP-hard k-Clique problem. First formally defined by Kortsarz and
Peleg [76], DkS asks us to find the set of k vertices in a given graph G whose induced
subgraph contains the largest number of edges. Formally, it can be states as follows.
Problem 1.1 (Densest kSubgraph (DkS)): Given a graph G = (V, E) and an
integer k, find a subset V
0
V of size k such that |E(V
0
)| the number of edges in
E induced by V
0
is maximized.
Although DkS can be directly applied to problems such as community detection
(see, e.g., [61]), much of the interest in DkS stems from its relationship to other
combinatorial graph problems. In particular, much of the difficulty in approximating
a wide variety of optimization problems can be tied back to that of approximating
DkS. Because DkS has thus far evaded all attempts at an approximation algorithm
with a sub-polynomial factor, a number of papers have shown hardness for problems
conditioned on the conjecture that DkS is hard to approximate to within some n
factor (e.g. [7, 34]).
Conjecture 1: For some > 0, there exists no probabilistic polynomial-time n
-
approximation algorithm for the Densest kSubgraph problem.
Planted DkS
All known algorithms for DkS fail for one particularly simple family of instances,
called the Random-in-Random or Random Planted Model. This family, first
introduced in [23], is defined as follows. A “base graph” G
1
is an Erd˝os-R´enyi random
graph with edge probability 1/
n. Thus, all vertices in G
1
have degree roughly
n.
Now, among an arbitrarily-selected
n-vertex subset (which will have an average
induced degree of roughly 1), we “plant” an Erd˝os-R´enyi random graph G
2
with edge
3
Figure 1.1: In the Planted DkS problem, two G(n, 1/
n) graphs are sampled
independently. Inside one of these, a subgraph on
n vertices is chosen arbitrarily
and its average degree is boosted from O(1) to n
1/4
, at random. Conjecture 2 claims
that it is hard to identify which of the two graphs was modified and which was left
untouched.
probability n
1/4
. In particular, we construct the random graph G
2
, and replace the
subgraph induced by our chosen vertices with G
2
. Thus, the average degree induced
on this subgraph is increased from 1 to n
1/4
.
As discussed in Bhaskara. et. al. [23], detecting this planted dense subgraph ap-
pears out of reach of current methods. Subsequently, a number of papers have since
used variants of this observation as a basis for conjecturing hardness for a number of
different flavors of DkS [11, 13, 17]. One version of this conjecture can be stated as
follows.
Conjecture 2: Let > 0 and let G
1
and G
2
be two graphs, each sampled as follows.
(i) G
1
is an Erd˝os-R´enyi random graph G(n, n
1/2
). (ii) G
2
is an Erd˝os-R`enyi random
graph G(n, n
1/2
) inside which an G(
n, n
1/4
) subgraph is planted, as described
above. For every
0
> 0, no probabilistic polynomial-time algorithm can succeed with
probability 1/2 +
0
in determining which of the two graphs is G
1
and which is G
2
.
Roughly speaking, Conjecture 2 says that it is hard to detect a G(
n, n
1/4
)
random subgraph planted with a G(n, n
1/2
) random graph.
4
Connection to Planted Clique
The definition of this Planted Dense Subgraph problem has its origins in the
notorious Planted Clique problem. Formally, Planted Clique asks us to dis-
tinguish between graphs chosen uniformly from one of two graph ensembles:
The collection of all graphs (i.e. sampled from the Erd˝os-R´enyi distribution
G(n, 1/2)).
The collection of all graphs with a k-Clique (i.e. a G(n, 1/2) graph within which
a k-clique is planted arbitrarily).
When k > c
n log n for an appropriate constant c, clique vertices will (with
overwhelming probability) reveal themselves simply by virtue of having unusually high
degrees [77]. For k = Ω(
n), spectral, semidefinite programming, and combinatorial
approaches all manage to detect the planted component in polynomial time [4, 52, 53].
For k = o(
n), the problem remains open, and lies at the center of myriad conditional
hardness results (e.g. [17, 22, 26, 63]).
Despite the difficulty of finding a polynomial-time algorithm for k =
n, the
problem with k as small as (2 + ) log
2
n can be solved in n
O(log n)
time simply by
enumerating all subgraphs of size (2 + ) log
2
n and checking if one of them is a clique.
As G(n, 1/2) graphs are extremely unlikely to contain cliques of size (2 + ) log n [5],
the existence of a clique at least as large certifies that the graph is overwhelmingly
likely to contain a planted component. For the search problem, a simple greedy
algorithm can uncover the rest of the planted clique. As the fastest known algorithms
for Planted DkS run in time 2
n
Θ()
, reductions from Planted DkS have the added
advantage that, under the appropriate set of assumptions, they still show hardness
even for quasipolynomial-time algorithms.
5
Densest k–Subgraph with Perfect Completeness
Of particular importance is the special case of (worst-case) DkS in which Yes in-
stances are promised to contain a k-Clique. Such perfectly complete instances of DkS
form the basis of almost all hardness results for the more general DkS problem, and
will receive special attention in Section 2.2. The problem can be formalized as follows.
Problem 1.2 (Densest kSubgraph with Perfect Completeness (DkS
C
)):
Given a graph G = (V, E) and an integer k, distinguish between (i) the Yes case in
which G contains a k-Clique and (ii) the No case in which G contains no sufficiently
dense k-subgraph. In particular, instances to the DkS
C
[1, β] problem are promised to
either (i) contain a k-Clique or (ii) contain no k-subgraph with at least β ·
n
2
edges.
In [81], it is shown that DkS
C
is hard to approximate to within an n
1/polylog n
factor
assuming the Exponential Time Hypothesis (ETH). Roughly, ETH (resp. Ran-
domized ETH) is the conjecture that SAT has no deterministic (resp. randomized)
2
o(n)
-time algorithms.
1
Additionally, [81] provides an n
o(1)
-factor hardness assuming
Gap-ETH, the hypothesis that even gap instances of SAT (such as those generated
by the PCP theorem) don’t have 2
o(n)
-time algorithms [50]. Note that the latter
result is in some sense tight: implicit in [23] is an algorithm solving DkS
C
[1, n
] in
time n
O(1/)
.
We use Conjectures 0a and 0b to denote that randomized ETH and randomized
Gap-ETH each hold, respectively. Although appeals to these conjectures will always
appear in pairs, the hardness factors attained will correspond to which of Conjec-
ture 0a or 0b is assumed.
Conjecture 0a: No randomized algorithm can exactly solve SAT in time 2
o(n)
(i.e.
randomized ETH holds).
1
In its more precise form, ETH also rules out the possibility of a sequence of algorithms each
with running time 2
i
n
with inf
i
i
= 0.
6
Conjecture 0b: For some > 0, no randomized algorithm can solve Gap-SAT[1 , 1]
in time 2
o(n)
(i.e. randomized Gap-ETH holds).
1.1.2 Smallest k–Edge Subgraph
The Smallest kEdge Subgraph (SkES) problem can be seen as the natural
minimization complement of DkS. While the problem statement of DkS asks us to
maximize the number of edges induced by a prescribed number of vertices, SkES
asks us to minimize the number of vertices needed to induce a prescribed number of
edges. This problem, first formalized by [39], can be stated as follows.
Problem 1.3 (Smallest kEdge Subgraph (SkES)): Given a graph G = (V, E)
and an integer k, find the smallest subset V
0
V such that |E(V
0
)| the number
of edges in E induced by V
0
is at least k.
As pointed out in [39], the proof used in [84] to show the hardness of approximating
Node-Weighted Steiner Network also implies that if DkS is α(n)-hard to
approximate, then SkES is O(
p
α(n))-hard.
1.1.3 LabelCover-Min and Min-Rep
LabelCover was first introduced by [12] to show hardness for the various Nearest
Lattice Vector and related problems. The problem is formalized as follows. As
input, we are provided a regular, bipartite graph G = ((V
1
, V
2
), E), a set Σ of labels,
and, for each edge e, and a partial function Π
e
: Σ Σ. A labeling is an assignment
of one or more labels to each vertex, and an edge (v
1
, v
2
) (with v
1
V
1
and v
2
V
2
) is
covered if for every label a
2
assigned to v
2
, v
1
contains a label a
1
such that Π
e
(a
1
) = a
2
.
LabelCover-Min then asks for the labeling with the smallest number of distinct
labels that covers every edge in the bipartite graph. LabelCover-Max, on the
7
other hand, asks us to assign exactly one label to each vertex so to maximize the
total number of covered edges.
The Min-Rep problem, first introduced in [74], can be viewed as a variant of
LabelCover-Min that has shown itself to be a simpler starting point for many
hardness results, e.g. [24, 33, 37, 47]. One formalization of the problem is as follows.
The first part of the input consists of a bipartite graph with vertex set V and edge
set E. Within each bipartition, vertices are further partitioned into equal-sized sub-
sets, which are called supervertices. The set of all supervertices is denoted
ˆ
V . We
then establish an equivalence relation between edges based on whether or not their
endpoints fall in exactly the same pair of supervertices. The non-empty equivalence
classes are called the graph’s superedges, the set whereof is denoted
ˆ
E. Thus, an
instance of Min-Rep can be defined as a 4-tuple (V, E,
ˆ
V ,
ˆ
E), defining the vertices,
edges, supervertices, and superedges of the instance.
2
For a subset V
0
V of vertices, we say that V
0
covers a superedge ˆe
ˆ
E if there
exists an edge e ˆe such that both endpoints of e are in V
0
. Finally, the set of
feasible solutions for a given Min-Rep instance are exactly those vertex subsets that
cover every superedge in
ˆ
E. As the name Min-Rep suggests, the objective is to find
a feasible solution V
0
V of minimal cardinality.
Problem 1.4 (Min-Rep): Given a 4-tuple (V, E,
ˆ
V ,
ˆ
E) representing a bipartite
graph G = (V, E) with supervertex set
ˆ
V and superedge set
ˆ
E, find a minimal-
cardinality subset V
0
V such that for each superedge ˆe
ˆ
E, both endpoints of
some e
ˆ
E are included in V
0
.
As shown in [74], Min-Rep has no 2
log
1
n
approximation algorithm unless NP
DTIME[n
polylogn
]. LabelCover-Max has since been shown to admit a (slightly
stronger) hardness of 2
log
11/ log log
1/2
n
under similar complexity conjectures [49]. To
2
Although the set of superedges is already implied by V , E, and
ˆ
V , in this thesis we treat it as
part of the input for convenience.
8
see how these factors relate to Manurangsi’s result on the hardness of Densest
kSubgraph (assuming ETH), note that
log
c
1
n = 2
c
1
log log n
2
log
1
n
(1.1)
2
log
11/ log log
1/2
n
n
(1.2)
= 2
log n/ log
1/ log log
1/2
n
n
= n
1/ log
1/ log log
1/2
n
n
n
1/(2
log log n/ log log
1/2
n
)
= n
1/2
log log
1/2+
n
n
1/2
c
2
log log log n
= n
1/log log
c
2
n
(1.3)
n
, (1.4)
where c
1
, c
2
, and are arbitrary positive constants, and f(n) g(n) iff f = o(g). In
particular, Equation (1.1) establishes super-polylogarithmic hardness for Min-Rep,
(1.2) establishes that the aforementioned hardness of LabelCover-Max exceeds
that of Min-Rep, (1.3) establishes that the known hardness of DkS exceeds that
of LabelCover-Max, and (1.4) establishes that the known hardness of DkS is
sub-polynomial.
1.2 Outline of Results
Chapter 2 studies the interaction between Min-Rep and DkS
C
. We show that Min-
Rep is at least as hard to approximate as DkS
C
. This establishes stronger lower
bounds for Min-Rep assuming either the Exponential Time Hypothesis (ETH) or its
strengthening to Gap-ETH. This material is based on work with Moses Charikar and
Anthony Wirth [31].
Chapter 3 centers on the Target Set Selection (TSS) problem, in which we
are tasked with finding a small set of nodes that can kick-start an influence spreading
9
process that eventually reaches the whole graph. We show that while both TSS and
its finite-round variant are hard to approximate, the finite-round version admits a
nontrivial approximation algorithm. Along the way, we introduce a recursive ver-
sion of the random-in-random model of the Densest kSubgraph problem (see
Section 1.1.1), and show that it remains as hard as the un-recursed version of the
problem. We then study a dynamic version of TSS, in which previously influenced
vertices can become de-influenced if a sufficient number of their in-neighbors them-
selves have become de-influenced. As we show, the dynamics of this problem allow
for exponential-period orbits, and deducing even simple properties of the long-term
behavior is PSPACE-complete. This material is based on work with Moses Charikar
and Anthony Wirth [30].
Chapter 4 centers on the Densest Common Subgraph problem, in which we
search a sequence of graphs for subgraphs that consistently remain dense. We study
two versions of this problem: the “max-min-average” variant and the “max-average-
min” variant. We show hardness of approximation results for both problems, and
provide two algorithms: an
˜
O(
n)-factor poly-time approximation algorithm for the
former and a (1 + )-approximate FPT-time algorithm for the latter (where the pa-
rameter is the number of graphs). This material is based on joint work with Moses
Charikar and Jimmy Wu [32].
In Chapter 5, we study the Middlebox Routing problem, which is a practice-
inspired variant of Multi-Commodity Flow in which packets also need to be
processed by the vertices in the graph. While we show that the base problem can
be solved using either a polynomial-sized linear program or with a more efficient
multiplicative weights algorithm, we also prove that the underlying network design
problem is NP-hard. We study five variants of the network design problem, and
provide both approximation and hardness results. This material is based on joint
work with Moses Charikar, Jennifer Rexford, and X. Kelvin Zou [29].
10
In Chapter 6, we study a number of problems, each of which can be shown to be
DkS-hard. In Section 6.1, we show hardness for Minimum Monotone Satisfying
Assignment from two different conjectures on the hardness of DkS
C
. In Sections 6.2
and 6.3, we show DkS-based hardness for both Min-Cost Colored Cut and
Min-Query Q&A. In Section 6.4, we show that the Min-k-Union problem, which
generalizes SkES and is thus DkS-hard, admits an O(
m)-approximation algorithm.
Finally, in Section 6.5, we show DkS-hardness for Network Flow Interdiction
as well as much stronger hardness for a natural variant of the problem.
The material in Sections 6.1 and 6.3 is based on joint work with Moses Charikar
and Anthony Wirth [30, 31], that in Sections 6.4 and 6.5 is based on joint work with
Moses Charikar and Nicole Wein, and that in Section 6.2 is based on joint work with
Nina Mishra [83].
1.3 Common notation
For a graph G, we use G[V
0
] to denote the subgraph of G induced by V
0
, and E[G]
to denote the set of edges within G. We will use n to denote the number of vertices
in a graph, and m to denote the number of edges.
Most problems in this paper will be introduced as function problems; their corre-
sponding decision version will follow naturally. Depending on context, we use OPT
either to refer to an optimal feasible solution or to its objective value. Similarly, we
use ALG to denote either the solution found by a candidate algorithm or the objective
score of said solution. When we wish to explicitly refer to the appropriate objective
value, we use the notation cost(OPT) and cost(ALG). When describing approxima-
tion and hardness ratios, we use values both greater-than and less-than 1. The correct
interpretation of these values will be obvious from context. For two problems A and
B, we say that A
p
B iff every poly-time f(n)-approximation algorithm for B can
11
be used in a black-box manner to attain a polynomial-time poly(f(n))-approximation
algorithm for A. In such cases, we say that A is polynomially-related to B.
When the parameter might not be clear from context, use the notation o
r
(1) to
denote a value that tends to 0 as r tends to . The variable will always be used to
denote a small positive constant. We use N
0
to denote the set of nonnegative integers
and Z
+
to denote the set of strictly positive integers.
12
Chapter 2
Min-Rep and DkS
2.1 Overview
In this section, we study how Densest kSubgraph can be used to show stronger
hardness for Min-Rep-hard problems. We describe this connection in two parts.
First, we show that current results on the hardness of DkS translate into hardness
for Min-Rep. This exploits the fact that the current best hardness results for DkS
follow from the hardness of DkS
C
. Next, though we fall short of showing Min-Rep
hardness for (non-complete) DkS, we show that many Min-Rep-hardness proofs can
be converted into proofs of (effectively) DkS-hardness. As a key intermediate step in
our technique, we introduce the Min–k–Rep problem a natural generalization of
Min-Rep that also happens to be DkS-hard. Thus, by showing that many Min-Rep-
hardness proofs can be translated into proofs of Min–k–Rep-hardness, we establish
that a large number of Min-Rep-hard problems are also DkS-hard.
13
2.2 DkS
C
-hardness of Min-Rep
Before showing that DkS
C
p
Min-Rep, we first present a simple lemma proving
that graphs without dense k-subgraphs cannot have dense r-subgraphs when r is not
much larger than k.
Lemma 2.1: Let G be a graph in which every k-vertex subgraph has at most s
induced edges. Every subgraph of G containing at least t s induced edges must
have more than (k 1)
p
t/s vertices.
Proof. Let R be an r-vertex subgraph of G containing at least t induced edges.
Clearly, r k. Now let K be a k-vertex induced subgraph of R chosen by selecting
a uniformly random sample of k vertices from R. For every edge (u, v) R, the
probability that it also appears in K is the probability that both of its endpoints
survived the sampling process. Namely, Pr[(u, v) K] =
k
r
·
k1
r1
>
(k1)
2
r
2
. By
linearity of expectations, E[# edges in K] > t(k 1)
2
/r
2
. However, since a k-vertex
subgraph of R has at most s induced edges, s E[# edges in K]. Combining the
two inequalities, s > t(k 1)
2
/r
2
, and, solving for r, we get r > (k 1)
p
t/s.
With this lemma in hand, we can proceed to prove that Min-Rep’s hardness is
polynomially related to that of DkS
C
.
Theorem 2.2: If there is an α-approximation algorithm for Min-Rep (with α > 1),
then there exists a randomized algorithm for DkS
C
[1, β] whenever β < 1/(9α
2
). In
particular, DkS
C
p
Min-Rep.
1
Proof. To simplify the presentation of the proof, we make the benign assumption that
the number of vertices is evenly divisible by 6 log n. Similar analysis carries through
when this is not the case. Given an instance (G(V, E), k) of DkS
C
[1, β], we partition
1
To keep the analysis simple, no attempt is made to lower the factor of 9 in the bound on β.
14
V uniformly at random into k
0
= k/(3 log n) groups so that each has (3n/k) log n
vertices.
We now construct a Min-Rep instance as follows. The vertex set is exactly
V . The first k
0
/2 groups in the above partitioning form the supervertices of the
left-partition of our Min-Rep instance (
ˆ
A). The remaining k
0
/2 groups form the
supervertices of the right partition (
ˆ
B). The edge set is exactly those edges which
cross the bipartition, all other edges are discarded. The set of superedges is implied
by our choice of the other three sets.
If G is a Yes instance of DkS
C
[1, β], then it contains a k-clique. The partitioning
scheme above ensures that, with high probability, each supervertex contains at least 1
vertex of this clique. Thus, with high probability, every (A
i
, B
j
) pair induces at least
one superedge. Consequently, the optimal objective value is exactly k
0
with high
probability, as picking one vertex in each partition is both necessary and sufficient to
cover all superedges
k
0
2
superedges.
However, if G came from a No instance, then one of two things can happen.
Either (i) the constructed Min-Rep instance is missing some superedge (that is, some
(A
i
, B
j
) supervertex pair does not induce a superedge); or (ii) the constructed Min-
Rep instance has all k
02
/4 potential superedges. In the first case, we can immediately
identify that with high probability the input graph came from a No instance of DkS
C
and return accordingly. Thus, we focus our attention on the optimal objective values
of No instances in which all superedges are present.
Such an instance can be identified as a No instance when its optimal objective
value is greater than αk
0
: applying the factor-α approximation algorithm to Min-Rep
would reveal that the optimal objective value exceeds k
0
, hence, with high probability,
that of a Yes instance.
15
To figure out which values of β necessitate this αk
0
lower bound on OPT, we appeal
to Lemma 2.1. A feasible solution to the problem must cover all k
02
/4 superedges,
and thus the subgraph induced by the solution must contain at least k
02
/4 edges.
However, the promise problem we are solving, DkS
C
[1, β], guarantees that no k-
vertex subgraph, where k = 3k
0
log n, contains more than β
k
2
βk
2
= 9βk
02
log
2
n
edges. Applying Lemma 2.1 with t = k
02
/4, s = 9βk
02
log
2
n, and k = 3k
0
log n, we
see that a subgraph inducing k
02
/4 edges (and, in particular, the smallest one) has at
least
(3k
0
log n 1)
s
k
02
/4
9βk
02
log
2
n
=
3k
0
log n 1
6
β log n
>
k
0
3
β
vertices. Thus, we have the sought αk
0
lower bound on the optimal objective value
whenever α < 1/(3
β) or, equivalently, whenever β < 1/(9α
2
).
Combining Theorem 2.2 with the result of [81], we get that for some con-
stant c, Conjecture 0a implies that it is hard to approximate to within a factor
of O(n
1/ log log
c
n
). Additionally, Conjecture 0b implies that Min-Rep is hard to
approximate to within a factor of n
o(1)
.
Theorem 2.3: For some constant c, Min-Rep has no O(n
1/ log log
c
n
)-approximation
algorithm (resp. no n
o(1)
-approximation algorithm) unless randomized ETH (resp.
randomized Gap-ETH) is false.
2.3 From Min-Rep-hardness to DkS-hardness
To show DkS hardness for Min-Rep-hard problems, we introduce a natural general-
ization of Min-Rep, which we call Min–k–Rep. This problem is just like Min-Rep,
with the twist that the set of feasible solutions are those that cover at least k su-
peredges (instead of all of them).
16
Problem 2.1 (Min–k–Rep): Given and integer k and a bipartite graph with vertex
set V , edge set E, supervertex set
ˆ
V , and superedge set
ˆ
E, find the smallest subset
of V that covers at least k superedges.
Thus, the Min-Rep problem is simply Min–k–Rep with k equal the number of
superedges in G; Min–k–Rep is hence Min-Rep-hard. As it turns out, the hardness
of Min–k–Rep is also polynomially-related to that of DkS.
Theorem 2.4: DkS
p
Min–k–Rep
Proof. As discussed in Section 1.1.2, DkS
p
SkES. It thus suffices to show that
SkES
p
Min–k–Rep. We show this via a direct reduction: given an instance
(G(V, E); k) of SkES, we choose a random bipartition of V into A and B by plac-
ing each vertex in A independently with probability 1/2, and in B otherwise. The
Min–k–Rep instance returned has vertex set V , an edge set containing all edges
crossing the bipartition, and one singleton supervertex for every vertex in V (thus,
we have exactly |V | supervertices).
For a subset S V , let v
1
(S) be the objective value of S for the given SkES
instance, and v
2
(S) be the objective value of S under the constructed Min–k–Rep
instance. Clearly, v
2
(S) v
1
(S), as choosing both endpoints of an edge coincides with
covering a unique superedge. Additionally, by the random construction, E[v
2
(S)] =
v
1
(S)/2, and, by a standard application of Chernoff bounds, with high probability,
v
2
(S) v
1
(S)/3. Thus, the two problems differ only by a constant factor in their
objective scores for every set S, meaning that the two problems are polynomially
related.
With this theorem in hand, the goal is now to transform proofs of Min-Rep
hardness into proofs of Min–k–Rep hardness. In some cases, Min–k–Rep-hardness
follows immediately: in [9], the proof of Min-Rep-hardness for k-Steiner For-
est actually shows Min–k–Rep-hardness for the problem. In other instances, the
17
conversion requires more work. In the remainder of this thesis (and particularly in
Chapter 6), we show how to establish Min–k–Rep-hardness for a number of Min-
Rep-hard problems.
18
Chapter 3
Target Set Selection
3.1 Overview
In the early 2000s, the growing interest in direct marketing inspired Domingos and
Richardson to study the network value of customers, defined as the sum of the profit
directly generated by marketing to the customer and any additional profit that the
customer may generate through word-of-mouth marketing [51, 88]. While their work
focuses on the Bayesian setting, subsequent work by Kempe, Kleinberg, and Tardos
introduced the following deterministic multi-stage influence problem [71].
As input, we are given an (un)directed graph G = (V, E), and a non-negative
threshold function τ : V N
0
. A vertex can be either active (infected) or inactive
(uninfected). For an initial seed set of active vertices, influence proceeds in a sequence
of rounds. A previously inactive vertex v becomes active in a particular round if in
the previous round at least τ (v) (in-)neighbors of v were active. Once a vertex is
active, it remains active in all subsequent rounds. Since the process (essentially)
stops if there is no new active vertex in some round, there are at most n rounds. A
seed set is said to be contagious if it induces the entire graph to eventually become
active. The TSS problem asks for the smallest contagious seed set of the graph.
19
Problem 3.1 (Target Set Selection (TSS)): Given a graph G = (V, E) and a
nonnegative threshold function τ : V N
0
, find the smallest subset S V such that
S is contagious.
In 2009, Chen [33] showed that TSS is Min-Rep-hard to approximate, and that
the problem is maximally hard even with τ(·) 2 identically. While algorithms
have been found for special instances of TSS (such as when the graph has low-
treewidth [20] or is an expander [46]), no approximation algorithm better than the
trivial O(n) approximation has yet been found for worst-case instances.
Although the activation procedure underlying TSS was originally defined to pro-
ceed for an infinite number of rounds, Cicalese et. al. initiated the study of a latency-
bounded variant of Target Set Selection [44, 45]. In addition to a (di)graph and
threshold function, latency-bounded instances are also parameterized by a positive
integer r. The r-round Target Set Selection problem, denoted TSS
r
, is then
to find the smallest r-contagious set of vertices; i.e. the smallest set that would acti-
vate the entire graph within r rounds. We will discuss this variant in more detail in
Sections 3.3 and 3.4.
Problem 3.2 (r-round Target Set Selection (TSS
r
)): Given a graph G =
(V, E), a nonnegative threshold function τ : V N
0
, and a positive integer r, find
the smallest subset S V such that S is r-contagious.
The vast majority of this chapter is based on joint work with Moses Charikar and
Anthony Wirth.
3.2 TSS and Min-Rep
We first give a simplified version of Chen’s Min-Rep-hardness proof for Target Set
Selection.
20
1
1
1
2
u
v
Figure 3.1: A directed edge gadget from u to v. The threshold of each vertex is
indicated by the number inside its depiction.
Theorem 3.1 (from [33]): There exists an approximation preserving reduction (up
to constant factors) from Min-Rep to TSS, even when all edges are bidirected.
A key tool in this reduction is the directed edge gadget (Figure 3.1), which is
implicit in Chen’s work and explicitly introduced in [19, 30]. This gadget is meant
to simulate a directed edge, in that u being active would (via the gadget) give v one
additional active neighbor (albeit after 3 rounds), while v being active does not (by
means of the gadget) directly affect u. Additionally, these gadgets have the property
that they do not allow algorithms to benefit by “cheating” and including a gadget
vertex in their choice of contagious set: any solution containing one of the gadget
vertices is (weakly) dominated by a solution activating its in-vertex u instead. Thus,
we can assume without loss of generality that solutions of interest do not contain
gadget vertices. Thus, for the remainder of the document, we can allow w.l.o.g. for
TSS instances to contain directed edges
1
.
With such a gadget in hand, the reduction is pretty direct. Given a Min-Rep
instance (V, E,
ˆ
V ,
ˆ
E), we create a TSS instance with |V | + |E| + 2|
ˆ
E| vertices.
Vertices There are four layers of vertices, called the vertex layer, edge layer,
superedge layer, and completeness layer. The vertex layer A has one threshold-|
ˆ
E|
vertex a
i
for each v
i
V , the edge layer B has one threshold-2 vertex b
j
for each
1
Although this gadget reduction may alter the number of rounds, the round-based lower bounds
we derive in the remainder of this chapter stem from entirely bi-directed instances and are unaltered.
21
e
i
E, the superedge layer C has one threshold-1 vertex c
k
for each ˆe
k
ˆ
E, and the
completeness layer D has |
ˆ
E| threshold-|
ˆ
E| vertices that behave indistinguishably.
Edges The four layers are connected in a cyclical manner. Between A and B,
we add a directed edge gadget from vertex a
i
to vertex b
j
whenever v
i
is an endpoint
of e
j
. Between the B and C, we add a directed edge gadget from b
j
to c
k
whenever
e
j
ˆe
k
. The following two sets of edges, connecting C to D and D to A, are both
complete (i.e. all possible connections exist, and the induced graphs are both directed
bicliques).
This completes the construction. Before we check the correctness of this construc-
tion, we first prove a helpful lemma about the structure of optimal solutions.
Lemma 3.2 ([33]): For instances formed by the above construction, any contagious
set S can be transformed (in polynomial time) into a contagious set S
0
of size at most
2|S| contained entirely within the vertex layer.
Proof of Lemma 3.2. Let S be a contagious set. If D S, then the set S
0
= S C \D
is also contagious (as it activates D in one round), and is no larger than S. Otherwise,
if S contains only a proper subset D
0
of vertices from D, the activation of these vertices
contributes nothing to the spreading process (as all of D needs to be active to activate
anything in A, and the events leading to the activation of D\D
0
would have activated
D
0
anyway), and so S \D
0
is also contagious. Thus, solutions can be made to contain
only vertices from A B C with zero increase in objective value.
Now suppose the solution S contains only vertices from A B C. Because all
vertices in C have a threshold of 1, the inclusion of any vertex from C in S is weakly
dominated by the choice of any of its in-neighbors from B. Thus, solutions can be
made to contain only vertices from A B. Similarly, because vertices in B only
have threshold 2, removing each of them from S and instead including their two in-
neighbors in A can only increase the size of S by a factor of 2. Thus, we can transform
22
any solution into one containing only vertices from A, up to a multiplicative blowup
of at most 2 in total cost.
The proof of the main theorem is now rather simple.
Proof of Theorem 3.1. Given an instance M of Min-Rep, we construct a TSS in-
stance T according to the above construction. A feasible solution to the Min-Rep
instance can be converted into a solution to the constructed TSS instance by select-
ing the corresponding vertices from the vertex layer. This forms a contagious set
because the activation of these vertices would initially activate exactly the vertices
corresponding to the edges they cover. These, in turn, activate vertices correspond-
ing to the superedges covered by the solution, which (by feasibility of the Min-Rep
solution) includes all of the superedge layer. The activation of the entire layer then
activates the entire completeness layer, which subsequently activates the vertex and
edge layers. Thus, cost(OPT
T
) cost(OPT
M
).
On the other hand, any contagious subset of the vertex layer in T must correspond
to a feasible solution in M: the spreading behavior corresponding to an infeasible
solution to M must fail to activate at least one vertex from the superedge layer
within the first 2 rounds, at which point the spreading process comes to an abrupt
halt. Because Lemma 3.2 guarantees that any feasible solution to T can be made to
contain only vertices from the vertex layer up to a factor of 2 in objective cost, we
conclude that cost(OPT
M
) 2 · cost(OPT
T
). The result follows.
3.2.1 Min–k–Rep hardness of TSS
The above reduction neatly falls in the framework outlined in Section 2.3. To get
Min–k–Rep (and thus DkS) hardness for TSS, we simply reduce the threshold of
every vertex in the completeness layer down to k. Thus, only k superedges need to
become active for the completeness layer to activate (and in turn activate all other
23
vertices). The remainder of Theorem 3.1 carries through as before. Thus, we can
conclude the following. Min–k–Rep
Theorem 3.3: DkS
p
Min–k–Rep
p
TSS.
3.3 Hardness from Planted DkS Instances
Notation
In this section, we will refer to a graph-theoretic operation called planting. For two
graphs G and H (the latter of order n
H
), one can plant H in G by replacing some
n
H
-vertex subset of G with a copy of H. For our purposes, the exact choice of
which subset of G to replace does will not matter. Thus, we can use the (otherwise-
ambiguous) notation G C H to denote the planting of H in G. When we chain this
notation (e.g. G
1
C G
2
C G
3
), the C
00
operator is to be read as a right-associative
operator. In the previous example, this means that G
3
is an induced subgraph of
G
2
C G
3
, which is in turn an induced subgraph of G
1
C G
2
C G
3
.
3.3.1 Hardness of TSS
4
Recall that in Section 1.1.1, we introduced the Planted Dense Subgraph Con-
jecture (Conjecture 2), which states that no probabilistic polynomial time algorithm
can, with probability at least
1
/2 + , distinguish between a vanilla G(n, 1/
n) graph
and a G(n, 1/
n) graph within which a G(
n, n
1/4
) is planted. To convey the
intuition underlying Section 3.3, we start out by showing how such instances can be
used directly to prove hardness for TSS
4
.
Theorem 3.4 (warmup): Assuming Conjecture 2, it is hard to approximate TSS
4
to within a factor of n
1/4
.
24
The approach, roughly, will be as follows. Suppose that we are given a graph from
one of the two distributions described above. For the duration of this subsection and
Section 3.3.3, we set the threshold function τ to identically equal some large constant
τ
0
. We show that for some
0
= o
τ
0
(1), G(n, 1/
n) graphs have (with high probability)
no 4-contagious set of size n
1/2
0
. On the other hand, we show that (w.h.p.) every
set of
n vertices is 2-contagious. Thus, when there is a dense planted component,
it suffices to activate the component in the first 2 rounds in order to activate the
whole graph in 4 rounds. We then take advantage of the extra density in the planted
component to show that it (w.h.p.) contains a seed set of size n
1/4
, establishing the
sought gap. Formally, we show the following results
Lemma 3.5 (Lower bound on OPT): For all positive integers τ
0
and r both at least
2, the TSS
r
instance with graph G(n, 1/
n) and threshold function identically equal
to τ
0
has |OPT| =
˜
Ω(n
β
), where β =
τ
0
2
2τ
0
2
.
Lemma 3.6 (Upper bound on OPT): For all positive integers τ
0
, and for any
(0, 1/(3(τ
0
+ 1)
2
)), the TSS
4
instance with graph G(n, 1/
n) C G(
n, n
1/4
) and
threshold function identically equal to τ
0
has |OPT| = O(
4
n).
Both of these lemmas can be proven by an iterated application of the Chernoff
bound. These two lemmas are special cases of Lemmas 3.8 and 3.9, respectively, both
of which will be proven in full generality in Section 3.3.3.
3.3.2 Recursive Planting
While the approach used in Section 3.3.1 can be used to show n
1/4
hardness for
TSS
4
(and, indeed, for TSS
r
with r 4), the bounds given by Lemmas 3.5 and 3.6
are tight enough to show that a direct application of Conjecture 2 cannot attain any
substantially stronger hardness result. To get around this barrier, we introduce the
concept of recursive planting.
25
The key idea here is as follows. Conjecture 2 says that it is hard to detect a
G(n
1/2
, n
1/4
) graph planted within G(n, n
1/2
). To assist the exposition of the
intuition, we sacrifice a bit of accuracy and pretend that can be set to 0. Thus, for
n
0
=
n, the planted graph is a G(n
0
, 1/
n
0
) graph – effectively a scaled down version
of the G(n, 1/
n) graph in which we are planting it. Thus, it’s natural to wonder if we
can set n
00
=
n
0
=
4
n and plant a G(n
00
, 1/
n
00
) within the planted G(n
0
, 1/
n
0
)
component, all without making the detection problem any easier. Fortunately, it
turns out that we can.
At a high level, the proof starts out by noticing that Conjecture 2 says (up to
the we are presently ignoring) that a graph chosen from G(n
0
, 1/
n
0
) is indis-
tinguishable from one chosen from G(n
0
, 1/
n
0
) C G(n
00
, 1/
n
00
) when n
00
=
n
0
.
This implies that the two distributions must remain indistinguishable after any
polynomial time operation, including that of planting the sampled graph in
G(n, 1/
n) where n = n
02
. Thus, G(n, 1/
n) C G(n
0
, 1/
n
0
) is indistinguish-
able from G(n, 1/
n) C G(n
0
, 1/
n
0
) C G(n
00
, 1/
n
00
). Because the first distribution
is itself indistinguishable from just G(n, 1/
n), we conclude that the twice-planted
distribution is indistinguishable from the entirely-unplanted distribution. Recursing
this argument, we can get indistinguishability for any constant number of rounds of
planting.
Before we prove this result formally, we introduce some helpful notation. We
will be indexing tuples in two different ways. All tuples in this chapter are written
thus ~a, and are indexed in two ways. If
~
ξ = (ξ
1
, ξ
2
, ··· , ξ
m
) is a tuple and 1
i < j m, we let
~
ξ
j
i
denote the contiguous sub-tuple (ξ
i
, ξ
i+1
, ··· , ξ
j
). Sometimes,
we are interested in the suffix of the tuple, so we can index the final elements thus:
~
ξ
(1)
(i)
= (x
m(i1)
, x
m(i2)
, . . . , x
m
). We define an instance of the PDS(n, k, α, β)
problem to be a graph that is with probability 1/2 sampled from G(n, n
α1
) and
with probability 1/2 sampled from G(n, n
α1
) C G(k, k
β1
). An algorithm is said to
26
solve the problem if it can guess from which distribution the graph was sampled with
probability at least 1/2 + . Conjecture 2 is exactly the conjecture no PPT algorithm
solve PDS(n, k, α, β) when k =
n and β < α.
The recursively planted version of PDS is then defined as follows. For every pair of
length-m tuples ~n = (n
1
, n
2
, . . . , n
m
) (the subgraph orders), and ~α = (α
1
, α
2
, . . . , α
m
)
(the subgraph log-densities), with each n
i
Z
+
and n
i
> n
i+1
, and each α
i
(0, 1),
we define the PDS
m
(~n, ~α) distribution via the recurrence
PDS
m
(~n, ~α) =
G(n
1
, n
1
α
1
1
) C PDS
m1
(~n
(1)
(m+1)
, ~α
(1)
(m+1)
) if m > 1;
G(n
1
, n
1
α
1
1
), otherwise.
We also define the (eponymous) PDS
m
(~n, ~α) problem: distinguish with ε advan-
tage graphs drawn from the PDS
m
(~n, ~α) distribution from those drawn simply from
PDS
1
(n
1
, α
1
) = G(n
1
, n
α
1
1
).
We now show that for the appropriate setting of parameters, PDS
m
(~n, ~α) is at
least as hard as PDS
2
(~n
2
1
, ~α
2
1
) (i.e., the version of the problem with just one round of
planting).
Lemma 3.7 (Recursive planting): Assuming Conjecture 2, if for every i < m,
α
i
> α
i+1
and n
i+1
=
n
i
, then for every m 2, no PPT algorithm can solve
the PDS
m
(~n, ~α) problem with ε advantage.
Proof (by contradiction). Consider the minimum m for which some algorithm A
solves PDS
m
(~n, ~α), with ε advantage. We show how to construct an algorithm that
attains an ε
0
> 0 advantage for the problem PDS
m1
(~n
(1)
(m+1)
, ~α
(1)
(m+1)
), contradicting
the minimality of m.
For a distribution H, let F(H) be the distribution G(n
1
, n
α
1
1
) C H. Hence
F
PDS
m1
(~n
(1)
(m+1)
, ~α
(1)
(m+1)
)
27
is PDS
m
(~n, ~α) and F (G(n
2
, n
2
α
2
1
)) is PDS
2
((n
1
, n
2
), (α
1
, α
2
)). But, of course,
this F operator represents a (randomized) polynomial-time-computable operation
on a graph, once drawn from distribution H. Assuming the existence of algorithm A,
we propose algorithm A
F
for PDS
m1
(~n
(1)
(m+1)
, ~α
(1)
(m+1)
):
Given graph H, apply algorithm A to F(H) and return A’s answer.
For the following three distributions, let p
i
be the probability that A reports True
when the graph is drawn from distribution i.
1. G(n
1
, n
1
α
1
1
);
2. F (G(n
2
, n
2
α
2
1
)) = PDS
2
((n
1
, n
2
), (α
1
, α
2
));
3. F
PDS
m1
(~n
(1)
(m+1)
, ~α
(1)
(m+1)
)
= PDS
m
(~n, ~α)
Since we assumed Conjecture 2, the probability of returning the correct answer
when distinguishing between distributions 1 and 2 is at most 1/2 + o(1). That is,
(1 p
1
)/2 + p
2
/2 1/2 +o(1). On the other hand, since A solves PDS
m
(~n, ~α), with ε
advantage, (1 p
1
)/2 + p
3
/2 1/2 + ε.
Consider algorithm A
F
, it applies F to its input graph and sends it to A. Al-
though A was promised a graph that was with probability 1/2 from the first distri-
bution and with probability 1/2 from the third,
2
its input is merely a graph with
nonzero mass in each of the two distributions, and thus it outputs some value in
probabilistic polynomial time
3
. The probability of algorithm A
F
correctly reporting
True is (by definition) (1p
2
)/2+p
3
/2, which, from the two previous inequalities, is
easily seen to be at least 1/2 + ε o(1). Hence, for sufficiently large n, Algorithm A
F
2
This application of a promise oracle to non-promise-satisfying inputs is an example of a non-
smart reduction [58].
3
Although the positivity of the mass on some order-n graph might seem like a technicality,
it is not intrinsic to the analysis. For other, similar, distributions of graphs where F(H) may
have 0 probability in either distribution, we can run A up to its polynomial-time upper-bound for
promise-satisfying instances and return False if it has failed to return by then. The rest of the
analysis proceeds identically, up to the necessary modifications needed to handle the new ensemble
of distributions.
28
is a PPT algorithm with advantage ε/2 for the PDS
m1
(~n
(1)
(m+1)
, ~α
(1)
(m+1)
) problem,
contradicting the minimality of m.
3.3.3 Hardness of TSS
r
With Lemma 3.7 in hand, we can now try recursing the ideas first outlined in Sec-
tion 3.3.1. We show that while the size of the optimal solution remains roughly
n
1/2o
τ
(1)
in the unplanted instances, appropriately-chosen recursively-planted in-
stances have (for a sufficiently large number of rounds) an optimal solution of size
roughly
n
0
, where n
0
is the order of the inner-most planted subgraph. As these
inner-most planted components are of size roughly n
.5
1+t
= n
o
t
(1)
where t is the num-
ber of levels of planting, our construction produces a gap approaching n
1/2
as t
increases.
Lemma 3.8 (Lower bound on OPT): For some α (0, 1/2) and all positive integers
r, τ
0
both at least 2, the TSS
r
instance with graph (G(n, n
α1
) and all thresholds set
to τ
0
has |OPT| =
˜
Ω(n
β
) with high probability, where β = (τ
0
(1 α) 1)/(τ
0
1).
Proof. For a seed set S of size
˜
O(n
β
), chosen uniformly at random (u.a.r.), and a
vertex v / S, chosen u.a.r., the probability that v has k neighbors in S is, by the
union bound, at most
|S|
k
(n
α1
)
k
=
˜
O
(n
β
n
α1
)
k
(for constant k), which decreases
geometrically with k. Thus, the number of vertices newly activated after one round,
|A
1
(S) \ S|, follows a binomial distribution with both mean and variance µ = σ
2
=
˜
O(n(n
β
n
α1
)
τ
0
) =
˜
O(n
1+(α+β1)τ
0
). Because (α + β 1)τ
0
= β 1, this expression
simplifies to
˜
O(n
β
). Chernoff bounds tell us that the probability that |A
1
(S) \ S|
exceeds its expectation by a log
2
n factor is bounded by
exp[µ(log
2
n 1 2 log
2
n · log log n)] < exp[µ(log
2
n)] = n
µ log n
.
Given it has size
˜
O(n
β
), there are
˜
O

n
n
β

choices of seed set S. By the union bound,
the probability that at least one of them activates more than
˜
O(n
β
) vertices within
29
one round is 1/n
Ω(1)
. By induction, after a constant number of activation rounds, for
all ε > 0, the number of active nodes is with high probability at most o(n
β+ε
). Hence,
with high probability, no seed set of size
˜
O(n
β
) is an r-round contagious set.
To show an upper bound on OPT for these recursively-planted instances, we reuse
both the indexing notation and the definition of the PDS distribution introduced in
Section 3.3.2.
Lemma 3.9 (Upper bound on OPT): For all positive integers τ
0
and r both at least
2, with ρ br/2c, and ε
0
(0, 1/(3(τ
0
+ 1)
2
)), ~n = (n,
n,
p
n, . . . ,
2
ρ
n) and ~α =
(1/2, 1/2 ε
0
/(1 + ρ), 1/2 2ε
0
/(1 + ρ) . . . , 1/2 ρε
0
/(1 + ρ)), and G PDS
r
0
(~n, ~α),
the TSS
r
instance with graph G and all thresholds equal to τ
0
has |OPT| = O(
2
ρ
n)
with high probability.
Proof. For i 1, 2, ··· , 1 + ρ, let G
i
be the depth-i planted component of G, i.e.,
the graph of order k
i
=
2
i1
n and average degree k
i
1/2(i1)ε
0
/(1+ρ)
planted in G.
Further, for every i, define a
i
= k
i
1/2ε
0
, making a
i
a lower bound on the edge
probability in G
i
. For i {1, 2, ··· , ρ}, choosing a subset of size
k
i
= k
i+1
of G
i
’s
vertices u.a.r. to be the seed set S, will lead to random variable |A
1
(S)\S| following a
binomial distribution with µ = σ
2
= Ω(k
i
·(
k
i
a
i
)
τ
0
). Each indicator of membership in
A
1
(S)\S is a Bernoulli random variable with activation probability at least
k
i
τ
0
a
i
τ
0
.
The expression for the mean and variance is thus Ω(k
i
(
k
i
· k
i
1/2ε
0
)
τ
0
), which is
Ω(k
i
1τ
0
ε
0
).
The probability that, for a randomly chosen seed set of size
k
i
, fewer than
µ/ log
2
n vertices become active is, again by the Chernoff bounds, less than n
µ log n
=
n
Ω(k
i
1τ
0
ε
0
)
< n
k
2/3
i
n
k
i
2
.
Every remaining inactive vertex in G
i
would now have
ˆ
d = µa
i
=
˜
Ω(k
i
1/2ε
0
(τ
0
+1)
) =
˜
Ω(k
i
1/21/(3(τ
0
+1))
) activated neighbors, in expectation, with variance O(
ˆ
d). Thus,
with high probability, no vertex has fewer than
˜
Ω(k
i
1/3
) active neighbors. Since each
30
vertex has so many active neighbors, with high probability, all of G
i
becomes active
within two rounds of G
i+1
becoming active. By induction, activating G
r
0
+1
will with
high probability activate G
1
after 2ρ r rounds. Since |V (G
ρ+1
)| = n
0.5
ρ
, the lemma
follows.
Finally, combining Lemmas 3.8 and 3.9 gets hardness for TSS
r
when r = O(1).
Theorem 3.10: Assuming Conjecture 2, for every ε > 0, no PPT algorithm can
approximate TSS
r
to within a factor of O(n
1/21/2
br/2c
ε
).
Proof. Given an instance G of the PDS
m
(~n, ~α) problem satisfying the conditions of
Lemma 3.7, we choose a threshold τ
0
satisfying ((1 α
1
)τ
0
1)/(τ
0
1) > 1/2 ε/2
and generate the TSS
r
instance with graph G and all thresholds set to τ
0
. If G
comes from the unplanted distribution, an application of Lemma 3.8 provides a lower
bound of
˜
Ω(n
1/2ε/2
) on the size of some optimal seed set, OPT. On the other hand,
if G comes from the planted distribution, Lemma 3.9 provides an upper bound of
O(n
0.5
br/2c
) on the size of OPT. Thus an
˜
O(n
0.5ε/2
/n
0.5
r/2
) =
˜
O(n
1/21/2
br/2c
ε/2
) =
O(n
1/21/2
br/2c
ε
)-approximation algorithm for TSS
r
can thus distinguish the two
cases, contradicting Conjecture 2.
3.3.4 Hardness of unbounded-round TSS
In the previous section, we showed that, assuming Conjecture 2, TSS
r
is hard to
approximate to within a factor of O(n
1/2o
r
(1)
). However, this does not a priori imply
anything about Target Set Selection instances where the number of rounds is
not bounded by some constant.
To show the hardness of unbounded-round versions of TSS, we use a direct re-
duction from TSS
r
. In particular, we show that the approximating TSS is (up to
constant factors) no easier than approximating TSS
r
for any r = O(1). Thus, the
31
a
1
b
1
c
1
d
1
a
2
b
2
c
2
d
2
a
0
b
0
c
0
d
0
+
+
+
+
+
+
+
+
a
1
b
1
c
1
d
1
a
r
b
r
c
r
d
r
a
0
b
0
c
0
d
0
Figure 3.2: Sample of the first few layers of the TSS
r
to TSS reduction. The i-th
stage vertices are white and the memory vertices are grey. All drawn arcs are oriented
rightwards. Not shown: arcs from each vertex in S
r
to all those in S
0
.
hardness factor achieved of TSS is at least the supremum of the hardness known for
each constant-round TSS problem, and is thus at least O(n
1/2
).
Theorem 3.11: An O(f(n))-approximation algorithm for TSS can be transformed
into an O(f(n))-approximation algorithm for TSS
r
.
Proof. A sketch of the construction is given in Figure 3.2
Given an instance TSS
r
with graph G = (V, E) and threshold function τ, we
construct an instance of TSS, whose graph we call G
0
and whose threshold function
we call τ’. The intent is that the (unbounded-round) activation process in G
0
simulates
the (r-round) activation in G. The instance is constructed as follows.
Vertices For i = 0, 1, . . . , r, and for each v V , there is a vertex v
i
. Collectively,
each set of vertices {v
i
}
vV
constitutes S
i
, the i-th stage” vertices of G
0
. These
vertices are meant to simulate the activation state of each vertex v over the r rounds
(as well as its initial round-0 activation). Additionally, for j = 0, 1, . . . , r 1, and
for each v V , there is a vertex v
+
j
. Collectively, each set of vertices {v
j
}
vV
constitutes M
j
, the j-th stage memory vertices” of G
0
. These vertices help ensure
that activated vertices remain active in all future rounds in our simulation.
32
Edges For i = 0, 1, . . . , r 1, and for j = i, . . . , r 1, there is a directed arc
from v
i
to v
+
j
. These arcs work to ensure that memory vertices are aware of any
activation the vertex attains in an earlier round. For i = 0, 1, . . . , r 1, and for
each pair (u, v) in E, there is a directed edge from u
+
i
to v
i+1
. These simulate the
activation process in the initial TSS
r
instance. Finally, for each vertex x S
r
and
for each y S
0
there is a directed-edge (gadget) from x to y. This bi-clique between
S
r
and S
0
is meant to ensure that activating all of S
r
(corresponding to the rth stage
of activation) in turn activates all vertices in G
0
.
Thresholds The threshold of each vertex in S
0
is |V |, the threshold of each
vertex v S
i
for i 1 is τ (v), and the threshold of each memory vertex is 1.
The goal is that active vertices in S
0
represent the seed set S in G, and active
vertices in S
i
represent the set of vertices in G active after i rounds. If S
r
is entirely
active— meaning that S is an r-contagious set—this set in turn activates all of S
0
,
causing all vertices in G
0
to become active. With the exception of the arcs between S
r
and S
0
, the directedness of the arcs ensure that vertices in some stage cannot activate
those in an “earlier” stage. Additionally, each memory vertex v
+
j
, in M
j
remains
active whenever there is some v
i
, with i < j, that is active, in turn ensuring that all
v
k
with k > i are also active.
Thus, an r-contagious set for the initial TSS
r
instance exactly corresponds to
a contagious set in S
0
. To show that G
0
does not contain smaller contagious sets,
observe that activating v
0
weakly dominates activating each of v
i
, for i > 0, and v
+
i
,
for i 0. Vertex v
i
will become active within i rounds anyway, so there is no benefit
in activating it earlier; a similar argument applies to v
+
i
. Thus, a contagious set
containing vertices outside of S
0
can be transformed into a no-larger contagious set
33
entirely inside S
0
. Hence the optimal value of both the initial TSS
r
instance and our
constructed TSS instance are equal.
Combining this with Theorem 3.10, we get n
1/2
hardness for Target Set
Selection with an unbounded number of rounds.
Theorem 3.12: Assuming the Planted Dense Subgraph Conjecture, no PPT
algorithm can approximate Target Set Selection to within a factor of O(n
1/2
)
for any > 0.
3.4 Approximating TSS
O(1)
Theorem 3.10 establishes that TSS
r
is hard to approximate to within a factor of
n
1/2−br/2c−
even when all thresholds are equal. In this section, we show that such
instances do admit a nontrivial approximation algorithm.
Theorem 3.13: For every r Z
+
, there is a polynomial-time algorithm approxi-
mating TSS
r
to within a factor of O((τ
max
min
)
11/r
n
11/r
log
1/r
n), where τ
max
and
τ
min
are the maximum and minimum thresholds, respectively.
In particular, the algorithm provides an O(n
11/r
log
1/r
)-approximate solution to
instances with uniform thresholds.
First, we define some notation. For a set S, let A
i
(S) denote the i
th
active set of
S, i.e. the set of vertices that would be active after i rounds if S is the seed set. We
also define the hunger of a vertex, which will serve as a useful potential function that
our algorithm attempts to minimize. Roughly, the hunger of a vertex is zero if the
seed set S would activate it in r rounds, otherwise, it is the number of v’s neighbors
that would need to be added to S so that v is active in one round. That is, for a
vertex v and seed set S, v’s hunger is
34
h
S,r
(v) =
0, if v A
r
(S)
τ(v) |{u : u δ(v), u A
r1
(S)}|, otherwise.
Likewise, we define the total (S, r) hunger in the graph, h
r
(S) =
P
vV
h
S,r
(v). Ver-
tex v is called (S, r)-hungry if h
S,r
(v) > 0. Finally, the r esidual of a set S is the TSS
r
instance on the graph induced by V \S, where all thresholds are set to h
S,1
(v). Such
instances may have the peculiar property in which some thresholds are set to zero. In
such cases, the threshold-0 vertices automatically becomes active in the first round.
We’re now ready to present the algorithm, which follows a two-step process. The
first step is the degree-reduction step—where we greedily add to the seed set S ver-
tices whose neighborhoods contain many (S
1
, 1)-hungry vertices—and a greedy step,
where we greedily select vertices that most reduce the total hunger of the graph. We
then argue that each vertex chosen in the degree-reduction step necessarily reduces
the total hunger by a large amount. After these degree-reducing vertices have been
added to the seed set, the residual problem—which is itself an instance of TSS
r
—has
somewhat low out-degree, which means that the greedy algorithm is effective.
Algorithm 1, including a parameter β to be chosen later, details the construction
of the two components of the seed set, S
1
and S
2
.
Algorithm 1 Algorithm for computing a target set: parameter β specified below.
1: S
1
, S
2
{Initialization}
2: while There exists some vertex u with more than β neighbors that are (S
1
, 1)-
hungry do
3: Add u to S
1
. {Degree-reduction step}
4: while There exists an (S
1
S
2
, r)-hungry vertex do
5: Add to S
2
the vertex that most reduces the total hunger of (S
1
S
2
, r)-hungry
vertices. {Greedy step}
6: Return S
1
S
2
.
Since Step 5 only terminates when there are no more (S
1
S
2
, r)-hungry vertices, the
algorithm returns a valid contagious set. We now bound the sizes of sets S
1
and S
2
.
35
Lemma 3.14: |S
1
|
max
Proof of Lemma 3.14. When S
1
= , the total 1-round hunger is bounded by
max
.
Each element added to S
1
reduces this quantity by at least β, and since the final total
1-round hunger cannot be negative, |S
1
|β
max
.
To analyze the size of S
2
, we focus on the sub-problem induced by including S
1
in the seed set, i.e. Residual(S
1
). The degree-reduction step ensures that no vertex
in Residual(S
1
) has more than β out-neighbors in V \ S
1
. This, plus the fact that
we activation flowing from within Residual(S
1
) to S
1
can be ignored (as vertices in
S
1
will be active within 1 round anyway), leads to the following lemma.
Lemma 3.15: |S
2
| O(|OPT|β
r1
log n)
Proof. During each iteration of the greedy step, let O be a minimum-cardinality set for
which S
2
O is (r-round) contagious for Residual(S
1
). We let
~
O = o
1
, o
2
, o
3
, . . . , o
k
be an arbitrary, but fixed, ordering over O. Were we to add the elements of
~
O, in
order, to some initially empty set S
3
, we would reduce the total hunger h
r
(S
2
S
3
)
from h
r
(S
2
) down to 0. Therefore, for some o
in the sequence
~
O, adding o
to S
3
causes h
r
(S
2
S
3
) to decrease by at least
h
r
(S
2
)/k.
We now argue that adding o
directly to S
2
must also significantly reduce the total
amount of hunger amongst (S
2
, r)-hungry vertices. Denote the magnitude of this
change in total hunger by δ
, that is, h
r
(S
2
) h
r
(S
2
{o
}). In a t-round activation
process, the amount of hunger that can be removed by adding some vertex u to the
seed set is bounded above by the number of up-to-length-t simple paths coming out
of u. The residual graph has out-degree bounded by β, and thus the number of such
paths is bounded by 1 + β + (β 1)
2
+ ··· + (β 1)
t
2β
t
.
The addition of o
to S
3
activates at most δ
neighbors, each of which may be
seen as initiating an (r 1)-round activation process. Therefore, by adding o
to S
3
,
36
h
r
(S
2
S
3
) drops by at most δ
·2β
r1
. Hence,
2δ
β
r1
and δ
/(2β
r1
) =
h
r
(S
2
)/(2kβ
r1
).
We cannot identify o
: it depends both on the unknown optimal solution and
some arbitrary ordering. However, we know that there exists some o
for which
h
r
(S
2
{o
}) = h
r
(S
2
) δ
1 1/(2β
r1
k)
h
r
(S
2
) . (3.1)
By iterating through each vertex, we can choose (in polynomial time) the vertex o
+
that minimizes h
r
(S
2
{o
+
}): this latter expression is clearly also bounded by expres-
sion (3.1). After Θ(kβ
r1
log n) iterations of greedily choosing such o
+
and adding it
to S
2
, the total hunger h
r
(S
2
) drops below 1 and so Algorithm 1 terminates.
Proof of Theorem 3.13. Given Lemmas 3.14 and 3.15, all that remains is to find
the optimal choice of β for Algorithm 1. Had we known the value of |OPT|,
we could let β be Θ([
max
/(|OPT|log n)]
1/r
), and we would have |S
1
| + |S
2
| =
O((
max
)
11/r
|OPT|
1/r
log
1/r
n), so the ratio of |S
1
| + |S
2
| to |OPT| would be
O((
max
)
11/r
log
1/r
n/|OPT|
11/r
) = O((τ
max
min
)
11/r
n
11/r
log
1/r
n) , (3.2)
where this “inequality” follows from the fact that |OPT| τ
min
. Although we do
not know |OPT|, we could run Algorithm 1 with each β in 1, 2, . . . , n, and return
the best solution; our approximation ratio would be at most the right-hand side of
“inequality” (3.2).
3.5 Dynamic Target Set Selection
In Section 3.1, we defined Target Set Selection so that once a vertex becomes
active, it remains active for all time. It is natural to ask what happens if vertices can
also lose activation if they do not have a sufficient number of active in-neighbors. This
can happen when vertices in the seed set have thresholds higher than their number
of in-neighbors also in the seed set. We call seed sets that cause the entire network
37
to activate dynamically contagious, and the corresponding problem the Dynamic
Target Set Selection problem.
Problem 3.3 (Dynamic Target Set Selection (DTSS)): Given a graph G =
(V, E) and a nonnegative threshold function τ : V N
0
, find the smallest subset
S V such that S is dynamically contagious.
This dynamical system is interesting because it can be realized by an appropriate
setting of parameters in many standard models of neuronal networks, such as the
leaky integrate-and-fire model introduced in [79]. In [20], Ben-Zwi et. al. show that
DTSS is #P hard when edges can also have negative influence, i.e. when vertices
become active only if the number of active neighbors made adjacent by positive in-
edges exceeds the number of active neighbors made adjacent by negative in-edges
(the activation behavior in this model is closely related to that in synchronous Hop-
field networks [65]). In this subsection, we show that determining even very simple
properties of the original dynamical system (with no negative influences) can be com-
putationally intractable. In particular, we show that it is hard, given a seed set, to
determine simple facts of the long-term behavior of the system.
First, we discuss why direct simulation cannot work. For TSS, we were able to
argue that the number of inactive vertices is a (strictly) monotonically-decreasing
non-negative function up until the system converges to a steady state. Thus, since
there are at most n inactive vertices, the system converges within n rounds. On the
other hand, the number of active vertices in the DTSS dynamical system may be
not strictly monotone (or even not monotone at all). Thus, it’s natural to wonder if
one can simulate the system by proving a polynomial upper bound on the number
of steps that it takes before convergence, or at least until the system first repeats a
state (the set of active vertices). Unfortunately, as we show, some seed sets may be
part of an orbit of exponential length.
38
Theorem 3.16: There exist instances of the DTSS dynamical system in which some
states are part of an orbit of period 2
˜
Ω(
n)
.
Proof. We construct such an instance as follows. For a given n, let k be the largest
positive integer such that the sum of the first k primes p
1
+p
2
+···+p
k
does not exceed
n. By the prime number theorem, k = Ω
p
n/ log n
. We create k disjoint, directed
cycles of sizes p
1
, p
2
, ··· , p
k
, with each vertex having threshold 1. The remaining o(n)
vertices remain disconnected from everything else. The o(n) bound on the number of
remaining vertices comes from bounds on prime gaps, which are currently known to
be O(n
.525
) [18].
Ignoring the irrelevant degree-zero vertices, what we have is Ω
p
n/ log n
cycles
of pairwise co-prime lengths. Starting out with one vertex activated in each cycle,
it’s easy to see that any given cycle of length x returns to its original configuration
(with just that vertex activated) only after some number of steps that’s a multiple of
x. The number of steps needed for every cycle (which have lengths p
1
, p
2
, ··· , p
k
) to
simultaneously get back to its starting configuration, then, is
LCM(p
1
, p
2
, ··· , p
k
) =
k
Y
i=1
p
i
> 2
k
= 2
Ω(
n/ log n)
= 2
˜
Ω(
n)
This 2
˜
Ω(
n)
bound has the same order of growth as Landau’s function [78] the
maximum order of an element in the symmetric group S
n
. This is no coincidence: the
adjacency matrix of our digraph is just a permutation matrix with the same period
as the process.
That said, an exponential bound on the period length does not a priori prohibit the
ability to simulate the dynamical system using clever computational tricks. However,
the above construction plays a crucial role in our proof that this dynamical system
39
is hard to simulate, to the point where it is PSPACE-hard to determine whether or
not a given vertex will ever activate.
Problem 3.4 (Dynamic Target Set Selection-Reachability (DTSS-R)):
Given an instance of the DTSS dynamical system and a seed set and a node v,
determine if v will ever activate.
Theorem 3.17: DTSS-R is PSPACE-complete.
For simplicity, we first present a proof of NP-hardness of the result, which al-
ready includes virtually all of the interesting ideas. We then quickly describe how
to transform this into PSPACE-hardness without going into the unexciting detail.
We note that, in parallel, Goles et. al. [60] derived a very similar proof showing
hardness for the same problem under the undirected block-sequential. In that model,
edges are undirected, but rather than updating in a synchronized manner (as in our
model), the graph cycles through a partitioning of vertices and each partition gets
updated in parallel. By subdividing each edge into a path of length 3, one can show
that the undirected block-sequential model can simulate the directed model.
4
By a
simple reduction to majority-threshold instances, the undirected synchronized model
can be shown to enter an orbit in a polynomial number of steps using the technique
from [86].
Proof of NP-hardness. We reduce from the decision version of Set Cover. A depic-
tion of the construction is provided in Figure 3.3. Given a Set Cover instance with
set system S of size m, element universe U of size n, and a budget k, we construct
our DTSS-R instance as follows. For each i 1, 2, ··· , m, we create a directed cycle
of length p
i
. By the prime number theorem, the total number of vertices among all
cycles is O(m
2
log m). Additionally, we use O(n + m) vertices to simulate the mono-
4
In Surely You’re Joking, Mr. Feynman!, Feynman [54] describes a similar technique by which
he conjectures members of a certain species of Brazilian leaf-cutting ants determine the direction of
their pheromone trails.
40
x
y
x
y
x
y
u
u'
v
Figure 3.3: A cartoon depiction of the construction showing hardness of evaluating
the long-term behavior of DTSS. The 4 represents a simulated monotone circuit for
the given set cover instance.
tone circuit accepting all (size-unconstrained) solutions to the set cover instance, i.e.
{x : x is the indicator vector for a set cover of U}
Such a circuit can be made using O(n + m) gates because feasible Set Cover
solutions can be expressed as
V
i∈U
W
S∈S:i3S
x
S
, where x
S
is an input gate for set S.
To convert this into a valid DTSS instance, gates can be simulated by threshold-1
vertices and gates can be simulated by vertices with threshold equal to their in-
degree. The vertices corresponding to the inputs of this circuit will have threshold
1.
We will also want one other small gadget. One vertex, which we call u, will have
m inputs and threshold m k. The inputs all have threshold 1. u also has a directed
arc into vertex u
0
, which has threshold 1 and no other in-vertices. Both u
0
and the
output vertex of the set cover circuit feed into the same threshold-2 vertex v.
The idea is as follows. At any time-step, let x be the bit vector representing
which inputs to the (simulated) set cover circuit are active, and let y be the bit
41
vector representing which inputs to u are active. Suppose that, at any point in time,
we can guarantee that the sum of the Hamming weights of x and y is at most m.
Then v, after a two round delay, becomes active only if x represents a Set Cover
solution of size at most k, i.e. only if x represents a satisfying assignment to the given
Set Cover instance. If the sum of the Hamming weights is exactly m, then this
becomes an if-and-only-if. Thus, the plan is to hook up the prime-length cycles to the
vertices of x and y so that some initial configuration brute forces all possible strings
of Hamming weight between 1 and m. Doing so ensures that v only ever becomes
active if the set cover instance is satisfiable.
To do this, we designate one vertex from each prime cycle as its “x” vertex and
another vertex from each cycle as its “y” vertex. The “x” vertex from the length-p
i
cycle then has a directed arc to the vertex corresponding to the ith bit of x. Similarly,
the “y” vertex from the cycle of length p
i
has a directed arc to the vertex corresponding
to the ith bit of y. By activating exactly one vertex from every cycle, we ensure that
each string of Hamming weight m appears exactly once (those of Hamming weight
less than m may appear multiple times). Consequently, v activates if and only if the
set cover instance has a satisfying assignment.
Converting this into a proof of PSPACE-completeness involves two steps. In the
first step, we double each node in the prime-length cycles from the original construc-
tion, and for each node v dub one of its copies the “original” (v
+
) vertex and the other
its “complement” (v
) vertex. We will ensure that at any point in time, for every v
exactly one of v
+
and v
is active. The intent is that the v
+
vertices simulate the
dynamics of the original network, and the v
vertices simply hold the complement ac-
tivation. Thus, we then hook up the network so that if an initially-active set S in the
original network would activate v in round r, the set {s
+
: s S}{s
: s V \S} in
the new network would activate v
+
in round r. This construction allows us to extract
activation from each bit and its complement, which is enough to allow us to simulate
42
a simple working memory. The second step involves using this memory (along with a
liberal use of the same type of +/- idea) to brute-force the solution to a given TQBF
instance.
Given the above results, it is not hard to show that other simple properties of the
dynamical system, such as whether or not all vertices ever simultaneously (de)activate,
whether it enters a period of length 1, and so on are also PSPACE-hard. Con-
sequently, it’s PSPACE-hard to distinguish DTSS instances with a dynamically-
contagious set of size 1 from those where all dynamically-contagious sets are of size
Ω(n) (i.e. the problem has no o(n) approximation algorithm). As these corollaries
are not hard to prove and are somewhat tangential to the rest of this thesis, details
are omitted.
43
Chapter 4
Densest Common Subgraph
4.1 Overview
The Densest Subgraph problem and its variants are often used to model the
task of finding dense communities in complex networks with interacting agents. In
some contexts, though, such as when modeling the web graph or many types of social
networks, these graphs may evolve over time: two users may suddenly become friends,
webmasters might remove links to outdated websites, and so on. Thus, it’s natural
to wonder how one can model communities that remain dense over a long period of
time.
In this chapter, we study two flavors of the recently-introduced Densest Com-
mon Subgraph (DCS) problem: given a sequence of T graphs (sometimes called
snapshots or frames) that share the same vertex set V , the goal is to find a subset
of S vertices that maximizes some aggregate measure of the density of the subgraphs
induced by S in each of the given graphs. The variants we study differ in their choices
for aggregation function. Namely, we study (1) DCS-MA, where the goal is to maxi-
mize the Minimum (over the frames) of the Average degree in the induced subgraph,
44
and (2) DCS-AM, where the goal is to maximize the Average (over the frames) of
the Minimum degree in the induced subgraph.
Problem 4.1 (Densest Common Subgraph-MA (DCS-MA)): Given a sequence
(G
1
, G
2
, ··· , G
T
) of graphs on the same vertex V , find the subset V
0
of V maximizing
min
i
|E[G
i
[V
0
]]|/|V
0
| (i.e. the density of the sparsest induced subgraph).
Problem 4.2 (Densest Common Subgraph-AM (DCS-AM)): Given a sequence
(G
1
, G
2
, ··· , G
T
) of graphs on the same vertex V , find the subset V
0
of V maximizing
1
T
·
P
i
d
min
G
i
[V
0
] (i.e. the average min-degree of the induced subgraph).
Because the
1
T
factor in the definition of Problem 4.2 is a constant (for any fixed
instance), we lose nothing by pretending that the objective value is just
P
i
d
min
G
i
[V
0
].
We will make this simplification throughout the rest of this section. When T = 1,
DCS-AM (with the modified objective function) asks for the degeneracy of the graph,
whereas DCS-MA becomes the Densest Subgraph problem. These two problems can
be solved exactly using well known polynomial-time algorithms. Thus, the problems
we study are natural generalizations of these two problems to sequences of graphs.
Previously, Jethava and Beerenwinkel introduced the DCS-MA problem, for
which they proposed both an LP and a greedy algorithm [69]. While they showed
that neither of these algorithms solve the problem exactly, they gave no bounds on
the approximation factor of either approach. In followup work, Andersson et. al.
ran experiments with the greedy algorithm and described a Lagrangian relaxation of
the LP [8]. In very recent work, Semertzidis et al. explored four different variants
of the DCS problem corresponding to different choices of the aggregation function
over the frames [90]. For the DCS-MM (Min Min) and DCS-AA (Average Aver-
age) problems, they gave simple exact algorithms. They also proposed algorithms for
DCS-MA and DCS-AM, but only proved lower bounds on their approximation ra-
tio. They conjectured that DCS-AM and DCS-MA are both NP-hard, and showed
45
NP-hardness for a variant of DCS-MA, DkCS-MA, in which we can freely discard
the worst T k graphs when computing the objective value of a candidate solution.
Problem 4.3 (Densest k-Common Subgraph-MA (DkCS-MA)): Given a se-
quence (G
1
, G
2
, ··· , G
T
) of graphs on the same vertex V and a positive integer k,
find the subset V
0
of V maximizing
1
|V
0
|
times the size of the kth largest set in
{E[G
1
[V
0
]], E[G
2
[V
0
]], ··· , E[G
T
[V
0
]]}.
In this chapter, we resolve the conjectures of [90] by showing that in addition to
being NP-hard, these problems are also strongly inapproximable. In Section 4.2, we
also provide a nontrivial approximation algorithm for Problem 4.1. This is all based
on joint work with Moses Charikar and Jimmy Wu.
4.2 Approximating DCS-MA
We first present an O(
n log T)-approximation algorithm for DCS-MA. Notably,
this simplifies to O(
n log n) when T = poly(n). We then show how to augment the
algorithm so that the approximation ratio never surpasses O(n
2/3
log
1/3
n) even when
T is allowed to be super-polynomial in n.
Theorem 4.1: There exists an O(
n log T)-approximation algorithm for DCS-MA.
The algorithm returns the better of two feasible solutions. The first feasible so-
lution it considers is that containing all vertices, i.e. S = V . The second feasible
solution is constructed greedily. For a set S
0
V of vertices, we say that a frame G
i
is covered if G
i
[S
0
] contains at least one edge. Now initialize S
0
. As long as some
graphs remain uncovered, we add to S
0
the two vertices u, v V such that S
0
{u, v}
covers as many graphs as possible (that is, u and v induce at least as many edges
46
among the uncovered graphs as does any other pair). When all graphs are covered,
S = S
0
forms our second feasible solution.
Let k =
:
|OP T |. Although this value is unknown to us, we can analyze, in terms
of k, the quality of the two feasible solutions:
Lemma 4.2: If S = V , then score(S) is an n/k approximation to score(OPT).
Proof.
score(V ) =
min
i
|E
i
[V ]|
|V |
min
i
|E
i
[OP T ]|
|V |
=
min
i
|E
i
[OP T ]|
|OP T |
·
|OP T |
|V |
= score(OP T ) ·
k
n
.
Lemma 4.3: If S is constructed as in the above greedy algorithm, then score(S) is
a 2k ln T approximation to score(OPT).
Proof of Lemma 4.3. Consider an intermediate state i of our algorithm, with partial
solution S
0
i
covering all but t
i
of the frames. Let T
i
be the set of t
i
subgraphs that OPT
induces on each of the uncovered frames. Since the average degree in each subgraph
in T
i
is at least score(OPT), there are at least
k·score(OPT)
2
induced edges in each frame
of T
i
, and, summing over all these frames,
t
i
·k·score(OPT)
2
edges total induced by OPT.
As there are
k
2
vertex pairs in OPT, at least one such pair induces an edge in at
least
t
i
k·score(OPT)
2·
(
k
2
)
of them. Thus, the next vertex pair chosen by the greedy algorithm
covers at least a
k·score(OPT)
2·
(
k
2
)
score(OPT)
k
fraction of the previously-uncovered frames.
Thus, the number of uncovered frames satisfies t
i
(1
score(OPT)
k
)t
i1
with t
0
= T .
Therefore, t
i
T (1
score(OPT)
k
)
i
, and in particular t
i
< 1 when i
k ln T
score(OPT)
. Thus,
the algorithm halts after at most
k ln T
score(OPT)
iterations.
As each iteration introduces at most 2 new vertices to S
0
, the ultimate size of S
0
is at most
2k ln T
score(OPT)
. Combining this with the fact that the returned solution covers
at least one vertex in each frame, we have
47
score(S) =
min
i
E
i
[S]
|S|
1
2k ln T
score(OPT)
=
score(OPT)
2k ln T
With these lemmas in place, Theorem 4.1 is a simple corollary.
Proof of Theorem 4.1. Let α be the score of V , and let β be the score of the solution
generated by the aforementioned greedy algorithm. By lemmas 4.2 and 4.3, α
score(OPT) ·
k
n
and β
score(OPT)
2k ln T
. Thus, the better of the two solutions has score at
least
max(α, β) =
p
max(α, β)
2
p
α · β =
q
score(OPT) ·
k
n
·
score(OPT)
2k ln T
=
score(OPT)
2n ln T
.
While the above algorithm provides an O(
n log n) approximation when T =
poly(n), it can be substantially worse when T is allowed to be super-polynomial in
n. In such cases, however, the size of the input must also be super-polynomial in
n, which we can exploit to get reasonable approximations in terms of n. This idea
underlies the following theorem:
Theorem 4.4: There exists an O(n
2/3
log
1/3
n) approximation algorithm for DCS-
MA that runs in poly(n, T ) time.
Proof. The algorithm returns the better of (1) the output of the algorithm underlying
Theorem 4.1 and (2) the best subset of V of size at most log
n
T .
As there are O(n
log
n
T
) = O(T ) vertex subsets of size at most log
n
T , this algorithm
runs in time at most T · poly(n). When k log
n
T , the algorithm will return an
optimal solution to DCS-MA. Hence, it suffices to consider the case that k log
n
T .
However, in this case, Lemma 4.2 guarantees that the returned subset provides at
48
worst an
n
k
n
log
n
T
=
n log n
log T
approximation to OPT. Hence, if
ˆ
S is the returned
solution,
score(
ˆ
S) score(OPT) · max
log T
n log n
,
1
2k log T
,
k
n
score(OPT) ·
3
s
log T
n log n
·
1
2k log T
·
k
n
= score(OPT) ·
3
r
1
2n
2
log n
= score(OPT) ·
1
n
2/3
log
1/3
n
4.3 Lower bounds for DCS-MA
We now show two lower bounds on DCS-MA. In Section 4.3.1, we show that the
LP introduced in [69] has an integrality gap of Ω(n/ log n), which is within a log
factor of the guarantee given by the trivial solution V
0
= V . Next, in Section 4.3.2,
we show that the problem is Min-Rep-hard to approximate. Finally, we show in
Section 4.3.2 that the DkCS-MA problem is even Min–k–Rep hard to approximate.
The implications of these two hardness results is discussed in Chapter 2.
4.3.1 Integrality Gap
In [69], Jethava et. al. introduce the following linear program (DCS LP) for DCS-
MA.
49
maximize z
subject to
P
i
y
i
= 1
x
(u,v)
y
u
for all u, v V
x
(u,v)
y
v
for all u, v V
z
P
eE
t
x
e,t
for all t T
x, y 0
When T = 1, this simplifies to the LP shown by [28] to solve the Densest
Subgraph problem exactly. When T is allowed to be larger than 1, we show that
the integrality gap of this LP can be near-linear.
1
Theorem 4.5: DCS LP has an integrality gap of Ω(
n
log n
).
Proof. Label n vertices 1 through n, and consider the instance composed of the fol-
lowing sequence of T = n 1 graphs, G
1
through G
n1
. G
1
contains a single edge
from v
1
to v
2
. G
2
contains two edges: one from v
1
to v
3
and the other from v
2
to v
3
.
In general, G
k
contains k total edges, each with one endpoint at v
k+1
and the other
at v
i
for i [k]. A depiction of this construction is given in Figure 4.1.
We first consider the optimal integral solution to this graph sequence. Because
each vertex is the center of a star in at least one of the graphs (G
1
has both v
1
and v
2
as centers), not picking even one of the vertices ensures that the corresponding graph
attains average degree 0, and thus the minimum average degree for the sequence is also
0. Thus, to get any positive objective value, we must pick every vertex. Consequently,
because G
1
contains a single edge, the optimal objective value is 1/n.
1
We use the term “integrality gap” loosely here. As the LP is normalized, the intended solution
has y
v
set to 1/|S| when v is in the optimal solution S and to 0 otherwise. What we measure is
really the ratio of LP-OPT to the score of true optimal feasible solution for the given instance.
50
1 2 3 4 5
G
1
1 2 3 4 5
G
2
1 2 3 4 5
G
3
Figure 4.1: A depiction of the first three frames of the integrality gap instance for
DCS LP.
Now let’s consider what the LP can achieve. Set h =
1
1+H
n1
(where H
k
=
P
k
i=1
1
i
is the kth harmonic number), and consider the fractional solution that assigns y
1
= h
and y
i
=
h
i1
for i 2. The sum of these is h ·
1 +
P
n
i=2
1
i1
= h ·
1
h
= 1. Note that
these values are monotonically nonincreasing in i, and thus if i j, then we can set
x
(v
i
,v
j
)
= min(y
v
i
, y
v
j
) = y
v
i
=
h
i1
. For each k, graph G
k
contains k edges, each with
one endpoint at v
k+1
and the other at a distinct vertex with a smaller index. Thus,
our assignment induces exactly k ·
h
k+11
= h fractional edges in each G
k
, meaning
that LP-OPT is at least h =
1
H
n1
+1
= (1/log n). Thus, the integrality gap is
Ω(1/ log n)
1/n
= Ω
n
log n
.
4.3.2 Min-Rep hardness
We now show that DCS-MA is Min-Rep-hard to approximate.
Theorem 4.6: There is an approximation-preserving reduction (up to constant fac-
tor) from Min-Rep to DCS-MA.
Proof. Suppose that we are given a Min-Rep instance (V, E,
ˆ
V ,
ˆ
E). Call an optimal
solution to this instance V
. The vertex set in our DCS-MA construction will be
V
0
= V {u, v}, where u and v are not vertices in V .
51
The frame G
1
has exactly one edge, between vertices u and v. This forces u and
v to each be in OPT, and ensures that score(OPT) is exactly
1
|OPT|
(all other graphs
will contain at least one edge, so score(OPT) is nonzero). Additionally, it means that
we never benefit from picking up more than 1 edge in any other frame in the sequence.
We now construct one graph for each superedge. For each superedge
ˆ
E
i
, E[G
i+1
]
is exactly
ˆ
E
i
(i.e. if (a
i
, b
j
)
ˆ
E
i
, then (a
i
, b
j
) E[G
i+1
]). Thus, to get a positive
score on G
i+1
, we must pick both endpoints of at least one edge in
ˆ
E
i
.
The only way to get a positive score is to get a positive score on each frame G
i
.
Thus, for each superedge
ˆ
E
i
, we must pick both endpoints of one of the edges in
ˆ
E
i
.
Thus, |OPT| |V
| + 2 (where the +2 comes from the requirement that we also
pick u and v). Conversely, picking the vertices in V
(as well as u and v) gives us a
positive score in each graph, and we have already established that OPT is precisely
the smallest such set. Thus, |OPT| |V
| + 2, meaning that |OPT| = |V
| + 2, and
thus score(OPT) =
1
|V
|+2
.
Now suppose that it is hard to distinguish between a Min-Rep instance with
objective value x and objective value x · f(n). This implies that it is hard to distin-
guish between DCS-MA instances with value
1
x+2
and those with value
1
x·f(n)+2
. The
inapproximability ratio is thus
1
x+2
1
x·f(n)+2
=
x·f(n)+2
x+2
= O(f(n)).
Min–k–Rep Hardness for DkCS-MA
Clearly, Theorem 4.6 also shows Min-Rep hardness for DCS-MA, as DCS-MA is
simply the special case where k = T . However, the proof of this fact falls squarely
within the framework outlined in Section 2.3. In particular, applying the above
reduction to a Min–(k 1)–Rep instance yields a DkCS-MA instance with at most
a constant-factor difference in the optimal objective value. Thus, basically for free,
we establish the following theorem
Theorem 4.7: DkS Min–k–Rep DkCS MA.
52
4.3.3 Planted DkS hardness
We now show that, assuming Conjecture 2, DCS-MA has no n
1/4
approximation
algorithm.
Briefly, the intuition behind the construction is as follows. Consider a graph
G = G(n, 1/
n) C G(
n, n
1/4
), as in the statement of Conjecture 2. Because the
conjecture says that the dense planted component is hard to detect, one might naively
imagine that this difficulty immediately implies hardness for the densest subgraph
problem (with T = 1). However, this reasoning is flawed, as the densest subgraph
in such instances is (with extremely high probability) simply all of G. However,
using a second graph G
0
, we can aim to restrict the set of good solutions to those of
size at most some O(k). In particular, by setting G
0
to contain only some k-clique,
we can ensure that, up to constant factors, the optimum solution contains at most
O(k) vertices. Additionally, good solutions to this new problem with T = 2 directly
correspond to good solutions for DkS on G. For an appropriate choice of k, n
1/8
-
hardness follows. To amplify this hardness up to n
1/4
, we both shrink the relative
size of the graph’s densest component using results from Chapter 3 and pad instances
with a large number of random graphs to ensure that small solutions perform poorly.
For simplicity, we first establish the n
1/8
hardness of approximation for T = 2.
Theorem 4.8: Assuming Conjecture 2, for no > 0 is there a probabilistic
polynomial-time algorithm approximating DCS-MA to within a factor of n
1/8
,
even for graph sequences of length 2.
Proof. Let G be an input to the Planted Dense Subgraph problem, with vertex
and edge sets V and E. We construct the graph sequence (G
1
, G
2
) of our DCS-MA
instance as follows.
Vertices Let U be a set of n
1/4
vertices not in V . The vertex set of both G
1
and
G
2
is V
0
= V U.
53
Edges E
1
contains an edge between every pair of vertices in U, and no edges
outside of U (thus, G
1
is an n
1/4
-clique plus n isolated vertices). E
2
, on the other
hand, is just E (and thus G
2
has every vertex in U isolated).
We now proceed to prove bounds on score(OPT).
Claim 1: If G is a PLANTED instance, then score(OPT) = Ω(n
1/8
).
Proof of Claim 1. Let S be the union of U and n
3/8
vertices from the planted com-
ponent of G. Then
|E
1
[S]| =
n
1/4
2
= Ω(n
1/2
)
and, by a standard application of the Chernoff bounds
|E
2
[S]| = Ω
n
1/4
·
n
3/8
2

= Ω(n
1/2
)
with high probability. Thus,
score(S) = min
|E
1
[S]|
|S|
,
|E
2
[S]|
|S|
=
min(Ω(n
1/2
), Ω(n
1/2
))
n
1/4
+ n
3/8
= Ω(n
1/8
)
Claim 2: If G is an UNPLANTED instance, then score(OPT) = n
o(1)
.
Proof of Claim 2. Let S be an optimal solution, and let s = |V S| be the number of
vertices it contains from V . Since OPT score
G
2
< s, we can assume that s = Ω(n
0
)
for some
0
> 0 (otherwise the proof is trivial). We now bound the quality of S in
each of G
1
and G
2
. In G
1
,
score
G
1
(S) =
|E
1
[S]|
|S|
|E
1
[S]|
s
|E
1
|
s
n
s
Meanwhile, in G
2
, a standard application of the Chernoff bounds implies
score
G
2
(S) =
|E
2
[S]|
|S|
|E
2
[S]|
s
O(s
2
/
n)
s
= O
s
n
54
Therefore,
score(S) = min(score
G
1
(S), score
G
2
(S))
min
n/s), O(s/
n)
q
(
n/s) · O(s/
n)
= O(1) = n
o(1)
By Claims 1 and 2, the gap between the Planted and Unplanted cases is at
least
Ω(n
1/8
)
O(n
o(1)
)
Ω(n
1/8
0
).
This concludes the proof of Theorem 4.8.
We now proceed to amplify the hardness obtained in Theorem 4.8 up to n
1/4
.
We first exhibit a straightforward but fallacious approach, and then show how to
correct its flaw.
Theorem 4.9: Assuming Conjecture 2, for no > 0 is there a probabilistic
polynomial-time algorithm approximating DCS-MA to within a factor of n
1/4
.
Fallacious proof of Theorem 4.9. Let G = (V, E) be an input to the recursed PDS
problem with r rounds of planting in which all log densities lie within the interval
(1/2 , 1/2] (see Section 3.3.2). The construction of this “proof begins much like
that of Theorem 4.8.
Vertices Let U be a set of n
1/2
r+1
vertices not in V . The vertex set of both G
1
and G
2
is V
0
= V U.
55
Edges E
1
contains an edge between every pair of vertices in U, and no edges
outside of U (and thus, G
1
is an n
1/2
r+1
-clique plus n isolated vertices). E
2
, on the
other hand, is just E (and thus G
2
has every vertex in U isolated).
One can now try to use the same arguments as in Claim 1 and Claim 2 to show that
(a) score(OPT) = Ω(n
0
) for some
0
= Θ() and (b) score(OPT) = O(n
1/4+1/2
r+1
).
Unfortunately, the second of these is false, as is exhibited by the trivial solution
selecting the endpoints of one edge from each of the two frames, which gets a score of
Ω(1). In particular, we do not have a good lower bound on s, so we cannot usefully
apply the Chernoff bounds. Additionally, this shows that the Ω(n
0
) lower bound
is both trivial and unhelpful. Since an Ω(1/T ) algorithm is trivial, we need our Yes
instances to have a score of at least Ω(n
1/4
0
/T ). As we now show, one way to do
this is by increasing T .
Our construction will be exactly as above, except we pad the graph sequence with
an additional n
2
different i.i.d. G(n, n
3
0
) random graphs. By repeated application
of the union bound, it can be shown that any set of size O(n
0
) induces an empty
subgraph in at least one graph in the sequence, and thus has score 0. Therefore,
we know that OPT > n
0
, which allows us to use Chernoff bounds. Additionally,
solutions of size Ω(n
4
0
) have their objective scores simply scaled down by a factor of
n
2
0
(up to subconstant factors), so the relative value of all large solutions remains
unchanged.
Thus, with this additional change, both arguments (a) and (b) above hold (up
factors of n
Θ(
0
)
), and we establish a gap of n
1/41/2
r+1
Θ(
0
)
. Rewriting the exponent,
our gap can be set to equal n
1/4
.
56
4.4 Hardness of DCS-AM
In this section, we prove the following theorem.
Theorem 4.10: DCS-AM has no n
1
-approximation algorithm for any > 0 unless
P = NP.
Proof. We show this by presenting a direct reduction from Max Independent Set
(MIS), which is well known to have the aforementioned hardness factor [66]. The
reduction is as follows. Suppose we’re given a Max Independent Set instance
with a graph G = (V, E) that is not complete (the problem is trivial when G is
complete). Our reduction will construct a DCS-AM instance consisting of one frame
G
v
for each vertex v V . In each such frame G
v
, all of v’s neighbors are singletons,
while the remaining vertices form a star centered at v. We now show that the size
of the maximum independent set in G is equal to the maximum feasible objective in
the constructed DCS-AM instance.
Suppose I V is an independent set of size k 2 in G. Then consider the
solution I to the constructed DCS-AM instance. For each v I, we score 1 point
in frame G
v
, since v has a neighbor in G
v
(some other vertex in I) and none of the
singletons of G
v
(the neighbors of v in G) are in I. Thus, in the DCS-AM instance,
score(I) k, and thus OPT(DCS-AM) OPT(MIS).
In the reverse direction, suppose that S V is a solution to the constructed
DCS-AM instance achieving score(S) = k. It is easy to check that we can only score
at most one point per frame; thus there must be k frames in which S induces nonzero
min degree. Consider any such frame G
v
, corresponding to v V . Since our score in
G
v
is 1, we must have v S, as it is the center of the star in G
v
. And we cannot have
chosen any of v’s neighbors in G, for otherwise there would be a singleton in G
v
[S].
We conclude that S contains at least k vertices, no pairs of which are neighbors in
57
G; i.e. S contains an independent set of size k (precisely those vertices at the centers
of frames in which we scored). Thus, OPT(MIS) OPT(DCS-AM).
Therefore, the optimal objective value of the constructed DCS-AM instance ex-
actly equals that of the given MIS instance, so DCS-AM has a hardness factor at
least as large as that of MIS.
4.5 Algorithms for DCS-AM with Small T
In this section, we show how to solve DCS-AM for small T by generalizing classical
algorithms for finding k-cores in a graph. Concretely, we provide an O(n
T
·poly(n, T ))-
time algorithm for the exact version of DCS-MA, as well as an O(f(T )·poly(n))-time
(i.e. FPT-time) (1 + )-approximation algorithm (for some computable function f).
Given a graph sequence G = (G
1
, G
2
, ··· , G
T
) over vertices V , we say that a set
S V is a (k
1
, ··· , k
T
)-core if it induces minimum degree at least k
i
in each frame
G
i
. In other words, for all i [T ], S satisfies min-deg(G
i
[S]) k
i
.
It is easy to find a (k
1
, ··· , k
T
)-core in poly(n, T) time if one exists. Starting from
a set S containing all of V , we repeatedly remove from S any vertex whose degree in
G
i
is less than k
i
. When no such vertices remain, we return S. The returned set is
either empty or a k-core. This works because if a vertex v is deleted, it cannot (by
definition) be part of any (k
1
, ··· , k
T
)-core of a graph induced by a subset of S.
We can now use this procedure as a black box to derive the following two results.
Theorem 4.11: There is an exact algorithm for AM with running time n
T
·
poly(n, T ).
Theorem 4.12: For some computable function f and every > 0, there is a (1 + )-
approximation algorithm for AM with running time f(T ) ·poly(n) (i.e, the algorithm
is fixed parameter tractable in T ).
58
Proof of Theorem 4.11. This algorithm simply return the largest integer k such that
G has a (k
1
, ··· , k
T
)-core with
P
T
i=1
k
i
= k. Since there are at most n
T
tuples of the
form (k
1
, ··· , k
T
), this runs in n
T
· poly(n, T ) time.
Proof of Theorem 4.12. As before, we intend to return the largest integer k such that
G has a (k
1
, ··· , k
T
)-core for some
P
T
i=1
k
i
= k. However this time, we only consider
those tuples (k
1
, ··· , k
T
) such that k
i
= (1 + )
`
i
for some integers `
1
, ··· , `
T
. As
one of the solutions considered contains the optimal solution with the corresponding
vector rounded down to the nearest power of 1+ (and thus each entry of the vector is
within a (1+) factor of those in OPT), the sum of the entries in some rounded-down
vector is within a (1 + ) factor of the value of the optimal solution.
The upshot is that we have an 1 + approximation algorithm that only considers
O
(log
1+
n)
T
= O((log n)
T
/) different k-cores. Thus, the total running time is
(log n)
T
· poly(n, T ). By AM-GM,
(log n)
T
= 2
T log log n
2
T
2
+log log
2
n
2
= 2
T
2
/2
· 2
(log log
2
n)/2
= 2
T
2
/2
· n
o(1)
Therefore, (log n)
T
· poly(n, T ) = f(T ) · poly(n, T) = f(T ) · poly(n).
59
Chapter 5
Middlebox Routing and Placement
In addition to routing packets between endpoints, modern networks run “middle-
boxes” that offer services ranging from network address translation and server load
balancing to firewalls, encryption, and compression. In an industry trend known as
Network Functions Virtualization (NFV), these middleboxes run as virtual machines
commodity servers, allowing them to each process a wide variety of tasks. In this
chapter, we study a new kind of multi-commodity flow problem, where the traffic
flows consume bandwidth on the links as well as processing resources on the nodes.
We first show that allocating resources to maximize the processed flow can be op-
timized exactly via an linear program formulation, and to arbitrary accuracy via a
multiplicative weights algorithm. We then study a class of network design problems
that decide where to provide node capacity to best process and route a given set of
demands, and demonstrate both approximation algorithms and hardness results for
these problems.
60
5.1 Overview
Context
In addition to delivering data efficiently, modern networks often perform services on
the traffic in flight to enhance security, privacy, or performance, or provide new fea-
tures. Network administrators often install “middleboxes” such as firewalls, network
address translators, server load balancers, Web caches, video transcoders, and devices
that compress or encrypt the traffic. In fact, many networks have as many middle-
boxes as underlying routers or switches [91]. Often a single connection must traverse
multiple middleboxes, and different connections may go through different sequences
of middleboxes. For example, while Web traffic may go through a firewall followed
by a server load balancer, video traffic may simply go through a transcoder. To keep
up with the traffic demands, an organization may run multiple instances of the same
middlebox. Deciding how many middleboxes to run, where to place them, and how
to direct traffic through them is a major challenge facing network administrators.
Until recently, each middlebox was a dedicated appliance, consisting of both soft-
ware and hardware. A network could easily have a long chain of these appliances at
one location, forcing all connections to traverse every appliance—whether they need
all of the services or not. Over the last few years, middleboxes have become increas-
ingly virtualized, with the software service separate from the physical hardware—an
industry trend called Network Functions Virtualization (NFV) [36, 85]. This has led
to a growing interest in good algorithms for optimizing the (i) allocation of middle-
boxes over a pool of server resources, (ii) steering of traffic through a suitable sequence
of middleboxes based on a high-level policy, and (iii) routing of the traffic between
the servers over efficient network paths [10, 64, 70, 80, 87].
61
Core Problem
To this end, we introduced the Middlebox Routing class of problems, in which we
discuss how to optimize processed flow routed in a fixed network [29]. This problem
can be modeled mathematically as follows. We are given a directed graph G = (V, E)
along with edge capacities B : E R
+
, vertex capacities C : V [0, ), and a
collection of demand pairs D = {(s
1
, t
1
, k
1
), (s
2
, t
2
, k
2
), ···} V × V × R
+
. While
the edge capacities are used in the same way as in a standard multi-commodity flow
problem, we also require that each unit of flow undergo a total of one unit of processing
at intermediate vertices. In particular, while edge capacities limit the total amount of
flow that may pass through an edge, vertex capacities only bottleneck the amount of
processing that may be done at a given vertex, regardless of the total amount of flow
that uses the vertex as an intermediate node. The goal is then either to route as much
flow as possible, or to satisfy all flow demand subject to a congestion-minimization
objective function. A concrete example of an Middlebox Routing instance can be
seen in Figure 5.1.
Problem 5.1 (Middlebox Routing (MR)): Given a directed graph representing
the network, edge capacities, vertex capacities, and a collection of demand pairs,
find the maximum feasible solution to the multi-commodity flow problem that also
satisfies vertex capacities.
In networks where not all of the middleboxes have become virtualized, or in cases
where some nodes simply cannot perform the same tasks that others can, we may also
have dependencies among the tasks performed. For example, if middlebox x does a
video transcoding operation while middlebox y runs an encryption algorithm, it may
be necessary for x to perform its task on a given flow before y does. To model this,
each flow is given a sequence of processing tasks that must be completed en route
from its origin to its destination. Additionally, each node may have zero or more
62
s
2
0
2
0
t
2
5
5
5
5
5
5
5
5
Figure 5.1: The edge capacity is 10 for all edges and the node capacities are denoted
in each node. Here, we can send up to 5 units of flow from s to t, by routing it along
the red arcs, have it processed at the nodes at the top, and then sent to t along the
blue arcs. The capacity of the bottom middle edge forms the bottleneck here, as all
flow must pass through it twice before reaching t.
units of processing capacity available for each task. The problem then proceeds as
before, with each flow requiring, in sequence, one unit of processing for each task.
Thus, the problem may be stated as follows.
Problem 5.2 (Dependency Middlebox Routing (DMR)): Given a directed
graph representing the network, edge capacities, vertex capacities, and a collection
of demand pairs each with a sequence of task types, find the maximum feasible solu-
tion to the multi-commodity flow problem that also satisfies vertex capacities while
completing each flow task in order.
In Section 5.2, we outline two linear programming-based algorithms for Prob-
lem 5.1. Like standard multi-commodity flow, the program can be written in two
different equivalent ways: either with an exponentially-sized walk-based LP or with a
polynomially-sized edge-based LP. The proof of equivalence of equivalence of these two
LPs requires a more careful argument than that for standard Multi-Commodity
Flow. We also outline an efficient multiplicative weight update algorithm that finds
approximately optimal solutions to our walk-based linear program far more quickly
than one could with the edge-based program paired with an off-the-shelf LP-solver.
63
In the original paper, we also show how polynomial-sized LPs for Problem 5.1 can
be generalized into LPs for Problem 5.2. We refer the interested reader to section 2.2
of the corresponding tech report.
Network Design
In some sense, the routing problem is the second part of the problem. First, net-
work administrators must decide how to distribute the processing capacity among
the graph’s many nodes. Thus, we also consider the Middlebox Placement class
of problems, in which the goal is to optimally purchase processing power at various
middleboxes. Prices for placing processing at the various nodes is given as part of
the input, and may differ substantially from one location to the next. This class of
problems comes has two natural variants:
Problem 5.3 (Min-Middlebox Placement (Min-MP)): Given a set of flow de-
mands, minimize cost while purchasing enough middlebox processing capacity so that
all flow demands are simultaneously satisfiable (that is, jointly routable and steer-
able).
Problem 5.4 (Max-Middlebox Placement (Max-MP)): Given a set of flow de-
mands and a budget of k dollars, spend at most $k on purchasing middlebox process-
ing capacity while maximizing the fraction of the given demand that is simultaneously
satisfiable.
For Min-MP, we show an O(log(n)/
2
) approximation algorithm that satisfies
a (1 ) fraction of the demand while obeying all node and edge capacities. We
show that in the directed case, the problem is hard to approximate better than a
logarithmic factor, even if the demand requirements are relaxed. Additionally, we
show that the undirected case is at least as hard to approximate as Vertex Cover.
64
We also prove approximation and hardness results for Max-MP. Although it’s
tempting to conjecture that the problem is an instance of Budgeted Submodular
Maximization, one can construct instances on both directed and undirected graphs
where the amount of routable processed flow is not submodular in the set of purchased
nodes, so black-box submodular maximization techniques cannot be used here. We
show an Ω(1/ log(n)) approximation for both problems, as well as a constant factor
approximation algorithm for undirected instances with a single source-sink pair. For
the directed case, we show approximation hardness of 1 1/e, and constant factor
hardness for the undirected problem.
Problems 5.3 and 5.4 can also be generalized to capture the dependency routing
problem. For the minimization version, we show Min–k–Rep-hardness.
Problem 5.5 (Dependency Min-Middlebox Placement (Min-MP)): Given
a set of flow demands with dependencies, minimize cost while purchasing enough
middlebox processing capacity for each process type so that all flow demands are
simultaneously satisfiable.
5.2 Solving Middlebox Routing
We outline how Problem 5.1 can be solved using linear programs. While Figure 5.1
rules out the possibility of directly using the standard path-based multi-commodity
flow LP, recasting the LP as a walk-based program suffices. When, as in Problem 5.1,
we’re unconcerned with differing processing requirements between tasks, it suffices to
only consider 2-walks, or walks that may repeat an edge at most twice. Thus, for the
duration of this section, the term “walk” will be used as a synonym for “2-walk”.
To express the walk-based linear program, we introduce one variable p
v
i,π
for each
walk-vertex-demand triplet, representing the total amount of flow from s
i
, t
i
exactly
utilizing walk π and processed at v. Note here the set P of walks is an enumeration
65
of all possible walks in the graph, which can be exponential in size. The LP is then
formulated as follows:
maximize Σ
|D|
i=1
P
πP
p
i,π
subject to
p
i,π
=
X
vπ
p
v
i,π
i [|D|], π P
|D|
X
i=1
X
πP
π3e
p
i,π
B(e) e E
|D|
X
i=1
X
πP
p
v
i,π
C(v) v V
p
v
i,π
0 i [|D|], π P, v V
While the first constraint enforces that all flows are fully processed, the second and
third constraints ensure that no edge or vertex is over-saturated.
To show how such a program can be optimized in polynomial-time, we establish
a corresponding polynomial-sized edge-based LP
maximize
|D|
X
i=1
X
eδ
+
(s
i
)
f
i
(e)
Subject to
X
eδ
(v)
f
i
(e) =
X
eδ
+
(v)
f
i
(e) i [|D|], v V \ {s
i
, t
i
} (5.2a)
p
i
(v) =
X
eδ
(v)
w
i
(e)
X
eδ
+
(v)
w
i
(e) i [|D|], v V \ {s
i
} (5.2b)
[D]
X
i=1
f
i
(e) B(e) e E (5.2c)
|D|
X
i=1
p
i
(v) C(v) v V (5.2d)
w
i
(e) f
i
(e) i [D], e E (5.2e)
w
i
(e) = f
i
(e) i [D], e δ
+
(s
i
) (5.2f)
w
i
(e) = 0 i [D], e δ
(t
i
) (5.2g)
w
i
(e), p
i
(v) 0 i [D], e E
66
The LP constraints can be interpreted as follows. Constraint (5.2a) is a flow
conservation constraint: at any non-terminal node of flow i, the amount of flow i
that enters the node should equal the amount that leaves it. Constraint (5.2b) is a
processing conservation constraint, ensuring that the total amount of flow (processed
or unprocessed) going through a node remains unchanged, although the quantity of
each might change if the node processes any of the flow. Constraints (5.2c) and (5.2d)
ensure that we don’t exceed the edge and node capacity. Constraint (5.2e) ensures
that the amount of work yet to be done on a flow does not exceed the size of the flow
itself, while (5.2f) and (5.2g) ensure that all flows leave the sources unprocessed and
arrive to the destinations fully processed.
While the construction of the edge-based LP is not particularly difficult, it is not
obvious that the edge-based solution actually solves the problem in question. We need
to prove the correctness of the edge-based LP. A priori, solutions to the edge-based
LP here may not be decomposable to a valid routing pattern at all. In this subsection,
we briefly describe a polynomial-time algorithm converting feasible solutions to the
edge-based LP into corresponding solutions to the walk-based program, proving both
that the edge-based LP is correct and that the actual flow paths can be recovered in
polynomial time as well. Because much of the involved work is menial, we summarize
this result in the following theorem, and provide only a proof outline. Full details are
left to the original paper.
Theorem 5.1: The edge-based formulation provides a polynomial-sized linear pro-
gram solving the Middlebox Routing problem. Further, the full routing pattern
can be extracted from the LP solution by decomposing it into its composing s
i
, t
i
walks in O(|V | · |E| · |D| · log |V |) time.
Proof sketch. The first part of the proof involves showing how to construct a solution
to the flow-based LP when there is exactly one s
i
, t
i
pair. Extracting the correspond-
67
ing flow paths and iterating this procedure for each demand pair eventually extracts
all s
i
, t
i
flows, giving a solution to the multi-commodity problem.
The flow extraction argument proceeds in two steps. First, we simplify the solution
by removing extraneous loops that do not affect the optimal solution. This argument
requires a bit of caution, but for the most part proceeds similarly to that for standard
max-flow problems. Next, we show that the existence of any residual flow in the graph
(i.e., the existence of some strictly positive f
i
(e)) implies that there exists at least one
valid walk we can efficiently extract while maintaining feasibility of all constraints for
the updated residual graph. Finally, we argue that a linear number of extractions
suffices to remove all flow from the solution, meaning that we attain an optimal
solution to the walk-based LP.
5.2.1 A Multiplicative Weights Approach
As the standard Multi-Commodity Flow problem admits a faster (1 )-
approximation algorithm using the Multiplicative Weights Update (MWU) Frame-
work [57, 14], it’s natural to wonder if we can use Multiplicative Weights to speed
up Middlebox Routing as well. In this section, we show that the MWU approach
does in fact work for our problem, giving us a (1 )-approximation algorithm with
running time O(
2
·|D|·|E| · (|E|+ |V |log |V |) ·log
2
|V |). Before we show how this
is done, we start out with a (very) high-level overview of the of the MWU method.
Next, we describe how to apply MWU to our problem that includes processing
vertices. Finally, we provide a proof of correctness.
Multiplicative Weight Update for Traditional Multi-Commodity Flow
In the traditional multiplicative weights algorithm for multi-commodity flow, there
is an “expert” is assigned to each edge, each of which is initially assigned a suffi-
ciently small weight. The algorithm then iteratively finds s
i
, t
i
walks minimizing the
68
sum of weighted utilization of their edges and adds together scaled down versions of
these paths to eventually construct a solution. When a path is chosen, all experts
corresponding to edges along the path have their weight increased by a multiplicative
factor, making it less likely that we repeat our selection of the edges. This process
is repeated until some expert’s weight surpasses the value 1, corresponding to a fully
utilized edge. When this happens, all paths are scaled down by the weight of the
largest expert to ensure that no capacities are exceeded. One then shows that the
final result is within a (1 ) factor of the maximum multi-commodity flow.
Formulation and Analysis
Although we derive the same (1) approximation factor for our problem, the analysis
of our multiplicative weights algorithm is quite different from that of traditional multi-
commodity flow. Intuitively, this is because vertex capacities are inherently very
different from edge capacities: while a flow walk reduces the remaining capacity on
all edges it traverses, it only reduces the capacity for one of its vertices. Thus, we
set up a different update condition, as well as a different method for picking the best
flow walks for each round.
Setup For each edge e, we have a constraint
P
π
p
π
B(e), where p
π
is the
amount of flow sent on walk π. For each vertex, the corresponding constraint is
P
π
p
v
π
C(v), where p
v
π
is the amount of flow on walk π that is processed at v. For
each of these two sets of constraints, we associate one expert, (which we call ˆe and
ˆv), whose weights are denoted by w
ˆe
and q
ˆv
, respectively.
Consider a feasible solution to the walk-based LP. The feasible solution consists
of variables of the form p
π
and p
v
π
. In this section, we abuse notation and let the
variable p denote a feasible solution to the LP, at which point p
π
and p
v
π
become bound
variables for each π and v (that is, p can be thought of as a dictionary containing the
69
aforementioned set of variables). Further, define A(p) as the objective function value
of p, i.e. A(p) =
P
vV
P
πP
p
v
π
.
For an expert ˆe and feasible solution p, define the gain M(ˆe, p) by M(ˆe, p) =
1
B(e)
P
π3e
p
π
., This can be thought of as the fraction of e’s capacity actually utilized
by the feasible solution. For each expert ˆv, we define the gain M(ˆv, p) by M(ˆv, p) =
1
C(v)
P
π3v
p
v
π
, which corresponds to the fractional utilization of v’s processing capacity.
Let D be the probability distribution over experts in which the probability of
choosing a given expert is proportional to its weight. For a fixed p, the expected gain
of a random variable sampled from D is
M(D, p) =
P
e
w
ˆe
M(ˆe, p) +
P
v
q
ˆv
M(ˆv, p)
P
e
w
ˆe
+
P
v
q
ˆv
We first make two observations:
Observation 1: For any feasible solution p, 0 M(D, p) 1. This is because
M(ˆe, p) 1 and M(ˆv, p) 1 for all e and v.
Observation 2: For any feasible solution p and weights w, q, if π
=
argmin
π
P
eπ
w
ˆe
/B(e) + min
ˆvπ
q
ˆv
/C(v)
, then
M(D, p)
A(p)
P
ˆeπ
w
ˆe
/B(e) + min
ˆvπ
q
ˆv
/C(v)
P
e
w
ˆe
+
P
v
q
ˆv
This is due to the fact that:
70
M(D, p) =
P
e
w
ˆe
M(ˆe, p) +
P
v
q
ˆv
M(ˆv, p)
P
e
w
ˆe
+
P
v
q
ˆv
=
P
π
p
π
(
P
e
w
ˆe
/B(e)) +
P
vπ
p
v
π
q
ˆv
/C(v)
P
e
w
ˆe
+
P
v
q
ˆv
P
π
(p
π
(
P
e
w
ˆe
/B(e) + min
vπ
q
ˆv
/C(v))
P
e
w
ˆe
+
P
v
q
ˆv
P
π
p
π
min
π
(
P
eπ
w
ˆe
/B(e) + min
vπ
q
ˆv
/C(v))
P
e
w
ˆe
+
P
v
q
ˆv
A(p)(
P
eπ
w
ˆe
/B(e) + min
vπ
q
ˆv
/C(v))
P
e
w
ˆe
+
P
v
q
ˆv
Where π
is the path minimizing the argmin in the statement of the observation.
Thus, in each round, we aim to find the π
minimizing this value. Conditioned on us
being able to do so, the rest of the MWU algorithm proceeds as follows:
1. We initialize all expert weights {w
ˆe
} and {q
ˆv
} to 1, where δ = (1 + )((1 +
) · |E|)
1/
. This choice of δ will be justified in the analysis of Section 5.2.1.
2. At each step t, given weights w
t
e
and q
t
v
on the experts, we pick the flow-walk
p
t
minimizing the quantity
P
eπ
w
ˆe
B(e)
+ min
vπ
q
ˆv
C(v)
. An efficient algorithm for
finding such a walk is given in Section 5.2.1.
3. Given the walk p
t
chosen in the previous step, we treat this as a feasible solution
to the instance, giving expert
ˆ
j a gain of M(
ˆ
j, p
t
). Consequently, the weight w
ˆe
or q
ˆv
of each expert j is increased by a multiplicative factor of M(
ˆ
j, p
t
).
4. The algorithm stops when one of the weights w
ˆe
or q
ˆv
is larger than 1. Once
the algorithm terminates, we scale down the flow p
t
computed at each round
by a factor of log
1+
1+
δ
= 1
ln δ
ln 1+
, and return the set of all flow-walks p
t
.
Computing the Optimal Path To compute the walk π
t
with minimum cost,
we use a dynamic programming algorithm reminiscent of Dijkstra’s shortest path
71
algorithm. Given a graph G(V, E), with weights w(e) on edges, weights n(v) on
nodes, and some source-sink pair s, t, we are interested in computing the following
quantity
opt(s, t) := argmin
π=(s,···,t),vπ
cost(π, v)
where cost(π, v) is defined as
cost(π, v) :=
X
eπ
w(e) + n(v)
!
We compute opt(s, t) in two stages. First, for every v, we upper bound the value of
opt(s, v) by n(v) plus the shortest distance from s to v. Afterwards, we use dynamic
programming to iteratively decrease these upper bounds. Full details are given in
Algorithm 2.
Algorithm 2 Optimal Walk Algorithm
Require: Graph G = (V, E) with edge weights w(e), node weights n(v), and a
designated source s.
return r(v) = opt(s, v) for every v V .
Use Dijsktra’s algorithm to compute the shortest path d(v) between s and v.
Initialize r(v) d(v) + n(v) for all v V . S {s}.
while S 6= V do
Let u
= argmin
vV \S
r(v). Add u
to S.
For all neighbors z of u
that are not already in S, let r(z) min{r(u
) +
w(u, z), r(z)}
return r
Update Suppose now that the walk π with smallest cost has been computed.
One of two things may bottleneck the amount of processed flow that can be sent
along π: either the edge capacity of some edge e, or the processing capacity of some
vertex v. We consider the two cases separately
72
If the bottleneck is edge-based, i.e.
P
vπ
t
C(v) min
eπ
t
B(e), then let e
t
=
argmin
eπ
t
B(e), and let the chosen flow walk p
t
be the one satisfying
p
t,v
π
=
C(v)
P
vπ
t
C(v)
· B
e
t
if π = π
t
, v π
t
0 otherwise
On the other hand, if
P
vπ
t
C(v) < min
eπ
t
B(e), select p
t
to satisfy
p
t,v
π
=
C(v) if π = π
t
, v π
t
0 otherwise
Proof of the (1 ) Approximation
Let T be the number of rounds taken until we hit the stopping criterion, and let
¯p =
P
T
t=1
p
t
be the total amount of flow selected after T rounds. By the guarantee of
the multiplicative update method (Theorem 2.5 in [15]), we have that for any e and
any v
T
X
t=1
M(D
t
, p
t
)
ln(1 + )
M(ˆe, ¯p)
ln m
T
X
t=1
M(D
t
, p
t
)
ln(1 + )
M(ˆv, ¯p)
ln m
Since at time T , w
T
ˆe
= w
0
ˆe
(1 + )
M(ˆe,¯p)
, and q
T
ˆv
= w
0
v
(1 + )
M(ˆv,¯p)
, and the stopping
rule ensures that at there exists e or v such that w
T
ˆe
1 or q
T
ˆv
1, we have that either
there exists an e such that M(ˆe, ¯p)
ln 1
ln(1+)
or there exists v such that M(ˆv, ¯p)
ln 1
ln(1+)
. Therefore, by the guarantee of the MWU method, we have that
T
X
t=1
M(D
t
, p
t
)
ln 1
ln m
73
We now attempt to bound the left-hand-side of the preceding inequality. Note
that
M(D
t
, p
t
) =
P
e
w
t
ˆe
M(ˆe
t
, p
t
) +
P
v
q
t
ˆv
M(ˆv
t
, p
t
)
P
e
w
t
ˆe
+
P
v
q
t
ˆv
=
A(p
t
) · (
P
eπ
t
w
t
ˆe
/B(e) + min
vπ
t
q
t
ˆv
/C(v))
P
e
w
t
ˆe
+
P
v
q
t
ˆv
By the definition of π
t
and Observation 2, we have
M(D
t
, p
t
) =
A(p
t
)(
P
eπ
t
w
t
ˆe
/B(e) + min
vπ
t
q
t
ˆv
/C(v))
P
e
w
t
ˆe
+
P
v
q
t
ˆv
A(p
t
)/A(p
opt
)
Combining these inequalities, we get that
A(¯p)/A(p
opt
)
T
X
t=1
M(D
t
, p
t
)
ln 1/(|E| · δ)
Fixing any edge e, its expert’s initial weight is 1 and its expert’s final weight is
at most 1+. Thus, ¯p passes at most B(e) log
1+
((1+)) flow through it. Similarly,
for each v, at most C(v) log
1+
((1 + )) units of processing are assigned to it. In
other words, scaling down all p
t
flows by log
1+
(1 + ) will result in a feasible flow.
Letting p
0
= ¯p/ log
1+
1+
δ
, we get
A(p
0
)/A(p
opt
) A(¯p)/
A(p
opt
) log
1+
1 +
δ
ln(1/(|E| · δ))
/ log
1+
1 +
δ
Taking δ = (1 + )((1 + )m)
1/
, we have that
A(p
0
)
A(p
opt
)
(1 ),
giving the promised approximation factor.
74
Note that in each iteration, we either increase the weight of one w
ˆe
by a factor of
(1 + ), or increase all of the q
ˆv
’s on a path π
t
by a factor of (1 + ). Since each w
ˆe
and each q
ˆv
can only be increased by such a factor at most
ln 1/(|Eδ)
times before its
weight exceeds 1, the total running time T is bounded by (|V |+ |E|)
ln 1/(|Eδ)
·T
sp
=
O(|E|log |V |/
2
· T
sp
), where T
sp
is the time it takes Algorithm 2 to compute the
optimal path for each the |D| flows. As a single flow takes O(|E| + |V |log V ) time,
we can compute the walks for each of the flows in time O(|D| · (|E| + |V |log |V |)).
Thus, the total running time is O(
2
· |D| · |E| · (|E| + |V |log |V |) · log
2
|V |).
5.3 Network Design
We now describe algorithms for both directed and undirected variants of the Min-MP
and Max-MP problems. We derive the following results
Table 5.1: Network Design Results
Directed Undirected
Maximization
Approximation Ω(1/ log n) .078
()
or Ω(1/ log n)
()
Hardness 1 1/e .999
Minimization
Approximation O(log n)
()
O(log n)
()
Hardness O(log n) 2
All demands are satisfied only up to an (1 ) fraction.
For 1 source-sink pair.
For multiple source-sink pairs.
5.3.1 Algorithms for Middlebox Placement
Bicriterion Approximation Algorithm for Min-MP
We first outline an algorithm for both directed and undirected variants of Min-MP
that satisfies all flow requirements up to a factor of 1 with expected cost bounded
by O(log n/
2
) times the optimum. Full details are given in the paper.
75
We first focus on the directed version of the problem. We modify the walk-based
LP formulation from Section 5.2 into an LP for Min-Rep by rewriting it as a mini-
mization problem and adding additional variables x
v
corresponding to whether or not
processing capacity at vertex v has been purchased. We also give a polynomial sized
edge-based LP formulation with flow variables f
1,v
i
(e) and f
2,v
i
(e) for each commodity
i, each vertex v V and each edge e E. The variables f
1,v
i
(e) correspond to the
(processed) commodity i flow that has been processed by vertex v: these variables
describe a flow from v to t
i
. The variables f
2,v
i
(e) correspond to the (unprocessed)
commodity i flow that will be processed by vertex v: these variables describe a flow
from s
i
to v.
Given an optimal solution to this LP, we pick vertices to install processing capacity
on by randomized rounding: each vertex v is picked independently with probability
x
v
. if x
v
is picked, then all flows processed by v are rounded up in the following way:
ˆ
F
j,v
i
(e) = f
j,v
i
(e)/x
v
for all i [|D|], j {1, 2}, e E. If v is not picked, then all
flows processed by v are set to zero, i.e.
ˆ
F
j,v
i
(e) = 0.
By design, all constraints are satisfied in expectation. Now, we repeat this ran-
domized rounding process t = O(log(n)/
2
) times, and consider the (likely infeasible)
solution that is the superposition of all solutions produced this way (i.e., the union
of all vertices picked between the t rounds plus the sum of all flows). By a standard
application of the Chernoff-Hoeffding and union bounds, scaling down this superpo-
sition by a factor of t(1 ) satisfies all constraints with probability at least 1/n.
Hence we get the following result:
Theorem 5.2: For directed Min-MP, there is a polynomial time randomized algo-
rithm that satisfies all flow requirements up to factor 1 and produces a solution
that respects all capacities, with expected cost bounded by O(log(n)/
2
) times the
optimal cost.
76
We can modify the LP to simulate the inclusion of an undirected edge with capac-
ity B(e) by adding the constraints for two arcs between its endpoints with capacity
B(e) each, as well as an additional constraint requiring that the sum of flows over
these two arcs is bounded by B(e). The analysis done above carries through line-by-
line, giving the following result.
Theorem 5.3: For undirected Min-MP, there is a polynomial time randomized
algorithm that satisfies all flow requirements up to factor 1 and produces a solution
that respects all capacities, with expected cost bounded by O(log(n)/
2
) times the
optimal cost.
Approximation Algorithm for Directed and Undirected Max-MP
The algorithm here proceeds similarly to that underlying Theorem 5.3. The LP we
use is the natural maximization variant of that used for the minimization problem,
with the added restriction that we only use a 1/2 fraction of the budget (this later
helps bound the probability that we exceed the budget, without messing up the
asymptotics).
Given a solution to the LP, we commit to purchasing any vertex that singlehand-
edly allows us to route a 1/(2 ln n) fraction of its flow. If no such vertex exists,
committing to not buying any vertex whose cost is more than a 1/ ln n fraction of the
budget can (by an argument reminiscent of Markov’s inequality) only reduce OPT by
a factor of 2. This ensures that costs are small enough that it’s easy to avoid going
over-budget in a randomized rounding scheme.
We now produce a solution as follows. Each vertex is picked to contain processing
with probability proportional to its value given by the LP. If a vertex is picked, our
solution will route all flows that are processed by it scaled down by a factor of 4 ln n.
Otherwise, none of the flows processed by the vertex are routed. By a combination
of the Chernoff and union bounds, all non-budget inequalities in the LP are satisfied
77
with high probability. Further, due to the scaling of the budget, Markov’s inequality
promises that the budget inequality is also satisfied with probability at least 1/2.
Thus, repeating the sampling process t = O(log n) times and taking the best of the t
solutions gives the sought Ω(1/ log n) approximation w.h.p, where the 1/ log n comes
from the scaling we applied to the flows.
Theorem 5.4: For directed Max-MP, there is a polynomial-time randomized algo-
rithm producing an Ω(1/ log(n)) approximation.
As in the proof of Theorem 5.3, we can also apply this algorithm to undirected
instances by adding an additional constraint, with the analysis carrying through as
before. Thus, we attain the following:
Theorem 5.5: For undirected Max-MP, there is a polynomial-time randomized
algorithm producing an Ω(1/ log(n)) approximation.
Approximation Algorithm for Undirected Max-MP with a single source
In the special case where there is just a single source s (or, by symmetry, just a single
sink t), we can derive a constant-factor approximation algorithm for undirected Max-
MP.
We construct the algorithm so that it routes its flows as follows. First, it re-
serves 25% of the capacity of each edge for the pre-processing step, 25% for the
post-processing step, and 50% for the routing step. Intuitively, the goal in the pre-
processing step is to simply route as much flow as possible to processing vertices,
irrespective of their location in the graph. In post-processing, we route all of this flow
back to the source. Finally, in the routing step, the algorithm routes as much flow as
it can from the source to the sinks.
The key idea behind the algorithm is as follows. Suppose that we can purchase
processing power so to maximize the amount of total power that can be utilized (i.e.,
78
optimizing the objective value of the pre-processing step). Then in the post-processing
step, we should be able to use those exact flow-paths to route all of the now-processed
flow back to the source. With all of the flow now processed and back at the source, we
can use the remaining capacity to route it from the source over to the sinks. Although
the remaining capacities may be lower than they were originally, the fact that at least
half of each edge was initially reserved implies that the amount routable is at least
half of that which was routable to the sinks before the pre- and post-processing steps.
In particular, we still have the capacity to route at least OPT/2 units of flow.
To actually purchase the vertices, we appeal to a lemma of [27], where it is shown
that the amount of flow routable from a source s to a set T of sinks forms a monotone,
submodular function in the sink set. Although the problem they study is defined
in the context of sinks that can accept an arbitrary amount of flow (should the
network support it), we can bottleneck each sink t
i
into accepting at most some c
i
units of flow by replacing it with a pair of vertices connected by a capacity c
i
edge,
without changing the submodularity of the routable flow function. Thus, purchasing
the vertices is exactly an instance of submodular maximization subject to knapsack
constraints, which is a problem well-known to be approximable to within a factor of
11/e [92]. Combining this with the approach outlined above, we derive the following
theorem.
Theorem 5.6: For undirected Max-MP with a single source, there is a deterministic
polynomial time algorithm that produces a solution that can route at least (1
1/e)/8 .078 times the optimal objective solution.
79
s
v
1
v
2
v
3
w
1
w
2
w
3
t
Figure 5.2: Approximation-preserving reduction from Set Cover and Max k-
Coverage to directed Min-MP and directed Max-MP. Solid edges have infinite
capacity, dashed edges have capacity 1. v
i
vertices have infinite processing potential,
at a cost of 1 each.
5.3.2 Hardness of Middlebox Placement
Approximation Hardness for Directed Min-MP and Directed Max-MP
Hardness both for directed Min-MP and for directed Max-MP can be shown using
effectively the same technique. For Min-MP, we reduce from Set Cover. Given
a Set Cover instance with set system S and universe U, we create a Min-MP
instance with one vertex v
S
for each S S and one vertex w
u
for each u U.
Further, we create one source vertex s and one sink vertex t, where t demands |U|
units of processed flow from s. We add one capacity-n arc from s to each v
S
, and one
capacity-1 arc from each w
u
to t. We then add a capacity-1 arc from each v
S
to w
u
whenever S 3 u. Finally, we give each v
S
vertex n units of processing capacity at a
cost of 1 each. A simple example of this construction is depicted in Figure 5.2.
The idea is simple. The edge capacities ensure that each w
i
(corresponding to an
element u
i
) can route at most 1 unit of flow to t. The other constraints ensure that
w
i
gets flow (under an optimal routing) if and only if a vertex corresponding to a set
containing u
i
was picked. Hence, given a set V
0
of vertices on which to put processing
capacity, the |U| units of demand can be routed if and only if V
0
corresponds to a Set
Cover solution. Since each vertex costs 1, the optimal objective value is therefore
exactly the same as that for Set Cover, and is therefore (by [50]) NP-hard to
approximate to within a (1 ) ln n factor for any > 0. In summary:
80
Theorem 5.7: For every > 0, it is NP-hard to approximate directed Min-MP to
within a factor of (1 ) ln n.
For Max-MP, replacing Set Cover with Min k-Cover in the above construc-
tion gives Min k-Cover hardness. Combining this with the hardness result of [50],
we get the following result.
Theorem 5.8: It is NP-hard to approximate directed Min-MP to within a factor
of 1 1/e.
Hardness of Undirected Min-MP
We show an approximation-preserving reduction from Min Vertex Cover to undi-
rected Min-MP. This implies that it’s UGC-hard to approximate to within a factor
of 2 , and NP-hard to approximate within a factor of 1.36 [48, 73]. Given a Min
Vertex Cover instance with graph G = (V, E), we create an identical graph with
each vertex v demanding one unit of processed flow from each of its neighbors, and
each edge’s capacity is 2. Further, each vertex has n units of processing potential, at
a cost of 1. Because the total demand equals the sum of all edge capacities, each unit
of flow sent must use exactly one unit of edge capacity, i.e. all flow paths have length
exactly one. Thus, the set of solutions exactly corresponds to vertex covers, with one
unit of flow going each way across each edge, from source to sink and either to or
from its point of processing. The unit costs ensure that the objective value equals
the number of vertices picked, and thus that the optimal solution to this undirected
Min-MP instance equals that of the original Min Vertex Cover. The conclusion,
summarized below, follows.
Theorem 5.9: There is an approximation-preserving reduction from Min Vertex
Cover to Min-MP.
81
Approximation Hardness for Undirected Max-MP
We show that for some fixed
0
> 0, the undirected version of Max-MP is NP-hard
to approximate within a factor of 1
0
, implying that the the problem does not
admit a PTAS unless P = NP.
We show this hardness by reducing from Max Bisection on degree-3 graphs,
shown to be hard to approximate within a factor of .997 in [21].
1
. Let G = (V, E)
be the input to the degree-3 Max Bisection instance. For each v
i
V , create two
vertices, u
i
and w
i
, joined by an edge with capacity 3. We also add a capacity-1 edge
between u
i
and u
j
whenever v
i
and v
j
are adjacent in G. Each w
i
vertex demands 3
units of flow from every u
j
(including when i = j). Further, every u
i
vertex can be
given 3|V | units of processing capacity (or, equivalently, units) at a cost of 1, and
the instance’s budget is set to |V |/2.
The intuition behind the construction is as follows. With a budget of |V |/2, we
can purchase exactly half of the u
i
vertices (and all budget is used up without loss of
generality); our bisection will be between the purchased u
i
s and the unpurchased ones.
Let b be the number of edges in any such bisection. Each w
i
adjacent to a purchased
u
i
can have 3 units of its demand satisfied by flow originating from and processed
by u
i
, and the only edge connecting w
i
to the rest of the graph ensures w
i
can never
receive more than 3 units of flow regardless. Thus, such w
i
s are maximally satisfied,
and contribute 3|V |/2 units to our objective value. The remaining w
i
s must have
their processed flow routed to them via the b capacity-1 edges in the bisection (and,
indeed, every edge in the bisection will carry 1 unit of flow when routed optimally,
as witnessed by the solution where each unprocessed u
i
receives flow on each cut-
edge and routes it directly to w
i
), so the total amount of demand satisfied by the w
i
adjacent to unpurchased vertices is exactly b.
1
To be precise, this paper shows the aforementioned hardness for Max Cut A simple approx-
imation preserving reduction from Max Cut to Max Bisection can be derived by looking at
maximum cuts of the graph formed by 2 disjoint copies of the Max Cut instance graph.
82
Hence, when there are b edges in the bisection, the objective value is exactly
3|V |/2 + b. Because b
OPT
|E|/2 = 3|V |/4, the constant-factor hardness of approx-
imating b
OPT
translates into constant-factor hardness for undirected Max-MP. In
particular, we can show the following result.
Theorem 5.10: It is NP-hard to approximate undirected Max-MP to within a
factor better than .999.
Min–k–Rep Hardness for Directed Dependency Min-MP
We start by showing Min-Rep-hardness for Dependency Min-MP. The reduction
from a given Min-Rep instance (V, E,
ˆ
V ,
ˆ
E) proceeds as follows. Let A and B be the
two partitions of V , and
ˆ
A and
ˆ
B be the two partitions of
ˆ
V . For each supervertex
in
ˆ
A and in
ˆ
B, the graph has one corresponding node (a
i
or b
i
, respectively) as well
as one node v
j
for each of the supervertex’s (constituent) vertices. Capacity-1 arcs
are added from each a
i
(supervertex) node to the {v
k
}s representing nodes in
ˆ
A
i
, as
well as from each {v
k
}
ˆ
B to the b
i
corresponding to its containing supervertex.
Purchasing (infinite) processing capacity is free at the {a
i
}s and {b
k
}s, but comes at
a cost of 1 at the {v
k
}s.
Finally, we add two vertices s and t the first with an infinite-capacity arc to
each a
i
and the second with an infinite-capacity arc from every b
k
. For each superedge
(A
i
, B
k
), we add one unit of demand from s to t specifying that the flow must be
processed at both a
i
and b
k
. It is not hard to verify that the problem of placing
processing at the v
i
nodes exactly corresponds to selecting vertices in the source
Min-Rep instance, thus establishing the hardness.
To convert this into hardness for Min–k–Rep, we add two additional nodes, q
1
and q
2
. There is an unbounded-capacity arc from s to q
1
and an unbounded-capacity
arc from q
2
to t, and q
1
is linked to q
2
with an arc of capacity (|
ˆ
V | k). Unlimited
processing capacity can be purchased at either q
1
or q
2
for free. We now modify the
83
demands so that each unit of flow can be processed either by its original required pair
(a
i
and b
j
) or by q
1
and q
2
.
This construction allows for |
ˆ
V |k units of the demand to get processed for free
via the path s q
1
q
2
t, meaning that only k units of the demand must be
processed as the original construction intended. Thus, the problem is now equivalent
to that of solving Min–k–Rep, giving the desired hardness.
Theorem 5.11: Min–k–Rep
p
Directed Dependency Min MP .
84
Chapter 6
Other Problems
In this chapter, we study a collection of problems each of which can be connected
back to the hardness of DkS. Although some sections, such as Sections 6.1 and 6.5,
only contain hardness results, other sections also include algorithmic results for their
respective problem.
6.1 MMSA
Roughly speaking, the Minimum Monotone Satisfying Assignment (MMSA)
class of problems asks for the smallest minterm of a monotone boolean function f,
i.e., the Boolean vector ~x with f(~x) = 1 that minimizes |~x|
1
. However, there’s
some subtlety involved in the exact definition of how the monotone boolean for-
mula is provided. The input f has varyingly been given as a monotone formula
(Formula MMSA/ FMMSA) [35, 49, 59], a monotone circuit (Circuit MMSA/
CMMSA) [2], and as a circuit that evaluates a monotone function but may use ¬
gates internally (Nonmonotone Circuit MMSA) [93]. The quality of an approx-
imate solution (the n in the approximation factor) has also been measured either
in terms of the formula/circuit size or in terms of the length of the input vector (in
85
which case the circuit size is assumed to be poly(n)). Because circuits generalize
formulas, CMMSA is at least as hard as FMMSA.
While each of these models has been shown to give LabelCover-hardness
(e.g. [59]) only [93] manages to establish a stronger hardness for the nonmono-
tone circuit form from standard complexity theory conjectures. Namely, Umans
shows n
1
-hardness for the problem (under either definition of n”) assuming
NP 6⊆ Quasi-P. Despite the sub-polynomial lower bounds on the other flavors of
MMSA, none of the flavors mentioned above are known to admit any nontrivial
(o(n)) approximation algorithm. In this section, we study the hardness of both
CMMSA and FMMSA.
Problem 6.1 (Circuit MMSA (CMMSA)): Given a monotone circuit, find its
smallest minterm.
Problem 6.2 (Formula MMSA (FMMSA)): Given a monotone formula, find its
smallest minterm.
6.1.1 MMSA and Min-Rep
We begin by describing the construction showing Min-Rep hardness for FMMSA.
Although this (to my knowledge) does not explicitly previously exist in the literature,
it’s a trivial modification of the known hardness from LabelCover-Min.
Theorem 6.1: Min-Rep
p
FMMSA (and thus Min-Rep
p
CMMSA)
Proof. Given a Min-Rep instance (V, E,
ˆ
V ,
ˆ
E), we can express the set of feasible
solutions to the problem as the monotone formula
^
ˆe
ˆ
E
_
(u,v)ˆe
(u v) (6.1)
Intuitively, this can be read as follows:
86
A feasible solution is one in which for all () superedges, there exists ()
an edge such that all () of its endpoints are picked.
To fit this into our Min–k–Rep framework, we want to replace the “for all su-
peredges” bit with a “for at least k superedges”. This exactly corresponds to replac-
ing the outermost with a monotone threshold formula, which can be constructed
in (deterministic) polynomial time using the AKS sorting network [1]. Thus, we
immediately attain Min–k–Rep-hardness for both problems. Formally,
Theorem 6.2: Min–k–Rep
p
FMMSA (and thus Min–k–Rep
p
CMMSA)
6.1.2 Hardness from Planted DkS Instances
While the results of Section 6.1.1 imply n
hardness for FMMSA and CMMSA
assuming Conjecture 1, we can improve the lower bound if we start from the stronger
Conjecture 2. In particular, we get the following theorem. We show that
Theorem 6.3: Conjecture 1 implies both n
1
1
/2
and n
2
1
/3
hardness for CMMSA,
where n
1
and n
2
measure the number of inputs and circuit gates in the given instance,
respectively.
Proof. Our starting point is the TSS
r
instance constructed in Theorem 3.10, where
we showed n
1
/2o
r
(1)
hardness for TSS
r
. In addition to exhibiting the aforemen-
tioned hardness for TSS
r
, this construction has two additional properties that we
will exploit: (i) all thresholds are the same constant τ = f(r), and (ii) the maximum
degree is O(
n).
The reduction is thus as follows. For each round t = 1, 2, . . . , r we have n gates,
one for each vertex. The gate corresponding to vertex v is the output of a monotone
circuit evaluating the threshold function Th
d
τ
(N
t1
(v)), where τ is the threshold of v, d
is its degree, and N
t1
(v) is the gates corresponding to the previous timestep’s gates
that are in-neighbors of v (for the first timestep, these are just the input gates).
87
Finally, the gates corresponding to the last round all feed into a gate, forming our
output.
The correctness of this circuit is easy to verify. It simulates running the activa-
tion process for all r rounds, and checks whether all vertices are active by the end
of round r. As the in-degree of each vertex in the TSS
r
instance is O(
n), the
monotone circuit construction of Friedman [56] requires at most O(τ
12.6
n log
n) =
O(
n log n) gates. Since we have rn of these threshold circuits, the total number of
gates is O(rn
n log n) = O(n
1.5
log n).
The number of inputs, n
1
, is still just n. Thus, the reduction is approximation
preserving, and so from a TSS
r
instance we attain n
1
1
/2o
1
(r)
hardness. By picking
a sufficiently large r, this gives a hardness factor of n
1
1
/2
. As the number of gates
is n
2
=
˜
O(n
1.5
), the hardness factor becomes n
1
/2
=
˜
O((n
2
2
/3
)
1
/2
) = n
2
1
/3
.
6.2 Min-Query Q&A
In the Min-Query Q&A problem, we take on the role of an automated system trying
to answer a user’s query in as few simple questions as possible. Before we formally
define the problem, we motivate it using the following scenario.
Suppose that a user wants to determine their risk of heart disease. One source
for answering this question may be Figure 6.1, provided by the NIH [55]. This table
returns an obesity class and risk score for a user as a function of their (i) age, (ii) sex,
(iii) waistline, and (iv) Body Mass Index (BMI). BMI, in turn, is itself a function of
the user’s height and weight. Thus, one depiction of the needed computations can be
expressed as in Figure 6.2.
In total, we need to ask the user up to 5 questions. While we could ask these in
an arbitrary order, some orderings might be more efficient than others. For example,
while asking a user first for their sex will not immediately provide much in the way
88
Figure 6.1: A table used for classifying a patient’s risk of heart disease. Source:
National Institute of Health [55].
Figure 6.2: An example DAG for computing heart health.
89
of determining a unique answer, learning that the user weighs 250 kg may allow
us to immediately determine that the user is almost certainly in Obesity Class III
(regardless of their other answers) and is therefore at Extremely High Risk of heart
disease.
Thus, there is some significance in the questioner’s order and selection of queries.
To formally describe the problem we’re studying, we introduce the concept of an
function composition DAG (FC). Formally, an FC k is a directed acyclic graph with
two kinds of nodes:
Attribute nodes - nodes that correspond to attributes a user may have (e.g.
height, weight, estimated risk of diabetes, etc.).
Function nodes - nodes that correspond to functions from its in-vertices (e.g.
height and weight) to its lone out-vertex (e.g. body mass index).
These two node sets are denoted A(k) and F (k), respectively. The sets A(k) and
F (k) form a bipartition of V (k) (so all out-edges of attribute nodes point to function
nodes and vice-versa), and all sources and sinks are in A(k). Source nodes, which
we call base attributes, are denoted B(k). Non-base attributes are called higher-order
attributes and are denoted H(k).
The FC is to be interpreted as follows. Every node corresponds to a real-number
value which may either be known or unknown. If the values of all in-attributes of a
function node u are known, then the value of u becomes u’s function evaluated on
precisely those values. Meanwhile, the value of an attribute can be deduced to be any
of the values of its in-functions. Base attributes, which do not have any in-functions,
form the set of questions we may ask the user.
Thus, the questions we want to study can be (roughly) stated as follows. Given
an FC k and target attribute a H(k), repeatedly query the user for values in B(k)
90
until the value of a is satisfactorily deducible according to the above rules (and then
output that value).
Of course, the term “satisfactorily deducible” is itself (intentionally) somewhat
vague. Various definitions of “satisfactorily deducible” may require additional meta-
data to be attached to each node (or, more generally, on subsets of nodes). For
example, in Min-Query Q&A (MQQA), we require a prior distribution on the val-
ues of nodes in B(k), while in another variant that we call Max-Credibility Q&A
(MCQA) requires credibility scores on nodes in both B(k) and F (k).
In this section, we focus exclusively on MQQA (details of MCQA are available
in the corresponding manuscript). In this problem, we’re given an FC as well as a
distribution over the values of the base attributes (this may or may not be a product
distribution). We then try to find an adaptive query strategy that minimizes the
number of queries we try to make.
Problem 6.3 (Min-Query Q&A (MQQA)): Given an FC and a prior distribution
on the base attributes, find the adaptive query strategy minimizing the number of
questions asked, in expectation.
Not surprisingly, this problem is very hard to solve. Even when the distribution
is a product distribution, it’s not hard to show (by a reduction from 3-SAT) that
it’s NP-hard to determine if even a single question needs to be asked. However, such
instances appear artificial and unlikely to occur in a medical setting. However, as
we show, even adding in the assumption that every function node is monotonically
increasing in each variable does not help us approximate the expected number of
queries we need to ask, even if the distribution is still promised to be a product
distribution.
Theorem 6.4: Computing the expected minimum number of questions to ask for
MQQA is CMMSA-hard even when the input distribution is a product distribution
91
and (iii) every function node computes a monotonically increasing function of its
inputs.
Proof sketch. Let C be the circuit given in an instance of CMMSA. We construct
the FC as follows. For every input node x
i
of C, we create one base attribute node b
i
.
The domain of b
i
is just {0, 1}, with each b
i
having a sufficiently small probability p
i
of
equaling 0. For each gate g
j
in C, create a function node f
j
and higher-order attribute
node h
j
. Function node f
i
takes as input the attribute nodes corresponding to the
inputs of g
i
, and outputs its result to h
i
. Finally, the function computed by f
i
is either
an or of its inputs, depending on whether g
i
was an or gate, respectively.
The target attribute is the higher-order attribute corresponding to the circuit’s output
gate. Thus, for a given set of user query responses, the target attribute’s value mimics
the (Boolean) value that the circuit will output on the analogous (Boolean) input.
The analysis then proceeds as follows. Following the construction and proof of [3],
p
i
is chosen so that the correct output of C is very likely 1, and querying any minterm
of C (a minimal subset of C’s inputs that force its evaluation to 1) will with high
probability provide a proof thereof. Conversely, it is impossible to deduce C’s output
without querying the contents of at least one of C’s minterms. Thus, with high
probability, an optimal query strategy is to query all terms of the smallest minterm
of C, which is exactly the objective value of the given MMSA instance.
6.3 Min-Cost Colored Cut
The Min-Cost Colored Cut problem, originally introduced by Rohloff et. al. in
the context of supervisory sensor selection [89], studies a natural generalization of the
classical Min-Cost s t Cut Problem. As in the classical problem, we are given
a directed graph G with a designated source s and designated sink t. Additionally,
the edge set E is partitioned into a collection of color classes; i.e. edge e is assigned
92
one color c(e) from a set C. Each color in turn, has an associated price p(c) R
+
.
The problem asks us to find the cheapest set of color classes (in terms of total price)
whose union forms an s t cut in the graph.
Problem 6.4 (Min-Cost Colored Cut (MCC)): Given a directed graph G with
a designated source s, designated sink t, and edges partitioned into color classes of
positive real cost, find the cheapest set of color classes whose union forms an s t
cut.
In this section, we establish that the problem is also CMMSA-hard, and thus, by
Theorem 6.2, also Min–k–Rep-hard.
6.3.1 CMMSA hardness of Min-Cost Colored Cut
Before we can show CMMSA hardness for MCC, we first show CMMSA-hardness
for a variant of CMMSA (see Section 6.1), which we call Min-Certificate Circuit
Monotone Satisfying Assignment (MCCMSA). The problem setup here is as
follows. Just like in CMMSA, we’re given a monotone circuit for which we want
to find an input that evaluates to True. However, while the objective function
in CMMSA is to minimize the number of inputs set to true, the one we have in
MCCMSA is a bit more confusing. In addition to specifying the set of true input
nodes as part of a candidate solution, we also need to specify a collection of gates in
the circuit itself. A solution is then feasible if (i) it includes the output gate, (ii) an
gate is part of the solution only if every one of its inputs is as well, and (iii) a
gate is part of the solution only if at least one of its inputs is as well.
This can be thought of as finding a sub-circuit of the original circuit that can
be used to directly certify that the circuit evaluates to True conditioned on the
selected input nodes (this is different from the number of gates that evaluate to
True, because we can exclude unneeded gates). The cost of a feasible solution is the
93
total number of nodes (inputs + gates) that it contains. Thus, for the same circuit,
OPT(CMMSA) < OPT(MCCMSA). Although this is closely related to a number
of the problems studied in [2], this exact variant has not (too my knowledge) been
previously described.
Problem 6.5 (Min-Certificate Circuit Monotone Satisfying Assignment
(MCCMSA)): Given a monotone circuit C, find a set of inputs y and sub-circuit C
0
proving that C outputs True on y, while minimizing |y|
1
plus the number of gates
in C
0
.
The MMSA-hardness of this problem follows from a simple padding argument.
Theorem 6.5: CMMSA
p
MCCMSA
Proof sketch. Let n be the size of the input circuit. We show that a p(n)-
approximation algorithm ALG
m
for MCCMSA (for some polynomial p(n)) implies
the existence of an O(p(n
2
))-approximation algorithm ALG
c
for CMMSA. Let C
be an instance of CMMSA on n nodes. We construct an instance C
0
of MCCMSA
by replacing every input gate x
i
of C with an gate
x
i
on a new set of n inputs
z
i,1
, z
i,2
, ··· , z
i,n
. The size of this new circuit is N = O(n
2
). We now run ALG
m
on
C
0
to get an p(N) = p(O(n)
2
) = O(p(n)
2
)-approximation algorithm for the optimal
value of MCCMSA(C
0
). Call this solution B
C
. We construct our solution ALG
c
(C)
to the original CMMSA problem by selecting exactly those x
i
which correspond to
a
x
i
gate in B
C
.
Effectively, what this construction accomplishes is a re-weighting of the input
nodes of C, with each input receiving the same (very large) weight n. While this does
not affect CMMSA solutions on C (as it just scales all costs by n), it makes sure
that the cost difference of the optimal CMMSA solution and the optimal MCCMSA
solution (which is what we’re looking for) only differ by a constant factor. Thus, we
can treat the input as an CMMSA instance without losing much in the way of our
94
approximation factor. Because the number of nodes blows up by a quadratic factor,
we lose that factor in our approximation ratio.
We now show that Min-Cost Colored Cut is MCCMSA-hard, thereby es-
tablishing CMMSA-hardness for the problem.
Theorem 6.6: MCCMSA
p
MCC
Proof sketch. Consider a candidate construction where we create a MCC instance
graph G as follows. First, we create a color class that has (effectively) infinite cost;
we slightly abuse notation and call elements of this class colorless. We create two
vertices, s and t, between which we want to find the cut. For each input gate x
i
in C,
we create two associated nodes v
in
i
and v
out
i
in G. Additionally, we create two nodes
u
in
i
and u
out
i
(resp. w
in
i
and w
out
i
) in G for every
i
(resp.
i
) gate in C. Call the three
sets of nodes X, U, and W , respectively. For each node i in X U W , we create
one color c
i
.
We then connect the nodes as follows. Each v
in
i
is connected to v
out
i
with an edge
of color c
v
i
. Each u
in
i
(resp. v
in
i
) is connected to u
out
i
(resp. v
out
i
) with a
V
(resp.
W
)
gadget, where these gadgets allow flow through them if and only if all (resp. one) of
u
in
i
(resp. v
in
i
)’s in-neighbors get flow to them. Next, if gate g
i
takes a
j
as input in C,
then g
i
’s corresponding in-vertex takes in a colorless arc from a
j
’s out-vertex. If y is
the output gate of C, then y is connected to t with a colorless arc. Finally, for each
gate z (whether it’s an input vertex or a circuit vertex), we connect s to the gate’s
corresponding out vertex with an arc of color c(z).
The upshot is that picking a (finite-cost) color to cut corresponds to picking a
single gate to delete, and a (finite-cost) MCCMSA solution (i.e. an entire certificate)
certifies (exactly) an s t path. However, this is the complement of what we wanted:
we want MCCMSA solutions to correspond to cuts, not flows. To fix this issue, we
flip all gates to and all gates to before carrying through with the reduction.
95
Temporarily interpreting inputs as their negation (i.e. “on” becomes “off and vice-
versa), this trick makes the circuit compute the complement function, and any would-
be path under the original reduction now becomes a cut, as desired.
6.4 Min-k-Union
The Min-k-Union problem can be viewed as a natural minimization complement of
the Max-k-Cover. It can be states as follows.
Problem 6.6 (Min-k-Union (MkU)): Given an element universe U of size n, a set
system S of size m, and a budget k, find the collection of exactly k sets in S whose
union is minimized.
Replacing the word “minimum” with “maximum” exactly recovers the definition
of Max-k-Cover. Although natural, this problem has (too our knowledge) received
surprisingly little attention until late 2015, when we and another group ([38]) inde-
pendently derived the same O(
m) approximation algorithm.
Subsequeny work of Chlamtac et. al. utilized the log-density framework pioneered
by [23] to improve the approximation ratio to O(m
1/4+
), which they conjecture is
(effectively) optimal [40].
Conjecture 3 (The Min-k-Union Conjecture): There is no m
1/4
polynomial-time
approximation algorithm for Min-k-Union.
6.4.1 An O(
m)-approximation for Min-k-Union
In this section, we describe a simple O(
m) approximation algorithm for MkU.
Theorem 6.7: There exists a polynomial-time 2
m approximation algorithm for
MkU.
The first step in the algorithm is deriving the trivial k-approximation algorithm.
96
Lemma 6.8: Greedily picking the set that introduces the fewest number of new
elements results in a solution of size at most k · |OPT|.
Proof of Lemma 6.8. Each of the k sets in the optimal solution contains at most OPT
elements. At each step of the greedy algorithm, at least one of these sets is available,
and can increase the number of introduced elements by no more than |OPT|.
While we’d like to combine this with an O(m/k) approximation algorithm, we
were unable to find an algorithm that provably bounds the approximation ratio that
well when k is large. However, as we’ll see shortly, we can get pretty close.
For a set S
0
S, let the expansion of S
0
be defined as e(S
0
) = |δ(S
0
)|/|S
0
|,
where δ(S
0
) is the union of all sets in S
0
. The Min-k-Union problem can be taken as
finding the S
0
of size k with minimum expansion. While we can’t solve the constrained
problem, the unconstrained one is just an instance of Submodular Minimization,
which can be solved in strongly polynomial time [68].
1
This inspires the following alternate greedy algorithm: greedily keep picking sub-
sets of S of minimum expansion until we pick k sets. While this approach doesn’t
quite give us what we want, we can still prove useful bounds on how it performs.
Lemma 6.9: Suppose that the alternate greedy algorithm first exceeds k picked sets
after t iterations, and let S
1
, S
2
, ··· , S
t
be of sequence of sets picked up by the end of
each iteration (so S
i
S
i+1
). Then the total number of elements covered is at most
|OPT| ·
m
k−|S
t1
|
.
Proof of Lemma 6.9. Let T
i
= S
i
\ S
i1
, where S
0
= , and let S
denote the (un-
known) optimal solution to the Min-k-Union instance. At round i, selecting the sets
of S
\S
i1
adds |S
\S
i1
| k |S
i1
| sets to our partial solution while introducing
1
Both we and [38] also derive flow-based and LP-based algorithms to solve this expansion problem
faster than by black-box reduction to submodular minimization. For simplicity, details are omitted.
97
at most |OPT| new elements. Thus, δ(T
i
)/|T
i
|
|OPT|
k−|S
i1
|
. Consequently,
δ(T
i
)
|T
i
| · |OPT|
k |S
i1
|
|OPT| ·
|T
i
|
k |S
t1
|
.
Since at most m sets can be picked in total, the total number of elements intro-
duced is at most |OPT| ·
m
k−|S
t1
|
.
Thus, the approximation factor of this alternate greedy algorithm can only suffer
when, in the last iteration, we were only a small number of sets away from having
gathered the requisite k sets. Fortunately, this is exactly when the original greedy
algorithm shines: the proof of Lemma 6.8 actually shows that when there are at most
k
0
sets left to pick, picking them greedily costs at most k
0
·|OPT|. With these lemmas
in mind, we’re ready to prove Theorem 6.7
Proof of Theorem 6.7. Our algorithm is as follows. For some number β < k, we keep
picking sets using the “alternate greedy algorithm” until we have picked up at least β
sets, at which point we switch to the original greedy algorithm to pick up sets until we
have k total. If we accidentally pick up more than k sets in the first step, we discard
sets arbitrarily until we get down to k. We then return the sets that we selected.
By (the proofs of) Lemmas 6.8 and 6.9, the total number of elements picked up
by this approach is at most
|OPT| · (k β) + |OPT| ·
m
kβ
= |OPT| ·
(k β) +
m
kβ
Picking β = k
m, this equals
|OPT| ·
m
k(k
m)
+ (k (k
m))
= |OPT| ·
m
m
+
m
= |OPT| · 2
m,
giving a 2
m approximation to the Min-k-Union problem.
98
6.5 Network Flow Interdiction
In the Network Flow Interdiction problem, we play the role of an adversary
trying to minimize the maximum flow routable between a single source/sink pair.
Problem 6.7 (Network Flow Interdiction (NFI)): Given a graph G with
both costs and capacities on its edges, as well as two designated vertices s and t and
a budget k, find a collection I of edges to remove from the graph so that (i) the sum
of the edge costs in I does not exceed k and (ii) the maximum flow from s to t is
minimized.
This problem can also be viewed as a generalization of the Minimum Cut prob-
lem. By the Max-Flow-Min-Cut theorem, this problem is equivalent to that of
cutting edges so to minimize the capacity of the smallest (s, t) cut. When the bud-
get is no less than the cost of the minimum cut, this problem can be solved using
standard Minimum Weighted Cut algorithms. However, when the budget can be
smaller, this problem has long been known to be NP-hard [94]. In this section, we
show that the problem is actually Min-k-Union-hard to approximate. Consequently,
Conjecture 1 implies that the problem has no n
approximation algorithm, and both
Conjecture 2 and Conjecture 3 imply n
1/8
and n
1/4
hardness, respectively.
Next, we discuss a variant of NFI where the quality of a solution is measured
in terms of the amount by which the maximum flow is reduced (as opposed to the
amount of flow that remains). A simple reduction shows that it is NP-hard to
determine if the maximum flow can be reduced by any amount, implying hardness of
approximation to any multiplicative factor.
Problem 6.8 (Max-Difference (MDNFI)): Given a graph G with both costs
and capacities on its edges, as well as two designated vertices s and t and a budget
k, find a collection I of edges to remove from the graph so that (i) the sum of the
99
edge costs in I does not exceed k and (ii) the maximum flow from s to t is reduced
by the maximum amount.
Both of these hardness results were discovered in parallel by Chestnut and Zen-
klussen [34].
6.5.1 SkES hardness of Network Flow Interdiction
As noted in [6], much of the difficulty in understanding NFI stems from our lack of
understanding of the R-Interdiction Covering (RIC) problem. RIC, originally
introduced in [41], is defined as follows:
Problem 6.9 (R-Interdiction Covering (RIC)): Given a bipartite graph G =
((V
1
, V
2
), E) and a budget k, find a subset V
0
V
1
of cardinality k so to maximize
the number of vertices in V
2
whose neighborhood is a subset of V
0
Despite the above, [6] actually shows an approximation-preserving reduction from
a version of RIC with a complemented objective function, i.e. in which we want to
minimize the number of vertices with any neighbor in V
0
. However, viewing vertices
in V
1
as sets and those in V
2
as elements, this complemented problem is exactly the
Min-k-Union (MkU) problem previously discussed in Section 6.4. Consequently, we
get the following theorem.
Theorem 6.10: There is an approximation-preserving reduction from Min-k-Union
to Network Flow Interdiction.
6.5.2 Complete Inapproximability of MDNFI
In this section, we show that MDNFI is hard to approximate to any multiplicative
factor.
100
Theorem 6.11: It is strongly NP-hard to determine if an MDNFI instance has a so-
lution with a strictly positive objective value. Therefore, MDNFI is inapproximable
to any multiplicative factor.
Proof. We reduce from the decision version of NFI. Consider an instance of NFI, in
which we’re given a graph G and are asked if a budget k is sufficient to reduce the
maximum s-t flow to below some value r < MaxFlow(G, s, t). We construct the
graph G
0
for our MDNFI instance by copying G, and adding a new vertex t
0
adjacent
only to t via a new edge ˆe with capacity r and interdiction cost k + 1. The budget,
as well as all capacities and interdiction costs of the copied edges, remains the same.
Because the new edge ˆe cannot be removed without exceeding the budget, the
maximum s t
0
flow in G
0
is min(MaxFlow(G
0
, s, t), r), which by our construction
equals min(MaxFlow(G, s, t), r) = r. Thus, determining whether or not the maxi-
mum s t
0
flow can be reduced at all by removing a feasible selection of edges from
the graph is equivalent to asking if there exists some feasible set of edges whose re-
moval causes the maximum s t flow in G to fall below r. As this resultant MDNFI
problem is equivalent to the given instance of NFI, we conclude that approximating
MDNFI to any factor is at least as hard as the decision version of NFI, and thus
that it is NP-hard as well.
101
Open Problems
Below, we collect some problems left open by work in the thesis.
DkS and Min-Rep In Chapter 2, we showed that from DkS
C
p
Min-Rep
under randomized reductions. Can we remove the randomness from this reduc-
tion? Even allowing ourselves 2
n
1
time in this reduction would imply a new ETH-
based hardness of approximation for Min-Rep. Are either of DkS
p
Min-Rep
or Min-Rep
p
DkS true? Can we show n
hardness from either via standard
complexity-theoretic conjectures?
Binary Search with Monotone Thresholds In Section 4.5, we enumerate
over all vectors in [n]
T
to find the single vector with largest sum for which some
oracle outputs Yes. One aspect we did not exploit is that the set of vectors this
oracle accepts is (downwards) monotone: if ~v is accepted, then every vector ~u point-
wise no-greater than ~v is also accepted. If T were equal to 1, this monotonicity could
be exploited to reduce the search space to log n queries via binary search. How the
optimal querying strategy scales with T remains, to my knowledge, still open.
Recursive planted instances In Section 3.3, we showed that the Planted
Dense Subgraph Conjecture remains just as hard under a certain recursive
construction. Outside of those already described in the thesis, are there any other
consequences of this result? The recursive construction works in discrete steps, but
102
one can also imagine “smooth” versions of this problem in which subgraphs become
continuously more dense as we focus in on specific regions of the graph. Can we show
that such a continuous model is also as hard as the original conjecture? Would such
a result imply anything interesting?
Hardness of Target Set Selection The key result of Chapter 3 is that it is
hard (assuming Conjecture 2) to approximate Target Set Selection to within a
factor of n
1/2
. However, no o(n)-approximation algorithm is known for the problem.
One limit to significantly improving the known hardness is that Planted Dense
Subgraph Conjecture explicitly assumes that the dense region is of size n
1/2
.
However, natural modifications to this conjecture could allow for planted sets as large
as n
1
, at the cost of a smaller (but still polynomial) gap in the relative density of the
planted region. Can we get conditional n
1
hardness for TSS using this approach?
Middlebox Routing In Chapter 5, we introduced a number of network design
problems focused on Middlebox Routing and provided simple approximation and
hardness results for these problems. However, many of these results were only “proof
of concept”-type bounds, such as the .999 hardness of undirected Max-MP. Can
we get tighter results for these problems? For the steering and routing problem, do
things change much when we allow the processing requirement of a certain flow to
not only vary linearly with its size?
Min-Query Q&A In light of Theorem 6.4, can we derive any interesting algo-
rithms for MQQA? What structure do realistic instances of the problem have that
may make this problem manageable? In practice, is minimizing the length of the di-
alog even the right way to look at the problem? We describe an alternate credibility-
maximization objective function in [83] and show its linear-time solvability. Are there
any natural hybrids of these objectives that admit helpful algorithms?
103
Bibliography
[1] Mikl´os Ajtai, anos Koml´os, and Endre Szemer´edi. Sorting in c log n parallel
steps. Combinatorica, 3(1):1–19, 1983.
[2] Michael Alekhnovich, Sam Buss, Shlomo Moran, and Toniann Pitassi. Minimum
propositional proof length is NP-hard to linearly approximate. Mathematical
Foundations of Computer Science 1998, pages 176–184, 1998.
[3] Sarah R Allen, Lisa Hellerstein, Devorah Kletenik, and Tongu¸c
¨
Unl¨uyurt. Eval-
uation of monotone DNF formulas. Algorithmica, pages 1–25, 2015.
[4] Noga Alon, Michael Krivelevich, and Benny Sudakov. Finding a large hidden
clique in a random graph. Random Structures & Algorithms, 13(3-4):457–466,
1998.
[5] Noga Alon and Joel H Spencer. The Probabilistic Method. John Wiley & Sons,
2004.
[6] Douglas S Altner,
¨
Ozlem Ergun, and Nelson A. Uhan. The maximum flow
network interdiction problem: valid inequalities, integrality gaps, and approx-
imability. Operations Research Letters, 38(1):33–38, 2010.
[7] Reid Andersen and Kumar Chellapilla. Finding dense subgraphs with size
bounds. In International Workshop on Algorithms and Models for the Web-
Graph, pages 25–37. Springer, 2009.
[8] Alexander Reinthal Anton ornqvist Arvid Andersson and Erik Norlander Philip
Stalhammar Sebastian Norlin. Finding the densest common subgraph with linear
programming. Unpublished manuscript, 2016.
[9] S. Antonakopoulos and G. Kortsarz. Lower bounds for non-preemptive kdial a
ride and for the ksteiner forest problem, August 2013.
[10] Bilal Anwer, Theophilus Benson, Nick Feamster, and Dave Levin. Programming
Slick network functions. In Proceedings of Symposium on SDN Research, June
2015.
[11] Benny Applebaum, Boaz Barak, and Avi Wigderson. Public-key cryptography
from different assumptions. In Proceedings of the forty-second ACM symposium
on Theory of computing, pages 171–180, 2010.
104
[12] Sanjeev Arora, aszl´o Babai, Jacques Stern, and Z Sweedyk. The hardness
of approximate optima in lattices, codes, and systems of linear equations. In
Proceedings of the 34th IEEE Symposium on Foundations of Computer Science,
pages 724–733. IEEE, 1993.
[13] Sanjeev Arora, Boaz Barak, Markus Brunnermeier, and Rong Ge. Computational
complexity and information asymmetry in financial products. In Proceedings of
The First Symposium on Innovations in Computer Science, pages 49–65, 2010.
[14] Sanjeev Arora, Elad Hazan, and Satyen Kale. The multiplicative weights update
method: a meta-algorithm and applications. Theory of Computing, 8(1):121–164,
2012.
[15] Sanjeev Arora, Elad Hazan, and Satyen Kale. The multiplicative weights update
method: A meta-algorithm and applications. Theory of Computing, 8(1):121–
164, 2012.
[16] Sanjeev Arora and Carsten Lund. Hardness of approximations. In Dorit S.
Hochbaum, editor, Approximation Algorithms for NP-hard Problems, chapter 10.
PWS, 1997.
[17] Pranjal Awasthi, Moses Charikar, Kevin A. Lai, and Andrej Risteski. Label opti-
mal regret bounds for online local learning. In Proceedings of the 28th Conference
on Learning Theory (COLT), pages 150–166, 2015.
[18] Roger C Baker, Glyn Harman, and anos Pintz. The difference between consec-
utive primes, ii. Proceedings of the London Mathematical Society, 83(3):532–562,
2001.
[19] Cristina Bazgan, Morgan Chopin, Andr´e Nichterlein, and Florian Sikora. Pa-
rameterized approximability of maximizing the spread of influence in networks.
CoRR, abs/1303.6907, 2013.
[20] Oren Ben-Zwi, Danny Hermelin, Daniel Lokshtanov, and Ilan Newman.
Treewidth governs the complexity of target set selection. Discrete Optimization,
8(1):87–96, 2011.
[21] Piotr Berman and Marek Karpinski. On Some Tighter Inapproximability Results.
Springer, 1999.
[22] Quentin Berthet and Philippe Rigollet. Complexity theoretic lower bounds for
sparse principal component detection. In Conference on Learning Theory, pages
1046–1066, 2013.
[23] Aditya Bhaskara, Moses Charikar, Eden Chlamtac, Uriel Feige, and Aravin-
dan Vijayaraghavan. Detecting high log-densities: an O(n
1/4
) approximation for
densest k–subgraph. In STOC, pages 201–210, 2010.
105
[24] Arnab Bhattacharyya, Elena Grigorescu, Kyomin Jung, Sofya Raskhodnikova,
and David P Woodruff. Transitive-closure spanners. SIAM Journal on Comput-
ing, 41(6):1380–1425, 2012.
[25] Mark Braverman, Young Kun Ko, Aviad Rubinstein, and Omri Weinstein. ETH
hardness for densest–k–subgraph with perfect completeness. In SODA, pages
1326–1341, 2017.
[26] S Charles Brubaker and Santosh Vempala. Random tensors and planted cliques.
In APPROX-RANDOM, pages 406–419. Springer, 2009.
[27] Deeparnab Chakrabarty, Ravishankar Krishnaswamy, Shi Li, and Srivatsan
Narayanan. Capacitated network design on undirected graphs. In Approxi-
mation, Randomization, and Combinatorial Optimization. Algorithms and Tech-
niques, pages 71–80. Springer, 2013.
[28] Moses Charikar. Greedy approximation algorithms for finding dense components
in a graph. In International Workshop on Approximation Algorithms for Com-
binatorial Optimization, pages 84–95. Springer, 2000.
[29] Moses Charikar, Yonatan Naamad, Jennifer Rexford, and X. Kelvin Zou. Multi-
commodity flow with in-network processing, submitted.
[30] Moses Charikar, Yonatan Naamad, and Anthony Wirth. On approximating tar-
get set selection. In APPROX, pages 4:1–4:16, 2016.
[31] Moses Charikar, Yonatan Naamad, and Anthony Wirth. On DkS hardness for
minrep-hard problems. Unpublished manuscript, 1, 2017.
[32] Moses Charikar, Yonatan Naamad, and Jimmy Wu. On finding dense common
subgraphs. Unpublished manuscript, 1, 2017.
[33] Ning Chen. On the approximability of influence in social networks. SIAM Journal
on Discrete Mathematics, 23(3):1400–1415, 2009.
[34] Stephen R Chestnut and Rico Zenklusen. Hardness and approximation for net-
work flow interdiction. arXiv preprint arXiv:1511.02486, 2015.
[35] Ramkumar Chinchani, Duc Ha, Anusha Iyer, Hung Q. Ngo, and Shambhu Upad-
hyaya. On the hardness of approximating the Min-Hack problem. Journal of
Combinatorial Optimization, 9(3):295–311, 2005.
[36] M Chiosi et al. Network Functions Virtualisation: Introductory white paper. In
SDN and OpenFlow World Congress, Oct 2012.
[37] Eden Chlamt´ac and Michael Dinitz. Lowest degree k–spanner: Approximation
and hardness. In LIPIcs-Leibniz International Proceedings in Informatics, vol-
ume 28. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2014.
106
[38] Eden Chlamtac, Michael Dinitz, Christian Konrad, Guy Kortsarz, and George
Rabanca. The densest k–subhypergraph problem. In LIPIcs-Leibniz Interna-
tional Proceedings in Informatics, volume 60. Schloss Dagstuhl-Leibniz-Zentrum
fuer Informatik, 2016.
[39] Eden Chlamtac, Michael Dinitz, and Robert Krauthgamer. Everywhere-sparse
spanners via dense subgraphs. In FOCS, pages 758–767, 2012.
[40] Eden Chlamt´c, Michael Dinitz, and Yury Makarychev. Minimizing the union:
Tight approximations for small set bipartite vertex expansion. In SODA, pages
881–899, 2017.
[41] Richard L Church, Maria P Scaparra, and Richard S Middleton. Identifying
critical infrastructure: the median and covering facility interdiction problems.
Annals of the Association of American Geographers, 94(3):491–502, 2004.
[42] Julia Chuzhoy, Sudipto Guha, Eran Halperin, Sanjeev Khanna, Guy Kortsarz,
Robert Krauthgamer, and Joseph Seffi Naor. Asymmetric k–center is log n-hard
to approximate. Journal of the ACM (JACM), 52(4):538–551, 2005.
[43] Julia Chuzhoy, Yury Makarychev, Aravindan Vijayaraghavan, and Yuan Zhou.
Approximation algorithms and hardness of the k–route cut problem. ACM Trans-
actions on Algorithms (TALG), 12(1):2, 2016.
[44] Ferdinando Cicalese, Gennaro Cordasco, Luisa Gargano, Martin Milaniˇc, Joseph
Peters, and Ugo Vaccaro. Spread of influence in weighted networks under time
and budget constraints. Theoretical Computer Science, 586:40–58, 2015.
[45] Ferdinando Cicalese, Gennaro Cordasco, Luisa Gargano, Martin Milaniˇc, and
Ugo Vaccaro. Latency-bounded target set selection in social networks. Theoret-
ical Computer Science, 535:1–15, 2014.
[46] Amin Coja-Oghlan, Uriel Feige, Michael Krivelevich, and Daniel Reichman. Con-
tagious sets in expanders. In Proceedings of the Twenty-Sixth Annual ACM-SIAM
Symposium on Discrete Algorithms, pages 1953–1987. SIAM, 2015.
[47] Michael Dinitz and Gordon T Wilfong. iBGP and constrained connectivity. In
APPROX-RANDOM, pages 122–133. Springer, 2012.
[48] Irit Dinur and Samuel Safra. On the hardness of approximating minimum vertex
cover. Annals of Mathematics, pages 439–485, 2005.
[49] Irit Dinur and Shmuel Safra. On the hardness of approximating label-cover.
Information Processing Letters, 89(5):247–254, 2004.
[50] Irit Dinur and David Steurer. Analytical approach to parallel repetition. In
Proceedings of the 46th Annual ACM Symposium on Theory of Computing, pages
624–633. ACM, 2014.
107
[51] Pedro Domingos and Matt Richardson. Mining the network value of customers.
In Proceedings of the seventh ACM SIGKDD international conference on Knowl-
edge discovery and data mining, pages 57–66. ACM, 2001.
[52] Uriel Feige and Robert Krauthgamer. Finding and certifying a large hidden
clique in a semirandom graph. Random Structures & Algorithms, 16(2):195–208,
2000.
[53] Uriel Feige and Dorit Ron. Finding hidden cliques in linear time. In 21st In-
ternational Meeting on Probabilistic, Combinatorial, and Asymptotic Methods in
the Analysis of Algorithms (AofA’10), pages 189–204. Discrete Mathematics and
Theoretical Computer Science, 2010.
[54] Richard Feynman. “Surely You’re Joking, Mr. Feynman!”: Adventures of a
Curious Character. W.W. Norton, New York, 1985.
[55] North American Association for the Study of Obesity, Lung National Heart,
Blood Institute, et al. The practical guide: identification, evaluation, and treat-
ment of overweight and obesity in adults. NIH Publication No. 00-4084, 2006.
[56] Joel Friedman. Constructing O(n log n) size monotone formulae for the kth
threshold function of n Boolean variables. SIAM Journal on Computing,
15(3):641–654, 1986.
[57] Naveen Garg and Jochen Koenemann. Faster and simpler algorithms for multi-
commodity flow and other fractional packing problems. SIAM Journal on Com-
puting, 37(2):630–652, 2007.
[58] Oded Goldreich and Shafi Goldwasser. On the possibility of basing cryptography
on the assumption that P 6= NP. In IACR Cryptology ePrint Archive, number 05
in 98, 1998.
[59] Michael Goldwasser and Rajeev Motwani. Intractability of assembly sequencing:
Unit disks in the plane. In WADS, pages 307–320, 1997.
[60] Eric Goles, Pedro Montealegre, Ville Salo, and Ilkka orm¨a. PSPACE-
completeness of majority automata networks. Theoretical Computer Science,
609:118–128, 2016.
[61] Bruce Hajek, Yihong Wu, and Jiaming Xu. Computational lower bounds for
community detection on random graphs. In Conference on Learning Theory,
pages 899–928, 2015.
[62] Eran Halperin and Robert Krauthgamer. Polylogarithmic inapproximability. In
Proceedings of the Thirty-fifth Annual ACM Symposium on Theory of Computing,
STOC ’03, pages 585–594, New York, NY, USA, 2003. ACM.
[63] Elad Hazan and Robert Krauthgamer. How hard is it to approximate the best
nash equilibrium? SIAM Journal on Computing, 40(1):79–91, 2011.
108
[64] Victor Heorhiadi, Michael K. Reiter, and Vyas Sekar. Simplifying software-
defined network optimization using SOL. In 13th USENIX Symposium on Net-
worked Systems Design and Implementation (NSDI 16), pages 223–237, Santa
Clara, CA, March 2016. USENIX Association.
[65] John J Hopfield. Neural networks and physical systems with emergent collec-
tive computational abilities. Proceedings of the national academy of sciences,
79(8):2554–2558, 1982.
[66] Johan H˚astad. Clique is hard to approximate within n
1
. Acta Math.,
182(1):105–142, 1999.
[67] Oscar H Ibarra and Chul E Kim. Fast approximation algorithms for the knapsack
and sum of subset problems. Journal of the ACM (JACM), 22(4):463–468, 1975.
[68] Satoru Iwata, Lisa Fleischer, and Satoru Fujishige. A combinatorial strongly
polynomial algorithm for minimizing submodular functions. Journal of the ACM
(JACM), 48(4):761–777, 2001.
[69] Vinay Jethava and Niko Beerenwinkel. Finding dense subgraphs in relational
graphs. In Joint European Conference on Machine Learning and Knowledge
Discovery in Databases, pages 641–654. Springer, 2015.
[70] Y. Jin, Y. Wen, and C. Westphal. Towards joint resource allocation and routing
to optimize video distribution over future internet. In IFIP Networking Confer-
ence (IFIP Networking), 2015, pages 1–9, May 2015.
[71] David Kempe, Jon Kleinberg, and
´
Eva Tardos. Maximizing the spread of influ-
ence through a social network. In KDD, pages 137–146, 2003.
[72] Subhash Khot. Ruling out PTAS for graph min-bisection, dense k–subgraph,
and bipartite clique. SIAM Journal on Computing, 36(4):1025–1071, 2006.
[73] Subhash Khot and Oded Regev. Vertex cover might be hard to approximate to
within 2 . Journal of Computer and System Sciences, 74(3):335–349, 2008.
[74] Guy Kortsarz. On the hardness of approximating spanners. Algorithmica,
30(3):432–450, 2001.
[75] Guy Kortsarz, Vahab S Mirrokni, Zeev Nutov, and Elena Tsanko. Approxi-
mating minimum-power degree and connectivity problems. In Latin American
Symposium on Theoretical Informatics, pages 423–435. Springer, 2008.
[76] Guy Kortsarz and David Peleg. On choosing a dense subgraph. In Proceedings of
the 34th IEEE Symposium on Foundations of Computer Science, pages 692–701.
IEEE, 1993.
[77] Ludˇek Kuˇcera. Expected complexity of graph partitioning problems. Discrete
Applied Mathematics, 57(2-3):193–212, 1995.
109
[78] Edmund Landau.
¨
Uber die maximalordnung der permutationen gegebenen
grades. Archiv der Math. und Phys, 3:92–103, 1903.
[79] L Lapique. Recherches quantitatives sur l’excitation electrique des nerfs traitee
comme une polarization. J Physiol Pathol Gen, 9:620–635, 1907.
[80] Xin Li and Chen Qian. A survey of network function placement. In 2016
13th IEEE Annual Consumer Communications Networking Conference (CCNC),
pages 948–953, Jan 2016.
[81] Pasin Manurangsi. Almost-polynomial ratio eth-hardness of approximating dens-
est k–subgraph. In Proceedings of the 49th Annual ACM SIGACT Symposium
on Theory of Computing, pages 954–961. ACM, 2017.
[82] Pasin Manurangsi and Dana Moshkovitz. Improved approximation algorithms
for projection games. Algorithmica, 77(2):555–594, 2017.
[83] Yonatan Naamad and Nina Mishra. Answering partially-specified questions with
dialog. Unpublished manuscript, 1, 2017.
[84] Zeev Nutov. Approximating steiner networks with node-weights. SIAM Journal
on Computing, 39(7):3001–3022, 2010.
[85] OPNFV. Opnfv: An open platform to accelerate nfv. Linux Foundation, https:
//www.opnfv.org/.
[86] Svatopluk Poljak and Miroslav Sura. On periodical behaviour in societies with
symmetric influences. Combinatorica, 3(1):119–121, 1983.
[87] Zafar Ayyub Qazi, Cheng-Chun Tu, Luis Chiang, Rui Miao, Vyas Sekar, and
Minlan Yu. SIMPLE-fying Middlebox Policy Enforcement Using SDN. In Pro-
ceedings of ACM SIGCOMM, pages 27–38. ACM, 2013.
[88] Matthew Richardson and Pedro Domingos. Mining knowledge-sharing sites for
viral marketing. In KDD, pages 61–70, 2002.
[89] Kurt R Rohloff, Samir Khuller, and Guy Kortsarz. Approximating the mini-
mal sensor selection for supervisory control. Discrete event dynamic systems,
16(1):143–170, 2006.
[90] Konstantinos Semertzidis, Evaggelia Pitoura, Evimaria Terzi, and Panayiotis
Tsaparas. Best friends forever (BFF): Finding lasting dense subgraphs. arXiv
preprint arXiv:1612.05440, 2016.
[91] Justine Sherry, Shaddi Hasan, Colin Scott, Arvind Krishnamurthy, Sylvia Rat-
nasamy, and Vyas Sekar. Making middleboxes someone else’s problem: Network
processing as a cloud service. In Proceedings of the ACM Conference on Applica-
tions, Technologies, Architectures, and Protocols for Computer Communication,
SIGCOMM ’12, pages 13–24. ACM, 2012.
110
[92] Maxim Sviridenko. A note on maximizing a submodular set function subject to
a knapsack constraint. Operations Research Letters, 32(1):41–43, 2004.
[93] Christopher Umans. Hardness of approximating Σ
p
2
minimization problems. In
FOCS, pages 465–474, 1999.
[94] R Kevin Wood. Deterministic network interdiction. Mathematical and Computer
Modelling, 17(2):1–18, 1993.
111