Much recent research effort has been directed to the development of efficient algorithms for solving minimax problems with theoretical convergence guarantees due to the relevance of these problems to a few emergent applications. In this paper, we propose a unified single-loop alternating gradient projection (AGP) algorithm for solving smooth nonconvex-(strongly) concave and (strongly) convex-nonconcave minimax problems. AGP employs simple gradient projection steps for updating the primal and dual variables alternatively at each iteration. We show that it can find an $\varepsilon$-stationary point of the objective function in $\mathcal{O}\left( \varepsilon ^{-2} \right)$ (resp. $\mathcal{O}\left( \varepsilon ^{-4} \right)$) iterations under nonconvex-strongly concave (resp. nonconvex-concave) setting. Moreover, its gradient complexity to obtain an $\varepsilon$-stationary point of the objective function is bounded by $\mathcal{O}\left( \varepsilon ^{-2} \right)$ (resp., $\mathcal{O}\left( \varepsilon ^{-4} \right)$) under the strongly convex-nonconcave (resp., convex-nonconcave) setting. To the best of our knowledge, this is the first time that a simple and unified single-loop algorithm is developed for solving both nonconvex-(strongly) concave and (strongly) convex-nonconcave minimax problems. Moreover, the complexity results for solving the latter (strongly) convex-nonconcave minimax problems have never been obtained before in the literature. Numerical results show the efficiency of the proposed AGP algorithm. Furthermore, we extend the AGP algorithm by presenting a block alternating proximal gradient (BAPG) algorithm for solving more general multi-block nonsmooth nonconvex-(strongly) concave and (strongly) convex-nonconcave minimax problems. We can similarly establish the gradient complexity of the proposed algorithm under these four different settings.
Continuous DR-submodular functions are a class of generally non-convex/non-concave functions that satisfy the Diminishing Returns (DR) property, which implies that they are concave along non-negative directions. Existing work has studied monotone continuous DR-submodular maximization subject to a convex constraint and provided efficient algorithms with approximation guarantees. In many applications, such as computing the stability number of a graph, the monotone DR-submodular objective function has the additional property of being strongly concave along non-negative directions (i.e., strongly DR-submodular). In this paper, we consider a subclass of $L$-smooth monotone DR-submodular functions that are strongly DR-submodular and have a bounded curvature, and we show how to exploit such additional structure to obtain faster algorithms with stronger guarantees for the maximization problem. We propose a new algorithm that matches the provably optimal $1-\frac{c}{e}$ approximation ratio after only $\lceil\frac{L}{\mu}\rceil$ iterations, where $c\in[0,1]$ and $\mu\geq 0$ are the curvature and the strong DR-submodularity parameter. Furthermore, we study the Projected Gradient Ascent (PGA) method for this problem, and provide a refined analysis of the algorithm with an improved $\frac{1}{1+c}$ approximation ratio (compared to $\frac{1}{2}$ in prior works) and a linear convergence rate. Experimental results illustrate and validate the efficiency and effectiveness of our proposed algorithms.
Majority-SAT is the problem of determining whether an input $n$-variable formula in conjunctive normal form (CNF) has at least $2^{n-1}$ satisfying assignments. Majority-SAT and related problems have been studied extensively in various AI communities interested in the complexity of probabilistic planning and inference. Although Majority-SAT has been known to be PP-complete for over 40 years, the complexity of a natural variant has remained open: Majority-$k$SAT, where the input CNF formula is restricted to have clause width at most $k$. We prove that for every $k$, Majority-$k$SAT is in P. In fact, for any positive integer $k$ and rational $\rho \in (0,1)$ with bounded denominator, we give an algorithm that can determine whether a given $k$-CNF has at least $\rho \cdot 2^n$ satisfying assignments, in deterministic linear time (whereas the previous best-known algorithm ran in exponential time). Our algorithms have interesting positive implications for counting complexity and the complexity of inference, significantly reducing the known complexities of related problems such as E-MAJ-$k$SAT and MAJ-MAJ-$k$SAT. At the heart of our approach is an efficient method for solving threshold counting problems by extracting sunflowers found in the corresponding set system of a $k$-CNF. We also show that the tractability of Majority-$k$SAT is somewhat fragile. For the closely related GtMajority-SAT problem (where we ask whether a given formula has greater than $2^{n-1}$ satisfying assignments) which is known to be PP-complete, we show that GtMajority-$k$SAT is in P for $k\le 3$, but becomes NP-complete for $k\geq 4$. These results are counterintuitive, because the ``natural'' classifications of these problems would have been PP-completeness, and because there is a stark difference in the complexity of GtMajority-$k$SAT and Majority-$k$SAT for all $k\ge 4$.
Let $G$ be an $n$-vertex graph, and $s,t$ vertices of $G$. We present an efficient algorithm which enumerates the set of minimal $st$-separators of $G$ in ascending order of cardinality, with a delay of $O(n^{3.5})$ per separator. In particular, we present an algorithm that lists, in ascending order of cardinality, all minimal separators with at most $k$ vertices. In that case, we show that the delay of the enumeration algorithm is $O(kn^{2.5})$ per separator. Our process is based on a new method that can decide, in polynomial time, whether the set of minimal separators under certain inclusion, exclusion, and cardinality constraints is empty.
We investigate the equilibrium behavior for the decentralized cheap talk problem for real random variables and quadratic cost criteria in which an encoder and a decoder have misaligned objective functions. In prior work, it has been shown that the number of bins in any equilibrium has to be countable, generalizing a classical result due to Crawford and Sobel who considered sources with density supported on $[0,1]$. In this paper, we first refine this result in the context of log-concave sources. For sources with two-sided unbounded support, we prove that, for any finite number of bins, there exists a unique equilibrium. In contrast, for sources with semi-unbounded support, there may be a finite upper bound on the number of bins in equilibrium depending on certain conditions stated explicitly. Moreover, we prove that for log-concave sources, the expected costs of the encoder and the decoder in equilibrium decrease as the number of bins increases. Furthermore, for strictly log-concave sources with two-sided unbounded support, we prove convergence to the unique equilibrium under best response dynamics which starts with a given number of bins, making a connection with the classical theory of optimal quantization and convergence results of Lloyd's method. In addition, we consider more general sources which satisfy certain assumptions on the tail(s) of the distribution and we show that there exist equilibria with infinitely many bins for sources with two-sided unbounded support. Further explicit characterizations are provided for sources with exponential, Gaussian, and compactly-supported probability distributions.
We consider the Vector Scheduling problem on identical machines: we have m machines, and a set J of n jobs, where each job j has a processing-time vector $p_j\in \mathbb{R}^d_{\geq 0}$. The goal is to find an assignment $\sigma:J\to [m]$ of jobs to machines so as to minimize the makespan $\max_{i\in [m]}\max_{r\in [d]}( \sum_{j:\sigma(j)=i}p_{j,r})$. A natural lower bound on the optimal makespan is lb $:=\max\{\max_{j\in J,r\in [d]}p_{j,r},\max_{r\in [d]}(\sum_{j\in J}p_{j,r}/m)\}$. Our main result is a very simple O(log d)-approximation algorithm for vector scheduling with respect to the lower bound lb: we devise an algorithm that returns an assignment whose makespan is at most O(log d)*lb. As an application, we show that the above guarantee leads to an O(log log m)-approximation for Stochastic Minimum-Norm Load Balancing (StochNormLB). In StochNormLB, we have m identical machines, a set J of n independent stochastic jobs whose processing times are nonnegative random variables, and a monotone, symmetric norm $f:\mathbb{R}^m \to \mathbb{R}_{\geq 0}$. The goal is to find an assignment $\sigma:J\to [m]$ that minimizes the expected $f$-norm of the induced machine-load vector, where the load on machine i is the (random) total processing time assigned to it. Our O(log log m)-approximation guarantee is in fact much stronger: we obtain an assignment that is simultaneously an O(log log m)-approximation for StochNormLB with all monotone, symmetric norms. Next, this approximation factor significantly improves upon the O(log m/log log m)-approximation in (Ibrahimpur and Swamy, FOCS 2020) for StochNormLB, and is a consequence of a more-general black-box reduction that we present, showing that a $\gamma(d)$-approximation for d-dimensional vector scheduling with respect to the lower bound lb yields a simultaneous $\gamma(\log m)$-approximation for StochNormLB with all monotone, symmetric norms.
We revisit the (block-angular) min-max resource sharing problem, which is a well-known generalization of fractional packing and the maximum concurrent flow problem. It consists of finding an $\ell_{\infty}$-minimal element in a Minkowski sum $\mathcal{X}= \sum_{C \in \mathcal{C}} X_C$ of non-empty closed convex sets $X_C \subseteq \mathbb{R}^{\mathcal{R}}_{\geq 0}$, where $\mathcal{C}$ and $\mathcal{R}$ are finite sets. We assume that an oracle for approximate linear minimization over $X_C$ is given. In this setting, the currently fastest known FPTAS is due to M\"uller, Radke, and Vygen. For $\delta \in (0,1]$, it computes a $\sigma(1+\delta)$-approximately optimal solution using $\mathcal{O}((|\mathcal{C}|+|\mathcal{R}|)\log |\mathcal{R}| (\delta^{-2} + \log \log |\mathcal{R}|))$ oracle calls, where $\sigma$ is the approximation ratio of the oracle. We describe an extension of their algorithm and improve on previous results in various ways. Our FPTAS, which, as previous approaches, is based on the multiplicative weight update method, computes close to optimal primal and dual solutions using $\mathcal{O}\left(\frac{|\mathcal{C}|+ |\mathcal{R}|}{\delta^2} \log |\mathcal{R}|\right)$ oracle calls. We prove that our running time is optimal under certain assumptions, implying that no warm-start analysis of the algorithm is possible. A major novelty of our analysis is the concept of local weak duality, which illustrates that the algorithm optimizes (close to) independent parts of the instance separately. Interestingly, this implies that the computed solution is not only approximately $\ell_{\infty}$-minimal, but among such solutions, also its second-highest entry is approximately minimal. We prove that this statement cannot be extended to the third-highest entry.
We study the problem of list-decodable mean estimation, where an adversary can corrupt a majority of the dataset. Specifically, we are given a set $T$ of $n$ points in $\mathbb{R}^d$ and a parameter $0< \alpha <\frac 1 2$ such that an $\alpha$-fraction of the points in $T$ are i.i.d. samples from a well-behaved distribution $\mathcal{D}$ and the remaining $(1-\alpha)$-fraction are arbitrary. The goal is to output a small list of vectors, at least one of which is close to the mean of $\mathcal{D}$. We develop new algorithms for list-decodable mean estimation, achieving nearly-optimal statistical guarantees, with running time $O(n^{1 + \epsilon_0} d)$, for any fixed $\epsilon_0 > 0$. All prior algorithms for this problem had additional polynomial factors in $\frac 1 \alpha$. We leverage this result, together with additional techniques, to obtain the first almost-linear time algorithms for clustering mixtures of $k$ separated well-behaved distributions, nearly-matching the statistical guarantees of spectral methods. Prior clustering algorithms inherently relied on an application of $k$-PCA, thereby incurring runtimes of $\Omega(n d k)$. This marks the first runtime improvement for this basic statistical problem in nearly two decades. The starting point of our approach is a novel and simpler near-linear time robust mean estimation algorithm in the $\alpha \to 1$ regime, based on a one-shot matrix multiplicative weights-inspired potential decrease. We crucially leverage this new algorithmic framework in the context of the iterative multi-filtering technique of Diakonikolas et al. '18, '20, providing a method to simultaneously cluster and downsample points using one-dimensional projections -- thus, bypassing the $k$-PCA subroutines required by prior algorithms.
In this paper we use the theory of computing to study fractal dimensions of projections in Euclidean spaces. A fundamental result in fractal geometry is Marstrand's projection theorem, which shows that for every analytic set E, for almost every line L, the Hausdorff dimension of the orthogonal projection of E onto L is maximal. We use Kolmogorov complexity to give two new results on the Hausdorff and packing dimensions of orthogonal projections onto lines. The first shows that the conclusion of Marstrand's theorem holds whenever the Hausdorff and packing dimensions agree on the set E, even if E is not analytic. Our second result gives a lower bound on the packing dimension of projections of arbitrary sets. Finally, we give a new proof of Marstrand's theorem using the theory of computing.
In this work, we consider the distributed optimization of non-smooth convex functions using a network of computing units. We investigate this problem under two regularity assumptions: (1) the Lipschitz continuity of the global objective function, and (2) the Lipschitz continuity of local individual functions. Under the local regularity assumption, we provide the first optimal first-order decentralized algorithm called multi-step primal-dual (MSPD) and its corresponding optimal convergence rate. A notable aspect of this result is that, for non-smooth functions, while the dominant term of the error is in $O(1/\sqrt{t})$, the structure of the communication network only impacts a second-order term in $O(1/t)$, where $t$ is time. In other words, the error due to limits in communication resources decreases at a fast rate even in the case of non-strongly-convex objective functions. Under the global regularity assumption, we provide a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function, and show that DRS is within a $d^{1/4}$ multiplicative factor of the optimal convergence rate, where $d$ is the underlying dimension.
Many resource allocation problems in the cloud can be described as a basic Virtual Network Embedding Problem (VNEP): finding mappings of request graphs (describing the workloads) onto a substrate graph (describing the physical infrastructure). In the offline setting, the two natural objectives are profit maximization, i.e., embedding a maximal number of request graphs subject to the resource constraints, and cost minimization, i.e., embedding all requests at minimal overall cost. The VNEP can be seen as a generalization of classic routing and call admission problems, in which requests are arbitrary graphs whose communication endpoints are not fixed. Due to its applications, the problem has been studied intensively in the networking community. However, the underlying algorithmic problem is hardly understood. This paper presents the first fixed-parameter tractable approximation algorithms for the VNEP. Our algorithms are based on randomized rounding. Due to the flexible mapping options and the arbitrary request graph topologies, we show that a novel linear program formulation is required. Only using this novel formulation the computation of convex combinations of valid mappings is enabled, as the formulation needs to account for the structure of the request graphs. Accordingly, to capture the structure of request graphs, we introduce the graph-theoretic notion of extraction orders and extraction width and show that our algorithms have exponential runtime in the request graphs' maximal width. Hence, for request graphs of fixed extraction width, we obtain the first polynomial-time approximations. Studying the new notion of extraction orders we show that (i) computing extraction orders of minimal width is NP-hard and (ii) that computing decomposable LP solutions is in general NP-hard, even when restricting request graphs to planar ones.