We consider a problem introduced by Feige, Gamarnik, Neeman, R\'acz and Tetali [2020], that of finding a large clique in a random graph $G\sim G(n,\frac{1}{2})$, where the graph $G$ is accessible by queries to entries of its adjacency matrix. The query model allows some limited adaptivity, with a constant number of rounds of queries, and $n^\delta$ queries in each round. With high probability, the maximum clique in $G$ is of size roughly $2\log n$, and the goal is to find cliques of size $\alpha\log n$, for $\alpha$ as large as possible. We prove that no two-rounds algorithm is likely to find a clique larger than $\frac{4}{3}\delta\log n$, which is a tight upper bound when $1\leq\delta\leq \frac{6}{5}$. For other ranges of parameters, namely, two-rounds with $\frac{6}{5}<\delta<2$, and three-rounds with $1\leq\delta<2$, we improve over the previously known upper bounds on $\alpha$, but our upper bounds are not tight. If early rounds are restricted to have fewer queries than the last round, then for some such restrictions we do prove tight upper bounds.
Stochastic and adversarial data are two widely studied settings in online learning. But many optimization tasks are neither i.i.d. nor fully adversarial, which makes it of fundamental interest to get a better theoretical understanding of the world between these extremes. In this work we establish novel regret bounds for online convex optimization in a setting that interpolates between stochastic i.i.d. and fully adversarial losses. By exploiting smoothness of the expected losses, these bounds replace a dependence on the maximum gradient length by the variance of the gradients, which was previously known only for linear losses. In addition, they weaken the i.i.d. assumption by allowing adversarially poisoned rounds or shifts in the data distribution. To accomplish this goal, we introduce two key quantities associated with the loss sequence, that we call the cumulative stochastic variance and the adversarial variation. Our upper bounds are attained by instances of optimistic follow the regularized leader, and we design adaptive learning rates that automatically adapt to the cumulative stochastic variance and adversarial variation. In the fully i.i.d. case, our bounds match the rates one would expect from results in stochastic acceleration, and in the fully adversarial case they gracefully deteriorate to match the minimax regret. We further provide lower bounds showing that our regret upper bounds are tight for all intermediate regimes for the cumulative stochastic variance and the adversarial variation.
Consider a set $P$ of $n$ points in $\mathbb{R}^d$. In the discrete median line segment problem, the objective is to find a line segment bounded by a pair of points in $P$ such that the sum of the Euclidean distances from $P$ to the line segment is minimized. In the continuous median line segment problem, a real number $\ell>0$ is given, and the goal is to locate a line segment of length $\ell$ in $\mathbb{R}^d$ such that the sum of the Euclidean distances between $P$ and the line segment is minimized. We show how to compute $(1+\epsilon\Delta)$- and $(1+\epsilon)$-approximations to a discrete median line segment in time $O(n\epsilon^{-2d}\log n)$ and $O(n^2\epsilon^{-d})$, respectively, where $\Delta$ is the spread of line segments spanned by pairs of points. While developing our algorithms, by using the principle of pair decomposition, we derive new data structures that allow us to quickly approximate the sum of the distances from a set of points to a given line segment or point. To our knowledge, our utilization of pair decompositions for solving minsum facility location problems is the first of its kind; it is versatile and easily implementable. We prove that it is impossible to construct a continuous median line segment for $n\geq3$ non-collinear points in the plane by using only ruler and compass. In view of this, we present an $O(n^d\epsilon^{-d})$-time algorithm for approximating a continuous median line segment in $\mathbb{R}^d$ within a factor of $1+\epsilon$. The algorithm is based upon generalizing the point-segment pair decomposition from the discrete to the continuous domain. Last but not least, we give an $(1+\epsilon)$-approximation algorithm, whose time complexity is sub-quadratic in $n$, for solving the constrained median line segment problem in $\mathbb{R}^2$ where an endpoint or the slope of the median line segment is given at input.
Let $N$ be the number of triangles in an Erd\H{o}s-R\'enyi graph $\mathcal{G}(n,p)$ on $n$ vertices with edge density $p=d/n,$ where $d>0$ is a fixed constant. It is well known that $N$ weakly converges to the Poisson distribution with mean ${d^3}/{6}$ as $n\rightarrow \infty$. We address the upper tail problem for $N,$ namely, we investigate how fast $k$ must grow, so that the probability of $\{N\ge k\}$ is not well approximated anymore by the tail of the corresponding Poisson variable. Proving that the tail exhibits a sharp phase transition, we essentially show that the upper tail is governed by Poisson behavior only when $k^{1/3} \log k< (\frac{3}{\sqrt{2}})^{2/3} \log n$ (sub-critical regime) as well as pin down the tail behavior when $k^{1/3} \log k> (\frac{3}{\sqrt{2}})^{2/3} \log n$ (super-critical regime). We further prove a structure theorem, showing that the sub-critical upper tail behavior is dictated by the appearance of almost $k$ vertex-disjoint triangles whereas in the supercritical regime, the excess triangles arise from a clique like structure of size approximately $(6k)^{1/3}$. This settles the long-standing upper-tail problem in this case, answering a question of Aldous, complementing a long sequence of works, spanning multiple decades, culminating in (Harel, Moussat, Samotij,'19) which analyzed the problem only in the regime $p\gg \frac{1}{n}.$ The proofs rely on several novel graph theoretical results which could have other applications.
We present a quite curious generalization of multi-step Fibonacci numbers. For any positive rational $q$, we enumerate binary words of length $n$ whose maximal factors of the form $0^a1^b$ satisfy $a = 0$ or $aq > b$. When $q$ is an integer we rediscover classical multi-step Fibonacci numbers (Fibonacci, Tribonacci, Tetranacci, etc). When $q$ is not an integer, obtained recurrence relations are connected to certain restricted integer compositions. We also discuss Gray codes for these words, and a possibly novel generalization of the golden ratio.
We study the algorithmic complexity of computing persistent homology of a randomly chosen filtration. Specifically, we prove upper bounds for the average fill-up (number of non-zero entries) of the boundary matrix on Erd\"os-R\'enyi and Vietoris-Rips filtrations after matrix reduction. Our bounds show that, in both cases, the reduced matrix is expected to be significantly sparser than what the general worst-case predicts. Our method is based on previous results on the expected first Betti numbers of corresponding complexes. We establish a link between these results to the fill-up of the boundary matrix. Our bound for Vietoris-Rips complexes is asymptotically tight up to logarithmic factors. We also provide an Erd\"os-R\'enyi filtration realising the worst-case.
A triangle in a hypergraph $\mathcal{H}$ is a set of three distinct edges $e, f, g\in\mathcal{H}$ and three distinct vertices $u, v, w\in V(\mathcal{H})$ such that $\{u, v\}\subseteq e$, $\{v, w\}\subseteq f$, $\{w, u\}\subseteq g$ and $\{u, v, w\}\cap e\cap f\cap g=\emptyset$. Johansson proved in 1996 that $\chi(G)=\mathcal{O}(\Delta/\log\Delta)$ for any triangle-free graph $G$ with maximum degree $\Delta$. Cooper and Mubayi later generalized the Johansson's theorem to all rank $3$ hypergraphs. In this paper we provide a common generalization of both these results for all hypergraphs, showing that if $\mathcal{H}$ is a rank $k$, triangle-free hypergraph, then the list chromatic number \[ \chi_{\ell}(\mathcal{H})\leq \mathcal{O}\left(\max_{2\leq \ell \leq k} \left\{\left( \frac{\Delta_{\ell}}{\log \Delta_{\ell}} \right)^{\frac{1}{\ell-1}} \right\}\right), \] where $\Delta_{\ell}$ is the maximum $\ell$-degree of $\mathcal{H}$. The result is sharp apart from the constant. Moreover, our result implies, generalizes and improves several earlier results on the chromatic number and also independence number of hypergraphs, while its proof is based on a different approach than prior works in hypergraphs (and therefore provides alternative proofs to them). In particular, as an application, we establish a bound on chromatic number of sparse hypergraphs in which each vertex is contained in few triangles, and thus extend results of Alon, Krivelevich and Sudakov, and Cooper and Mubayi from hypergraphs of rank 2 and 3, respectively, to all hypergraphs.
We consider the problem of partitioning a line segment into two subsets, so that $n$ finite measures all has the same ratio of values for the subsets. Letting $\alpha\in[0,1]$ denote the desired ratio, this generalises the PPA-complete consensus-halving problem, in which $\alpha=\frac{1}{2}$. It is known that for any $\alpha$, there exists a solution using $2n$ cuts of the segment. Here we show that if $\alpha$ is irrational, that upper bound is almost optimal. We also obtain bounds that are nearly exact for a large subset of rational values $\alpha$. On the computational side, we explore its dependence on the number of cuts available. More specifically, 1. when using the minimal number of cuts for each instance is required, the problem is NP-hard for any $\alpha$; 2. for a large subset of rational $\alpha = \frac{\ell}{k}$, when $\frac{k-1}{k} \cdot 2n$ cuts are available, the problem is in the Turing closure of PPA-$k$; 3. when $2n$ cuts are allowed, the problem belongs to PPA for any $\alpha$; furthermore, the problem belong to PPA-$p$ for any prime $p$ if $2(p-1)\cdot \frac{\lceil p/2 \rceil}{\lfloor p/2 \rfloor} \cdot n$ cuts are available.
In this paper we study temporal design problems of undirected temporally connected graphs. The basic setting of these optimization problems is as follows: given an undirected graph $G$, what is the smallest number $|\lambda|$ of time-labels that we need to add to the edges of $G$ such that the resulting temporal graph $(G,\lambda)$ is temporally connected? As we prove, this basic problem, called MINIMUM LABELING, can be optimally solved in polynomial time, thus resolving an open question. The situation becomes however more complicated if we strengthen, or even if we relax a bit, the requirement of temporal connectivity of $(G,\lambda)$. One way to strengthen the temporal connectivity requirements is to upper-bound the allowed age (i.e., maximum label) of the obtained temporal graph $(G,\lambda)$. On the other hand, we can relax temporal connectivity by only requiring that there exists a temporal path between any pair of ``important'' vertices which lie in a subset $R\subseteq V$, which we call the terminals. This relaxed problem, called MINIMUM STEINER LABELING, resembles the problem STEINER TREE in static (i.e., non-temporal) graphs; however, as it turns out, STEINER TREE is not a special case of MINIMUM STEINER LABELING. We prove that MINIMUM STEINER LABELING is NP-hard and in FPT with respect to the number $|R|$ of terminals. Moreover, we prove that, adding the age restriction makes the above problems strictly harder (unless P=NP or W[1]=FPT). More specifically, we prove that the age-restricted version of MINIMUM LABELING becomes NP-complete on undirected graphs, while the age-restricted version of MINIMUM STEINER LABELING becomes W[1]-hard with respect to the number $|R|$ of terminals.
We show that for the problem of testing if a matrix $A \in F^{n \times n}$ has rank at most $d$, or requires changing an $\epsilon$-fraction of entries to have rank at most $d$, there is a non-adaptive query algorithm making $\widetilde{O}(d^2/\epsilon)$ queries. Our algorithm works for any field $F$. This improves upon the previous $O(d^2/\epsilon^2)$ bound (SODA'03), and bypasses an $\Omega(d^2/\epsilon^2)$ lower bound of (KDD'14) which holds if the algorithm is required to read a submatrix. Our algorithm is the first such algorithm which does not read a submatrix, and instead reads a carefully selected non-adaptive pattern of entries in rows and columns of $A$. We complement our algorithm with a matching query complexity lower bound for non-adaptive testers over any field. We also give tight bounds of $\widetilde{\Theta}(d^2)$ queries in the sensing model for which query access comes in the form of $\langle X_i, A\rangle:=tr(X_i^\top A)$; perhaps surprisingly these bounds do not depend on $\epsilon$. We next develop a novel property testing framework for testing numerical properties of a real-valued matrix $A$ more generally, which includes the stable rank, Schatten-$p$ norms, and SVD entropy. Specifically, we propose a bounded entry model, where $A$ is required to have entries bounded by $1$ in absolute value. We give upper and lower bounds for a wide range of problems in this model, and discuss connections to the sensing model above.
In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov's accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.