For an input graph $G$, an additive spanner is a sparse subgraph $H$ whose shortest paths match those of $G$ up to small additive error. We prove two new lower bounds in the area of additive spanners: 1) We construct $n$-node graphs $G$ for which any spanner on $O(n)$ edges must increase a pairwise distance by $+\Omega(n^{1/7})$. This improves on a recent lower bound of $+\Omega(n^{1/10.5})$ by Lu, Wein, Vassilevska Williams, and Xu [SODA '22]. 2) A classic result by Coppersmith and Elkin [SODA '05] proves that for any $n$-node graph $G$ and set of $p = O(n^{1/2})$ demand pairs, one can exactly preserve all pairwise distances among demand pairs using a spanner on $O(n)$ edges. They also provided a lower bound construction, establishing that that this range $p = O(n^{1/2})$ cannot be improved. We strengthen this lower bound by proving that, for any constant $k$, this range of $p$ is still unimprovable even if the spanner is allowed $+k$ additive error among the demand pairs. This negatively resolves an open question asked by Coppersmith and Elkin [SODA '05] and again by Cygan, Grandoni, and Kavitha [STACS '13] and Abboud and Bodwin [SODA '16]. At a technical level, our lower bounds are obtained by an improvement to the entire obstacle product framework used to compose ``inner'' and ``outer'' graphs into lower bound instances. In particular, we develop a new strategy for analysis that allows certain non-layered graphs to be used in the product, and we use this freedom to design better inner and outer graphs that lead to our new lower bounds.
In a completely randomized experiment, the variances of treatment effect estimators in the finite population are usually not identifiable and hence not estimable. Although some estimable bounds of the variances have been established in the literature, few of them are derived in the presence of covariates. In this paper, the difference-in-means estimator and the Wald estimator are considered in the completely randomized experiment with perfect compliance and noncompliance, respectively. Sharp bounds for the variances of these two estimators are established when covariates are available. Furthermore, consistent estimators for such bounds are obtained, which can be used to shorten the confidence intervals and improve the power of tests. Confidence intervals are constructed based on the consistent estimators of the upper bounds, whose coverage rates are uniformly asymptotically guaranteed. Simulations were conducted to evaluate the proposed methods. The proposed methods are also illustrated with two real data analyses.
In this paper, we propose a new covering technique localized for the trajectories of SGD. This localization provides an algorithm-specific complexity measured by the covering number, which can have dimension-independent cardinality in contrast to standard uniform covering arguments that result in exponential dimension dependency. Based on this localized construction, we show that if the objective function is a finite perturbation of a piecewise strongly convex and smooth function with $P$ pieces, i.e. non-convex and non-smooth in general, the generalization error can be upper bounded by $O(\sqrt{(\log n\log(nP))/n})$, where $n$ is the number of data samples. In particular, this rate is independent of dimension and does not require early stopping and decaying step size. Finally, we employ these results in various contexts and derive generalization bounds for multi-index linear models, multi-class support vector machines, and $K$-means clustering for both hard and soft label setups, improving the known state-of-the-art rates.
A coreset for a set of points is a small subset of weighted points that approximately preserves important properties of the original set. Specifically, if $P$ is a set of points, $Q$ is a set of queries, and $f:P\times Q\to\mathbb{R}$ is a cost function, then a set $S\subseteq P$ with weights $w:P\to[0,\infty)$ is an $\epsilon$-coreset for some parameter $\epsilon>0$ if $\sum_{s\in S}w(s)f(s,q)$ is a $(1+\epsilon)$ multiplicative approximation to $\sum_{p\in P}f(p,q)$ for all $q\in Q$. Coresets are used to solve fundamental problems in machine learning under various big data models of computation. Many of the suggested coresets in the recent decade used, or could have used a general framework for constructing coresets whose size depends quadratically on what is known as total sensitivity $t$. In this paper we improve this bound from $O(t^2)$ to $O(t\log t)$. Thus our results imply more space efficient solutions to a number of problems, including projective clustering, $k$-line clustering, and subspace approximation. Moreover, we generalize the notion of sensitivity sampling for sup-sampling that supports non-multiplicative approximations, negative cost functions and more. The main technical result is a generic reduction to the sample complexity of learning a class of functions with bounded VC dimension. We show that obtaining an $(\nu,\alpha)$-sample for this class of functions with appropriate parameters $\nu$ and $\alpha$ suffices to achieve space efficient $\epsilon$-coresets. Our result implies more efficient coreset constructions for a number of interesting problems in machine learning; we show applications to $k$-median/$k$-means, $k$-line clustering, $j$-subspace approximation, and the integer $(j,k)$-projective clustering problem.
The Strong Exponential Time Hypothesis (SETH) asserts that for every $\varepsilon>0$ there exists $k$ such that $k$-SAT requires time $(2-\varepsilon)^n$. The field of fine-grained complexity has leveraged SETH to prove quite tight conditional lower bounds for dozens of problems in various domains and complexity classes, including Edit Distance, Graph Diameter, Hitting Set, Independent Set, and Orthogonal Vectors. Yet, it has been repeatedly asked in the literature whether SETH-hardness results can be proven for other fundamental problems such as Hamiltonian Path, Independent Set, Chromatic Number, MAX-$k$-SAT, and Set Cover. In this paper, we show that fine-grained reductions implying even $\lambda^n$-hardness of these problems from SETH for any $\lambda>1$, would imply new circuit lower bounds: super-linear lower bounds for Boolean series-parallel circuits or polynomial lower bounds for arithmetic circuits (each of which is a four-decade open question). We also extend this barrier result to the class of parameterized problems. Namely, for every $\lambda>1$ we conditionally rule out fine-grained reductions implying SETH-based lower bounds of $\lambda^k$ for a number of problems parameterized by the solution size $k$. Our main technical tool is a new concept called polynomial formulations. In particular, we show that many problems can be represented by relatively succinct low-degree polynomials, and that any problem with such a representation cannot be proven SETH-hard (without proving new circuit lower bounds).
In the $d$-dimensional cow-path problem, a cow living in $\mathbb{R}^d$ must locate a $(d - 1)$-dimensional hyperplane $H$ whose location is unknown. The only way that the cow can find $H$ is to roam $\mathbb{R}^d$ until it intersects $\mathcal{H}$. If the cow travels a total distance $s$ to locate a hyperplane $H$ whose distance from the origin was $r \ge 1$, then the cow is said to achieve competitive ratio $s / r$. It is a classic result that, in $\mathbb{R}^2$, the optimal (deterministic) competitive ratio is $9$. In $\mathbb{R}^3$, the optimal competitive ratio is known to be at most $\approx 13.811$. But in higher dimensions, the asymptotic relationship between $d$ and the optimal competitive ratio remains an open question. The best upper and lower bounds, due to Antoniadis et al., are $O(d^{3/2})$ and $\Omega(d)$, leaving a gap of roughly $\sqrt{d}$. In this note, we achieve a stronger lower bound of $\tilde{\Omega}(d^{3/2})$.
Asymptotic study on the partition function $p(n)$ began with the work of Hardy and Ramanujan. Later Rademacher obtained a convergent series for $p(n)$ and an error bound was given by Lehmer. Despite having this, a full asymptotic expansion for $p(n)$ with an explicit error bound is not known. Recently O'Sullivan studied the asymptotic expansion of $p^{k}(n)$-partitions into $k$th powers, initiated by Wright, and consequently obtained an asymptotic expansion for $p(n)$ along with a concise description of the coefficients involved in the expansion but without any estimation of the error term. Here we consider a detailed and comprehensive analysis on an estimation of the error term obtained by truncating the asymptotic expansion for $p(n)$ at any positive integer $n$. This gives rise to an infinite family of inequalities for $p(n)$ which finally answers to a question proposed by Chen. Our error term estimation predominantly relies on applications of algorithmic methods from symbolic summation.
In type-and-coeffect systems, contexts are enriched by coeffects modeling how they are actually used, typically through annotations on single variables. Coeffects are computed bottom-up, combining, for each term, the coeffects of its subterms, through a fixed set of algebraic operators. We show that this principled approach can be adopted to track sharing in the imperative paradigm, that is, links among variables possibly introduced by the execution. This provides a significant example of non-structural coeffects, which cannot be computed by-variable, since the way a given variable is used can affect the coeffects of other variables. To illustrate the effectiveness of the approach, we enhance the type system tracking sharing to model a sophisticated set of features related to uniqueness and immutability. Thanks to the coeffect-based approach, we can express such features in a simple way and prove related properties with standard techniques.
This paper considers a natural fault-tolerant shortest paths problem: for some constant integer $f$, given a directed weighted graph with no negative cycles and two fixed vertices $s$ and $t$, compute (either explicitly or implicitly) for every tuple of $f$ edges, the distance from $s$ to $t$ if these edges fail. We call this problem $f$-Fault Replacement Paths ($f$FRP). We first present an $\tilde{O}(n^3)$ time algorithm for $2$FRP in $n$-vertex directed graphs with arbitrary edge weights and no negative cycles. As $2$FRP is a generalization of the well-studied Replacement Paths problem (RP) that asks for the distances between $s$ and $t$ for any single edge failure, $2$FRP is at least as hard as RP. Since RP in graphs with arbitrary weights is equivalent in a fine-grained sense to All-Pairs Shortest Paths (APSP) [Vassilevska Williams and Williams FOCS'10, J.~ACM'18], $2$FRP is at least as hard as APSP, and thus a substantially subcubic time algorithm in the number of vertices for $2$FRP would be a breakthrough. Therefore, our algorithm in $\tilde{O}(n^3)$ time is conditionally nearly optimal. Our algorithm implies an $\tilde{O}(n^{f+1})$ time algorithm for the $f$FRP problem, giving the first improvement over the straightforward $O(n^{f+2})$ time algorithm. Then we focus on the restriction of $2$FRP to graphs with small integer weights bounded by $M$ in absolute values. Using fast rectangular matrix multiplication, we obtain a randomized algorithm that runs in $\tilde{O}(M^{2/3}n^{2.9153})$ time. This implies an improvement over our $\tilde{O}(n^{f+1})$ time arbitrary weight algorithm for all $f>1$. We also present a data structure variant of the algorithm that can trade off pre-processing and query time. In addition to the algebraic algorithms, we also give an $n^{8/3-o(1)}$ conditional lower bound for combinatorial $2$FRP algorithms in directed unweighted graphs.
When is heterogeneity in the composition of an autonomous robotic team beneficial and when is it detrimental? We investigate and answer this question in the context of a minimally viable model that examines the role of heterogeneous speeds in perimeter defense problems, where defenders share a total allocated speed budget. We consider two distinct problem settings and develop strategies based on dynamic programming and on local interaction rules. We present a theoretical analysis of both approaches and our results are extensively validated using simulations. Interestingly, our results demonstrate that the viability of heterogeneous teams depends on the amount of information available to the defenders. Moreover, our results suggest a universality property: across a wide range of problem parameters the optimal ratio of the speeds of the defenders remains nearly constant.
Unsupervised domain adaptation has recently emerged as an effective paradigm for generalizing deep neural networks to new target domains. However, there is still enormous potential to be tapped to reach the fully supervised performance. In this paper, we present a novel active learning strategy to assist knowledge transfer in the target domain, dubbed active domain adaptation. We start from an observation that energy-based models exhibit free energy biases when training (source) and test (target) data come from different distributions. Inspired by this inherent mechanism, we empirically reveal that a simple yet efficient energy-based sampling strategy sheds light on selecting the most valuable target samples than existing approaches requiring particular architectures or computation of the distances. Our algorithm, Energy-based Active Domain Adaptation (EADA), queries groups of targe data that incorporate both domain characteristic and instance uncertainty into every selection round. Meanwhile, by aligning the free energy of target data compact around the source domain via a regularization term, domain gap can be implicitly diminished. Through extensive experiments, we show that EADA surpasses state-of-the-art methods on well-known challenging benchmarks with substantial improvements, making it a useful option in the open world. Code is available at //github.com/BIT-DA/EADA.