Consider the regression problem where the response $Y\in\mathbb{R}$ and the covariate $X\in\mathbb{R}^d$ for $d\geq 1$ are \textit{unmatched}. Under this scenario, we do not have access to pairs of observations from the distribution of $(X, Y)$, but instead, we have separate datasets $\{Y_i\}_{i=1}^n$ and $\{X_j\}_{j=1}^m$, possibly collected from different sources. We study this problem assuming that the regression function is linear and the noise distribution is known or can be estimated. We introduce an estimator of the regression vector based on deconvolution and demonstrate its consistency and asymptotic normality under an identifiability assumption. In the general case, we show that our estimator (DLSE: Deconvolution Least Squared Estimator) is consistent in terms of an extended $\ell_2$ norm. Using this observation, we devise a method for semi-supervised learning, i.e., when we have access to a small sample of matched pairs $(X_k, Y_k)$. Several applications with synthetic and real datasets are considered to illustrate the theory.
Austrin showed that the approximation ratio $\beta\approx 0.94016567$ obtained by the MAX 2-SAT approximation algorithm of Lewin, Livnat and Zwick (LLZ) is optimal modulo the Unique Games Conjecture (UGC) and modulo a Simplicity Conjecture that states that the worst performance of the algorithm is obtained on so called simple configurations. We prove Austrin's conjecture, thereby showing the optimality of the LLZ approximation algorithm, relying only on the Unique Games Conjecture. Our proof uses a combination of analytic and computational tools. We also present new approximation algorithms for two restrictions of the MAX 2-SAT problem. For MAX HORN-$\{1,2\}$-SAT, i.e., MAX CSP$(\{x\lor y,\bar{x}\lor y,x,\bar{x}\})$, in which clauses are not allowed to contain two negated literals, we obtain an approximation ratio of $0.94615981$. For MAX CSP$(\{x\lor y,x,\bar{x}\})$, i.e., when 2-clauses are not allowed to contain negated literals, we obtain an approximation ratio of $0.95397990$. By adapting Austrin's and our arguments for the MAX 2-SAT problem we show that these two approximation ratios are also tight, modulo only the UGC conjecture. This completes a full characterization of the approximability of the MAX 2-SAT problem and its restrictions.
We explore a notion of bent sequence attached to the data consisting of an Hadamard matrix of order $n$ defined over the complex $q^{th}$ roots of unity, an eigenvalue of that matrix, and a Galois automorphism from the cyclotomic field of order $q.$ In particular we construct self-dual bent sequences for various $q\le 60$ and lengths $n\le 21.$ Computational construction methods comprise the resolution of polynomial systems by Groebner bases and eigenspace computations. Infinite families can be constructed from regular Hadamard matrices, Bush-type Hadamard matrices, and generalized Boolean bent functions.As an application, we estimate the covering radius of the code attached to that matrix over $\Z_q.$ We derive a lower bound on that quantity for the Chinese Euclidean metric when bent sequences exist. We give the Euclidean distance spectrum, and bound above the covering radius of an attached spherical code, depending on its strength as a spherical design.
The polynomial identity lemma (also called the "Schwartz-Zippel lemma") states that any nonzero polynomial $f(x_1,\ldots, x_n)$ of degree at most $s$ will evaluate to a nonzero value at some point on a grid $S^n \subseteq \mathbb{F}^n$ with $|S| > s$. Thus, there is an explicit hitting set for all $n$-variate degree $s$, size $s$ algebraic circuits of size $(s+1)^n$. In this paper, we prove the following results: - Let $\varepsilon > 0$ be a constant. For a sufficiently large constant $n$ and all $s > n$, if we have an explicit hitting set of size $(s+1)^{n-\varepsilon}$ for the class of $n$-variate degree $s$ polynomials that are computable by algebraic circuits of size $s$, then for all $s$, we have an explicit hitting set of size $s^{\exp \circ \exp (O(\log^\ast s))}$ for $s$-variate circuits of degree $s$ and size $s$. That is, if we can obtain a barely non-trivial exponent compared to the trivial $(s+1)^{n}$ sized hitting set even for constant variate circuits, we can get an almost complete derandomization of PIT. - The above result holds when "circuits" are replaced by "formulas" or "algebraic branching programs". This extends a recent surprising result of Agrawal, Ghosh and Saxena (STOC 2018,PNAS 2019) who proved the same conclusion for the class of algebraic circuits, if the hypothesis provided a hitting set of size at most $(s^{n^{0.5 - \delta}})$ (where $\delta>0$ is any constant). Hence, our work significantly weakens the hypothesis of Agrawal, Ghosh and Saxena to only require a slightly non-trivial saving over the trivial hitting set, and also presents the first such result for algebraic branching programs and formulas.
G{\"o}del's second incompleteness theorem forbids to prove, in a given theory U, the consistency of many theories-in particular, of the theory U itself-as well as it forbids to prove the normalization property for these theories, since this property implies their consistency. When we cannot prove in a theory U the consistency of a theory T , we can try to prove a relative consistency theorem, that is, a theorem of the form: If U is consistent then T is consistent. Following the same spirit, we show in this paper how to prove relative normalization theorems, that is, theorems of the form: If U is 1-consistent, then T has the normalization property.
We give a simple and computationally efficient algorithm that, for any constant $\varepsilon>0$, obtains $\varepsilon T$-swap regret within only $T = \mathsf{polylog}(n)$ rounds; this is an exponential improvement compared to the super-linear number of rounds required by the state-of-the-art algorithm, and resolves the main open problem of [Blum and Mansour 2007]. Our algorithm has an exponential dependence on $\varepsilon$, but we prove a new, matching lower bound. Our algorithm for swap regret implies faster convergence to $\varepsilon$-Correlated Equilibrium ($\varepsilon$-CE) in several regimes: For normal form two-player games with $n$ actions, it implies the first uncoupled dynamics that converges to the set of $\varepsilon$-CE in polylogarithmic rounds; a $\mathsf{polylog}(n)$-bit communication protocol for $\varepsilon$-CE in two-player games (resolving an open problem mentioned by [Babichenko-Rubinstein'2017, Goos-Rubinstein'2018, Ganor-CS'2018]; and an $\tilde{O}(n)$-query algorithm for $\varepsilon$-CE (resolving an open problem of [Babichenko'2020] and obtaining the first separation between $\varepsilon$-CE and $\varepsilon$-Nash equilibrium in the query complexity model). For extensive-form games, our algorithm implies a PTAS for $\mathit{normal}$ $\mathit{form}$ $\mathit{correlated}$ $\mathit{equilibria}$, a solution concept often conjectured to be computationally intractable (e.g. [Stengel-Forges'08, Fujii'23]).
We construct a graph with $n$ vertices where the smoothed runtime of the 3-FLIP algorithm for the 3-Opt Local Max-Cut problem can be as large as $2^{\Omega(\sqrt{n})}$. This provides the first example where a local search algorithm for the Max-Cut problem can fail to be efficient in the framework of smoothed analysis. We also give a new construction of graphs where the runtime of the FLIP algorithm for the Local Max-Cut problem is $2^{\Omega(n)}$ for any pivot rule. This graph is much smaller and has a simpler structure than previous constructions.
Despite the empirical version of least trimmed squares (LTS) in regression (Rousseeuw \cite{R84}) having been repeatedly studied in the literature, the population version of LTS has never been introduced. Novel properties of the objective function in both empirical and population settings of the LTS, along with other properties, are established for the first time in this article. The primary properties of the objective function facilitate the establishment of other original results, including the influence function and Fisher consistency. The strong consistency is established with the help of a generalized Glivenko-Cantelli Theorem over a class of functions. Differentiability and stochastic equicontinuity promote the establishment of asymptotic normality with a concise and novel approach.
We show an area law with logarithmic correction for the maximally mixed state $\Omega$ in the (degenerate) ground space of a 1D gapped local Hamiltonian $H$, which is independent of the underlying ground space degeneracy. Formally, for $\varepsilon>0$ and a bi-partition $L\cup L^c$ of the 1D lattice, we show that $$\mathrm{I}^{\varepsilon}_{\max}(L:L^c)_{\Omega} \leq O(\log(|L|)+\log(1/\varepsilon)),$$ where $|L|$ represents the number of qudits in $L$ and $\mathrm{I}^{\epsilon}_{\max}(L:L^c)_{\Omega}$ represents the $\varepsilon$- 'smoothed maximum mutual information' with respect to the $L:L^c$ partition in $\Omega$. As a corollary, we get an area law for the mutual information of the form $\mathrm{I}(L:R)_\Omega \leq O(\log |L|)$. In addition, we show that $\Omega$ can be approximated up to an $\varepsilon$ in trace norm with a state of Schmidt rank of at most $\mathrm{poly}(|L|/\varepsilon)$.
We construct and analyze finite element approximations of the Einstein tensor in dimension $N \ge 3$. We focus on the setting where a smooth Riemannian metric tensor $g$ on a polyhedral domain $\Omega \subset \mathbb{R}^N$ has been approximated by a piecewise polynomial metric $g_h$ on a simplicial triangulation $\mathcal{T}$ of $\Omega$ having maximum element diameter $h$. We assume that $g_h$ possesses single-valued tangential-tangential components on every codimension-1 simplex in $\mathcal{T}$. Such a metric is not classically differentiable in general, but it turns out that one can still attribute meaning to its Einstein curvature in a distributional sense. We study the convergence of the distributional Einstein curvature of $g_h$ to the Einstein curvature of $g$ under refinement of the triangulation. We show that in the $H^{-2}(\Omega)$-norm, this convergence takes place at a rate of $O(h^{r+1})$ when $g_h$ is an optimal-order interpolant of $g$ that is piecewise polynomial of degree $r \ge 1$. We provide numerical evidence to support this claim.
For a set of points in $\mathbb{R}^d$, the Euclidean $k$-means problems consists of finding $k$ centers such that the sum of distances squared from each data point to its closest center is minimized. Coresets are one the main tools developed recently to solve this problem in a big data context. They allow to compress the initial dataset while preserving its structure: running any algorithm on the coreset provides a guarantee almost equivalent to running it on the full data. In this work, we study coresets in a fully-dynamic setting: points are added and deleted with the goal to efficiently maintain a coreset with which a k-means solution can be computed. Based on an algorithm from Henzinger and Kale [ESA'20], we present an efficient and practical implementation of a fully dynamic coreset algorithm, that improves the running time by up to a factor of 20 compared to our non-optimized implementation of the algorithm by Henzinger and Kale, without sacrificing more than 7% on the quality of the k-means solution.