In this paper, we investigate local permutation tests for testing conditional independence between two random vectors $X$ and $Y$ given $Z$. The local permutation test determines the significance of a test statistic by locally shuffling samples which share similar values of the conditioning variables $Z$, and it forms a natural extension of the usual permutation approach for unconditional independence testing. Despite its simplicity and empirical support, the theoretical underpinnings of the local permutation test remain unclear. Motivated by this gap, this paper aims to establish theoretical foundations of local permutation tests with a particular focus on binning-based statistics. We start by revisiting the hardness of conditional independence testing and provide an upper bound for the power of any valid conditional independence test, which holds when the probability of observing collisions in $Z$ is small. This negative result naturally motivates us to impose additional restrictions on the possible distributions under the null and alternate. To this end, we focus our attention on certain classes of smooth distributions and identify provably tight conditions under which the local permutation method is universally valid, i.e. it is valid when applied to any (binning-based) test statistic. To complement this result on type I error control, we also show that in some cases, a binning-based statistic calibrated via the local permutation method can achieve minimax optimal power. We also introduce a double-binning permutation strategy, which yields a valid test over less smooth null distributions than the typical single-binning method without compromising much power. Finally, we present simulation results to support our theoretical findings.
The segment number of a planar graph $G$ is the smallest number of line segments needed for a planar straight-line drawing of $G$. Dujmovi\'c, Eppstein, Suderman, and Wood [CGTA'07] introduced this measure for the visual complexity of graphs. There are optimal algorithms for trees and worst-case optimal algorithms for outerplanar graphs, 2-trees, and planar 3-trees. It is known that every cubic triconnected planar $n$-vertex graph (except $K_4$) has segment number $n/2+3$, which is the only known universal lower bound for a meaningful class of planar graphs. We show that every triconnected planar 4-regular graph can be drawn using at most $n+3$ segments. This bound is tight up to an additive constant, improves a previous upper bound of $7n/4+2$ implied by a more general result of Dujmovi\'c et al., and supplements the result for cubic graphs. We also give a simple optimal algorithm for cactus graphs, generalizing the above-mentioned result for trees. We prove the first linear universal lower bounds for outerpaths, maximal outerplanar graphs, 2-trees, and planar 3-trees. This shows that the existing algorithms for these graph classes are constant-factor approximations. For maximal outerpaths, our bound is best possible and can be generalized to circular arcs.
In the \emph{Exact Matching Problem} (EM), we are given a graph equipped with a fixed coloring of its edges with two colors (red and blue), as well as a positive integer $k$. The task is then to decide whether the given graph contains a perfect matching exactly $k$ of whose edges have color red. EM generalizes several important algorithmic problems such as \emph{perfect matching} and restricted minimum weight spanning tree problems. When introducing the problem in 1982, Papadimitriou and Yannakakis conjectured EM to be \textbf{NP}-complete. Later however, Mulmuley et al.~presented a randomized polynomial time algorithm for EM, which puts EM in \textbf{RP}. Given that to decide whether or not \textbf{RP}$=$\textbf{P} represents a big open challenge in complexity theory, this makes it unlikely for EM to be \textbf{NP}-complete, and in fact indicates the possibility of a \emph{deterministic} polynomial time algorithm. EM remains one of the few natural combinatorial problems in \textbf{RP} which are not known to be contained in \textbf{P}, making it an interesting instance for testing the hypothesis \textbf{RP}$=$\textbf{P}. Despite EM being quite well-known, attempts to devise deterministic polynomial algorithms have remained illusive during the last 40 years and progress has been lacking even for very restrictive classes of input graphs. In this paper we finally push the frontier of positive results forward by proving that EM can be solved in deterministic polynomial time for input graphs of bounded independence number, and for bipartite input graphs of bounded bipartite independence number. This generalizes previous positive results for complete (bipartite) graphs which were the only known results for EM on dense graphs.
In this paper, we provide the first deterministic algorithm that achieves the tight $1-1/e$ approximation guarantee for submodular maximization under a cardinality (size) constraint while making a number of queries that scales only linearly with the size of the ground set $n$. To complement our result, we also show strong information-theoretic lower bounds. More specifically, we show that when the maximum cardinality allowed for a solution is constant, no algorithm making a sub-linear number of function evaluations can guarantee any constant approximation ratio. Furthermore, when the constraint allows the selection of a constant fraction of the ground set, we show that any algorithm making fewer than $\Omega(n/\log(n))$ function evaluations cannot perform better than an algorithm that simply outputs a uniformly random subset of the ground set of the right size. We then provide a variant of our deterministic algorithm for the more general knapsack constraint, which is the first linear-time algorithm that achieves $1/2$-approximation guarantee for this constraint. Finally, we extend our results to the general case of maximizing a monotone submodular function subject to the intersection of a $p$-set system and multiple knapsack constraints. We extensively evaluate the performance of our algorithms on multiple real-life machine learning applications, including movie recommendation, location summarization, twitter text summarization and video summarization.
In the computational sciences, one must often estimate model parameters from data subject to noise and uncertainty, leading to inaccurate results. In order to improve the accuracy of models with noisy parameters, we consider the problem of reducing error in a linear system with the operator corrupted by noise. Our contribution in this paper is to extend the elliptic operator shifting framework from Etter, Ying '20 to the general nonsymmetric matrix case. Roughly, the operator shifting technique is a matrix analogue of the James-Stein estimator. The key insight is that a shift of the matrix inverse estimate in an appropriately chosen direction will reduce average error. In our extension, we interrogate a number of questions -- namely, whether or not shifting towards the origin for general matrix inverses always reduces error as it does in the elliptic case. We show that this is usually the case, but that there are three key features of the general nonsingular matrices that allow for adversarial examples not possible in the symmetric case. We prove that when these adversarial possibilities are eliminated by the assumption of noise symmetry and the use of the residual norm as the error metric, the optimal shift is always towards the origin, mirroring results from Etter, Ying '20. We also investigate behavior in the small noise regime and other scenarios. We conclude by presenting numerical experiments (with accompanying source code) inspired by Reinforcement Learning to demonstrate that operator shifting can yield substantial reductions in error.
We propose inferential tools for functional linear quantile regression where the conditional quantile of a scalar response is assumed to be a linear functional of a functional covariate. In contrast to conventional approaches, we employ kernel convolution to smooth the original loss function. The coefficient function is estimated under a reproducing kernel Hilbert space framework. A gradient descent algorithm is designed to minimize the smoothed loss function with a roughness penalty. With the aid of the Banach fixed-point theorem, we show the existence and uniqueness of our proposed estimator as the minimizer of the regularized loss function in an appropriate Hilbert space. Furthermore, we establish the convergence rate as well as the weak convergence of our estimator. As far as we know, this is the first weak convergence result for a functional quantile regression model. Pointwise confidence intervals and a simultaneous confidence band for the true coefficient function are then developed based on these theoretical properties. Numerical studies including both simulations and a data application are conducted to investigate the performance of our estimator and inference tools in finite sample.
We show that the sample complexity of robust interpolation problem could be exponential in the input dimensionality and discover a phase transition phenomenon when the data are in a unit ball. Robust interpolation refers to the problem of interpolating $n$ noisy training data in $\R^d$ by a Lipschitz function. Although this problem has been well understood when the covariates are drawn from an isoperimetry distribution, much remains unknown concerning its performance under generic or even the worst-case distributions. Our results are two-fold: 1) too many data hurt robustness; we provide a tight and universal Lipschitzness lower bound $\Omega(n^{1/d})$ of the interpolating function for arbitrary data distributions. Our result disproves potential existence of an $\mathcal{O}(1)$-Lipschitz function in the overparametrization scenario when $n=\exp(\omega(d))$. 2) Small data hurt robustness: $n=\exp(\Omega(d))$ is necessary for obtaining a good population error under certain distributions by any $\mathcal{O}(1)$-Lipschitz learning algorithm. Perhaps surprisingly, our results shed light on the curse of big data and the blessing of dimensionality for robustness, and discover an intriguing phenomenon of phase transition at $n=\exp(\Theta(d))$.
We study the robust maximum flow problem and the robust maximum flow over time problem where a given number of arcs $\Gamma$ may fail or may be delayed. Two prominent models have been introduced for these problems: either one assigns flow to arcs fulfilling weak flow conservation in any scenario, or one assigns flow to paths where an arc failure or delay affects a whole path. We provide a unifying framework by presenting novel general models, in which we assign flow to subpaths. These models contain the known models as special cases and unify their advantages in order to obtain less conservative robust solutions. We give a thorough analysis with respect to complexity of the general models. In particular, we show that the general models are essentially NP-hard, whereas, e.g. in the static case with $\Gamma = 1$ an optimal solution can be computed in polynomial time. Further, we answer the open question about the complexity of the dynamic path model for $\Gamma = 1$. We also compare the solution quality of the different models. In detail, we show that the general models have better robust optimal values than the known models and we prove bounds on these gaps.
Group synchronization asks to recover group elements from their pairwise measurements. It has found numerous applications across various scientific disciplines. In this work, we focus on orthogonal and permutation group synchronization which are widely used in computer vision such as object matching and structure from motion. Among many available approaches, the spectral methods have enjoyed great popularity due to their efficiency and convenience. We will study the performance guarantees of the spectral methods in solving these two synchronization problems by investigating how well the computed eigenvectors approximate each group element individually. We establish our theory by applying the recent popular~\emph{leave-one-out} technique and derive a~\emph{block-wise} performance bound for the recovery of each group element via eigenvectors. In particular, for orthogonal group synchronization, we obtain a near-optimal performance bound for the group recovery in presence of additive Gaussian noise. For permutation group synchronization under random corruption, we show that the widely-used two-step procedure (spectral method plus rounding) can recover all the group elements exactly if the SNR (signal-to-noise ratio) is close to the information theoretical limit. Our numerical experiments confirm our theory and indicate a sharp phase transition for the exact group recovery.
We introduce a notion of \emph{generic local algorithm} which strictly generalizes existing frameworks of local algorithms such as \emph{factors of i.i.d.} by capturing local \emph{quantum} algorithms such as the Quantum Approximate Optimization Algorithm (QAOA). Motivated by a question of Farhi et al. [arXiv:1910.08187, 2019] we then show limitations of generic local algorithms including QAOA on random instances of constraint satisfaction problems (CSPs). Specifically, we show that any generic local algorithm whose assignment to a vertex depends only on a local neighborhood with $o(n)$ other vertices (such as the QAOA at depth less than $\epsilon\log(n)$) cannot arbitrarily-well approximate boolean CSPs if the problem satisfies a geometric property from statistical physics called the coupled overlap-gap property (OGP) [Chen et al., Annals of Probability, 47(3), 2019]. We show that the random MAX-k-XOR problem has this property when $k\geq4$ is even by extending the corresponding result for diluted $k$-spin glasses. Our concentration lemmas confirm a conjecture of Brandao et al. [arXiv:1812.04170, 2018] asserting that the landscape independence of QAOA extends to logarithmic depth -- in other words, for every fixed choice of QAOA angle parameters, the algorithm at logarithmic depth performs almost equally well on almost all instances. One of these concentration lemmas is a strengthening of McDiarmid's inequality, applicable when the random variables have a highly biased distribution, and may be of independent interest.
Influence maximization is the task of selecting a small number of seed nodes in a social network to maximize the spread of the influence from these seeds, and it has been widely investigated in the past two decades. In the canonical setting, the whole social network as well as its diffusion parameters is given as input. In this paper, we consider the more realistic sampling setting where the network is unknown and we only have a set of passively observed cascades that record the set of activated nodes at each diffusion step. We study the task of influence maximization from these cascade samples (IMS), and present constant approximation algorithms for this task under mild conditions on the seed set distribution. To achieve the optimization goal, we also provide a novel solution to the network inference problem, that is, learning diffusion parameters and the network structure from the cascade data. Comparing with prior solutions, our network inference algorithm requires weaker assumptions and does not rely on maximum-likelihood estimation and convex programming. Our IMS algorithms enhance the learning-and-then-optimization approach by allowing a constant approximation ratio even when the diffusion parameters are hard to learn, and we do not need any assumption related to the network structure or diffusion parameters.