We consider the goodness-of fit testing problem for H\"older smooth densities over $\mathbb{R}^d$: given $n$ iid observations with unknown density $p$ and given a known density $p_0$, we investigate how large $\rho$ should be to distinguish, with high probability, the case $p=p_0$ from the composite alternative of all H\"older-smooth densities $p$ such that $\|p-p_0\|_t \geq \rho$ where $t \in [1,2]$. The densities are assumed to be defined over $\mathbb{R}^d$ and to have H\"older smoothness parameter $\alpha>0$. In the present work, we solve the case $\alpha \leq 1$ and handle the case $\alpha>1$ using an additional technical restriction on the densities. We identify matching upper and lower bounds on the local minimax rates of testing, given explicitly in terms of $p_0$. We propose novel test statistics which we believe could be of independent interest. We also establish the first definition of an explicit cutoff $u_B$ allowing us to split $\mathbb{R}^d$ into a bulk part (defined as the subset of $\mathbb{R}^d$ where $p_0$ takes only values greater than or equal to $u_B$) and a tail part (defined as the complementary of the bulk), each part involving fundamentally different contributions to the local minimax rates of testing.
We provide universally-optimal distributed graph algorithms for $(1+\varepsilon)$-approximate shortest path problems including shortest-path-tree and transshipment. The universal optimality of our algorithms guarantees that, on any $n$-node network $G$, our algorithm completes in $T \cdot n^{o(1)}$ rounds whenever a $T$-round algorithm exists for $G$. This includes $D \cdot n^{o(1)}$-round algorithms for any planar or excluded-minor network. Our algorithms never require more than $(\sqrt{n} + D) \cdot n^{o(1)}$ rounds, resulting in the first sub-linear-round distributed algorithm for transshipment. The key technical contribution leading to these results is the first efficient $n^{o(1)}$-competitive linear $\ell_1$-oblivious routing operator that does not require the use of $\ell_1$-embeddings. Our construction is simple, solely based on low-diameter decompositions, and -- in contrast to all known constructions -- directly produces an oblivious flow instead of just an approximation of the optimal flow cost. This also has the benefit of simplifying the interaction with Sherman's multiplicative weight framework [SODA'17] in the distributed setting and its subsequent rounding procedures.
In this paper, an abstract framework for the error analysis of discontinuous finite element method is developed for the distributed and Neumann boundary control problems governed by the stationary Stokes equation with control constraints. {\it A~priori} error estimates of optimal order are derived for velocity and pressure in the energy norm and the $L^2$-norm, respectively. Moreover, a reliable and efficient {\it a~posteriori} error estimator is derived. The results are applicable to a variety of problems just under the minimal regularity possessed by the well-posedness of the problem. In particular, we consider the abstract results with suitable stable pairs of velocity and pressure spaces like as the lowest-order Crouzeix-Raviart finite element and piecewise constant spaces, piecewise linear and constant finite element spaces. The theoretical results are illustrated by the numerical experiments.
An $s{\operatorname{-}}t$ minimum cut in a graph corresponds to a minimum weight subset of edges whose removal disconnects vertices $s$ and $t$. Finding such a cut is a classic problem that is dual to that of finding a maximum flow from $s$ to $t$. In this work we describe a quantum algorithm for the minimum $s{\operatorname{-}}t$ cut problem on undirected graphs. For an undirected graph with $n$ vertices, $m$ edges, and integral edge weights bounded by $W$, the algorithm computes with high probability the weight of a minimum $s{\operatorname{-}}t$ cut in time $\widetilde O(\sqrt{m} n^{5/6} W^{1/3} + n^{5/3} W^{2/3})$, given adjacency list access to $G$. For simple graphs this bound is always $\widetilde O(n^{11/6})$, even in the dense case when $m = \Omega(n^2)$. In contrast, a randomized algorithm must make $\Omega(m)$ queries to the adjacency list of a simple graph $G$ even to decide whether $s$ and $t$ are connected.
We consider the problem of approximating the arboricity of a graph $G= (V,E)$, which we denote by $\mathsf{arb}(G)$, in sublinear time, where the arboricity of a graph is the minimal number of forests required to cover its edges. An algorithm for this problem may perform degree and neighbor queries, and is allowed a small error probability. We design an algorithm that outputs an estimate $\hat{\alpha}$, such that with probability $1-1/\textrm{poly}(n)$, $\mathsf{arb}(G)/c\log^2 n \leq \hat{\alpha} \leq \mathsf{arb}(G)$, where $n=|V|$ and $c$ is a constant. The expected query complexity and running time of the algorithm are $O(n/\mathsf{arb}(G))\cdot \textrm{poly}(\log n)$, and this upper bound also holds with high probability. %($\widetilde{O}(\cdot)$ is used to suppress $\textrm{poly}(\log n)$ dependencies). This bound is optimal for such an approximation up to a $\textrm{poly}(\log n)$ factor.
We consider the problem of testing for long-range dependence for time-varying coefficient regression models. The covariates and errors are assumed to be locally stationary, which allows complex temporal dynamics and heteroscedasticity. We develop KPSS, R/S, V/S, and K/S-type statistics based on the nonparametric residuals, and propose bootstrap approaches equipped with a difference-based long-run covariance matrix estimator for practical implementation. Under the null hypothesis, the local alternatives as well as the fixed alternatives, we derive the limiting distributions of the test statistics, establish the uniform consistency of the difference-based long-run covariance estimator, and justify the bootstrap algorithms theoretically. In particular, the exact local asymptotic power of our testing procedure enjoys the order $O( \log^{-1} n)$, the same as that of the classical KPSS test for long memory in strictly stationary series without covariates. We demonstrate the effectiveness of our tests by extensive simulation studies. The proposed tests are applied to a COVID-19 dataset in favor of long-range dependence in the cumulative confirmed series of COVID-19 in several countries, and to the Hong Kong circulatory and respiratory dataset, identifying a new type of 'spurious long memory'.
We consider \emph{Gibbs distributions}, which are families of probability distributions over a discrete space $\Omega$ with probability mass function of the form $\mu^\Omega_\beta(\omega) \propto e^{\beta H(\omega)}$ for $\beta$ in an interval $[\beta_{\min}, \beta_{\max}]$ and $H( \omega ) \in \{0 \} \cup [1, n]$. The \emph{partition function} is the normalization factor $Z(\beta)=\sum_{\omega \in\Omega}e^{\beta H(\omega)}$. Two important parameters of these distributions are the log partition ratio $q = \log \tfrac{Z(\beta_{\max})}{Z(\beta_{\min})}$ and the counts $c_x = |H^{-1}(x)|$. These are correlated with system parameters in a number of physical applications and sampling algorithms. Our first main result is to estimate the counts $c_x$ using roughly $\tilde O( \frac{q}{\varepsilon^2})$ samples for general Gibbs distributions and $\tilde O( \frac{n^2}{\varepsilon^2} )$ samples for integer-valued distributions (ignoring some second-order terms and parameters), and we show this is optimal up to logarithmic factors. We illustrate with improved algorithms for counting connected subgraphs and perfect matchings in a graph. We develop a key subroutine to estimate the partition function $Z$. Specifically, it generates a data structure to estimate $Z(\beta)$ for \emph{all} values $\beta$, without further samples. Constructing the data structure requires $O(\frac{q \log n}{\varepsilon^2})$ samples for general Gibbs distributions and $O(\frac{n^2 \log n}{\varepsilon^2} + n \log q)$ samples for integer-valued distributions. This improves over a prior algorithm of Huber (2015) which computes a single point estimate $Z(\beta_\max)$ using $O( q \log n( \log q + \log \log n + \varepsilon^{-2}))$ samples. We show matching lower bounds, demonstrating that this complexity is optimal as a function of $n$ and $q$ up to logarithmic terms.
We study a variation of Bayesian M-ary hypothesis testing in which the test outputs a list of L candidates out of the M possible upon processing the observation. We study the minimum error probability of list hypothesis testing, where an error is defined as the event where the true hypothesis is not in the list output by the test. We derive two exact expressions of the minimum probability or error. The first is expressed as the error probability of a certain non-Bayesian binary hypothesis test, and is reminiscent of the meta-converse bound. The second, is expressed as the tail probability of the likelihood ratio between the two distributions involved in the aforementioned non-Bayesian binary hypothesis test.
We give lower bounds on the performance of two of the most popular sampling methods in practice, the Metropolis-adjusted Langevin algorithm (MALA) and multi-step Hamiltonian Monte Carlo (HMC) with a leapfrog integrator, when applied to well-conditioned distributions. Our main result is a nearly-tight lower bound of $\widetilde{\Omega}(\kappa d)$ on the mixing time of MALA from an exponentially warm start, matching a line of algorithmic results up to logarithmic factors and answering an open question of Chewi et. al. We also show that a polynomial dependence on dimension is necessary for the relaxation time of HMC under any number of leapfrog steps, and bound the gains achievable by changing the step count. Our HMC analysis draws upon a novel connection between leapfrog integration and Chebyshev polynomials, which may be of independent interest.
In this work, we consider the distributed optimization of non-smooth convex functions using a network of computing units. We investigate this problem under two regularity assumptions: (1) the Lipschitz continuity of the global objective function, and (2) the Lipschitz continuity of local individual functions. Under the local regularity assumption, we provide the first optimal first-order decentralized algorithm called multi-step primal-dual (MSPD) and its corresponding optimal convergence rate. A notable aspect of this result is that, for non-smooth functions, while the dominant term of the error is in $O(1/\sqrt{t})$, the structure of the communication network only impacts a second-order term in $O(1/t)$, where $t$ is time. In other words, the error due to limits in communication resources decreases at a fast rate even in the case of non-strongly-convex objective functions. Under the global regularity assumption, we provide a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function, and show that DRS is within a $d^{1/4}$ multiplicative factor of the optimal convergence rate, where $d$ is the underlying dimension.
In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov's accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.