Motivated by applications to topological data analysis, we give an efficient algorithm for computing a (minimal) presentation of a bigraded $K[x,y]$-module $M$, where $K$ is a field. The algorithm takes as input a short chain complex of free modules $X\xrightarrow{f} Y \xrightarrow{g} Z$ such that $M\cong \ker{g}/\mathrm{im}{f}$. It runs in time $O(|X|^3+|Y|^3+|Z|^3)$ and requires $O(|X|^2+|Y|^2+|Z|^2)$ memory, where $|\cdot |$ denotes the rank. Given the presentation computed by our algorithm, the bigraded Betti numbers of $M$ are readily computed. Our approach is based on a simple matrix reduction algorithm, slight variants of which compute kernels of morphisms between free modules, minimal generating sets, and Gr\"obner bases. Our algorithm for computing minimal presentations has been implemented in RIVET, a software tool for the visualization and analysis of two-parameter persistent homology. In experiments on topological data analysis problems, our implementation outperforms the standard computational commutative algebra packages Singular and Macaulay2 by a wide margin.
The randomized singular value decomposition (SVD) is a popular and effective algorithm for computing a near-best rank $k$ approximation of a matrix $A$ using matrix-vector products with standard Gaussian vectors. Here, we generalize the randomized SVD to multivariate Gaussian vectors, allowing one to incorporate prior knowledge of $A$ into the algorithm. This enables us to explore the continuous analogue of the randomized SVD for Hilbert--Schmidt (HS) operators using operator-function products with functions drawn from a Gaussian process (GP). We then construct a new covariance kernel for GPs, based on weighted Jacobi polynomials, which allows us to rapidly sample the GP and control the smoothness of the randomly generated functions. Numerical examples on matrices and HS operators demonstrate the applicability of the algorithm.
For a map (function) $F(x):\ftwo^n\rightarrow\ftwo^n$ and a given $y$ in the image of $F$ the problem of \emph{local inversion} of $F$ is to find all inverse images $x$ in $\ftwo^n$ such that $y=F(x)$. In Cryptology, such a problem arises in Cryptanalysis of One way Functions (OWFs). The well known TMTO attack in Cryptanalysis is a probabilistic algorithm for computing one solution of local inversion using $O(\sqrt N)$ order computation in offline as well as online for $N=2^n$. This paper proposes a complete algorithm for solving the local inversion problem which uses linear complexity for a unique solution in a periodic orbit. The algorithm is shown to require an offline computation to solve a hard problem (possibly requiring exponential computation) and an online computation dependent on $y$ that of repeated forward evaluation $F(x)$ on points $x$ in $\ff_{2^n}$ which is polynomial time at each evaluation. However the forward evaluation is repeated at most as many number of times as the Linear Complexity of the sequence $\{y,F(y),\ldots\}$ to get one possible solution when this sequence is periodic. All other solutions are obtained in chains $\{e,F(e),\ldots\}$ for all points $e$ in the Garden of Eden (GOE) of the map $F$. Hence a solution $x$ exists iff either the former sequence is periodic or a solution occurs in a chain starting from a point in GOE. The online computation then turns out to be polynomial time $O(L^k)$ in the linear complexity $L$ of the sequence to compute one possible solution in a periodic orbit or $O(l)$ the chain length for a fixed $n$. Hence this is a complete algorithm for solving the problem of finding all rational solutions $x$ of the equation $F(x)=y$ for a given $y$ and a map $F$ in $\ff_{2^n}$.
We study the Bahadur efficiency of several weighted L2--type goodness--of--fit tests based on the empirical characteristic function. The methods considered are for normality and exponentiality testing, and for testing goodness--of--fit to the logistic distribution. Our results are helpful in deciding which specific test a potential practitioner should apply. For the celebrated BHEP and energy tests for normality we obtain novel efficiency results, with some of them in the multivariate case, while in the case of the logistic distribution this is the first time that efficiencies are computed for any composite goodness--of--fit test.
We study population protocols, a model of distributed computing appropriate for modeling well-mixed chemical reaction networks and other physical systems where agents exchange information in pairwise interactions, but have no control over their schedule of interaction partners. The well-studied *majority* problem is that of determining in an initial population of $n$ agents, each with one of two opinions $A$ or $B$, whether there are more $A$, more $B$, or a tie. A *stable* protocol solves this problem with probability 1 by eventually entering a configuration in which all agents agree on a correct consensus decision of $\mathsf{A}$, $\mathsf{B}$, or $\mathsf{T}$, from which the consensus cannot change. We describe a protocol that solves this problem using $O(\log n)$ states ($\log \log n + O(1)$ bits of memory) and optimal expected time $O(\log n)$. The number of states $O(\log n)$ is known to be optimal for the class of polylogarithmic time stable protocols that are "output dominant" and "monotone". These are two natural constraints satisfied by our protocol, making it simultaneously time- and state-optimal for that class. We introduce a key technique called a "fixed resolution clock" to achieve partial synchronization. Our protocol is *nonuniform*: the transition function has the value $\left \lceil {\log n} \right \rceil$ encoded in it. We show that the protocol can be modified to be uniform, while increasing the state complexity to $\Theta(\log n \log \log n)$.
We study sparse linear regression over a network of agents, modeled as an undirected graph and no server node. The estimation of the $s$-sparse parameter is formulated as a constrained LASSO problem wherein each agent owns a subset of the $N$ total observations. We analyze the convergence rate and statistical guarantees of a distributed projected gradient tracking-based algorithm under high-dimensional scaling, allowing the ambient dimension $d$ to grow with (and possibly exceed) the sample size $N$. Our theory shows that, under standard notions of restricted strong convexity and smoothness of the loss functions, suitable conditions on the network connectivity and algorithm tuning, the distributed algorithm converges globally at a {\it linear} rate to an estimate that is within the centralized {\it statistical precision} of the model, $O(s\log d/N)$. When $s\log d/N=o(1)$, a condition necessary for statistical consistency, an $\varepsilon$-optimal solution is attained after $\mathcal{O}(\kappa \log (1/\varepsilon))$ gradient computations and $O (\kappa/(1-\rho) \log (1/\varepsilon))$ communication rounds, where $\kappa$ is the restricted condition number of the loss function and $\rho$ measures the network connectivity. The computation cost matches that of the centralized projected gradient algorithm despite having data distributed; whereas the communication rounds reduce as the network connectivity improves. Overall, our study reveals interesting connections between statistical efficiency, network connectivity \& topology, and convergence rate in high dimensions.
A determinantal point process (DPP) on a collection of $M$ items is a model, parameterized by a symmetric kernel matrix, that assigns a probability to every subset of those items. Recent work shows that removing the kernel symmetry constraint, yielding nonsymmetric DPPs (NDPPs), can lead to significant predictive performance gains for machine learning applications. However, existing work leaves open the question of scalable NDPP sampling. There is only one known DPP sampling algorithm, based on Cholesky decomposition, that can directly apply to NDPPs as well. Unfortunately, its runtime is cubic in $M$, and thus does not scale to large item collections. In this work, we first note that this algorithm can be transformed into a linear-time one for kernels with low-rank structure. Furthermore, we develop a scalable sublinear-time rejection sampling algorithm by constructing a novel proposal distribution. Additionally, we show that imposing certain structural constraints on the NDPP kernel enables us to bound the rejection rate in a way that depends only on the kernel rank. In our experiments we compare the speed of all of these samplers for a variety of real-world tasks.
We prove upper and lower bounds on the minimal spherical dispersion, improving upon previous estimates obtained by Rote and Tichy [Spherical dispersion with an application to polygonal approximation of curves, Anz. \"Osterreich. Akad. Wiss. Math.-Natur. Kl. 132 (1995), 3--10]. In particular, we see that the inverse $N(\varepsilon,d)$ of the minimal spherical dispersion is, for fixed $\varepsilon>0$, linear in the dimension $d$ of the ambient space. We also derive upper and lower bounds on the expected dispersion for points chosen independently and uniformly at random from the Euclidean unit sphere. In terms of the corresponding inverse $\widetilde{N}(\varepsilon,d)$, our bounds are optimal with respect to the dependence on $\varepsilon$.
Study of the interaction between computation and society often focuses on how researchers model social and physical systems in order to specify problems and propose solutions. However, the social effects of computing can depend just as much on obscure and opaque technical caveats, choices, and qualifiers. These artifacts are products of the particular algorithmic techniques and theory applied to solve a problem once it has been modeled, and their nature can imperil thorough sociotechnical scrutiny of the often discretionary decisions made to manage them. We describe three classes of objects used to encode these choices and qualifiers: heuristic models, assumptions, and parameters, and discuss selection of the last for differential privacy as an illustrative example. We raise six reasons these objects may be hazardous to comprehensive analysis of computing and argue they deserve deliberate consideration as researchers explain scientific work.
We study the problem of learning in the stochastic shortest path (SSP) setting, where an agent seeks to minimize the expected cost accumulated before reaching a goal state. We design a novel model-based algorithm EB-SSP that carefully skews the empirical transitions and perturbs the empirical costs with an exploration bonus to guarantee both optimism and convergence of the associated value iteration scheme. We prove that EB-SSP achieves the minimax regret rate $\widetilde{O}(B_{\star} \sqrt{S A K})$, where $K$ is the number of episodes, $S$ is the number of states, $A$ is the number of actions and $B_{\star}$ bounds the expected cumulative cost of the optimal policy from any state, thus closing the gap with the lower bound. Interestingly, EB-SSP obtains this result while being parameter-free, i.e., it does not require any prior knowledge of $B_{\star}$, nor of $T_{\star}$ which bounds the expected time-to-goal of the optimal policy from any state. Furthermore, we illustrate various cases (e.g., positive costs, or general costs when an order-accurate estimate of $T_{\star}$ is available) where the regret only contains a logarithmic dependence on $T_{\star}$, thus yielding the first horizon-free regret bound beyond the finite-horizon MDP setting.
We propose a new method of estimation in topic models, that is not a variation on the existing simplex finding algorithms, and that estimates the number of topics K from the observed data. We derive new finite sample minimax lower bounds for the estimation of A, as well as new upper bounds for our proposed estimator. We describe the scenarios where our estimator is minimax adaptive. Our finite sample analysis is valid for any number of documents (n), individual document length (N_i), dictionary size (p) and number of topics (K), and both p and K are allowed to increase with n, a situation not handled well by previous analyses. We complement our theoretical results with a detailed simulation study. We illustrate that the new algorithm is faster and more accurate than the current ones, although we start out with a computational and theoretical disadvantage of not knowing the correct number of topics K, while we provide the competing methods with the correct value in our simulations.