{mayi_des}
The main theme of this paper is using $k$-dimensional generalizations of the combinatorial Boolean Matrix Multiplication (BMM) hypothesis and the closely-related Online Matrix Vector Multiplication (OMv) hypothesis to prove new tight conditional lower bounds for dynamic problems. The combinatorial $k$-Clique hypothesis, which is a standard hypothesis in the literature, naturally generalizes the combinatorial BMM hypothesis. In this paper, we prove tight lower bounds for several dynamic problems under the combinatorial $k$-Clique hypothesis. For instance, we show that: * The Dynamic Range Mode problem has no combinatorial algorithms with $\mathrm{poly}(n)$ pre-processing time, $O(n^{2/3-\epsilon})$ update time and $O(n^{2/3-\epsilon})$ query time for any $\epsilon > 0$, matching the known upper bounds for this problem. Previous lower bounds only ruled out algorithms with $O(n^{1/2-\epsilon})$ update and query time under the OMv hypothesis. Other examples include tight combinatorial lower bounds for Dynamic Subgraph Connectivity, Dynamic 2D Orthogonal Range Color Counting, Dynamic 2-Pattern Document Retrieval, and Dynamic Range Mode in higher dimensions. Furthermore, we propose the OuMv$_k$ hypothesis as a natural generalization of the OMv hypothesis. Under this hypothesis, we prove tight lower bounds for various dynamic problems. For instance, we show that: * The Dynamic Skyline Points Counting problem in $(2k-1)$-dimensional space has no algorithm with $\mathrm{poly}(n)$ pre-processing time and $O(n^{1-1/k-\epsilon})$ update and query time for $\epsilon > 0$, even if the updates are semi-online. Other examples include tight conditional lower bounds for (semi-online) Dynamic Klee's measure for unit cubes, and high-dimensional generalizations of Erickson's problem and Langerman's problem.
We consider the question of adaptive data analysis within the framework of convex optimization. We ask how many samples are needed in order to compute $\epsilon$-accurate estimates of $O(1/\epsilon^2)$ gradients queried by gradient descent, and we provide two intermediate answers to this question. First, we show that for a general analyst (not necessarily gradient descent) $\Omega(1/\epsilon^3)$ samples are required. This rules out the possibility of a foolproof mechanism. Our construction builds upon a new lower bound (that may be of interest of its own right) for an analyst that may ask several non adaptive questions in a batch of fixed and known $T$ rounds of adaptivity and requires a fraction of true discoveries. We show that for such an analyst $\Omega (\sqrt{T}/\epsilon^2)$ samples are necessary. Second, we show that, under certain assumptions on the oracle, in an interaction with gradient descent $\tilde \Omega(1/\epsilon^{2.5})$ samples are necessary. Our assumptions are that the oracle has only \emph{first order access} and is \emph{post-hoc generalizing}. First order access means that it can only compute the gradients of the sampled function at points queried by the algorithm. Our assumption of \emph{post-hoc generalization} follows from existing lower bounds for statistical queries. More generally then, we provide a generic reduction from the standard setting of statistical queries to the problem of estimating gradients queried by gradient descent. These results are in contrast with classical bounds that show that with $O(1/\epsilon^2)$ samples one can optimize the population risk to accuracy of $O(\epsilon)$ but, as it turns out, with spurious gradients.
In this paper we get error bounds for fully discrete approximations of infinite horizon problems via the dynamic programming approach. It is well known that considering a time discretization with a positive step size $h$ an error bound of size $h$ can be proved for the difference between the value function (viscosity solution of the Hamilton-Jacobi-Bellman equation corresponding to the infinite horizon) and the value function of the discrete time problem. However, including also a spatial discretization based on elements of size $k$ an error bound of size $O(k/h)$ can be found in the literature for the error between the value functions of the continuous problem and the fully discrete problem. In this paper we revise the error bound of the fully discrete method and prove, under similar assumptions to those of the time discrete case, that the error of the fully discrete case is in fact $O(h+k)$ which gives first order in time and space for the method. This error bound matches the numerical experiments of many papers in the literature in which the behaviour $1/h$ from the bound $O(k/h)$ have not been observed.
Given a set $P$ of $n$ points in the plane, the $k$-center problem is to find $k$ congruent disks of minimum possible radius such that their union covers all the points in $P$. The $2$-center problem is a special case of the $k$-center problem that has been extensively studied in the recent past \cite{CAHN,HT,SH}. In this paper, we consider a generalized version of the $2$-center problem called \textit{proximity connected} $2$-center (PCTC) problem. In this problem, we are also given a parameter $\delta\geq 0$ and we have the additional constraint that the distance between the centers of the disks should be at most $\delta$. Note that when $\delta=0$, the PCTC problem is reduced to the $1$-center(minimum enclosing disk) problem and when $\delta$ tends to infinity, it is reduced to the $2$-center problem. The PCTC problem first appeared in the context of wireless networks in 1992 \cite{ACN0}, but obtaining a nontrivial deterministic algorithm for the problem remained open. In this paper, we resolve this open problem by providing a deterministic $O(n^2\log n)$ time algorithm for the problem.
We study the problem of testing whether a function $f: \mathbb{R}^n \to \mathbb{R}$ is a polynomial of degree at most $d$ in the \emph{distribution-free} testing model. Here, the distance between functions is measured with respect to an unknown distribution $\mathcal{D}$ over $\mathbb{R}^n$ from which we can draw samples. In contrast to previous work, we do not assume that $\mathcal{D}$ has finite support. We design a tester that given query access to $f$, and sample access to $\mathcal{D}$, makes $(d/\varepsilon)^{O(1)}$ many queries to $f$, accepts with probability $1$ if $f$ is a polynomial of degree $d$, and rejects with probability at least $2/3$ if every degree-$d$ polynomial $P$ disagrees with $f$ on a set of mass at least $\varepsilon$ with respect to $\mathcal{D}$. Our result also holds under mild assumptions when we receive only a polynomial number of bits of precision for each query to $f$, or when $f$ can only be queried on rational points representable using a logarithmic number of bits. Along the way, we prove a new stability theorem for multivariate polynomials that may be of independent interest.
We study the distributed minimum spanning tree (MST) problem, a fundamental problem in distributed computing. It is well-known that distributed MST can be solved in $\tilde{O}(D+\sqrt{n})$ rounds in the standard CONGEST model (where $n$ is the network size and $D$ is the network diameter) and this is essentially the best possible round complexity (up to logarithmic factors). However, in resource-constrained networks such as ad hoc wireless and sensor networks, nodes spending so much time can lead to significant spending of resources such as energy. Motivated by the above consideration, we study distributed algorithms for MST under the \emph{sleeping model} [Chatterjee et al., PODC 2020], a model for design and analysis of resource-efficient distributed algorithms. In the sleeping model, a node can be in one of two modes in any round -- \emph{sleeping} or \emph{awake} (unlike the traditional model where nodes are always awake). Only the rounds in which a node is \emph{awake} are counted, while \emph{sleeping} rounds are ignored. A node spends resources only in the awake rounds and hence the main goal is to minimize the \emph{awake complexity} of a distributed algorithm, the worst-case number of rounds any node is awake. We present deterministic and randomized distributed MST algorithms that have an \emph{optimal} awake complexity of $O(\log n)$ time with a matching lower bound. We also show that our randomized awake-optimal algorithm has essentially the best possible round complexity by presenting a lower bound of $\tilde{\Omega}(n)$ on the product of the awake and round complexity of any distributed algorithm (including randomized) that outputs an MST, where $\tilde{\Omega}$ hides a $1/(\text{polylog } n)$ factor.
Computing a dense subgraph is a fundamental problem in graph mining, with a diverse set of applications ranging from electronic commerce to community detection in social networks. In many of these applications, the underlying context is better modelled as a weighted hypergraph that keeps evolving with time. This motivates the problem of maintaining the densest subhypergraph of a weighted hypergraph in a {\em dynamic setting}, where the input keeps changing via a sequence of updates (hyperedge insertions/deletions). Previously, the only known algorithm for this problem was due to Hu et al. [HWC17]. This algorithm worked only on unweighted hypergraphs, and had an approximation ratio of $(1+\epsilon)r^2$ and an update time of $O(\text{poly} (r, \log n))$, where $r$ denotes the maximum rank of the input across all the updates. We obtain a new algorithm for this problem, which works even when the input hypergraph is weighted. Our algorithm has a significantly improved (near-optimal) approximation ratio of $(1+\epsilon)$ that is independent of $r$, and a similar update time of $O(\text{poly} (r, \log n))$. It is the first $(1+\epsilon)$-approximation algorithm even for the special case of weighted simple graphs. To complement our theoretical analysis, we perform experiments with our dynamic algorithm on large-scale, real-world data-sets. Our algorithm significantly outperforms the state of the art [HWC17] both in terms of accuracy and efficiency.
Let $X^{(n)}$ be an observation sampled from a distribution $P_{\theta}^{(n)}$ with an unknown parameter $\theta,$ $\theta$ being a vector in a Banach space $E$ (most often, a high-dimensional space of dimension $d$). We study the problem of estimation of $f(\theta)$ for a functional $f:E\mapsto {\mathbb R}$ of some smoothness $s>0$ based on an observation $X^{(n)}\sim P_{\theta}^{(n)}.$ Assuming that there exists an estimator $\hat \theta_n=\hat \theta_n(X^{(n)})$ of parameter $\theta$ such that $\sqrt{n}(\hat \theta_n-\theta)$ is sufficiently close in distribution to a mean zero Gaussian random vector in $E,$ we construct a functional $g:E\mapsto {\mathbb R}$ such that $g(\hat \theta_n)$ is an asymptotically normal estimator of $f(\theta)$ with $\sqrt{n}$ rate provided that $s>\frac{1}{1-\alpha}$ and $d\leq n^{\alpha}$ for some $\alpha\in (0,1).$ We also derive general upper bounds on Orlicz norm error rates for estimator $g(\hat \theta)$ depending on smoothness $s,$ dimension $d,$ sample size $n$ and the accuracy of normal approximation of $\sqrt{n}(\hat \theta_n-\theta).$ In particular, this approach yields asymptotically efficient estimators in some high-dimensional exponential models.
Generalized pair weights of linear codes are generalizations of minimum symbol-pair weights, which were introduced by Liu and Pan \cite{LP} recently. Generalized pair weights can be used to characterize the ability of protecting information in the symbol-pair read wire-tap channels of type II. In this paper, we introduce the notion of generalized $b$-symbol weights of linear codes over finite fields, which is a generalization of generalized Hamming weights and generalized pair weights. We obtain some basic properties and bounds of generalized $b$-symbol weights which are called Singleton-like bounds for generalized $b$-symbol weights. As examples, we calculate generalized weight matrices for simplex codes and Hamming codes. We provide a necessary and sufficient condition for a linear code to be a $b$-symbol MDS code by using the generator matrix and the parity check matrix of this linear code. Finally, a necessary and sufficient condition of a linear isomorphism preserving $b$-symbol weights between two linear codes is obtained. As a corollary, we get the classical MacWilliams extension theorem when $b=1$.
Universal coding of integers~(UCI) is a class of variable-length code, such that the ratio of the expected codeword length to $\max\{1,H(P)\}$ is within a constant factor, where $H(P)$ is the Shannon entropy of the decreasing probability distribution $P$. However, if we consider the ratio of the expected codeword length to $H(P)$, the ratio tends to infinity by using UCI, when $H(P)$ tends to zero. To solve this issue, this paper introduces a class of codes, termed generalized universal coding of integers~(GUCI), such that the ratio of the expected codeword length to $H(P)$ is within a constant factor $K$. First, the definition of GUCI is proposed and the coding structure of GUCI is introduced. Next, we propose a class of GUCI $\mathcal{C}$ to achieve the expansion factor $K_{\mathcal{C}}=2$ and show that the optimal GUCI is in the range $1\leq K_{\mathcal{C}}^{*}\leq 2$. Then, by comparing UCI and GUCI, we show that when the entropy is very large or $P(0)$ is not large, there are also cases where the average codeword length of GUCI is shorter. Finally, the asymptotically optimal GUCI is presented.
Binding operation is fundamental to many cognitive processes, such as cognitive map formation, relational reasoning, and language comprehension. In these processes, two different modalities, such as location and objects, events and their contextual cues, and words and their roles, need to be bound together, but little is known about the underlying neural mechanisms. Previous works introduced a binding model based on quadratic functions of bound pairs, followed by vector summation of multiple pairs. Based on this framework, we address following questions: Which classes of quadratic matrices are optimal for decoding relational structures? And what is the resultant accuracy? We introduce a new class of binding matrices based on a matrix representation of octonion algebra, an eight-dimensional extension of complex numbers. We show that these matrices enable a more accurate unbinding than previously known methods when a small number of pairs are present. Moreover, numerical optimization of a binding operator converges to this octonion binding. We also show that when there are a large number of bound pairs, however, a random quadratic binding performs as well as the octonion and previously-proposed binding methods. This study thus provides new insight into potential neural mechanisms of binding operations in the brain.