Consider the communication-constrained estimation of discrete distributions under $\ell^p$ losses, where each distributed terminal holds multiple independent samples and uses limited number of bits to describe the samples. We obtain the minimax optimal rates of the problem in most parameter regimes. An elbow effect of the optimal rates at $p=2$ is clearly identified. To show the optimal rates, we first design estimation protocols to achieve them. The key ingredient of these protocols is to introduce adaptive refinement mechanisms, which first generate rough estimate by partial information and then establish refined estimate in subsequent steps guided by the rough estimate. The protocols leverage successive refinement, sample compression, thresholding and random hashing methods to achieve the optimal rates in different parameter regimes. The optimality of the protocols is shown by deriving compatible minimax lower bounds.
Let $p$ be an odd prime. In this paper, we have determined the Hamming distances for constacyclic codes of length $2p^s$ over the finite commutative non-chain ring $\mathcal{R}=\frac{\mathbb{F}_{p^m}[u, v]}{\langle u^2, v^2, uv-vu\rangle}$. Also their symbol-pair distances are completely obtained.
In this paper, we develop an iterative method, based on the Bartels-Stewart algorithm to solve $N$-dimensional matrix equations, that relies on the Schur decomposition of the matrices involved. We remark that, unlike other possible implementations of that algorithm, ours avoids recursivity, and makes an efficient use of the available computational resources, which enables considering arbitrarily large problems, up to the size of the available memory. In this respect, we have successfully solved matrix equations in up to $N = 29$ dimensions. We explain carefully all the steps, and calculate accurately the computational cost required. Furthermore, in order to ease the understanding, we offer both pseudocodes and full Matlab codes, and special emphasis is put on making the implementation of the method completely independent from the number of dimensions. As an important application, the method allows to compute the solution of linear $N$-dimensional systems of ODEs of constant coefficients at any time $t$, and, hence, of evolutionary PDEs, after discretizing the spatial derivatives by means of matrices. In this regard, we are able to compute with great accuracy the solution of an advection-diffusion equation on $\mathbb R^N$.
For a prime $p$, a restricted arithmetic progression in $\mathbb{F}_p^n$ is a triplet of vectors $x, x+a, x+2a$ in which the common difference $a$ is a non-zero element from $\{0,1,2\}^n$. What is the size of the largest $A\subseteq \mathbb{F}_p^n$ that is free of restricted arithmetic progressions? We show that the density of any such a set is at most $\frac{C}{(\log\log\log n)^c}$, where $c,C>0$ depend only on $p$, giving the first reasonable bounds for the density of such sets. Previously, the best known bound was $O(1/\log^{*} n)$, which follows from the density Hales-Jewett theorem.
The number of independent sets in regular bipartite expander graphs can be efficiently approximated by expressing it as the partition function of a suitable polymer model and truncating its cluster expansion. While this approach has been extensively used for graphs, surprisingly little is known about analogous questions in the context of hypergraphs. In this work, we apply this method to asymptotically determine the number of independent sets in regular $k$-partite $k$-uniform hypergraphs which satisfy natural expansion properties. The resulting formula depends only on the local structure of the hypergraph, making it computationally efficient. In particular, we provide a simple closed-form expression for linear hypergraphs.
We characterise the behaviour of the maximum Diaconis-Ylvisaker prior penalized likelihood estimator in high-dimensional logistic regression, where the number of covariates is a fraction $\kappa \in (0,1)$ of the number of observations $n$, as $n \to \infty$. We derive the estimator's aggregate asymptotic behaviour under this proportional asymptotic regime, when covariates are independent normal random variables with mean zero and the linear predictor has asymptotic variance $\gamma^2$. From this foundation, we devise adjusted $Z$-statistics, penalized likelihood ratio statistics, and aggregate asymptotic results with arbitrary covariate covariance. While the maximum likelihood estimate asymptotically exists only for a narrow range of $(\kappa, \gamma)$ values, the maximum Diaconis-Ylvisaker prior penalized likelihood estimate not only exists always but is also directly computable using maximum likelihood routines. Thus, our asymptotic results also hold for $(\kappa, \gamma)$ values where results for maximum likelihood are not attainable, with no overhead in implementation or computation. We study the estimator's shrinkage properties, compare it to alternative estimation methods that can operate with proportional asymptotics, and present procedures for the estimation of unknown constants that describe the asymptotic behaviour of our estimator. We also provide a conjecture about the behaviour of our estimator when an intercept parameter is present in the model. We present results from extensive numerical studies to demonstrate the theoretical advances and strong evidence to support the conjecture, and illustrate the methodology we put forward through the analysis of a real-world data set on digit recognition.
Clustering is one of the staples of data analysis and unsupervised learning. As such, clustering algorithms are often used on massive data sets, and they need to be extremely fast. We focus on the Euclidean $k$-median and $k$-means problems, two of the standard ways to model the task of clustering. For these, the go-to algorithm is $k$-means++, which yields an $O(\log k)$-approximation in time $\tilde O(nkd)$. While it is possible to improve either the approximation factor [Lattanzi and Sohler, ICML19] or the running time [Cohen-Addad et al., NeurIPS 20], it is unknown how precise a linear-time algorithm can be. In this paper, we almost answer this question by presenting an almost linear-time algorithm to compute a constant-factor approximation.
Intelligent robots need to interact with diverse objects across various environments. The appearance and state of objects frequently undergo complex transformations depending on the object properties, e.g., phase transitions. However, in the vision community, segmenting dynamic objects with phase transitions is overlooked. In light of this, we introduce the concept of phase in segmentation, which categorizes real-world objects based on their visual characteristics and potential morphological and appearance changes. Then, we present a new benchmark, Multi-Phase, Multi-Transition, and Multi-Scenery Video Object Segmentation (M3-VOS), to verify the ability of models to understand object phases, which consists of 479 high-resolution videos spanning over 10 distinct everyday scenarios. It provides dense instance mask annotations that capture both object phases and their transitions. We evaluate state-of-the-art methods on M3-VOS, yielding several key insights. Notably, current appearance based approaches show significant room for improvement when handling objects with phase transitions. The inherent changes in disorder suggest that the predictive performance of the forward entropy-increasing process can be improved through a reverse entropy-reducing process. These findings lead us to propose ReVOS, a new plug-and-play model that improves its performance by reversal refinement. Our data and code will be publicly available
Tensor data are multi-dimension arrays. Low-rank decomposition-based regression methods with tensor predictors exploit the structural information in tensor predictors while significantly reducing the number of parameters in tensor regression. We propose a method named NA$_0$CT$^2$ (Noise Augmentation for $\ell_0$ regularization on Core Tensor in Tucker decomposition) to regularize the parameters in tensor regression (TR), coupled with Tucker decomposition. We establish theoretically that NA$_0$CT$^2$ achieves exact $\ell_0$ regularization on the core tensor from the Tucker decomposition in linear TR and generalized linear TR. To our knowledge, NA$_0$CT$^2$ is the first Tucker decomposition-based regularization method in TR to achieve $\ell_0$ in core tensors. NA$_0$CT$^2$ is implemented through an iterative procedure and involves two straightforward steps in each iteration -- generating noisy data based on the core tensor from the Tucker decomposition of the updated parameter estimate and running a regular GLM on noise-augmented data on vectorized predictors. We demonstrate the implementation of NA$_0$CT$^2$ and its $\ell_0$ regularization effect in both simulation studies and real data applications. The results suggest that NA$_0$CT$^2$ can improve predictions compared to other decomposition-based TR approaches, with or without regularization and it identifies important predictors though not designed for that purpose.
We introduce and study a purely syntactic notion of lax cones and $(\infty,\infty)$-limits on finite computads in \texttt{CaTT}, a type theory for $(\infty,\infty)$-categories due to Finster and Mimram. Conveniently, finite computads are precisely the contexts in \texttt{CaTT}. We define a cone over a context to be a context, which is obtained by induction over the list of variables of the underlying context. In the case where the underlying context is globular we give an explicit description of the cone and conjecture that an analogous description continues to hold also for general contexts. We use the cone to control the types of the term constructors for the universal cone. The implementation of the universal property follows a similar line of ideas. Starting with a cone as a context, a set of context extension rules produce a context with the shape of a transfor between cones, i.e.~a higher morphism between cones. As in the case of cones, we use this context as a template to control the types of the term constructor required for universal property.
In general $n$-dimensional simplicial meshes, we propose a family of interior penalty nonconforming finite element methods for $2m$-th order partial differential equations, where $m \geq 0$ and $n \geq 1$. For this family of nonconforming finite elements, the shape function space consists of polynomials with a degree not greater than $m$, making it minimal. This family of finite element spaces exhibits natural inclusion properties, analogous to those in the corresponding Sobolev spaces in the continuous case. By applying interior penalty to the bilinear form, we establish quasi-optimal error estimates in the energy norm. Due to the weak continuity of the nonconforming finite element spaces, the interior penalty terms in the bilinear form take a simple form, and an interesting property is that the penalty parameter needs only to be a positive constant of $\mathcal{O}(1)$. These theoretical results are further validated by numerical tests.