The classical $k$-means clustering requires a complete data matrix without missing entries. As a natural extension of the $k$-means clustering for missing data, the $k$-POD clustering has been proposed, which ignores the missing entries in the $k$-means clustering. This paper shows the inconsistency of the $k$-POD clustering even under the missing completely at random mechanism. More specifically, the expected loss of the $k$-POD clustering can be represented as the weighted sum of the expected $k$-means losses with parts of variables. Thus, the $k$-POD clustering converges to the different clustering from the $k$-means clustering as the sample size goes to infinity. This result indicates that although the $k$-means clustering works well, the $k$-POD clustering may fail to capture the hidden cluster structure. On the other hand, for high-dimensional data, the $k$-POD clustering could be a suitable choice when the missing rate in each variable is low.
We investigate shift-invariant vectorial Boolean functions on $n$ bits that are induced from Boolean functions on $k$ bits, for $k\leq n$. We consider such functions that are not necessarily permutations, but are, in some sense, almost bijective, and their cryptographic properties. In this context, we define an almost lifting as a Boolean function for which there is an upper bound on the number of collisions of its induced functions that does not depend on $n$. We show that if a Boolean function with diameter $k$ is an almost lifting, then the maximum number of collisions of its induced functions is $2^{k-1}$ for any $n$. Moreover, we search for functions in the class of almost liftings that have good cryptographic properties and for which the non-bijectivity does not cause major security weaknesses. These functions generalize the well-known map $\chi$ used in the Keccak hash function.
We improve bounds on the degree and sparsity of Boolean functions representing the Legendre symbol as well as on the $N$th linear complexity of the Legendre sequence. We also prove similar results for both the Liouville function for integers and its analog for polynomials over $\mathbb{F}_2$, or more general for any (binary) arithmetic function which satisfies $f(2n)=-f(n)$ for $n=1,2,\ldots$
We explore the theoretical possibility of learning $d$-dimensional targets with $W$-parameter models by gradient flow (GF) when $W<d$. Our main result shows that if the targets are described by a particular $d$-dimensional probability distribution, then there exist models with as few as two parameters that can learn the targets with arbitrarily high success probability. On the other hand, we show that for $W<d$ there is necessarily a large subset of GF-non-learnable targets. In particular, the set of learnable targets is not dense in $\mathbb R^d$, and any subset of $\mathbb R^d$ homeomorphic to the $W$-dimensional sphere contains non-learnable targets. Finally, we observe that the model in our main theorem on almost guaranteed two-parameter learning is constructed using a hierarchical procedure and as a result is not expressible by a single elementary function. We show that this limitation is essential in the sense that most models written in terms of elementary functions cannot achieve the learnability demonstrated in this theorem.
QAC$^0$ is the class of constant-depth quantum circuits with polynomially many ancillary qubits, where Toffoli gates on arbitrarily many qubits are allowed. In this work, we show that the parity function cannot be computed in QAC$^0$, resolving a long-standing open problem in quantum circuit complexity more than twenty years old. As a result, this proves ${\rm QAC}^0 \subsetneqq {\rm QAC}_{\rm wf}^0$. We also show that any QAC circuit of depth $d$ that approximately computes parity on $n$ bits requires $2^{\widetilde{\Omega}(n^{1/d})}$ ancillary qubits, which is close to tight. This implies a similar lower bound on approximately preparing cat states using QAC circuits. Finally, we prove a quantum analog of the Linial-Mansour-Nisan theorem for QAC$^0$. This implies that, for any QAC$^0$ circuit $U$ with $a={\rm poly}(n)$ ancillary qubits, and for any $x\in\{0,1\}^n$, the correlation between $Q(x)$ and the parity function is bounded by ${1}/{2} + 2^{-\widetilde{\Omega}(n^{1/d})}$, where $Q(x)$ denotes the output of measuring the output qubit of $U|x,0^a\rangle$. All the above consequences rely on the following technical result. If $U$ is a QAC$^0$ circuit with $a={\rm poly}(n)$ ancillary qubits, then there is a distribution $\mathcal{D}$ of bounded polynomials of degree polylog$(n)$ such that with high probability, a random polynomial from $\mathcal{D}$ approximates the function $\langle x,0^a| U^\dag Z_{n+1} U |x,0^a\rangle$ for a large fraction of $x\in \{0,1\}^n$. This result is analogous to the Razborov-Smolensky result on the approximation of AC$^0$ circuits by random low-degree polynomials.
In decision-making, maxitive functions are used for worst-case and best-case evaluations. Maxitivity gives rise to a rich structure that is well-studied in the context of the pointwise order. In this article, we investigate maxitivity with respect to general preorders and provide a representation theorem for such functionals. The results are illustrated for different stochastic orders in the literature, including the usual stochastic order, the increasing convex/concave order, and the dispersive order.
We consider linear models with scalar responses and covariates from a separable Hilbert space. The aim is to detect change points in the error distribution, based on sequential residual empirical distribution functions. Expansions for those estimated functions are more challenging in models with infinite-dimensional covariates than in regression models with scalar or vector-valued covariates due to a slower rate of convergence of the parameter estimators. Yet the suggested change point test is asymptotically distribution-free and consistent for one-change point alternatives. In the latter case we also show consistency of a change point estimator.
We present a new $hp$-version space-time discontinuous Galerkin (dG) finite element method for the numerical approximation of parabolic evolution equations on general spatial meshes consisting of polygonal/polyhedral (polytopic) elements, giving rise to prismatic space-time elements. A key feature of the proposed method is the use of space-time elemental polynomial bases of \emph{total} degree, say $p$, defined in the physical coordinate system, as opposed to standard dG-time-stepping methods whereby spatial elemental bases are tensorized with temporal basis functions. This approach leads to a fully discrete $hp$-dG scheme using less degrees of freedom for each time step, compared to standard dG time-stepping schemes employing tensorized space-time, with acceptable deterioration of the approximation properties. A second key feature of the new space-time dG method is the incorporation of very general spatial meshes consisting of possibly polygonal/polyhedral elements with \emph{arbitrary} number of faces. A priori error bounds are shown for the proposed method in various norms. An extensive comparison among the new space-time dG method, the (standard) tensorized space-time dG methods, the classical dG-time-stepping, and conforming finite element method in space, is presented in a series of numerical experiments.
The problem of identifying the satisfiability threshold of random $3$-SAT formulas has received a lot of attention during the last decades and has inspired the study of other threshold phenomena in random combinatorial structures. The classical assumption in this line of research is that, for a given set of $n$ Boolean variables, each clause is drawn uniformly at random among all sets of three literals from these variables, independently from other clauses. Here, we keep the uniform distribution of each clause, but deviate significantly from the independence assumption and consider richer families of probability distributions. For integer parameters $n$, $m$, and $k$, we denote by $\DistFamily_k(n,m)$ the family of probability distributions that produce formulas with $m$ clauses, each selected uniformly at random from all sets of three literals from the $n$ variables, so that the clauses are $k$-wise independent. Our aim is to make general statements about the satisfiability or unsatisfiability of formulas produced by distributions in $\DistFamily_k(n,m)$ for different values of the parameters $n$, $m$, and $k$.
We propose a novel, highly efficient, second-order accurate, long-time unconditionally stable numerical scheme for a class of finite-dimensional nonlinear models that are of importance in geophysical fluid dynamics. The scheme is highly efficient in the sense that only a (fixed) symmetric positive definite linear problem (with varying right hand sides) is involved at each time-step. The solutions to the scheme are uniformly bounded for all time. We show that the scheme is able to capture the long-time dynamics of the underlying geophysical model, with the global attractors as well as the invariant measures of the scheme converge to those of the original model as the step size approaches zero. In our numerical experiments, we take an indirect approach, using long-term statistics to approximate the invariant measures. Our results suggest that the convergence rate of the long-term statistics, as a function of terminal time, is approximately first order using the Jensen-Shannon metric and half-order using the L1 metric. This implies that very long time simulation is needed in order to capture a few significant digits of long time statistics (climate) correct. Nevertheless, the second order scheme's performance remains superior to that of the first order one, requiring significantly less time to reach a small neighborhood of statistical equilibrium for a given step size.
We develop two novel couplings between general pure-jump L\'evy processes in $\R^d$ and apply them to obtain upper bounds on the rate of convergence in an appropriate Wasserstein distance on the path space for a wide class of L\'evy processes attracted to a multidimensional stable process in the small-time regime. We also establish general lower bounds based on certain universal properties of slowly varying functions and the relationship between the Wasserstein and Toscani--Fourier distances of the marginals. Our upper and lower bounds typically have matching rates. In particular, the rate of convergence is polynomial for the domain of normal attraction and slower than a slowly varying function for the domain of non-normal attraction.