We propose two novel unbiased estimators of the integral $\int_{[0,1]^{s}}f(u) du$ for a function $f$, which depend on a smoothness parameter $r\in\mathbb{N}$. The first estimator integrates exactly the polynomials of degrees $p<r$ and achieves the optimal error $n^{-1/2-r/s}$ (where $n$ is the number of evaluations of $f$) when $f$ is $r$ times continuously differentiable. The second estimator is computationally cheaper but it is restricted to functions that vanish on the boundary of $[0,1]^s$. The construction of the two estimators relies on a combination of cubic stratification and control ariates based on numerical derivatives. We provide numerical evidence that they show good performance even for moderate values of $n$.
For a sequence of Boolean functions $f_n : \{-1,1\}^{V_n} \longrightarrow \{-1,1\}$, defined on increasing configuration spaces of random inputs, we say that there is sparse reconstruction if there is a sequence of subsets $U_n \subseteq V_n$ of the coordinates satisfying $|U_n| = o(|V_n|)$ such that knowing the coordinates in $U_n$ gives us a non-vanishing amount of information about the value of $f_n$. We first show that, if the underlying measure is a product measure, then no sparse reconstruction is possible for any sequence of transitive functions. We discuss the question in different frameworks, measuring information content in $L^2$ and with entropy. We also highlight some interesting connections with cooperative game theory. Beyond transitive functions, we show that the left-right crossing event for critical planar percolation on the square lattice does not admit sparse reconstruction either. Some of these results answer questions posed by Itai Benjamini.
In the \emph{graph matching} problem we observe two graphs $G,H$ and the goal is to find an assignment (or matching) between their vertices such that some measure of edge agreement is maximized. We assume in this work that the observed pair $G,H$ has been drawn from the correlated Wigner model -- a popular model for correlated weighted graphs -- where the entries of the adjacency matrices of $G$ and $H$ are independent Gaussians and each edge of $G$ is correlated with one edge of $H$ (determined by the unknown matching) with the edge correlation described by a parameter $\sigma\in [0,1)$. In this paper, we analyse the performance of the \emph{projected power method} (PPM) as a \emph{seeded} graph matching algorithm where we are given an initial partially correct matching (called the seed) as side information. We prove that if the seed is close enough to the ground-truth matching, then with high probability, PPM iteratively improves the seed and recovers the ground-truth matching (either partially or exactly) in $\mathcal{O}(\log n)$ iterations. Our results prove that PPM works even in regimes of constant $\sigma$, thus extending the analysis in \citep{MaoRud} for the sparse Erd\H{o}s-R\'enyi model to the (dense) Wigner model. As a byproduct of our analysis, we see that the PPM framework generalizes some of the state-of-art algorithms for seeded graph matching. We support and complement our theoretical findings with numerical experiments on synthetic data.
Given samples from two non-negative random variables, we propose a new class of nonparametric tests for the null hypothesis that one random variable dominates the other with respect to second-order stochastic dominance. These tests are based on the Lorenz P-P plot (LPP), which is the composition between the inverse unscaled Lorenz curve of one distribution and the unscaled Lorenz curve of the other. The LPP exceeds the identity function if and only if the dominance condition is violated, providing a rather simple method to construct test statistics, given by functionals defined over the difference between the identity and the LPP. We determine a stochastic upper bound for such test statistics under the null hypothesis, and derive its limit distribution, to be approximated via bootstrap procedures. We also establish the asymptotic validity of the tests under relatively mild conditions, allowing for both dependent and independent samples. Finally, finite sample properties are investigated through simulation studies.
The proper conflict-free chromatic number, $\chi_{pcf}(G)$, of a graph $G$ is the least $k$ such that $G$ has a proper $k$-coloring in which for each non-isolated vertex there is a color appearing exactly once among its neighbors. The proper odd chromatic number, $\chi_{o}(G)$, of $G$ is the least $k$ such that $G$ has a proper coloring in which for every non-isolated vertex there is a color appearing an odd number of times among its neighbors. We say that a graph class $\mathcal{G}$ is $\chi_{pcf}$-bounded ($\chi_{o}$-bounded) if there is a function $f$ such that $\chi_{pcf}(G) \leq f(\chi(G))$ ($\chi_{o}(G) \leq f(\chi(G))$) for every $G \in \mathcal{G}$. Caro et al. (2022) asked for classes that are linearly $\chi_{pcf}$-bounded ($\chi_{pcf}$-bounded), and as a starting point, they showed that every claw-free graph $G$ satisfies $\chi_{pcf}(G) \le 2\Delta(G)+1$, which implies $\chi_{pcf}(G) \le 4\chi(G)+1$. They also conjectured that any graph $G$ with $\Delta(G) \ge 3$ satisfies $\chi_{pcf}(G) \le \Delta(G)+1$. In this paper, we improve the bound for claw-free graphs to a nearly tight bound by showing that such a graph $G$ satisfies $\chi_{pcf}(G) \le \Delta(G)+6$, and even $\chi_{pcf}(G) \le \Delta(G)+4$ if it is a quasi-line graph. Moreover, we show that convex-round graphs and permutation graphs are linearly $\chi_{pcf}$-bounded. For these last two results, we prove a lemma that reduces the problem of deciding if a hereditary class is linearly $\chi_{pcf}$-bounded to deciding if the bipartite graphs in the class are $\chi_{pcf}$-bounded by an absolute constant. This lemma complements a theorem of Liu (2022) and motivates us to further study boundedness in bipartite graphs. So among other results, we show that convex bipartite graphs are not $\chi_{o}$-bounded, and a class of bipartite circle graphs that is linearly $\chi_{o}$-bounded but not $\chi_{pcf}$-bounded.
Motivated by general probability theory, we say that the set $X$ in $\mathbb{R}^d$ is \emph{antipodal of rank $k$}, if for any $k+1$ elements $q_1,\ldots q_{k+1}\in X$, there is an affine map from $\mathrm{conv} X$ to the $k$-dimensional simplex $\Delta_k$ that maps $q_1,\ldots q_{k+1}$ onto the $k+1$ vertices of $\Delta_k$. For $k=1$, it coincides with the well-studied notion of (pairwise) antipodality introduced by Klee. We consider the following natural generalization of Klee's problem on antipodal sets: What is the maximum size of an antipodal set of rank $k$ in $\mathbb{R}^d$? We present a geometric characterization of antipodal sets of rank $k$ and adapting the argument of Danzer and Gr\"unbaum originally developed for the $k=1$ case, we prove an upper bound which is exponential in the dimension. We point out that this problem can be connected to a classical question in computer science on finding perfect hashes, and it provides a lower bound on the maximum size, which is also exponential in the dimension.
The theory of Koopman operators allows to deploy non-parametric machine learning algorithms to predict and analyze complex dynamical systems. Estimators such as principal component regression (PCR) or reduced rank regression (RRR) in kernel spaces can be shown to provably learn Koopman operators from finite empirical observations of the system's time evolution. Scaling these approaches to very long trajectories is a challenge and requires introducing suitable approximations to make computations feasible. In this paper, we boost the efficiency of different kernel-based Koopman operator estimators using random projections (sketching). We derive, implement and test the new "sketched" estimators with extensive experiments on synthetic and large-scale molecular dynamics datasets. Further, we establish non asymptotic error bounds giving a sharp characterization of the trade-offs between statistical learning rates and computational efficiency. Our empirical and theoretical analysis shows that the proposed estimators provide a sound and efficient way to learn large scale dynamical systems. In particular our experiments indicate that the proposed estimators retain the same accuracy of PCR or RRR, while being much faster.
We propose a new concept of codivergence, which quantifies the similarity between two probability measures $P_1, P_2$ relative to a reference probability measure $P_0$. In the neighborhood of the reference measure $P_0$, a codivergence behaves like an inner product between the measures $P_1 - P_0$ and $P_2 - P_0$. Codivergences of covariance-type and correlation-type are introduced and studied with a focus on two specific correlation-type codivergences, the $\chi^2$-codivergence and the Hellinger codivergence. We derive explicit expressions for several common parametric families of probability distributions. For a codivergence, we introduce moreover the divergence matrix as an analogue of the Gram matrix. It is shown that the $\chi^2$-divergence matrix satisfies a data-processing inequality.
$\operatorname{Holant}^*(f)$ denotes a class of counting problems specified by a constraint function $f$. We prove complexity dichotomy theorems for $\operatorname{Holant}^*(f)$ in two settings: (1) $f$ is any arity-3 real-valued function on input of domain size 3. (2) $f$ is any arity-3 $\{0,1\}$-valued function on input of domain size 4.
We investigate the approximation of functions $f$ on a bounded domain $\Omega\subset \mathbb{R}^d$ by the outputs of single-hidden-layer ReLU neural networks of width $n$. This form of nonlinear $n$-term dictionary approximation has been intensely studied since it is the simplest case of neural network approximation (NNA). There are several celebrated approximation results for this form of NNA that introduce novel model classes of functions on $\Omega$ whose approximation rates avoid the curse of dimensionality. These novel classes include Barron classes, and classes based on sparsity or variation such as the Radon-domain BV classes. The present paper is concerned with the definition of these novel model classes on domains $\Omega$. The current definition of these model classes does not depend on the domain $\Omega$. A new and more proper definition of model classes on domains is given by introducing the concept of weighted variation spaces. These new model classes are intrinsic to the domain itself. The importance of these new model classes is that they are strictly larger than the classical (domain-independent) classes. Yet, it is shown that they maintain the same NNA rates.
Let $F$ be a finite field, let $f$ be a function from $F$ to $F$, and let $a$ be a nonzero element of $F$. The discrete derivative of $f$ in direction $a$ is $\Delta_a f \colon F \to F$ with $(\Delta_a f)(x)=f(x+a)-f(x)$. The differential spectrum of $f$ is the multiset of cardinalities of all the fibers of all the derivatives $\Delta_a f$ as $a$ runs through $F^*$. The function $f$ is almost perfect nonlinear (APN) if the largest cardinality in the differential spectrum is $2$. Almost perfect nonlinear functions are of interest as cryptographic primitives. If $d$ is a positive integer, the power function over $F$ with exponent $d$ is the function $f \colon F \to F$ with $f(x)=x^d$ for every $x \in F$. There is a small number of known infinite families of APN power functions. In this paper, we re-express the exponents for one such family in a more convenient form. This enables us to give the differential spectrum and, even more, to determine the sizes of individual fibers of derivatives.