We consider the goodness-of fit testing problem for H\"older smooth densities over $\mathbb{R}^d$: given $n$ iid observations with unknown density $p$ and given a known density $p_0$, we investigate how large $\rho$ should be to distinguish, with high probability, the case $p=p_0$ from the composite alternative of all H\"older-smooth densities $p$ such that $\|p-p_0\|_t \geq \rho$ where $t \in [1,2]$. The densities are assumed to be defined over $\mathbb{R}^d$ and to have H\"older smoothness parameter $\alpha>0$. In the present work, we solve the case $\alpha \leq 1$ and handle the case $\alpha>1$ using an additional technical restriction on the densities. We identify matching upper and lower bounds on the local minimax rates of testing, given explicitly in terms of $p_0$. We propose novel test statistics which we believe could be of independent interest. We also establish the first definition of an explicit cutoff $u_B$ allowing us to split $\mathbb{R}^d$ into a bulk part (defined as the subset of $\mathbb{R}^d$ where $p_0$ takes only values greater than or equal to $u_B$) and a tail part (defined as the complementary of the bulk), each part involving fundamentally different contributions to the local minimax rates of testing.
A code of length $n$ is said to be (combinatorially) $(\rho,L)$-list decodable if the Hamming ball of radius $\rho n$ around any vector in the ambient space does not contain more than $L$ codewords. We study a recently introduced class of higher order MDS codes, which are closely related (via duality) to codes that achieve a generalized Singleton bound for list decodability. For some $\ell\geq 1$, higher order MDS codes of length $n$, dimension $k$, and order $\ell$ are denoted as $(n,k)$-MDS($\ell$) codes. We present a number of results on the structure of these codes, identifying the `extend-ability' of their parameters in various scenarios. Specifically, for some parameter regimes, we identify conditions under which $(n_1,k_1)$-MDS($\ell_1$) codes can be obtained from $(n_2,k_2)$-MDS($\ell_2$) codes, via various techniques. We believe that these results will aid in efficient constructions of higher order MDS codes. We also obtain a new field size upper bound for the existence of such codes, which arguably improves over the best known existing bound, in some parameter regimes.
Auctions are modeled as Bayesian games with continuous type and action spaces. Determining equilibria in auction games is computationally hard in general and no exact solution theory is known. We introduce an algorithmic framework in which we discretize type and action space and then learn distributional strategies via online optimization algorithms. One advantage of distributional strategies is that we do not have to make any assumptions on the shape of the bid function. Besides, the expected utility of agents is linear in the strategies. It follows that if our optimization algorithms converge to a pure strategy, then they converge to an approximate equilibrium of the discretized game with high precision. Importantly, we show that the equilibrium of the discretized game approximates an equilibrium in the continuous game. In a wide variety of auction games, we provide empirical evidence that the approach approximates the analytical (pure) Bayes Nash equilibrium closely. This speed and precision is remarkable, because in many finite games learning dynamics do not converge or are even chaotic. In standard models where agents are symmetric, we find equilibrium in seconds. While we focus on dual averaging, we show that the overall approach converges independent of the regularizer and alternative online convex optimization methods achieve similar results, even though the discretized game neither satisfies monotonicity nor variational stability globally. The method allows for interdependent valuations and different types of utility functions and provides a foundation for broadly applicable equilibrium solvers that can push the boundaries of equilibrium analysis in auction markets and beyond.
A class of stochastic Besov spaces $B^p L^2(\Omega;\dot H^\alpha(\mathcal{O}))$, $1\le p\le\infty$ and $\alpha\in[-2,2]$, is introduced to characterize the regularity of the noise in the semilinear stochastic heat equation \begin{equation*} {\rm d} u -\Delta u {\rm d} t =f(u) {\rm d} t + {\rm d} W(t) , \end{equation*} under the following conditions for some $\alpha\in(0,1]$: $$ \Big\| \int_0^te^{-(t-s)A}{\rm d} W(s) \Big\|_{L^2(\Omega;L^2(\mathcal{O}))} \le C t^{\frac{\alpha}{2}} \quad\mbox{and}\quad \Big\| \int_0^te^{-(t-s)A}{\rm d} W(s) \Big\|_{B^\infty L^2(\Omega;\dot H^\alpha(\mathcal{O}))}\le C. $$ The conditions above are shown to be satisfied by both trace-class noises (with $\alpha=1$) and one-dimensional space-time white noises (with $\alpha=\frac12$). The latter would fail to satisfy the conditions with $\alpha=\frac12$ if the stochastic Besov norm $\|\cdot\|_{B^\infty L^2(\Omega;\dot H^\alpha(\mathcal{O}))}$ is replaced by the classical Sobolev norm $\|\cdot\|_{L^2(\Omega;\dot H^\alpha(\mathcal{O}))}$, and this often causes reduction of the convergence order in the numerical analysis of the semilinear stochastic heat equation. In this article, the convergence of a modified exponential Euler method, with a spectral method for spatial discretization, is proved to have order $\alpha$ in both time and space for possibly nonsmooth initial data in $L^4(\Omega;\dot{H}^{\beta}(\mathcal{O}))$ with $\beta>-1$, by utilizing the real interpolation properties of the stochastic Besov spaces and a class of locally refined stepsizes to resolve the singularity of the solution at $t=0$.
We constructed a modular, biomimetic red panda paw with which to experimentally investigate the evolutionary reason for the existence of the false thumbs of red pandas. These thumbs were once believed to have shared a common origin with the similar false thumbs of giant pandas; however, the discovery of a carnivorous fossil ancestor of the red panda that had false thumbs implies that the red panda did not evolve its thumbs to assist in eating bamboo, as the giant panda did, but rather evolved its thumbs for some other purpose. The leading proposal for this purpose is that the thumbs developed to aid arboreal locomotion. To test this hypothesis, we conducted grasp tests on rods 5-15 mm in diameter using a biomimetic paw with 0-16 mm interchangeable thumb lengths. The results of these tests demonstrated an optimal thumb length of 7 mm, which is just above that of the red panda's true thumb length of 5.5 mm. Given trends in the data that suggest that smaller thumbs are better suited to grasping larger diameter rods, we conclude that the red panda's thumb being sized below the optimum length suggests an adaptation toward grasping branches as opposed to relatively thinner food items, supporting the new proposal that the red panda's thumbs are an adaptation primary to climbing rather than food manipulation.
The trade algorithm, which includes the curveball and fastball implementations, is the state-of-the-art for uniformly sampling r x c binary matrices with fixed row and column sums. The mixing time of the trade algorithm is currently unknown, although 5r is currently used as a heuristic. We propose a distribution-based approach to estimating the mixing time, but which also can return a sample of matrices that are nearly guaranteed to be uniformly randomly sampled. In numerical experiments on matrices that vary by size, fill, and row and column sum distributions, we find that the upper bound on mixing time is at least 10r, and that it increases as a function of both c and the fraction of cells containing a 1.
We study the causal bandit problem that entails identifying a near-optimal intervention from a specified set $A$ of (possibly non-atomic) interventions over a given causal graph. Here, an optimal intervention in ${A}$ is one that maximizes the expected value for a designated reward variable in the graph, and we use the standard notion of simple regret to quantify near optimality. Considering Bernoulli random variables and for causal graphs on $N$ vertices with constant in-degree, prior work has achieved a worst case guarantee of $\widetilde{O} (N/\sqrt{T})$ for simple regret. The current work utilizes the idea of covering interventions (which are not necessarily contained within ${A}$) and establishes a simple regret guarantee of $\widetilde{O}(\sqrt{N/T})$. Notably, and in contrast to prior work, our simple regret bound depends only on explicit parameters of the problem instance. We also go beyond prior work and achieve a simple regret guarantee for causal graphs with unobserved variables. Further, we perform experiments to show improvements over baselines in this setting.
Let a polytope $P$ be defined by a system $A x \leq b$. We consider the problem of counting the number of integer points inside $P$, assuming that $P$ is $\Delta$-modular, where the polytope $P$ is called $\Delta$-modular if all the rank sub-determinants of $A$ are bounded by $\Delta$ in the absolute value. We present a new FPT-algorithm, parameterized by $\Delta$ and by the maximal number of vertices in $P$, where the maximum is taken by all r.h.s. vectors $b$. We show that our algorithm is more efficient for $\Delta$-modular problems than the approach of A. Barvinok et al. To this end, we do not directly compute the short rational generating function for $P \cap Z^n$, which is commonly used for the considered problem. Instead, we use the dynamic programming principle to compute its particular representation in the form of exponential series that depends on a single variable. We completely do not rely to the Barvinok's unimodular sign decomposition technique. Using our new complexity bound, we consider different special cases that may be of independent interest. For example, we give FPT-algorithms for counting the integer points number in $\Delta$-modular simplices and similar polytopes that have $n + O(1)$ facets. As a special case, for any fixed $m$, we give an FPT-algorithm to count solutions of the unbounded $m$-dimensional $\Delta$-modular subset-sum problem.
Determining causal effects of interventions onto outcomes from real-world, observational (non-randomized) data, e.g., treatment repurposing using electronic health records, is challenging due to underlying bias. Causal deep learning has improved over traditional techniques for estimating individualized treatment effects (ITE). We present the Doubly Robust Variational Information-theoretic Deep Adversarial Learning (DR-VIDAL), a novel generative framework that combines two joint models of treatment and outcome, ensuring an unbiased ITE estimation even when one of the two is misspecified. DR-VIDAL integrates: (i) a variational autoencoder (VAE) to factorize confounders into latent variables according to causal assumptions; (ii) an information-theoretic generative adversarial network (Info-GAN) to generate counterfactuals; (iii) a doubly robust block incorporating treatment propensities for outcome predictions. On synthetic and real-world datasets (Infant Health and Development Program, Twin Birth Registry, and National Supported Work Program), DR-VIDAL achieves better performance than other non-generative and generative methods. In conclusion, DR-VIDAL uniquely fuses causal assumptions, VAE, Info-GAN, and doubly robustness into a comprehensive, performant framework. Code is available at: //github.com/Shantanu48114860/DR-VIDAL-AMIA-22 under MIT license.
Many recent developments in causal inference, and functional estimation problems more generally, have been motivated by the fact that classical one-step (first-order) debiasing methods, or their more recent sample-split double machine-learning avatars, can outperform plugin estimators under surprisingly weak conditions. These first-order corrections improve on plugin estimators in a black-box fashion, and consequently are often used in conjunction with powerful off-the-shelf estimation methods. These first-order methods are however provably suboptimal in a minimax sense for functional estimation when the nuisance functions live in Holder-type function spaces. This suboptimality of first-order debiasing has motivated the development of "higher-order" debiasing methods. The resulting estimators are, in some cases, provably optimal over Holder-type spaces, but both the estimators which are minimax-optimal and their analyses are crucially tied to properties of the underlying function space. In this paper we investigate the fundamental limits of structure-agnostic functional estimation, where relatively weak conditions are placed on the underlying nuisance functions. We show that there is a strong sense in which existing first-order methods are optimal. We achieve this goal by providing a formalization of the problem of functional estimation with black-box nuisance function estimates, and deriving minimax lower bounds for this problem. Our results highlight some clear tradeoffs in functional estimation -- if we wish to remain agnostic to the underlying nuisance function spaces, impose only high-level rate conditions, and maintain compatibility with black-box nuisance estimators then first-order methods are optimal. When we have an understanding of the structure of the underlying nuisance functions then carefully constructed higher-order estimators can outperform first-order estimators.
We study the query version of the approximate heavy hitter and quantile problems. In the former problem, the input is a parameter $\varepsilon$ and a set $P$ of $n$ points in $\mathbb{R}^d$ where each point is assigned a color from a set $C$, and we want to build a structure s.t. given any geometric range $\gamma$, we can efficiently find a list of approximate heavy hitters in $\gamma\cap P$, i.e., colors that appear at least $\varepsilon |\gamma \cap P|$ times in $\gamma \cap P$, as well as their frequencies with an additive error of $\varepsilon |\gamma \cap P|$. In the latter problem, each point is assigned a weight from a totally ordered universe and the query must output a sequence $S$ of $1+1/\varepsilon$ weights s.t. the $i$-th weight in $S$ has approximate rank $i\varepsilon|\gamma\cap P|$, meaning, rank $i\varepsilon|\gamma\cap P|$ up to an additive error of $\varepsilon|\gamma\cap P|$. Previously, optimal results were only known in 1D [WY11] but a few sub-optimal methods were available in higher dimensions [AW17, ACH+12]. We study the problems for 3D halfspace and dominance queries. We consider the real RAM model with integer registers of size $w=\Theta(\log n)$ bits. For dominance queries, we show optimal solutions for both heavy hitter and quantile problems: using linear space, we can answer both queries in time $O(\log n + 1/\varepsilon)$. Note that as the output size is $\frac{1}{\varepsilon}$, after investing the initial $O(\log n)$ searching time, our structure takes on average $O(1)$ time to find a heavy hitter or a quantile! For more general halfspace heavy hitter queries, the same optimal query time can be achieved by increasing the space by an extra $\log_w\frac{1}{\varepsilon}$ (resp. $\log\log_w\frac{1}{\varepsilon}$) factor in 3D (resp. 2D). By spending extra $\log^{O(1)}\frac{1}{\varepsilon}$ factors in time and space, we can also support quantile queries.