We consider bootstrap inference for estimators which are (asymptotically) biased. We show that, even when the bias term cannot be consistently estimated, valid inference can be obtained by proper implementations of the bootstrap. Specifically, we show that the prepivoting approach of Beran (1987, 1988), originally proposed to deliver higher-order refinements, restores bootstrap validity by transforming the original bootstrap p-value into an asymptotically uniform random variable. We propose two different implementations of prepivoting (plug-in and double bootstrap), and provide general high-level conditions that imply validity of bootstrap inference. To illustrate the practical relevance and implementation of our results, we discuss five examples: (i) inference on a target parameter based on model averaging; (ii) ridge-type regularized estimators; (iii) nonparametric regression; (iv) a location model for infinite variance data; and (v) dynamic panel data models.
In the realm of cost-sharing mechanisms, the vulnerability to Sybil strategies, where agents can create fake identities to manipulate outcomes, has not yet been studied. In this paper, we delve into the intricacies of different cost-sharing mechanisms proposed in the literature highlighting its non Sybil-resistance nature. Furthermore, we prove that under mild conditions, a Sybil-proof cost-sharing mechanism for public excludable goods is at least $(n/2+1)-$approximate. This finding reveals an actual exponential increase in the worst-case social cost in environments where agents are restricted from using Sybil strategies. We introduce the concept of \textit{Sybil Welfare Invariant} mechanisms, where a mechanism maintains its worst-case welfare under Sybil-strategies for every set of prior beliefs with full support even when the mechanism is not Sybil-proof. Finally, we prove that the Shapley value mechanism for public excludable goods holds this property, and so deduce that the worst-case social cost of this mechanism is the $n$th harmonic number $\mathcal H_n$ even under equilibrium of the game with Sybil strategies, matching the worst-case social cost bound for cost-sharing mechanisms. This finding carries important implications for decentralized autonomous organizations (DAOs), indicating that they are capable of funding public excludable goods efficiently, even when the total number of agents in the DAO is unknown.
We study operator - or noncommutative - variants of constraint satisfaction problems (CSPs). These higher-dimensional variants are a core topic of investigation in quantum information, where they arise as nonlocal games and entangled multiprover interactive proof systems (MIP*). The idea of higher-dimensional relaxations of CSPs is also important in the classical literature. For example since the celebrated work of Goemans and Williamson on Max-Cut, higher dimensional vector relaxations have been central in the design of approximation algorithms for classical CSPs. We introduce a framework for designing approximation algorithms for noncommutative CSPs. Prior to this work Max-$2$-Lin$(k)$ was the only family of noncommutative CSPs known to be efficiently solvable. This work is the first to establish approximation ratios for a broader class of noncommutative CSPs. In the study of classical CSPs, $k$-ary decision variables are often represented by $k$-th roots of unity, which generalise to the noncommutative setting as order-$k$ unitary operators. In our framework, using representation theory, we develop a way of constructing unitary solutions from SDP relaxations, extending the pioneering work of Tsirelson on XOR games. Then, we introduce a novel rounding scheme to transform these solutions to order-$k$ unitaries. Our main technical innovation here is a theorem guaranteeing that, for any set of unitary operators, there exists a set of order-$k$ unitaries that closely mimics it. As an integral part of the rounding scheme, we prove a random matrix theory result that characterises the distribution of the relative angles between eigenvalues of random unitaries using tools from free probability.
It has been classically conjectured that the brain assigns probabilistic models to sequences of stimuli. An important issue associated with this conjecture is the identification of the classes of models used by the brain to perform this task. We address this issue by using a new clustering procedure for sets of electroencephalographic (EEG) data recorded from participants exposed to a sequence of auditory stimuli generated by a stochastic chain. This clustering procedure indicates that the brain uses renewal points in the stochastic sequence of auditory stimuli in order to build a model.
There has been recently a lot of interest in the analysis of the Stein gradient descent method, a deterministic sampling algorithm. It is based on a particle system moving along the gradient flow of the Kullback-Leibler divergence towards the asymptotic state corresponding to the desired distribution. Mathematically, the method can be formulated as a joint limit of time $t$ and number of particles $N$ going to infinity. We first observe that the recent work of Lu, Lu and Nolen (2019) implies that if $t \approx \log \log N$, then the joint limit can be rigorously justified in the Wasserstein distance. Not satisfied with this time scale, we explore what happens for larger times by investigating the stability of the method: if the particles are initially close to the asymptotic state (with distance $\approx 1/N$), how long will they remain close? We prove that this happens in algebraic time scales $t \approx \sqrt{N}$ which is significantly better. The exploited method, developed by Caglioti and Rousset for the Vlasov equation, is based on finding a functional invariant for the linearized equation. This allows to eliminate linear terms and arrive at an improved Gronwall-type estimate.
Nominal terms extend first-order terms with binding. They lack some properties of first- and higher-order terms: Terms must be reasoned about in a context of 'freshness assumptions'; it is not always possible to 'choose a fresh variable symbol' for a nominal term; it is not always possible to 'alpha-convert a bound variable symbol' or to 'quotient by alpha-equivalence'; the notion of unifier is not based just on substitution. Permissive nominal terms closely resemble nominal terms but they recover these properties, and in particular the 'always fresh' and 'always rename' properties. In the permissive world, freshness contexts are elided, equality is fixed, and the notion of unifier is based on substitution alone rather than on nominal terms' notion of unification based on substitution plus extra freshness conditions. We prove that expressivity is not lost moving to the permissive case and provide an injection of nominal terms unification problems and their solutions into permissive nominal terms problems and their solutions. We investigate the relation between permissive nominal unification and higher-order pattern unification. We show how to translate permissive nominal unification problems and solutions in a sound, complete, and optimal manner, in suitable senses which we make formal.
How easy is it to uniquely identify a person based on their web browsing behavior? Here we show that when people navigate the Web, their online traces produce fingerprints that identify them. By merely knowing their most visited web domains, four data points are enough to identify 95% of the individuals. These digital fingerprints are stable and render high re-identifiability. We demonstrate that we can re-identify 90% of the individuals in separate time slices of data. Such a privacy threat persists even with limited information about individuals' browsing behavior, reinforcing existing concerns around online privacy.
We propose a simple multivariate normality test based on Kac-Bernstein's characterization, which can be conducted by utilising existing statistical independence tests for sums and differences of data samples. We also perform its empirical investigation, which reveals that for high-dimensional data, the proposed approach may be more efficient than the alternative ones. The accompanying code repository is provided at \url{//shorturl.at/rtuy5}.
Scientific claims gain credibility by replicability, especially if replication under different circumstances and varying designs yields equivalent results. Aggregating results over multiple studies is, however, not straightforward, and when the heterogeneity between studies increases, conventional methods such as (Bayesian) meta-analysis and Bayesian sequential updating become infeasible. *Bayesian Evidence Synthesis*, built upon the foundations of the Bayes factor, allows to aggregate support for conceptually similar hypotheses over studies, regardless of methodological differences. We assess the performance of Bayesian Evidence Synthesis over multiple effect and sample sizes, with a broad set of (inequality-constrained) hypotheses using Monte Carlo simulations, focusing explicitly on the complexity of the hypotheses under consideration. The simulations show that this method can evaluate complex (informative) hypotheses regardless of methodological differences between studies, and performs adequately if the set of studies considered has sufficient statistical power. Additionally, we pinpoint challenging conditions that can lead to unsatisfactory results, and provide suggestions on handling these situations. Ultimately, we show that Bayesian Evidence Synthesis is a promising tool that can be used when traditional research synthesis methods are not applicable due to insurmountable between-study heterogeneity.
Many complex tasks and environments can be decomposed into simpler, independent parts. Discovering such underlying compositional structure has the potential to expedite adaptation and enable compositional generalization. Despite progress, our most powerful systems struggle to compose flexibly. While most of these systems are monolithic, modularity promises to allow capturing the compositional nature of many tasks. However, it is unclear under which circumstances modular systems discover this hidden compositional structure. To shed light on this question, we study a teacher-student setting with a modular teacher where we have full control over the composition of ground truth modules. This allows us to relate the problem of compositional generalization to that of identification of the underlying modules. We show theoretically that identification up to linear transformation purely from demonstrations is possible in hypernetworks without having to learn an exponential number of module combinations. While our theory assumes the infinite data limit, in an extensive empirical study we demonstrate how meta-learning from finite data can discover modular solutions that generalize compositionally in modular but not monolithic architectures. We further show that our insights translate outside the teacher-student setting and demonstrate that in tasks with compositional preferences and tasks with compositional goals hypernetworks can discover modular policies that compositionally generalize.
This paper does not describe a working system. Instead, it presents a single idea about representation which allows advances made by several different groups to be combined into an imaginary system called GLOM. The advances include transformers, neural fields, contrastive representation learning, distillation and capsules. GLOM answers the question: How can a neural network with a fixed architecture parse an image into a part-whole hierarchy which has a different structure for each image? The idea is simply to use islands of identical vectors to represent the nodes in the parse tree. If GLOM can be made to work, it should significantly improve the interpretability of the representations produced by transformer-like systems when applied to vision or language