Model selection aims to identify a sufficiently well performing model that is possibly simpler than the most complex model among a pool of candidates. However, the decision-making process itself can inadvertently introduce non-negligible bias when the cross-validation estimates of predictive performance are marred by excessive noise. In finite data regimes, cross-validated estimates can encourage the statistician to select one model over another when it is not actually better for future data. While this bias remains negligible in the case of few models, when the pool of candidates grows, and model selection decisions are compounded (as in forward search), the expected magnitude of selection-induced bias is likely to grow too. This paper introduces an efficient approach to estimate and correct selection-induced bias based on order statistics. Numerical experiments demonstrate the reliability of our approach in both estimating selection-induced bias and quantifying the degree of over-fitting along compounded model selection decisions, with specific application to forward search. This work represents a light-weight alternative to more computationally expensive approaches to correcting selection-induced bias, such as nested cross-validation and the bootstrap. Our approach rests on several theoretic assumptions, and we provide a diagnostic to help understand when these may not be valid, and when to fall back on safer, albeit more computationally expensive approaches. The accompanying code facilitates its practical implementation and fosters further exploration in this area.
Most of the literature on causality considers the structural framework of Pearl and the potential-outcome framework of Neyman and Rubin to be formally equivalent, and therefore interchangeably uses the do-notation and the potential-outcome subscript notation to write counterfactual outcomes. In this paper, we superimpose the two causal frameworks to prove that structural counterfactual outcomes and potential outcomes do not coincide in general -- not even in law. More precisely, we express the law of the potential outcomes in terms of the latent structural causal model under the fundamental assumptions of causal inference. This enables us to precisely identify when counterfactual inference is or is not equivalent between approaches, and to clarify the meaning of each kind of counterfactuals.
We develop a provably efficient importance sampling scheme that estimates exit probabilities of solutions to small-noise stochastic reaction-diffusion equations from scaled neighborhoods of a stable equilibrium. The moderate deviation scaling allows for a local approximation of the nonlinear dynamics by their linearized version. In addition, we identify a finite-dimensional subspace where exits take place with high probability. Using stochastic control and variational methods we show that our scheme performs well both in the zero noise limit and pre-asymptotically. Simulation studies for stochastically perturbed bistable dynamics illustrate the theoretical results.
Nonparametric maximum likelihood estimators (MLEs) in inverse problems often have non-normal limit distributions, like Chernoff's distribution. However, if one considers smooth functionals of the model, with corresponding functionals of the MLE, one gets normal limit distributions and faster rates of convergence. We demonstrate this for a model for the incubation time of a disease. The usual approach in the latter models is to use parametric distributions, like Weibull and gamma distributions, which leads to inconsistent estimators. Smoothed bootstrap methods are discussed for constructing confidence intervals. The classical bootstrap, based on the nonparametric MLE itself, has been proved to be inconsistent in this situation.
Recalling the most relevant visual memories for localisation or understanding a priori the likely outcome of localisation effort against a particular visual memory is useful for efficient and robust visual navigation. Solutions to this problem should be divorced from performance appraisal against ground truth - as this is not available at run-time - and should ideally be based on generalisable environmental observations. For this, we propose applying the recently developed Visual DNA as a highly scalable tool for comparing datasets of images - in this work, sequences of map and live experiences. In the case of localisation, important dataset differences impacting performance are modes of appearance change, including weather, lighting, and season. Specifically, for any deep architecture which is used for place recognition by matching feature volumes at a particular layer, we use distribution measures to compare neuron-wise activation statistics between live images and multiple previously recorded past experiences, with a potentially large seasonal (winter/summer) or time of day (day/night) shift. We find that differences in these statistics correlate to performance when localising using a past experience with the same appearance gap. We validate our approach over the Nordland cross-season dataset as well as data from Oxford's University Parks with lighting and mild seasonal change, showing excellent ability of our system to rank actual localisation performance across candidate experiences.
We formulate and test a technique to use Emergent Communication (EC) with a pre-trained multilingual model to improve on modern Unsupervised NMT systems, especially for low-resource languages. It has been argued that the current dominant paradigm in NLP of pre-training on text-only corpora will not yield robust natural language understanding systems, and the need for grounded, goal-oriented, and interactive language learning has been high lighted. In our approach, we embed a multilingual model (mBART, Liu et al., 2020) into an EC image-reference game, in which the model is incentivized to use multilingual generations to accomplish a vision-grounded task. The hypothesis is that this will align multiple languages to a shared task space. We present two variants of EC Fine-Tuning (Steinert-Threlkeld et al., 2022), one of which outperforms a backtranslation-only baseline in all four languages investigated, including the low-resource language Nepali.
Many attempts have been made at estimating discrete emotions (calmness, anxiety, boredom, surprise, anger) and continuous emotional measures commonly used in psychology, namely `valence' (The pleasantness of the emotion being displayed) and `arousal' (The intensity of the emotion being displayed). Existing methods to estimate arousal and valence rely on learning from data sets, where an expert annotator labels every image frame. Access to an expert annotator is not always possible, and the annotation can also be tedious. Hence it is more practical to obtain self-reported arousal and valence values directly from the human in a real-time Human-Robot collaborative setting. Hence this paper provides an emotion data set (HRI-AVC) obtained while conducting a human-robot interaction (HRI) task. The self-reported pair of labels in this data set is associated with a set of image frames. This paper also proposes a spatial and temporal attention-based network to estimate arousal and valence from this set of image frames. The results show that an attention-based network can estimate valence and arousal on the HRI-AVC data set even when Arousal and Valence values are unavailable per frame.
Insurers usually turn to generalized linear models for modelling claim frequency and severity data. Due to their success in other fields, machine learning techniques are gaining popularity within the actuarial toolbox. Our paper contributes to the literature on frequency-severity insurance pricing with machine learning via deep learning structures. We present a benchmark study on four insurance data sets with frequency and severity targets in the presence of multiple types of input features. We compare in detail the performance of: a generalized linear model on binned input data, a gradient-boosted tree model, a feed-forward neural network (FFNN), and the combined actuarial neural network (CANN). Our CANNs combine a baseline prediction established with a GLM and GBM, respectively, with a neural network correction. We explain the data preprocessing steps with specific focus on the multiple types of input features typically present in tabular insurance data sets, such as postal codes, numeric and categorical covariates. Autoencoders are used to embed the categorical variables into the neural network and we explore their potential advantages in a frequency-severity setting. Finally, we construct global surrogate models for the neural nets' frequency and severity models. These surrogates enable the translation of the essential insights captured by the FFNNs or CANNs to GLMs. As such, a technical tariff table results that can easily be deployed in practice.
In this paper the interpolating rational functions introduced by Floater and Hormann are generalized leading to a whole new family of rational functions depending on $\gamma$, an additional positive integer parameter. For $\gamma = 1$, the original Floater--Hormann interpolants are obtained. When $\gamma>1$ we prove that the new rational functions share a lot of the nice properties of the original Floater--Hormann functions. Indeed, for any configuration of nodes in a compact interval, they have no real poles, interpolate the given data, preserve the polynomials up to a certain fixed degree, and have a barycentric-type representation. Moreover, we estimate the associated Lebesgue constants in terms of the minimum ($h^*$) and maximum ($h$) distance between two consecutive nodes. It turns out that, in contrast to the original Floater-Hormann interpolants, for all $\gamma > 1$ we get uniformly bounded Lebesgue constants in the case of equidistant and quasi-equidistant nodes configurations (i.e., when $h\sim h^*$). For such configurations, as the number of nodes tends to infinity, we prove that the new interpolants ($\gamma>1$) uniformly converge to the interpolated function $f$, for any continuous function $f$ and all $\gamma>1$. The same is not ensured by the original FH interpolants ($\gamma=1$). Moreover, we provide uniform and pointwise estimates of the approximation error for functions having different degrees of smoothness. Numerical experiments illustrate the theoretical results and show a better error profile for less smooth functions compared to the original Floater-Hormann interpolants.
Indirect reciprocity is a mechanism that explains large-scale cooperation in human societies. In indirect reciprocity, an individual chooses whether or not to cooperate with another based on reputation information, and others evaluate the action as good or bad. Under what evaluation rule (called ``social norm'') cooperation evolves has long been of central interest in the literature. It has been reported that if individuals can share their evaluations (i.e., public reputation), social norms called ``leading eight'' can be evolutionarily stable. On the other hand, when they cannot share their evaluations (i.e., private assessment), the evolutionary stability of cooperation is still in question. To tackle this problem, we create a novel method to analyze the reputation structure in the population under private assessment. Specifically, we characterize each individual by two variables, ``goodness'' (what proportion of the population considers the individual as good) and ``self-reputation'' (whether an individual thinks of him/herself as good or bad), and analyze the stochastic process of how these two variables change over time. We discuss evolutionary stability of each of the leading eight social norms by studying the robustness against invasions of unconditional cooperators and defectors. We identify key pivots in those social norms for establishing a high level of cooperation or stable cooperation against mutants. Our finding gives an insight into how human cooperation is established in a real-world society.
We develop randomized matrix-free algorithms for estimating partial traces. Our algorithm improves on the typicality-based approach used in [T. Chen and Y-C. Cheng, Numerical computation of the equilibrium-reduced density matrix for strongly coupled open quantum systems, J. Chem. Phys. 157, 064106 (2022)] by deflating important subspaces (e.g. corresponding to the low-energy eigenstates) explicitly. This results in a significant variance reduction for matrices with quickly decaying singular values. We then apply our algorithm to study the thermodynamics of several Heisenberg spin systems, particularly the entanglement spectrum and ergotropy.