A theoretical expression is derived for the mean squared error of a nonparametric estimator of the tail dependence coefficient, depending on a threshold that defines which rank delimits the tails of a distribution. We propose a new method to optimally select this threshold. It combines the theoretical mean squared error of the estimator with a parametric estimation of the copula linking observations in the tails. Using simulations, we compare this semiparametric method with other approaches proposed in the literature, including the plateau-finding algorithm.
Iterative refinement (IR) is a popular scheme for solving a linear system of equations based on gradually improving the accuracy of an initial approximation. Originally developed to improve upon the accuracy of Gaussian elimination, interest in IR has been revived because of its suitability for execution on fast low-precision hardware such as analog devices and graphics processing units. IR generally converges when the error associated with the solution method is small, but is known to diverge when this error is large. We propose and analyze a novel enhancement to the IR algorithm by adding a line search optimization step that guarantees the algorithm will not diverge. Numerical experiments verify our theoretical results and illustrate the effectiveness of our proposed scheme.
In real life, success is often contingent upon multiple critical steps that are distant in time from each other and from the final reward. These critical steps are challenging to identify with traditional reinforcement learning (RL) methods that rely on the Bellman equation for credit assignment. Here, we present a new RL algorithm that uses offline contrastive learning to hone in on critical steps. This algorithm, which we call contrastive introspection (ConSpec), can be added to any existing RL algorithm. ConSpec learns a set of prototypes for the critical steps in a task by a novel contrastive loss and delivers an intrinsic reward when the current state matches one of these prototypes. The prototypes in ConSpec provide two key benefits for credit assignment: (1) They enable rapid identification of all the critical steps. (2) They do so in a readily interpretable manner, enabling out-of-distribution generalization when sensory features are altered. Distinct from other contemporary RL approaches to credit assignment, ConSpec takes advantage of the fact that it is easier to retrospectively identify the small set of steps that success is contingent upon than it is to prospectively predict reward at every step taken in the environment. Altogether, ConSpec improves learning in a diverse set of RL tasks, including both those with explicit, discrete critical steps and those with complex, continuous critical steps.
In machine learning models, the estimation of errors is often complex due to distribution bias, particularly in spatial data such as those found in environmental studies. We introduce an approach based on the ideas of importance sampling to obtain an unbiased estimate of the target error. By taking into account difference between desirable error and available data, our method reweights errors at each sample point and neutralizes the shift. Importance sampling technique and kernel density estimation were used for reweighteing. We validate the effectiveness of our approach using artificial data that resemble real-world spatial datasets. Our findings demonstrate advantages of the proposed approach for the estimation of the target error, offering a solution to a distribution shift problem. Overall error of predictions dropped from 7% to just 2% and it gets smaller for larger samples.
We are interested in numerical algorithms for computing the electrical field generated by a charge distribution localized on scale $l$ in an infinite heterogeneous correlated random medium, in a situation where the medium is only known in a box of diameter $L\gg l$ around the support of the charge. We show that the algorithm of Lu, Otto and Wang, suggesting optimal Dirichlet boundary conditions motivated by the multipole expansion of Bella, Giunti and Otto, still performs well in correlated media. With overwhelming probability, we obtain a convergence rate in terms of $l$, $L$ and the size of the correlations for which optimality is supported with numerical simulations. These estimates are provided for ensembles which satisfy a multi-scale logarithmic Sobolev inequality, where our main tool is an extension of the semi-group estimates established by the first author. As part of our strategy, we construct sub-linear second-order correctors in this correlated setting which is of independent interest.
This paper presents a numerical method for the simulation of elastic solid materials coupled to fluid inclusions. The application is motivated by the modeling of vascularized tissues and by problems in medical imaging which target the estimation of effective (i.e., macroscale) material properties, taking into account the influence of microscale dynamics, such as fluid flow in the microvasculature. The method is based on the recently proposed Reduced Lagrange Multipliers framework. In particular, the interface between solid and fluid domains is not resolved within the computational mesh for the elastic material but discretized independently, imposing the coupling condition via non-matching Lagrange multipliers. Exploiting the multiscale properties of the problem, the resulting Lagrange multipliers space is reduced to a lower-dimensional characteristic set. We present the details of the stability analysis of the resulting method considering a non-standard boundary condition that enforces a local deformation on the solid-fluid boundary. The method is validated with several numerical examples.
Robustness to distribution shift and fairness have independently emerged as two important desiderata required of modern machine learning models. While these two desiderata seem related, the connection between them is often unclear in practice. Here, we discuss these connections through a causal lens, focusing on anti-causal prediction tasks, where the input to a classifier (e.g., an image) is assumed to be generated as a function of the target label and the protected attribute. By taking this perspective, we draw explicit connections between a common fairness criterion - separation - and a common notion of robustness - risk invariance. These connections provide new motivation for applying the separation criterion in anticausal settings, and inform old discussions regarding fairness-performance tradeoffs. In addition, our findings suggest that robustness-motivated approaches can be used to enforce separation, and that they often work better in practice than methods designed to directly enforce separation. Using a medical dataset, we empirically validate our findings on the task of detecting pneumonia from X-rays, in a setting where differences in prevalence across sex groups motivates a fairness mitigation. Our findings highlight the importance of considering causal structure when choosing and enforcing fairness criteria.
Confounder selection, namely choosing a set of covariates to control for confounding between a treatment and an outcome, is arguably the most important step in the design of observational studies. Previous methods, such as Pearl's celebrated back-door criterion, typically require pre-specifying a causal graph, which can often be difficult in practice. We propose an interactive procedure for confounder selection that does not require pre-specifying the graph or the set of observed variables. This procedure iteratively expands the causal graph by finding what we call "primary adjustment sets" for a pair of possibly confounded variables. This can be viewed as inverting a sequence of latent projections of the underlying causal graph. Structural information in the form of primary adjustment sets is elicited from the user, bit by bit, until either a set of covariates are found to control for confounding or it can be determined that no such set exists. We show that if the user correctly specifies the primary adjustment sets in every step, our procedure is both sound and complete.
We consider a one-dimensional nonlocal hyperbolic model introduced to describe the formation and movement of self-organizing collectives of animals in homogeneous 1D environments. Previous research has shown that this model exhibits a large number of complex spatial and spatiotemporal aggregation patterns, as evidenced by numerical simulations and weakly nonlinear analysis. In this study, we focus on a particular type of localised patterns with odd/even/no symmetries (which are usually part of snaking solution branches with different symmetries that form complex bifurcation structures called snake-and-ladder bifurcations). To numerically investigate the bifurcating solution branches (to eventually construct the full bifurcating structures), we first need to understand the numerical issues that could appear when using different numerical schemes. To this end, in this study, we consider ten different numerical schemes (the upwind scheme, the MacCormack scheme, the Fractional-Step method, and the Quasi-Steady Wave-Propagation algorithm, combining them with high-resolution methods), while paying attention to the preservation of the solution symmetries with all these schemes. We show several numerical issues: first, we observe the presence of two distinct types of numerical solutions (with different symmetries) that exhibit very small errors; second, in some cases, none of the investigated numerical schemes converge, posing a challenge for the development of numerical continuation algorithms for nonlocal hyperbolic systems; lastly, the choice of the numerical schemes, as well as their corresponding parameters such as time-space steps, exert a significant influence on the type and symmetry of bifurcating solutions.
Neural networks are high-dimensional nonlinear dynamical systems that process information through the coordinated activity of many connected units. Understanding how biological and machine-learning networks function and learn requires knowledge of the structure of this coordinated activity, information contained, for example, in cross covariances between units. Self-consistent dynamical mean field theory (DMFT) has elucidated several features of random neural networks -- in particular, that they can generate chaotic activity -- however, a calculation of cross covariances using this approach has not been provided. Here, we calculate cross covariances self-consistently via a two-site cavity DMFT. We use this theory to probe spatiotemporal features of activity coordination in a classic random-network model with independent and identically distributed (i.i.d.) couplings, showing an extensive but fractionally low effective dimension of activity and a long population-level timescale. Our formulae apply to a wide range of single-unit dynamics and generalize to non-i.i.d. couplings. As an example of the latter, we analyze the case of partially symmetric couplings.
We hypothesize that due to the greedy nature of learning in multi-modal deep neural networks, these models tend to rely on just one modality while under-fitting the other modalities. Such behavior is counter-intuitive and hurts the models' generalization, as we observe empirically. To estimate the model's dependence on each modality, we compute the gain on the accuracy when the model has access to it in addition to another modality. We refer to this gain as the conditional utilization rate. In the experiments, we consistently observe an imbalance in conditional utilization rates between modalities, across multiple tasks and architectures. Since conditional utilization rate cannot be computed efficiently during training, we introduce a proxy for it based on the pace at which the model learns from each modality, which we refer to as the conditional learning speed. We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning. The proposed algorithm improves the model's generalization on three datasets: Colored MNIST, Princeton ModelNet40, and NVIDIA Dynamic Hand Gesture.