Mediation analysis aims to identify and estimate the effect of an exposure on an outcome that is mediated through one or more intermediate variables. In the presence of multiple intermediate variables, two pertinent methodological questions arise: estimating mediated effects when mediators are correlated, and performing high-dimensional mediation analysis when the number of mediators exceeds the sample size. This paper presents a two-step procedure for high-dimensional mediation analysis. The first step selects a reduced number of candidate mediators using an ad-hoc lasso penalty. The second step applies a procedure we previously developed to estimate the mediated and direct effects, accounting for the correlation structure among the retained candidate mediators. We compare the performance of the proposed two-step procedure with state-of-the-art methods using simulated data. Additionally, we demonstrate its practical application by estimating the causal role of DNA methylation in the pathway between smoking and rheumatoid arthritis using real data.
Sequential neural posterior estimation (SNPE) techniques have been recently proposed for dealing with simulation-based models with intractable likelihoods. Unlike approximate Bayesian computation, SNPE techniques learn the posterior from sequential simulation using neural network-based conditional density estimators by minimizing a specific loss function. The SNPE method proposed by Lueckmann et al. (2017) used a calibration kernel to boost the sample weights around the observed data, resulting in a concentrated loss function. However, the use of calibration kernels may increase the variances of both the empirical loss and its gradient, making the training inefficient. To improve the stability of SNPE, this paper proposes to use an adaptive calibration kernel and several variance reduction techniques. The proposed method greatly speeds up the process of training and provides a better approximation of the posterior than the original SNPE method and some existing competitors as confirmed by numerical experiments. We also manage to demonstrate the superiority of the proposed method for a high-dimensional model with real-world dataset.
Statistical learning under distribution shift is challenging when neither prior knowledge nor fully accessible data from the target distribution is available. Distributionally robust learning (DRL) aims to control the worst-case statistical performance within an uncertainty set of candidate distributions, but how to properly specify the set remains challenging. To enable distributional robustness without being overly conservative, in this paper, we propose a shape-constrained approach to DRL, which incorporates prior information about the way in which the unknown target distribution differs from its estimate. More specifically, we assume the unknown density ratio between the target distribution and its estimate is isotonic with respect to some partial order. At the population level, we provide a solution to the shape-constrained optimization problem that does not involve the isotonic constraint. At the sample level, we provide consistency results for an empirical estimator of the target in a range of different settings. Empirical studies on both synthetic and real data examples demonstrate the improved accuracy of the proposed shape-constrained approach.
This study presents a scalable Bayesian estimation algorithm for sparse estimation in exploratory item factor analysis based on a classical Bayesian estimation method, namely Bayesian joint modal estimation (BJME). BJME estimates the model parameters and factor scores that maximize the complete-data joint posterior density. Simulation studies show that the proposed algorithm has high computational efficiency and accuracy in variable selection over latent factors and the recovery of the model parameters. Moreover, we conducted a real data analysis using large-scale data from a psychological assessment that targeted the Big Five personality traits. This result indicates that the proposed algorithm achieves computationally efficient parameter estimation and extracts the interpretable factor loading structure.
We propose a novel, highly efficient, second-order accurate, long-time unconditionally stable numerical scheme for a class of finite-dimensional nonlinear models that are of importance in geophysical fluid dynamics. The scheme is highly efficient in the sense that only a (fixed) symmetric positive definite linear problem (with varying right hand sides) is involved at each time-step. The solutions to the scheme are uniformly bounded for all time. We show that the scheme is able to capture the long-time dynamics of the underlying geophysical model, with the global attractors as well as the invariant measures of the scheme converge to those of the original model as the step size approaches zero. In our numerical experiments, we take an indirect approach, using long-term statistics to approximate the invariant measures. Our results suggest that the convergence rate of the long-term statistics, as a function of terminal time, is approximately first order using the Jensen-Shannon metric and half-order using the L1 metric. This implies that very long time simulation is needed in order to capture a few significant digits of long time statistics (climate) correct. Nevertheless, the second order scheme's performance remains superior to that of the first order one, requiring significantly less time to reach a small neighborhood of statistical equilibrium for a given step size.
Gate-defined quantum dots are a promising candidate system for realizing scalable, coupled qubit systems and serving as a fundamental building block for quantum computers. However, present-day quantum dot devices suffer from imperfections that must be accounted for, which hinders the characterization, tuning, and operation process. Moreover, with an increasing number of quantum dot qubits, the relevant parameter space grows sufficiently to make heuristic control infeasible. Thus, it is imperative that reliable and scalable autonomous tuning approaches are developed. This meeting report outlines current challenges in automating quantum dot device tuning and operation with a particular focus on datasets, benchmarking, and standardization. We also present insights and ideas put forward by the quantum dot community on how to overcome them. We aim to provide guidance and inspiration to researchers invested in automation efforts.
Dealing with uncertainty in optimization parameters is an important and longstanding challenge. Typically, uncertain parameters are predicted accurately, and then a deterministic optimization problem is solved. However, the decisions produced by this so-called \emph{predict-then-optimize} procedure can be highly sensitive to uncertain parameters. In this work, we contribute to recent efforts in producing \emph{decision-focused} predictions, i.e., to build predictive models that are constructed with the goal of minimizing a \emph{regret} measure on the decisions taken with them. We begin by formulating the exact expected regret minimization as a pessimistic bilevel optimization model. Then, we establish NP-completeness of this problem, even in a heavily restricted case. Using duality arguments, we reformulate it as a non-convex quadratic optimization problem. Finally, we show various computational techniques to achieve tractability. We report extensive computational results on shortest-path instances with uncertain cost vectors. Our results indicate that our approach can improve training performance over the approach of Elmachtoub and Grigas (2022), a state-of-the-art method for decision-focused learning.
In this paper we build a joint model which can accommodate for binary, ordinal and continuous responses, by assuming that the errors of the continuous variables and the errors underlying the ordinal and binary outcomes follow a multivariate normal distribution. We employ composite likelihood methods to estimate the model parameters and use composite likelihood inference for model comparison and uncertainty quantification. The complimentary R package mvordnorm implements estimation of this model using composite likelihood methods and is available for download from Github. We present two use-cases in the area of risk management to illustrate our approach.
We consider random matrix ensembles on the set of Hermitian matrices that are heavy tailed, in particular not all moments exist, and that are invariant under the conjugate action of the unitary group. The latter property entails that the eigenvectors are Haar distributed and, therefore, factorise from the eigenvalue statistics. We prove a classification for stable matrix ensembles of this kind of matrices represented in terms of matrices, their eigenvalues and their diagonal entries with the help of the classification of the multivariate stable distributions and the harmonic analysis on symmetric matrix spaces. Moreover, we identify sufficient and necessary conditions for their domains of attraction. To illustrate our findings we discuss for instance elliptical invariant random matrix ensembles and P\'olya ensembles, the latter playing a particular role in matrix convolutions. As a byproduct we generalise the derivative principle on the Hermitian matrices to general tempered distributions. This principle relates the joint probability density of the eigenvalues and the diagonal entries of the random matrix.
Coboundary expansion is a high dimensional generalization of the Cheeger constant to simplicial complexes. Originally, this notion was motivated by the fact that it implies topological expansion, but nowadays a significant part of the motivation stems from its deep connection to problems in theoretical computer science such as agreement expansion in the low soundness regime. In this paper, we prove coboundary expansion with non-Abelian coefficients for the coset complex construction of Kaufman and Oppenheim. Our proof uses a novel global argument, as opposed to the local-to-global arguments that are used to prove cosystolic expansion.
Transformers can efficiently learn in-context from example demonstrations. Most existing theoretical analyses studied the in-context learning (ICL) ability of transformers for linear function classes, where it is typically shown that the minimizer of the pretraining loss implements one gradient descent step on the least squares objective. However, this simplified linear setting arguably does not demonstrate the statistical efficiency of ICL, since the pretrained transformer does not outperform directly solving linear regression on the test prompt. In this paper, we study ICL of a nonlinear function class via transformer with nonlinear MLP layer: given a class of \textit{single-index} target functions $f_*(\boldsymbol{x}) = \sigma_*(\langle\boldsymbol{x},\boldsymbol{\beta}\rangle)$, where the index features $\boldsymbol{\beta}\in\mathbb{R}^d$ are drawn from a $r$-dimensional subspace, we show that a nonlinear transformer optimized by gradient descent (with a pretraining sample complexity that depends on the \textit{information exponent} of the link functions $\sigma_*$) learns $f_*$ in-context with a prompt length that only depends on the dimension of the distribution of target functions $r$; in contrast, any algorithm that directly learns $f_*$ on test prompt yields a statistical complexity that scales with the ambient dimension $d$. Our result highlights the adaptivity of the pretrained transformer to low-dimensional structures of the function class, which enables sample-efficient ICL that outperforms estimators that only have access to the in-context data.