General nonlinear sieve learnings are classes of nonlinear sieves that can approximate nonlinear functions of high dimensional variables much more flexibly than various linear sieves (or series). This paper considers general nonlinear sieve quasi-likelihood ratio (GN-QLR) based inference on expectation functionals of time series data, where the functionals of interest are based on some nonparametric function that satisfy conditional moment restrictions and are learned using multilayer neural networks. While the asymptotic normality of the estimated functionals depends on some unknown Riesz representer of the functional space, we show that the optimally weighted GN-QLR statistic is asymptotically Chi-square distributed, regardless whether the expectation functional is regular (root-$n$ estimable) or not. This holds when the data are weakly dependent beta-mixing condition. We apply our method to the off-policy evaluation in reinforcement learning, by formulating the Bellman equation into the conditional moment restriction framework, so that we can make inference about the state-specific value functional using the proposed GN-QLR method with time series data. In addition, estimating the averaged partial means and averaged partial derivatives of nonparametric instrumental variables and quantile IV models are also presented as leading examples. Finally, a Monte Carlo study shows the finite sample performance of the procedure
Incomplete covariate vectors are known to be problematic for estimation and inferences on model parameters, but their impact on prediction performance is less understood. We develop an imputation-free method that builds on a random partition model admitting variable-dimension covariates. Cluster-specific response models further incorporate covariates via linear predictors, facilitating estimation of smooth prediction surfaces with relatively few clusters. We exploit marginalization techniques of Gaussian kernels to analytically project response distributions according to any pattern of missing covariates, yielding a local regression with internally consistent uncertainty propagation that utilizes only one set of coefficients per cluster. Aggressive shrinkage of these coefficients regulates uncertainty due to missing covariates. The method allows in- and out-of-sample prediction for any missingness pattern, even if the pattern in a new subject's incomplete covariate vector was not seen in the training data. We develop an MCMC algorithm for posterior sampling that improves a computationally expensive update for latent cluster allocation. Finally, we demonstrate the model's effectiveness for nonlinear point and density prediction under various circumstances by comparing with other recent methods for regression of variable dimensions on synthetic and real data.
Markov Decision Process (MDP) presents a mathematical framework to formulate the learning processes of agents in reinforcement learning. MDP is limited by the Markovian assumption that a reward only depends on the immediate state and action. However, a reward sometimes depends on the history of states and actions, which may result in the decision process in a non-Markovian environment. In such environments, agents receive rewards via temporally-extended behaviors sparsely, and the learned policies may be similar. This leads the agents acquired with similar policies generally overfit to the given task and can not quickly adapt to perturbations of environments. To resolve this problem, this paper tries to learn the diverse policies from the history of state-action pairs under a non-Markovian environment, in which a policy dispersion scheme is designed for seeking diverse policy representation. Specifically, we first adopt a transformer-based method to learn policy embeddings. Then, we stack the policy embeddings to construct a dispersion matrix to induce a set of diverse policies. Finally, we prove that if the dispersion matrix is positive definite, the dispersed embeddings can effectively enlarge the disagreements across policies, yielding a diverse expression for the original policy embedding distribution. Experimental results show that this dispersion scheme can obtain more expressive diverse policies, which then derive more robust performance than recent learning baselines under various learning environments.
We consider random sample splitting for estimation and inference in high dimensional generalized linear models, where we first apply the lasso to select a submodel using one subsample and then apply the debiased lasso to fit the selected model using the remaining subsample. We show that, no matter including a prespecified subset of regression coefficients or not, the debiased lasso estimation of the selected submodel after a single splitting follows a normal distribution asymptotically. Furthermore, for a set of prespecified regression coefficients, we show that a multiple splitting procedure based on the debiased lasso can address the loss of efficiency associated with sample splitting and produce asymptotically normal estimates under mild conditions. Our simulation results indicate that using the debiased lasso instead of the standard maximum likelihood estimator in the estimation stage can vastly reduce the bias and variance of the resulting estimates. We illustrate the proposed multiple splitting debiased lasso method with an analysis of the smoking data of the Mid-South Tobacco Case-Control Study.
We study policy evaluation of offline contextual bandits subject to unobserved confounders. Sensitivity analysis methods are commonly used to estimate the policy value under the worst-case confounding over a given uncertainty set. However, existing work often resorts to some coarse relaxation of the uncertainty set for the sake of tractability, leading to overly conservative estimation of the policy value. In this paper, we propose a general estimator that provides a sharp lower bound of the policy value. It can be shown that our estimator contains the recently proposed sharp estimator by Dorn and Guo (2022) as a special case, and our method enables a novel extension of the classical marginal sensitivity model using f-divergence. To construct our estimator, we leverage the kernel method to obtain a tractable approximation to the conditional moment constraints, which traditional non-sharp estimators failed to take into account. In the theoretical analysis, we provide a condition for the choice of the kernel which guarantees no specification error that biases the lower bound estimation. Furthermore, we provide consistency guarantees of policy evaluation and learning. In the experiments with synthetic and real-world data, we demonstrate the effectiveness of the proposed method.
Model calibration consists of using experimental or field data to estimate the unknown parameters of a mathematical model. The presence of model discrepancy and measurement bias in the data complicates this task. Satellite interferograms, for instance, are widely used for calibrating geophysical models in geological hazard quantification. In this work, we used satellite interferograms to relate ground deformation observations to the properties of the magma chamber at K\={\i}lauea Volcano in Hawai`i. We derived closed-form marginal likelihoods and implemented posterior sampling procedures that simultaneously estimate the model discrepancy of physical models, and the measurement bias from the atmospheric error in satellite interferograms. We found that model calibration by aggregating multiple interferograms and downsampling the pixels in the interferograms can reduce the computation complexity compared to calibration approaches based on multiple data sets. The conditions that lead to no loss of information from data aggregation and downsampling are studied. Simulation illustrates that both discrepancy and measurement bias can be estimated, and real applications demonstrate that modeling both effects helps obtain a reliable estimation of a physical model's unobserved parameters and enhance its predictive accuracy. We implement the computational tools in the RobustCalibration package available on CRAN.
Strict stationarity is a common assumption used in the time series literature in order to derive asymptotic distributional results for second-order statistics, like sample autocovariances and sample autocorrelations. Focusing on weak stationarity, this paper derives the asymptotic distribution of the maximum of sample autocovariances and sample autocorrelations under weak conditions by using Gaussian approximation techniques. The asymptotic theory for parameter estimation obtained by fitting a (linear) autoregressive model to a general weakly stationary time series is revisited and a Gaussian approximation theorem for the maximum of the estimators of the autoregressive coefficients is derived. To perform statistical inference for the second order parameters considered, a bootstrap algorithm, the so-called second-order wild bootstrap, is applied. Consistency of this bootstrap procedure is proven. In contrast to existing bootstrap alternatives, validity of the second-order wild bootstrap does not require the imposition of strict stationary conditions or structural process assumptions, like linearity. The good finite sample performance of the second-order wild bootstrap is demonstrated by means of simulations.
Discrete data are abundant and often arise as counts or rounded data. These data commonly exhibit complex distributional features such as zero-inflation, over-/under-dispersion, boundedness, and heaping, which render many parametric models inadequate. Yet even for parametric regression models, approximations such as MCMC typically are needed for posterior inference. This paper introduces a Bayesian modeling and algorithmic framework that enables semiparametric regression analysis for discrete data with Monte Carlo (not MCMC) sampling. The proposed approach pairs a nonparametric marginal model with a latent linear regression model to encourage both flexibility and interpretability, and delivers posterior consistency even under model misspecification. For a parametric or large-sample approximation of this model, we identify a class of conjugate priors with (pseudo) closed-form posteriors. All posterior and predictive distributions are available analytically or via direct Monte Carlo sampling. These tools are broadly useful for linear regression, nonlinear models via basis expansions, and variable selection with discrete data. Simulation studies demonstrate significant advantages in computing, prediction, estimation, and selection relative to existing alternatives. This novel approach is applied successfully to self-reported mental health data that exhibit zero-inflation, overdispersion, boundedness, and heaping.
Conditional local independence is an asymmetric independence relation among continuous time stochastic processes. It describes whether the evolution of one process is directly influenced by another process given the histories of additional processes, and it is important for the description and learning of causal relations among processes. We develop a model-free framework for testing the hypothesis that a counting process is conditionally locally independent of another process. To this end, we introduce a new functional parameter called the Local Covariance Measure (LCM), which quantifies deviations from the hypothesis. Following the principles of double machine learning, we propose an estimator of the LCM and a test of the hypothesis using nonparametric estimators and sample splitting or cross-fitting. We call this test the (cross-fitted) Local Covariance Test ((X)-LCT), and we show that its level and power can be controlled uniformly, provided that the nonparametric estimators are consistent with modest rates. We illustrate the theory by an example based on a marginalized Cox model with time-dependent covariates, and we show in simulations that when double machine learning is used in combination with cross-fitting, then the test works well without restrictive parametric assumptions.
Pricing based on individual customer characteristics is widely used to maximize sellers' revenues. This work studies offline personalized pricing under endogeneity using an instrumental variable approach. Standard instrumental variable methods in causal inference/econometrics either focus on a discrete treatment space or require the exclusion restriction of instruments from having a direct effect on the outcome, which limits their applicability in personalized pricing. In this paper, we propose a new policy learning method for Personalized pRicing using Invalid iNsTrumental variables (PRINT) for continuous treatment that allow direct effects on the outcome. Specifically, relying on the structural models of revenue and price, we establish the identifiability condition of an optimal pricing strategy under endogeneity with the help of invalid instrumental variables. Based on this new identification, which leads to solving conditional moment restrictions with generalized residual functions, we construct an adversarial min-max estimator and learn an optimal pricing strategy. Furthermore, we establish an asymptotic regret bound to find an optimal pricing strategy. Finally, we demonstrate the effectiveness of the proposed method via extensive simulation studies as well as a real data application from an US online auto loan company.
We consider the problem of supervised dimension reduction with a particular focus on extreme values of the target $Y\in\mathbb{R}$ to be explained by a covariate vector $X \in \mathbb{R}^p$. The general purpose is to define and estimate a projection on a lower dimensional subspace of the covariate space which is sufficient for predicting exceedances of the target above high thresholds. We propose an original definition of Tail Conditional Independence which matches this purpose. Inspired by Sliced Inverse Regression (SIR) methods, we develop a novel framework (TIREX, Tail Inverse Regression for EXtreme response) in order to estimate an extreme sufficient dimension reduction (SDR) space of potentially smaller dimension than that of a classical SDR space. We prove the weak convergence of tail empirical processes involved in the estimation procedure and we illustrate the relevance of the proposed approach on simulated and real world data.