Cluster-randomized experiments are increasingly used to evaluate interventions in routine practice conditions, and researchers often adopt model-based methods with covariate adjustment in the statistical analyses. However, the validity of model-based covariate adjustment is unclear when the working models are misspecified, leading to ambiguity of estimands and risk of bias. In this article, we first adapt two conventional model-based methods, generalized estimating equations and linear mixed models, with weighted g-computation to achieve robust inference for cluster-average and individual-average treatment effects. To further overcome the limitations of model-based covariate adjustment methods, we propose an efficient estimator for each estimand that allows for flexible covariate adjustment and additionally addresses cluster size variation dependent on treatment assignment and other cluster characteristics. Such cluster size variations often occur post-randomization and, if ignored, can lead to bias of model-based estimators. For our proposed efficient covariate-adjusted estimator, we prove that when the nuisance functions are consistently estimated by machine learning algorithms, the estimator is consistent, asymptotically normal, and efficient. When the nuisance functions are estimated via parametric working models, the estimator is triply-robust. Simulation studies and analyses of three real-world cluster-randomized experiments demonstrate that the proposed methods are superior to existing alternatives.
With the increasing availability of large scale datasets, computational power and tools like automatic differentiation and expressive neural network architectures, sequential data are now often treated in a data-driven way, with a dynamical model trained from the observation data. While neural networks are often seen as uninterpretable black-box architectures, they can still benefit from physical priors on the data and from mathematical knowledge. In this paper, we use a neural network architecture which leverages the long-known Koopman operator theory to embed dynamical systems in latent spaces where their dynamics can be described linearly, enabling a number of appealing features. We introduce methods that enable to train such a model for long-term continuous reconstruction, even in difficult contexts where the data comes in irregularly-sampled time series. The potential for self-supervised learning is also demonstrated, as we show the promising use of trained dynamical models as priors for variational data assimilation techniques, with applications to e.g. time series interpolation and forecasting.
Marginal structural models (MSMs) are often used to estimate causal effects of treatments on survival time outcomes from observational data when time-dependent confounding may be present. They can be fitted using, e.g., inverse probability of treatment weighting (IPTW). It is important to evaluate the performance of statistical methods in different scenarios, and simulation studies are a key tool for such evaluations. In such simulation studies, it is common to generate data in such a way that the model of interest is correctly specified, but this is not always straightforward when the model of interest is for potential outcomes, as is an MSM. Methods have been proposed for simulating from MSMs for a survival outcome, but these methods impose restrictions on the data-generating mechanism. Here we propose a method that overcomes these restrictions. The MSM can be a marginal structural logistic model for a discrete survival time or a Cox or additive hazards MSM for a continuous survival time. The hazard of the potential survival time can be conditional on baseline covariates, and the treatment variable can be discrete or continuous. We illustrate the use of the proposed simulation algorithm by carrying out a brief simulation study. This study compares the coverage of confidence intervals calculated in two different ways for causal effect estimates obtained by fitting an MSM via IPTW.
We study a subspace constrained version of the randomized Kaczmarz algorithm for solving large linear systems in which the iterates are confined to the space of solutions of a selected subsystem. We show that the subspace constraint leads to an accelerated convergence rate, especially when the system has structure such as having coherent rows or being approximately low-rank. On Gaussian-like random data, it results in a form of dimension reduction that effectively improves the aspect ratio of the system. Furthermore, this method serves as a building block for a second, quantile-based algorithm for the problem of solving linear systems with arbitrary sparse corruptions, which is able to efficiently exploit partial external knowledge about uncorrupted equations and achieve convergence in difficult settings such as in almost-square systems. Numerical experiments on synthetic and real-world data support our theoretical results and demonstrate the validity of the proposed methods for even more general data models than guaranteed by the theory.
This paper introduces non-linear dimension reduction in factor-augmented vector autoregressions to analyze the effects of different economic shocks. I argue that controlling for non-linearities between a large-dimensional dataset and the latent factors is particularly useful during turbulent times of the business cycle. In simulations, I show that non-linear dimension reduction techniques yield good forecasting performance, especially when data is highly volatile. In an empirical application, I identify a monetary policy as well as an uncertainty shock excluding and including observations of the COVID-19 pandemic. Those two applications suggest that the non-linear FAVAR approaches are capable of dealing with the large outliers caused by the COVID-19 pandemic and yield reliable results in both scenarios.
We present a constant-factor approximation algorithm for the Nash social welfare maximization problem with subadditive valuations accessible via demand queries. More generally, we propose a template for NSW optimization by solving a configuration-type LP and using a rounding procedure for (utilitarian) social welfare as a blackbox, which could be applicable to other variants of the problem.
Besov priors are nonparametric priors that can model spatially inhomogeneous functions. They are routinely used in inverse problems and imaging, where they exhibit attractive sparsity-promoting and edge-preserving features. A recent line of work has initiated the study of their asymptotic frequentist convergence properties. In the present paper, we consider the theoretical recovery performance of the posterior distributions associated to Besov-Laplace priors in the density estimation model, under the assumption that the observations are generated by a possibly spatially inhomogeneous true density belonging to a Besov space. We improve on existing results and show that carefully tuned Besov-Laplace priors attain optimal posterior contraction rates. Furthermore, we show that hierarchical procedures involving a hyper-prior on the regularity parameter lead to adaptation to any smoothness level.
In this research work, we propose a high-order time adapted scheme for pricing a coupled system of fixed-free boundary constant elasticity of variance (CEV) model on both equidistant and locally refined space-grid. The performance of our method is substantially enhanced to improve irregularities in the model which are both inherent and induced. Furthermore, the system of coupled PDEs is strongly nonlinear and involves several time-dependent coefficients that include the first-order derivative of the early exercise boundary. These coefficients are approximated from a fourth-order analytical approximation which is derived using a regularized square-root function. The semi-discrete equation for the option value and delta sensitivity is obtained from a non-uniform fourth-order compact finite difference scheme. Fifth-order 5(4) Dormand-Prince time integration method is used to solve the coupled system of discrete equations. Enhancing the performance of our proposed method with local mesh refinement and adaptive strategies enables us to obtain highly accurate solution with very coarse space grids, hence reducing computational runtime substantially. We further verify the performance of our methodology as compared with some of the well-known and better-performing existing methods.
Multiscale stochastic dynamical systems have been widely adopted to scientific and engineering problems due to their capability of depicting complex phenomena in many real world applications. This work is devoted to investigating the effective reduced dynamics for a slow-fast stochastic dynamical system. Given observation data on a short-term period satisfying some unknown slow-fast stochastic system, we propose a novel algorithm including a neural network called Auto-SDE to learn invariant slow manifold. Our approach captures the evolutionary nature of a series of time-dependent autoencoder neural networks with the loss constructed from a discretized stochastic differential equation. Our algorithm is also proved to be accurate, stable and effective through numerical experiments under various evaluation metrics.
A novel and fully distributed optimization method is proposed for the distributed robust convex program (DRCP) over a time-varying unbalanced directed network without imposing any differentiability assumptions. Firstly, a tractable approximated DRCP (ADRCP) is introduced by discretizing the semi-infinite constraints into a finite number of inequality constraints and restricting the right-hand side of the constraints with a proper positive parameter, which will be iteratively solved by a random-fixed projection algorithm. Secondly, a cutting-surface consensus approach is proposed for locating an approximately optimal consensus solution of the DRCP with guaranteed feasibility. This approach is based on iteratively approximating the DRCP by successively reducing the restriction parameter of the right-hand constraints and populating the cutting-surfaces into the existing finite set of constraints. Thirdly, to ensure finite-time convergence of the distributed optimization, a distributed termination algorithm is developed based on uniformly local consensus and zeroth-order optimality under uniformly strongly connected graphs. Fourthly, it is proved that the cutting-surface consensus approach converges within a finite number of iterations. Finally, the effectiveness of the approach is illustrated through a numerical example.
In this paper, two novel classes of implicit exponential Runge-Kutta (ERK) methods are studied for solving highly oscillatory systems. Firstly, we analyze the symplectic conditions for two kinds of exponential integrators and obtain the symplectic method. In order to effectively solve highly oscillatory problems, we try to design the highly accurate implicit ERK integrators. By comparing the Taylor series expansion of numerical solution with exact solution, it can be verified that the order conditions of two new kinds of exponential methods are identical to classical Runge-Kutta (RK) methods, which implies that using the coefficients of RK methods, some highly accurate numerical methods are directly formulated. Furthermore, we also investigate the linear stability properties for these exponential methods. Finally, numerical results not only display the long time energy preservation of the symplectic method, but also present the accuracy and efficiency of these formulated methods in comparison with standard ERK methods.