Propensity score matching (PSM) and augmented inverse propensity weighting (AIPW) are widely used in observational studies to estimate causal effects. The two approaches present complementary features. The AIPW estimator is doubly robust and locally efficient but can be unstable when the propensity scores are close to zero or one due to weighting by the inverse of the propensity score. On the other hand, PSM circumvents the instability of propensity score weighting but it hinges on the correctness of the propensity score model and cannot attain the semiparametric efficiency bound. Besides, the fixed number of matches, K, renders PSM nonsmooth and thus invalidates standard nonparametric bootstrap inference. This article presents novel augmented match weighted (AMW) estimators that combine the advantages of matching and weighting estimators. AMW adheres to the form of AIPW for its double robustness and local efficiency but it mitigates the instability due to weighting. We replace inverse propensity weights with matching weights resulting from PSM with unfixed K. Meanwhile, we propose a new cross-validation procedure to select K that minimizes the mean squared error anchored around an unbiased estimator of the causal estimand. Besides, we derive the limiting distribution for the AMW estimators showing that they enjoy the double robustness property and can achieve the semiparametric efficiency bound if both nuisance models are correct. As a byproduct of unfixed K which smooths the AMW estimators, nonparametric bootstrap can be adopted for variance estimation and inference. Furthermore, simulation studies and real data applications support that the AMW estimators are stable with extreme propensity scores and their variances can be obtained by naive bootstrap.
Image-based precision medicine aims to personalize treatment decisions based on an individual's unique imaging features so as to improve their clinical outcome. Machine learning frameworks that integrate uncertainty estimation as part of their treatment recommendations would be safer and more reliable. However, little work has been done in adapting uncertainty estimation techniques and validation metrics for precision medicine. In this paper, we use Bayesian deep learning for estimating the posterior distribution over factual and counterfactual outcomes on several treatments. This allows for estimating the uncertainty for each treatment option and for the individual treatment effects (ITE) between any two treatments. We train and evaluate this model to predict future new and enlarging T2 lesion counts on a large, multi-center dataset of MR brain images of patients with multiple sclerosis, exposed to several treatments during randomized controlled trials. We evaluate the correlation of the uncertainty estimate with the factual error, and, given the lack of ground truth counterfactual outcomes, demonstrate how uncertainty for the ITE prediction relates to bounds on the ITE error. Lastly, we demonstrate how knowledge of uncertainty could modify clinical decision-making to improve individual patient and clinical trial outcomes.
The abundance of observed data in recent years has increased the number of statistical augmentations to complex models across science and engineering. By augmentation we mean coherent statistical methods that incorporate measurements upon arrival and adjust the model accordingly. However, in this research area methodological developments tend to be central, with important assessments of model fidelity often taking second place. Recently, the statistical finite element method (statFEM) has been posited as a potential solution to the problem of model misspecification when the data are believed to be generated from an underlying partial differential equation system. Bayes nonlinear filtering permits data driven finite element discretised solutions that are updated to give a posterior distribution which quantifies the uncertainty over model solutions. The statFEM has shown great promise in systems subject to mild misspecification but its ability to handle scenarios of severe model misspecification has not yet been presented. In this paper we fill this gap, studying statFEM in the context of shallow water equations chosen for their oceanographic relevance. By deliberately misspecifying the governing equations, via linearisation, viscosity, and bathymetry, we systematically analyse misspecification through studying how the resultant approximate posterior distribution is affected, under additional regimes of decreasing spatiotemporal observational frequency. Results show that statFEM performs well with reasonable accuracy, as measured by theoretically sound proper scoring rules.
We propose a simple and efficient approach to generate a prediction intervals (PI) for approximated and forecasted trends. Our method leverages a weighted asymmetric loss function to estimate the lower and upper bounds of the PI, with the weights determined by its coverage probability. We provide a concise mathematical proof of the method, show how it can be extended to derive PIs for parametrised functions and argue why the method works for predicting PIs of dependent variables. The presented tests of the method on a real-world forecasting task using a neural network-based model show that it can produce reliable PIs in complex machine learning scenarios.
The state of the art related to parameter correlation in two-parameter models has been reviewed in this paper. The apparent contradictions between the different authors regarding the ability of D--optimality to simultaneously reduce the correlation and the area of the confidence ellipse in two-parameter models were analyzed. Two main approaches were found: 1) those who consider that the optimality criteria simultaneously control the precision and correlation of the parameter estimators; and 2) those that consider a combination of criteria to achieve the same objective. An analytical criterion combining in its structure both the optimality of the precision of the estimators of the parameters and the reduction of the correlation between their estimators is provided. The criterion was tested both in a simple linear regression model, considering all possible design spaces, and in a non-linear model with strong correlation of the estimators of the parameters (Michaelis--Menten) to show its performance. This criterion showed a superior behavior to all the strategies and criteria to control at the same time the precision and the correlation.
Randomized experiments (REs) are the cornerstone for treatment effect evaluation. However, due to practical considerations, REs may encounter difficulty recruiting sufficient patients. External controls (ECs) can supplement REs to boost estimation efficiency. Yet, there may be incomparability between ECs and concurrent controls (CCs), resulting in misleading treatment effect evaluation. We introduce a novel bias function to measure the difference in the outcome mean functions between ECs and CCs. We show that the ANCOVA model augmented by the bias function for ECs renders a consistent estimator of the average treatment effect, regardless of whether or not the ANCOVA model is correct. To accommodate possibly different structures of the ANCOVA model and the bias function, we propose a double penalty integration estimator (DPIE) with different penalization terms for the two functions. With an appropriate choice of penalty parameters, our DPIE ensures consistency, oracle property, and asymptotic normality even in the presence of model misspecification. DPIE is more efficient than the estimator derived from REs alone, validated through theoretical and experimental results.
The central limit theorem (CLT) is one of the most fundamental results in probability; and establishing its rate of convergence has been a key question since the 1940s. For independent random variables, a series of recent works established optimal error bounds under the Wasserstein-p distance (with p>=1). In this paper, we extend those results to locally dependent random variables, which include m-dependent random fields and U-statistics. Under conditions on the moments and the dependency neighborhoods, we derive optimal rates in the CLT for the Wasserstein-p distance. Our proofs rely on approximating the empirical average of dependent observations by the empirical average of i.i.d. random variables. To do so, we expand the Stein equation to arbitrary orders by adapting the Stein's dependency neighborhood method. Finally we illustrate the applicability of our results by obtaining efficient tail bounds.
Classical physical modelling with associated numerical simulation (model-based), and prognostic methods based on the analysis of large amounts of data (data-driven) are the two most common methods used for the mapping of complex physical processes. In recent years, the efficient combination of these approaches has become increasingly important. Continuum mechanics in the core consists of conservation equations that -- in addition to the always necessary specification of the process conditions -- can be supplemented by phenomenological material models. The latter are an idealized image of the specific material behavior that can be determined experimentally, empirically, and based on a wealth of expert knowledge. The more complex the material, the more difficult the calibration is. This situation forms the starting point for this work's hybrid data-driven and model-based approach for mapping a complex physical process in continuum mechanics. Specifically, we use data generated from a classical physical model by the MESHFREE software to train a Principal Component Analysis-based neural network (PCA-NN) for the task of parameter identification of the material model parameters. The obtained results highlight the potential of deep-learning-based hybrid models for determining parameters, which are the key to characterizing materials occurring naturally, and their use in industrial applications (e.g. the interaction of vehicles with sand).
Like conventional software projects, projects in model-driven software engineering require adequate management of multiple versions of development artifacts, importantly allowing living with temporary inconsistencies. In previous work, multi-version models for model-driven software engineering have been introduced, which allow checking well-formedness and finding merge conflicts for multiple versions of a model at once. However, also for multi-version models, situations where different artifacts, that is, different models, are linked via automatic model transformations have to be handled. In this paper, we propose a technique for jointly handling the transformation of multiple versions of a source model into corresponding versions of a target model, which enables the use of a more compact representation that may afford improved execution time of both the transformation and further analysis operations. Our approach is based on the well-known formalism of triple graph grammars and the aforementioned encoding of model version histories called multi-version models. In addition to batch transformation of an entire model version history, the technique also covers incremental synchronization of changes in the framework of multi-version models. We show the correctness of our approach with respect to the standard semantics of triple graph grammars and conduct an empirical evaluation to investigate the performance of our technique regarding execution time and memory consumption. Our results indicate that the proposed technique affords lower memory consumption and may improve execution time for batch transformation of large version histories, but can also come with computational overhead in unfavorable cases.
Most algorithms for representation learning and link prediction in relational data have been designed for static data. However, the data they are applied to usually evolves with time, such as friend graphs in social networks or user interactions with items in recommender systems. This is also the case for knowledge bases, which contain facts such as (US, has president, B. Obama, [2009-2017]) that are valid only at certain points in time. For the problem of link prediction under temporal constraints, i.e., answering queries such as (US, has president, ?, 2012), we propose a solution inspired by the canonical decomposition of tensors of order 4. We introduce new regularization schemes and present an extension of ComplEx (Trouillon et al., 2016) that achieves state-of-the-art performance. Additionally, we propose a new dataset for knowledge base completion constructed from Wikidata, larger than previous benchmarks by an order of magnitude, as a new reference for evaluating temporal and non-temporal link prediction methods.
The U-Net was presented in 2015. With its straight-forward and successful architecture it quickly evolved to a commonly used benchmark in medical image segmentation. The adaptation of the U-Net to novel problems, however, comprises several degrees of freedom regarding the exact architecture, preprocessing, training and inference. These choices are not independent of each other and substantially impact the overall performance. The present paper introduces the nnU-Net ('no-new-Net'), which refers to a robust and self-adapting framework on the basis of 2D and 3D vanilla U-Nets. We argue the strong case for taking away superfluous bells and whistles of many proposed network designs and instead focus on the remaining aspects that make out the performance and generalizability of a method. We evaluate the nnU-Net in the context of the Medical Segmentation Decathlon challenge, which measures segmentation performance in ten disciplines comprising distinct entities, image modalities, image geometries and dataset sizes, with no manual adjustments between datasets allowed. At the time of manuscript submission, nnU-Net achieves the highest mean dice scores across all classes and seven phase 1 tasks (except class 1 in BrainTumour) in the online leaderboard of the challenge.