We describe a new design-based framework for drawing causal inference in randomized experiments. Causal effects in the framework are defined as linear functionals evaluated at potential outcome functions. Knowledge and assumptions about the potential outcome functions are encoded as function spaces. This makes the framework expressive, allowing experimenters to formulate and investigate a wide range of causal questions. We describe a class of estimators for estimands defined using the framework and investigate their properties. The construction of the estimators is based on the Riesz representation theorem. We provide necessary and sufficient conditions for unbiasedness and consistency. Finally, we provide conditions under which the estimators are asymptotically normal, and describe a conservative variance estimator to facilitate the construction of confidence intervals for the estimands.
Failure is common in clinical trials since the successful failures presented in negative results always indicate the ways that should not be taken. In this paper, we proposed an automated approach to extracting positive and negative clinical research results by introducing a PICOE (Population, Intervention, Comparation, Outcome, and Effect) framework to represent randomized controlled trials (RCT) reports, where E indicates the effect between a specific I and O. We developed a pipeline to extract and assign the corresponding statistical effect to a specific I-O pair from natural language RCT reports. The extraction models achieved a high degree of accuracy for ICO and E descriptive words extraction through two rounds of training. By defining a threshold of p-value, we find in all Covid-19 related intervention-outcomes pairs with statistical tests, negative results account for nearly 40%. We believe that this observation is noteworthy since they are extracted from the published literature, in which there is an inherent risk of reporting bias, preferring to report positive results rather than negative results. We provided a tool to systematically understand the current level of clinical evidence by distinguishing negative results from the positive results.
Relational verification encompasses information flow security, regression verification, translation validation for compilers, and more. Effective alignment of the programs and computations to be related facilitates use of simpler relational invariants and relational procedure specs, which in turn enables automation and modular reasoning. Alignment has been explored in terms of trace pairs, deductive rules of relational Hoare logics (RHL), and several forms of product automata. This article shows how a simple extension of Kleene Algebra with Tests (KAT), called BiKAT, subsumes prior formulations, including alignment witnesses for forall-exists properties, which brings to light new RHL-style rules for such properties. Alignments can be discovered algorithmically or devised manually but, in either case, their adequacy with respect to the original programs must be proved; an explicit algebra enables constructive proof by equational reasoning. Furthermore our approach inherits algorithmic benefits from existing KAT-based techniques and tools, which are applicable to a range of semantic models.
Positive-Unlabeled (PU) learning tries to learn binary classifiers from a few labeled positive examples with many unlabeled ones. Compared with ordinary semi-supervised learning, this task is much more challenging due to the absence of any known negative labels. While existing cost-sensitive-based methods have achieved state-of-the-art performances, they explicitly minimize the risk of classifying unlabeled data as negative samples, which might result in a negative-prediction preference of the classifier. To alleviate this issue, we resort to a label distribution perspective for PU learning in this paper. Noticing that the label distribution of unlabeled data is fixed when the class prior is known, it can be naturally used as learning supervision for the model. Motivated by this, we propose to pursue the label distribution consistency between predicted and ground-truth label distributions, which is formulated by aligning their expectations. Moreover, we further adopt the entropy minimization and Mixup regularization to avoid the trivial solution of the label distribution consistency on unlabeled data and mitigate the consequent confirmation bias. Experiments on three benchmark datasets validate the effectiveness of the proposed method.Code available at: //github.com/Ray-rui/Dist-PU-Positive-Unlabeled-Learning-from-a-Label-Distribution-Perspective.
Sampling-based methods have become a cornerstone of contemporary approaches to Model Predictive Control (MPC), as they make no restrictions on the differentiability of the dynamics or cost function and are straightforward to parallelize. However, their efficacy is highly dependent on the quality of the sampling distribution itself, which is often assumed to be simple, like a Gaussian. This restriction can result in samples which are far from optimal, leading to poor performance. Recent work has explored improving the performance of MPC by sampling in a learned latent space of controls. However, these methods ultimately perform all MPC parameter updates and warm-starting between time steps in the control space. This requires us to rely on a number of heuristics for generating samples and updating the distribution and may lead to sub-optimal performance. Instead, we propose to carry out all operations in the latent space, allowing us to take full advantage of the learned distribution. Specifically, we frame the learning problem as bi-level optimization and show how to train the controller with backpropagation-through-time. By using a normalizing flow parameterization of the distribution, we can leverage its tractable density to avoid requiring differentiability of the dynamics and cost function. Finally, we evaluate the proposed approach on simulated robotics tasks and demonstrate its ability to surpass the performance of prior methods and scale better with a reduced number of samples.
In empirical studies, the data usually don't include all the variables of interest in an economic model. This paper shows the identification of unobserved variables in observations at the population level. When the observables are distinct in each observation, there exists a function mapping from the observables to the unobservables. Such a function guarantees the uniqueness of the latent value in each observation. The key lies in the identification of the joint distribution of observables and unobservables from the distribution of observables. The joint distribution of observables and unobservables then reveal the latent value in each observation. Three examples of this result are discussed.
We propose simple estimators for mediation analysis and dynamic treatment effects over short horizons based on kernel ridge regression. We study both nonparametric response curves and semiparametric treatment effects, allowing treatments, mediators, and covariates to be continuous or discrete in general spaces. Our key innovation is a new RKHS technique called sequential mean embedding, which facilitates the construction of simple estimators for complex causal estimands, including new estimands without existing alternatives. In particular, we propose machine learning estimators of dynamic dose response curves and dynamic counterfactual distributions without restrictive linearity, Markov, or no-effect-modification assumptions. Our simple estimators preserve the generality of classic identification while also achieving nonasymptotic uniform rates for causal functions and semiparametric efficiency for causal scalars. In nonlinear simulations with many covariates, we demonstrate state-of-the-art performance. We estimate mediated and dynamic response curves of the US Job Corps program for disadvantaged youth, and share a data set that may serve as a benchmark in future work.
In this paper we propose the adaptive lasso for predictive quantile regression (ALQR). Reflecting empirical findings, we allow predictors to have various degrees of persistence and exhibit different signal strengths. The number of predictors is allowed to grow with the sample size. We study regularity conditions under which stationary, local unit root, and cointegrated predictors are present simultaneously. We next show the convergence rates, model selection consistency, and asymptotic distributions of ALQR. We apply the proposed method to the out-of-sample quantile prediction problem of stock returns and find that it outperforms the existing alternatives. We also provide numerical evidence from additional Monte Carlo experiments, supporting the theoretical results.
Multivariate outcomes are common in pragmatic cluster randomized trials. While sample size calculation procedures for multivariate outcomes exist under parallel assignment, none have been developed for a stepped wedge design. In this article, we present computationally efficient power and sample size procedures for stepped wedge cluster randomized trials (SW-CRTs) with multivariate outcomes that differentiate the within-period and between-period intracluster correlation coefficients (ICCs). Under a multivariate linear mixed model, we derive the joint distribution of the intervention test statistics which can be used for determining power under different hypotheses and provide an example using the commonly utilized intersection-union test for co-primary outcomes. Simplifications under a common treatment effect and common ICCs across endpoints and an extension to closed cohort designs are also provided. Finally, under the common ICC across endpoints assumption, we formally prove that the multivariate linear mixed model leads to a more efficient treatment effect estimator compared to the univariate linear mixed model, providing a rigorous justification on the use of the former with multivariate outcomes. We illustrate application of the proposed methods using data from an existing SW-CRT and present extensive simulations to validate the methods.
Existing knowledge graph (KG) embedding models have primarily focused on static KGs. However, real-world KGs do not remain static, but rather evolve and grow in tandem with the development of KG applications. Consequently, new facts and previously unseen entities and relations continually emerge, necessitating an embedding model that can quickly learn and transfer new knowledge through growth. Motivated by this, we delve into an expanding field of KG embedding in this paper, i.e., lifelong KG embedding. We consider knowledge transfer and retention of the learning on growing snapshots of a KG without having to learn embeddings from scratch. The proposed model includes a masked KG autoencoder for embedding learning and update, with an embedding transfer strategy to inject the learned knowledge into the new entity and relation embeddings, and an embedding regularization method to avoid catastrophic forgetting. To investigate the impacts of different aspects of KG growth, we construct four datasets to evaluate the performance of lifelong KG embedding. Experimental results show that the proposed model outperforms the state-of-the-art inductive and lifelong embedding baselines.
When is heterogeneity in the composition of an autonomous robotic team beneficial and when is it detrimental? We investigate and answer this question in the context of a minimally viable model that examines the role of heterogeneous speeds in perimeter defense problems, where defenders share a total allocated speed budget. We consider two distinct problem settings and develop strategies based on dynamic programming and on local interaction rules. We present a theoretical analysis of both approaches and our results are extensively validated using simulations. Interestingly, our results demonstrate that the viability of heterogeneous teams depends on the amount of information available to the defenders. Moreover, our results suggest a universality property: across a wide range of problem parameters the optimal ratio of the speeds of the defenders remains nearly constant.