Exploratory factor analysis (EFA) has been widely used to learn the latent structure underlying multivariate data. Rotation and regularised estimation are two classes of methods in EFA that are widely used to find interpretable loading matrices. This paper proposes a new family of oblique rotations based on component-wise $L^p$ loss functions $(0 < p\leq 1)$ that is closely related to an $L^p$ regularised estimator. Model selection and post-selection inference procedures are developed based on the proposed rotation. When the true loading matrix is sparse, the proposed method tends to outperform traditional rotation and regularised estimation methods, in terms of statistical accuracy and computational cost. Since the proposed loss functions are non-smooth, an iteratively reweighted gradient projection algorithm is developed for solving the optimisation problem. Theoretical results are developed that establish the statistical consistency of the estimation, model selection, and post-selection inference. The proposed method is evaluated and compared with regularised estimation and traditional rotation methods via simulation studies. It is further illustrated by an application to big-five personality assessment.
In this work we are interested in general linear inverse problems where the corresponding forward problem is solved iteratively using fixed point methods. Then one-shot methods, which iterate at the same time on the forward problem solution and on the inverse problem unknown, can be applied. We analyze two variants of the so-called multi-step one-shot methods and establish sufficient conditions on the descent step for their convergence, by studying the eigenvalues of the block matrix of the coupled iterations. Several numerical experiments are provided to illustrate the convergence of these methods in comparison with the classical usual and shifted gradient descent. In particular, we observe that very few inner iterations on the forward problem are enough to guarantee good convergence of the inversion algorithm.
Heterogeneity is a dominant factor in the behaviour of many biological processes. Despite this, it is common for mathematical and statistical analyses to ignore biological heterogeneity as a source of variability in experimental data. Therefore, methods for exploring the identifiability of models that explicitly incorporate heterogeneity through variability in model parameters are relatively underdeveloped. We develop a new likelihood-based framework, based on moment matching, for inference and identifiability analysis of differential equation models that capture biological heterogeneity through parameters that vary according to probability distributions. As our novel method is based on an approximate likelihood function, it is highly flexible; we demonstrate identifiability analysis using both a frequentist approach based on profile likelihood, and a Bayesian approach based on Markov-chain Monte Carlo. Through three case studies, we demonstrate our method by providing a didactic guide to inference and identifiability analysis of hyperparameters that relate to the statistical moments of model parameters from independent observed data. Our approach has a computational cost comparable to analysis of models that neglect heterogeneity, a significant improvement over many existing alternatives. We demonstrate how analysis of random parameter models can aid better understanding of the sources of heterogeneity from biological data.
Granular-ball computing is an efficient, robust, and scalable learning method for granular computing. The basis of granular-ball computing is the granular-ball generation method. This paper proposes a method for accelerating the granular-ball generation using the division to replace $k$-means. It can greatly improve the efficiency of granular-ball generation while ensuring the accuracy similar to the existing method. Besides, a new adaptive method for the granular-ball generation is proposed by considering granular-ball's overlap eliminating and some other factors. This makes the granular-ball generation process of parameter-free and completely adaptive in the true sense. In addition, this paper first provides the mathematical models for the granular-ball covering. The experimental results on some real data sets demonstrate that the proposed two granular-ball generation methods have similar accuracies with the existing method while adaptiveness or acceleration is realized.
Compressed Stochastic Gradient Descent (SGD) algorithms have been recently proposed to address the communication bottleneck in distributed and decentralized optimization problems, such as those that arise in federated machine learning. Existing compressed SGD algorithms assume the use of non-adaptive step-sizes(constant or diminishing) to provide theoretical convergence guarantees. Typically, the step-sizes are fine-tuned in practice to the dataset and the learning algorithm to provide good empirical performance. Such fine-tuning might be impractical in many learning scenarios, and it is therefore of interest to study compressed SGD using adaptive step-sizes. Motivated by prior work on adaptive step-size methods for SGD to train neural networks efficiently in the uncompressed setting, we develop an adaptive step-size method for compressed SGD. In particular, we introduce a scaling technique for the descent step in compressed SGD, which we use to establish order-optimal convergence rates for convex-smooth and strong convex-smooth objectives under an interpolation condition and for non-convex objectives under a strong growth condition. We also show through simulation examples that without this scaling, the algorithm can fail to converge. We present experimental results on deep neural networks for real-world datasets, and compare the performance of our proposed algorithm with previously proposed compressed SGD methods in literature, and demonstrate improved performance on ResNet-18, ResNet-34 and DenseNet architectures for CIFAR-100 and CIFAR-10 datasets at various levels of compression.
We propose a general model that jointly characterizes degree heterogeneity and homophily in weighted, undirected networks. We present a moment estimation method using node degrees and homophily statistics. We establish consistency and asymptotic normality of our estimator using novel analysis. We apply our general framework to three applications, including both exponential family and non-exponential family models. Comprehensive numerical studies and a data example also demonstrate the usefulness of our method.
The Extended Randomized Kaczmarz method is a well known iterative scheme which can find the Moore-Penrose inverse solution of a possibly inconsistent linear system and requires only one additional column of the system matrix in each iteration in comparison with the standard randomized Kaczmarz method. Also, the Sparse Randomized Kaczmarz method has been shown to converge linearly to a sparse solution of a consistent linear system. Here, we combine both ideas and propose an Extended Sparse Randomized Kaczmarz method. We show linear expected convergence to a sparse least squares solution in the sense that an extended variant of the regularized basis pursuit problem is solved. Moreover, we generalize the additional step in the method and prove convergence to a more abstract optimization problem. We demonstrate numerically that our method can find sparse least squares solutions of real and complex systems if the noise is concentrated in the complement of the range of the system matrix and that our generalization can handle impulsive noise.
This study demonstrates the existence of a testable condition for the identification of the causal effect of a treatment on an outcome in observational data, which relies on two sets of variables: observed covariates to be controlled for and a suspected instrument. Under a causal structure commonly found in empirical applications, the testable conditional independence of the suspected instrument and the outcome given the treatment and the covariates has two implications. First, the instrument is valid, i.e. it does not directly affect the outcome (other than through the treatment) and is unconfounded conditional on the covariates. Second, the treatment is unconfounded conditional on the covariates such that the treatment effect is identified. We suggest tests of this conditional independence based on machine learning methods that account for covariates in a data-driven way and investigate their asymptotic behavior and finite sample performance in a simulation study. We also apply our testing approach to evaluating the impact of fertility on female labor supply when using the sibling sex ratio of the first two children as supposed instrument, which by and large points to a violation of our testable implication for the moderate set of socio-economic covariates considered.
Propensity score weighting is widely used to improve the representativeness and correct the selection bias in the voluntary sample. The propensity score is often developed using a model for the sampling probability, which can be subject to model misspecification. In this paper, we consider an alternative approach of estimating the inverse of the propensity scores using the density ratio function satisfying the self-efficiency condition. The smoothed density ratio function is obtained by the solution to the information projection onto the space satisfying the moment conditions on the balancing scores. By including the covariates for the outcome regression models only in the density ratio model, we can achieve efficient propensity score estimation. Penalized regression is used to identify important covariates. We further extend the proposed approach to the multivariate missing case. Some limited simulation studies are presented to compare with the existing methods.
Existing literature on constructing optimal regimes often focuses on intention-to-treat analyses that completely ignore the compliance behavior of individuals. Instrumental variable-based methods have been developed for learning optimal regimes under endogeneity. However, when there are two active treatment arms, the average causal effects of treatments cannot be identified using a binary instrument, and thus the existing methods will not be applicable. To fill this gap, we provide a procedure that identifies an optimal regime and the corresponding value function as a function of a vector of sensitivity parameters. We also derive the canonical gradient of the target parameter and propose a multiply robust classification-based estimator of the optimal regime. Our simulations highlight the need for and usefulness of the proposed method in practice. We implement our method on the Adaptive Treatment for Alcohol and Cocaine Dependence randomized trial.
Causality can be described in terms of a structural causal model (SCM) that carries information on the variables of interest and their mechanistic relations. For most processes of interest the underlying SCM will only be partially observable, thus causal inference tries to leverage any exposed information. Graph neural networks (GNN) as universal approximators on structured input pose a viable candidate for causal learning, suggesting a tighter integration with SCM. To this effect we present a theoretical analysis from first principles that establishes a novel connection between GNN and SCM while providing an extended view on general neural-causal models. We then establish a new model class for GNN-based causal inference that is necessary and sufficient for causal effect identification. Our empirical illustration on simulations and standard benchmarks validate our theoretical proofs.