We develop a methodology for proving central limit theorems in network models with strategic interactions and homophilous agents. Since data often consists of observations on a single large network, we consider an asymptotic framework in which the network size tends to infinity. In the presence of strategic interactions, network moments are generally complex functions of components, where a node's component consists of all alters to which it is directly or indirectly connected. We find that a modification of "exponential stabilization" conditions from the stochastic geometry literature provides a useful formulation of weak dependence for moments of this type. We establish a CLT for a network moments satisfying stabilization and provide a methodology for deriving primitive sufficient conditions for stabilization using results in branching process theory. We apply the methodology to static and dynamic models of network formation.
Generalised linear models for multi-class classification problems are one of the fundamental building blocks of modern machine learning tasks. In this manuscript, we characterise the learning of a mixture of $K$ Gaussians with generic means and covariances via empirical risk minimisation (ERM) with any convex loss and regularisation. In particular, we prove exact asymptotics characterising the ERM estimator in high-dimensions, extending several previous results about Gaussian mixture classification in the literature. We exemplify our result in two tasks of interest in statistical learning: a) classification for a mixture with sparse means, where we study the efficiency of $\ell_1$ penalty with respect to $\ell_2$; b) max-margin multi-class classification, where we characterise the phase transition on the existence of the multi-class logistic maximum likelihood estimator for $K>2$. Finally, we discuss how our theory can be applied beyond the scope of synthetic data, showing that in different cases Gaussian mixtures capture closely the learning curve of classification tasks in real data sets.
This work provides a theoretical framework for the pose estimation problem using total least squares for vector observations from landmark features. First, the optimization framework is formulated with observation vectors extracted from point cloud features. Then, error-covariance expressions are derived. The attitude and position solutions obtained via the derived optimization framework are proven to reach the bounds defined by the Cram\'er-Rao lower bound under the small-angle approximation of attitude errors. The measurement data for the simulation of this problem is provided through a series of vector observation scans, and a fully populated observation noise-covariance matrix is assumed as the weight in the cost function to cover the most general case of the sensor uncertainty. Here, previous derivations are expanded for the pose estimation problem to include more generic correlations in the errors than previous cases involving an isotropic noise assumption. The proposed solution is simulated in a Monte-Carlo framework to validate the error-covariance analysis.
Imitation Learning (IL) is an effective learning paradigm exploiting the interactions between agents and environments. It does not require explicit reward signals and instead tries to recover desired policies using expert demonstrations. In general, IL methods can be categorized into Behavioral Cloning (BC) and Inverse Reinforcement Learning (IRL). In this work, a novel reward function based on probability density estimation is proposed for IRL, which can significantly reduce the complexity of existing IRL methods. Furthermore, we prove that the theoretically optimal policy derived from our reward function is identical to the expert policy as long as it is deterministic. Consequently, an IRL problem can be gracefully transformed into a probability density estimation problem. Based on the proposed reward function, we present a "watch-try-learn" style framework named Probability Density Estimation based Imitation Learning (PDEIL), which can work in both discrete and continuous action spaces. Finally, comprehensive experiments in the Gym environment show that PDEIL is much more efficient than existing algorithms in recovering rewards close to the ground truth.
We explore various Bayesian approaches to estimate partial Gaussian graphical models. Our hierarchical structures enable to deal with single-output as well as multiple-output linear regressions, in small or high dimension, enforcing either no sparsity, sparsity, group sparsity or even sparse-group sparsity for a bi-level selection through partial correlations (direct links) between predictors and responses, thanks to spike-and-slab priors corresponding to each setting. Adaptative and global shrinkages are also incorporated in the Bayesian modeling of the direct links. An existing result for model selection consistency is reformulated to stick to our sparse and group-sparse settings, providing a theoretical guarantee under some technical assumptions. Gibbs samplers are developed and a simulation study shows the efficiency of our models which give very competitive results, especially in terms of support recovery. To conclude, a real dataset is investigated.
This paper addresses the problem of full model estimation for non-parametric finite mixture models. It presents an approach for selecting the number of components and the subset of discriminative variables (i.e., the subset of variables having different distributions among the mixture components). The proposed approach considers a discretization of each variable into B bins and a penalization of the resulting log-likelihood. Considering that the number of bins tends to infinity as the sample size tends to infinity, we prove that our estimator of the model (number of components and subset of relevant variables for clustering) is consistent under a suitable choice of the penalty term. Interest of our proposal is illustrated on simulated and benchmark data.
We study optimal transport for stationary stochastic processes taking values in finite spaces. In order to reflect the stationarity of the underlying processes, we restrict attention to stationary couplings, also known as joinings. The resulting optimal joining problem captures differences in the long run average behavior of the processes of interest. We introduce estimators of both optimal joinings and the optimal joining cost, and we establish consistency of the estimators under mild conditions. Furthermore, under stronger mixing assumptions we establish finite-sample error rates for the estimated optimal joining cost that extend the best known results in the iid case. Finally, we extend the consistency and rate analysis to an entropy-penalized version of the optimal joining problem.
Influence maximization is the task of selecting a small number of seed nodes in a social network to maximize the spread of the influence from these seeds, and it has been widely investigated in the past two decades. In the canonical setting, the whole social network as well as its diffusion parameters is given as input. In this paper, we consider the more realistic sampling setting where the network is unknown and we only have a set of passively observed cascades that record the set of activated nodes at each diffusion step. We study the task of influence maximization from these cascade samples (IMS), and present constant approximation algorithms for this task under mild conditions on the seed set distribution. To achieve the optimization goal, we also provide a novel solution to the network inference problem, that is, learning diffusion parameters and the network structure from the cascade data. Comparing with prior solutions, our network inference algorithm requires weaker assumptions and does not rely on maximum-likelihood estimation and convex programming. Our IMS algorithms enhance the learning-and-then-optimization approach by allowing a constant approximation ratio even when the diffusion parameters are hard to learn, and we do not need any assumption related to the network structure or diffusion parameters.
This paper addresses the problem of formally verifying desirable properties of neural networks, i.e., obtaining provable guarantees that neural networks satisfy specifications relating their inputs and outputs (robustness to bounded norm adversarial perturbations, for example). Most previous work on this topic was limited in its applicability by the size of the network, network architecture and the complexity of properties to be verified. In contrast, our framework applies to a general class of activation functions and specifications on neural network inputs and outputs. We formulate verification as an optimization problem (seeking to find the largest violation of the specification) and solve a Lagrangian relaxation of the optimization problem to obtain an upper bound on the worst case violation of the specification being verified. Our approach is anytime i.e. it can be stopped at any time and a valid bound on the maximum violation can be obtained. We develop specialized verification algorithms with provable tightness guarantees under special assumptions and demonstrate the practical significance of our general verification approach on a variety of verification tasks.
Dynamic topic models (DTMs) model the evolution of prevalent themes in literature, online media, and other forms of text over time. DTMs assume that word co-occurrence statistics change continuously and therefore impose continuous stochastic process priors on their model parameters. These dynamical priors make inference much harder than in regular topic models, and also limit scalability. In this paper, we present several new results around DTMs. First, we extend the class of tractable priors from Wiener processes to the generic class of Gaussian processes (GPs). This allows us to explore topics that develop smoothly over time, that have a long-term memory or are temporally concentrated (for event detection). Second, we show how to perform scalable approximate inference in these models based on ideas around stochastic variational inference and sparse Gaussian processes. This way we can train a rich family of DTMs to massive data. Our experiments on several large-scale datasets show that our generalized model allows us to find interesting patterns that were not accessible by previous approaches.
Many resource allocation problems in the cloud can be described as a basic Virtual Network Embedding Problem (VNEP): finding mappings of request graphs (describing the workloads) onto a substrate graph (describing the physical infrastructure). In the offline setting, the two natural objectives are profit maximization, i.e., embedding a maximal number of request graphs subject to the resource constraints, and cost minimization, i.e., embedding all requests at minimal overall cost. The VNEP can be seen as a generalization of classic routing and call admission problems, in which requests are arbitrary graphs whose communication endpoints are not fixed. Due to its applications, the problem has been studied intensively in the networking community. However, the underlying algorithmic problem is hardly understood. This paper presents the first fixed-parameter tractable approximation algorithms for the VNEP. Our algorithms are based on randomized rounding. Due to the flexible mapping options and the arbitrary request graph topologies, we show that a novel linear program formulation is required. Only using this novel formulation the computation of convex combinations of valid mappings is enabled, as the formulation needs to account for the structure of the request graphs. Accordingly, to capture the structure of request graphs, we introduce the graph-theoretic notion of extraction orders and extraction width and show that our algorithms have exponential runtime in the request graphs' maximal width. Hence, for request graphs of fixed extraction width, we obtain the first polynomial-time approximations. Studying the new notion of extraction orders we show that (i) computing extraction orders of minimal width is NP-hard and (ii) that computing decomposable LP solutions is in general NP-hard, even when restricting request graphs to planar ones.