Performing (variance-based) global sensitivity analysis (GSA) with dependent inputs has recently benefited from cooperative game theory concepts.By using this theory, despite the potential correlation between the inputs, meaningful sensitivity indices can be defined via allocation shares of the model output's variance to each input. The ``Shapley effects'', i.e., the Shapley values transposed to variance-based GSA problems, allowed for this suitable solution. However, these indices exhibit a particular behavior that can be undesirable: an exogenous input (i.e., which is not explicitly included in the structural equations of the model) can be associated with a strictly positive index when it is correlated to endogenous inputs. In the present work, the use of a different allocation, called the ``proportional values'' is investigated. A first contribution is to propose an extension of this allocation, suitable for variance-based GSA. Novel GSA indices are then proposed, called the ``proportional marginal effects'' (PME). The notion of exogeneity is formally defined in the context of variance-based GSA, and it is shown that the PME allow the distinction of exogenous variables, even when they are correlated to endogenous inputs. Moreover, their behavior is compared to the Shapley effects on analytical toy-cases and more realistic use-cases.
In a mixed generalized linear model, the objective is to learn multiple signals from unlabeled observations: each sample comes from exactly one signal, but it is not known which one. We consider the prototypical problem of estimating two statistically independent signals in a mixed generalized linear model with Gaussian covariates. Spectral methods are a popular class of estimators which output the top two eigenvectors of a suitable data-dependent matrix. However, despite the wide applicability, their design is still obtained via heuristic considerations, and the number of samples $n$ needed to guarantee recovery is super-linear in the signal dimension $d$. In this paper, we develop exact asymptotics on spectral methods in the challenging proportional regime in which $n, d$ grow large and their ratio converges to a finite constant. By doing so, we are able to optimize the design of the spectral method, and combine it with a simple linear estimator, in order to minimize the estimation error. Our characterization exploits a mix of tools from random matrices, free probability and the theory of approximate message passing algorithms. Numerical simulations for mixed linear regression and phase retrieval display the advantage enabled by our analysis over existing designs of spectral methods.
Cellular-connected unmanned aerial vehicle (UAV) has attracted a surge of research interest in both academia and industry. To support aerial user equipment (UEs) in the existing cellular networks, one promising approach is to assign a portion of the system bandwidth exclusively to the UAV-UEs. This is especially favorable for use cases where a large number of UAV-UEs are exploited, e.g., for package delivery close to a warehouse. Although the nearly line-of-sight (LoS) channels can result in higher powers received, UAVs can in turn cause severe interference to each other in the same frequency band. In this contribution, we focus on the uplink communications of massive cellular-connected UAVs. Different power allocation algorithms are proposed to either maximize the minimal spectrum efficiency (SE) or maximize the overall SE to cope with severe interference based on the successive convex approximation (SCA) principle. One of the challenges is that a UAV can affect a large area meaning that many more UAV-UEs must be considered in the optimization problem, which is essentially different from that for terrestrial UEs. The necessity of single-carrier uplink transmission further complicates the problem. Nevertheless, we find that the special property of large coherent bandwidths and coherent times of the propagation channels can be leveraged. The performances of the proposed algorithms are evaluated via extensive simulations in the full-buffer transmission mode and bursty-traffic mode. Results show that the proposed algorithms can effectively enhance the uplink SEs. This work can be considered the first attempt to deal with the interference among massive cellular-connected UAV-UEs with optimized power allocations.
The mathematical approaches for modeling dynamic traffic can roughly be divided into two categories: discrete packet routing models and continuous flow over time models. Despite very vital research activities on models in both categories, the connection between these approaches was poorly understood so far. In this work we build this connection by specifying a (competitive) packet routing model, which is discrete in terms of flow and time, and by proving its convergence to the intensively studied model of flows over time with deterministic queuing. More precisely, we prove that the limit of the convergence process, when decreasing the packet size and time step length in the packet routing model, constitutes a flow over time with multiple commodities. In addition, we show that the convergence result implies the existence of approximate equilibria in the competitive version of the packet routing model. This is of significant interest as exact pure Nash equilibria, similar to almost all other competitive models, cannot be guaranteed in the multi-commodity setting. Moreover, the introduced packet routing model with deterministic queuing is very application-oriented as it is based on the network loading module of the agent-based transport simulation MATSim. As the present work is the first mathematical formalization of this simulation, it provides a theoretical foundation and an environment for provable mathematical statements for MATSim.
End-to-end generative methods are considered a more promising solution for image restoration in physics-based vision compared with the traditional deconstructive methods based on handcrafted composition models. However, existing generative methods still have plenty of room for improvement in quantitative performance. More crucially, these methods are considered black boxes due to weak interpretability and there is rarely a theory trying to explain their mechanism and learning process. In this study, we try to re-interpret these generative methods for image restoration tasks using information theory. Different from conventional understanding, we analyzed the information flow of these methods and identified three sources of information (extracted high-level information, retained low-level information, and external information that is absent from the source inputs) are involved and optimized respectively in generating the restoration results. We further derived their learning behaviors, optimization objectives, and the corresponding information boundaries by extending the information bottleneck principle. Based on this theoretic framework, we found that many existing generative methods tend to be direct applications of the general models designed for conventional generation tasks, which may suffer from problems including over-invested abstraction processes, inherent details loss, and vanishing gradients or imbalance in training. We analyzed these issues with both intuitive and theoretical explanations and proved them with empirical evidence respectively. Ultimately, we proposed general solutions or ideas to address the above issue and validated these approaches with performance boosts on six datasets of three different image restoration tasks.
This paper develops a clustering method that takes advantage of the sturdiness of model-based clustering, while attempting to mitigate some of its pitfalls. First, we note that standard model-based clustering likely leads to the same number of clusters per margin, which seems a rather artificial assumption for a variety of datasets. We tackle this issue by specifying a finite mixture model per margin that allows each margin to have a different number of clusters, and then cluster the multivariate data using a strategy game-inspired algorithm to which we call Reign-and-Conquer. Second, since the proposed clustering approach only specifies a model for the margins -- but leaves the joint unspecified -- it has the advantage of being partially parallelizable; hence, the proposed approach is computationally appealing as well as more tractable for moderate to high dimensions than a `full' (joint) model-based clustering approach. A battery of numerical experiments on artificial data indicate an overall good performance of the proposed methods in a variety of scenarios, and real datasets are used to showcase their application in practice.
Sparse linear regression methods generally have a free hyperparameter which controls the amount of sparsity, and is subject to a bias-variance tradeoff. This article considers the use of Aggregated hold-out to aggregate over values of this hyperparameter, in the context of linear regression with the Huber loss function. Aggregated hold-out (Agghoo) is a procedure which averages estimators selected by hold-out (cross-validation with a single split). In the theoretical part of the article, it is proved that Agghoo satisfies a non-asymptotic oracle inequality when it is applied to sparse estimators which are parametrized by their zero-norm. In particular , this includes a variant of the Lasso introduced by Zou, Hasti{\'e} and Tibshirani. Simulations are used to compare Agghoo with cross-validation. They show that Agghoo performs better than CV when the intrinsic dimension is high and when there are confounders correlated with the predictive covariates.
This paper presents an algorithm to solve the Soft k-Means problem globally. Unlike Fuzzy c-Means, Soft k-Means (SkM) has a matrix factorization-type objective and has been shown to have a close relation with the popular probability decomposition-type clustering methods, e.g., Left Stochastic Clustering (LSC). Though some work has been done for solving the Soft k-Means problem, they usually use an alternating minimization scheme or the projected gradient descent method, which cannot guarantee global optimality since the non-convexity of SkM. In this paper, we present a sufficient condition for a feasible solution of Soft k-Means problem to be globally optimal and show the output of the proposed algorithm satisfies it. Moreover, for the Soft k-Means problem, we provide interesting discussions on stability, solutions non-uniqueness, and connection with LSC. Then, a new model, named Minimal Volume Soft k-Means (MVSkM), is proposed to address the solutions non-uniqueness issue. Finally, experimental results support our theoretical results.
Firms and statistical agencies must protect the privacy of the individuals whose data they collect, analyze, and publish. Increasingly, these organizations do so by using publication mechanisms that satisfy differential privacy. We consider the problem of choosing such a mechanism so as to maximize the value of its output to end users. We show that this is a constrained information design problem, and characterize its solution. When the underlying database is drawn from a symmetric distribution -- for instance, if individuals' data are i.i.d. -- we show that the problem's dimensionality can be reduced, and that its solution belongs to a simpler class of mechanisms. When, in addition, data users have supermodular payoffs, we show that the simple geometric mechanism is always optimal by using a novel comparative static that ranks information structures according to their usefulness in supermodular decision problems.
Large knowledge graphs often grow to store temporal facts that model the dynamic relations or interactions of entities along the timeline. Since such temporal knowledge graphs often suffer from incompleteness, it is important to develop time-aware representation learning models that help to infer the missing temporal facts. While the temporal facts are typically evolving, it is observed that many facts often show a repeated pattern along the timeline, such as economic crises and diplomatic activities. This observation indicates that a model could potentially learn much from the known facts appeared in history. To this end, we propose a new representation learning model for temporal knowledge graphs, namely CyGNet, based on a novel timeaware copy-generation mechanism. CyGNet is not only able to predict future facts from the whole entity vocabulary, but also capable of identifying facts with repetition and accordingly predicting such future facts with reference to the known facts in the past. We evaluate the proposed method on the knowledge graph completion task using five benchmark datasets. Extensive experiments demonstrate the effectiveness of CyGNet for predicting future facts with repetition as well as de novo fact prediction.
Recent advances in maximizing mutual information (MI) between the source and target have demonstrated its effectiveness in text generation. However, previous works paid little attention to modeling the backward network of MI (i.e., dependency from the target to the source), which is crucial to the tightness of the variational information maximization lower bound. In this paper, we propose Adversarial Mutual Information (AMI): a text generation framework which is formed as a novel saddle point (min-max) optimization aiming to identify joint interactions between the source and target. Within this framework, the forward and backward networks are able to iteratively promote or demote each other's generated instances by comparing the real and synthetic data distributions. We also develop a latent noise sampling strategy that leverages random variations at the high-level semantic space to enhance the long term dependency in the generation process. Extensive experiments based on different text generation tasks demonstrate that the proposed AMI framework can significantly outperform several strong baselines, and we also show that AMI has potential to lead to a tighter lower bound of maximum mutual information for the variational information maximization problem.