Centrality measures are widely used to assign importance to graph-structured data. Recently, understanding the principles of such measures has attracted a lot of attention. Given that measures are diverse, this research has usually focused on classes of centrality measures. In this work, we provide a different approach by focusing on classes of graphs instead of classes of measures to understand the underlying principles among various measures. More precisely, we study the class of trees. We observe that even in \fix{the} case of trees, there is no consensus on which node should be selected as the most central. To analyze the behavior of centrality measures on trees, we introduce a property of \emph{tree rooting} that states a measure selects one or two adjacent nodes as the most important, and the importance decreases from them in all directions. This property is satisfied by closeness centrality but violated by PageRank. We show that, for several centrality measures that root trees, the comparison of adjacent nodes can be inferred by \emph{potential functions} that assess the quality of trees. We use these functions to give fundamental insights on rooting and derive a characterization explaining why some measure root trees. Moreover, we provide an almost liner-time algorithm to compute the root of a graph by using potential functions. Finally, using a family of potential functions, we show that many ways of tree rooting exist with desirable properties.
The Lov\'{a}sz Local Lemma (LLL) is a keystone principle in probability theory, guaranteeing the existence of configurations which avoid a collection $\mathcal B$ of "bad" events which are mostly independent and have low probability. In its simplest "symmetric" form, it asserts that whenever a bad-event has probability $p$ and affects at most $d$ bad-events, and $e p d < 1$, then a configuration avoiding all $\mathcal B$ exists. A seminal algorithm of Moser & Tardos (2010) gives nearly-automatic randomized algorithms for most constructions based on the LLL. However, deterministic algorithms have lagged behind. We address three specific shortcomings of the prior deterministic algorithms. First, our algorithm applies to the LLL criterion of Shearer (1985); this is more powerful than alternate LLL criteria and also removes a number of nuisance parameters and leads to cleaner and more legible bounds. Second, we provide parallel algorithms with much greater flexibility in the functional form of of the bad-events. Third, we provide a derandomized version of the MT-distribution, that is, the distribution of the variables at the termination of the MT algorithm. We show applications to non-repetitive vertex coloring, independent transversals, strong coloring, and other problems. These give deterministic algorithms which essentially match the best previous randomized sequential and parallel algorithms.
Predictive models -- as with machine learning -- can underpin causal inference, to estimate the effects of an intervention at the population or individual level. This opens the door to a plethora of models, useful to match the increasing complexity of health data, but also the Pandora box of model selection: which of these models yield the most valid causal estimates? Classic machine-learning cross-validation procedures are not directly applicable. Indeed, an appropriate selection procedure for causal inference should equally weight both outcome errors for each individual, treated or not treated, whereas one outcome may be seldom observed for a sub-population. We study how more elaborate risks benefit causal model selection. We show theoretically that simple risks are brittle to weak overlap between treated and non-treated individuals as well as to heterogeneous errors between populations. Rather a more elaborate metric, the R-risk appears as a proxy of the oracle error on causal estimates, observable at the cost of an overlap re-weighting. As the R-risk is defined not only from model predictions but also by using the conditional mean outcome and the treatment probability, using it for model selection requires adapting cross validation. Extensive experiments show that the resulting procedure gives the best causal model selection.
The linear combination of Student's $t$ random variables (RVs) appears in many statistical applications. Unfortunately, the Student's $t$ distribution is not closed under convolution, thus, deriving an exact and general distribution for the linear combination of $K$ Student's $t$ RVs is infeasible, which motivates a fitting/approximation approach. Here, we focus on the scenario where the only constraint is that the number of degrees of freedom of each $t-$RV is greater than two. Notice that since the odd moments/cumulants of the Student's $t$ distribution are zero, and the even moments/cumulants do not exist when their order is greater than the number of degrees of freedom, it becomes impossible to use conventional approaches based on moments/cumulants of order one or higher than two. To circumvent this issue, herein we propose fitting such a distribution to that of a scaled Student's $t$ RV by exploiting the second moment together with either the first absolute moment or the characteristic function (CF). For the fitting based on the absolute moment, we depart from the case of the linear combination of $K= 2$ Student's $t$ RVs and then generalize to $K\ge 2$ through a simple iterative procedure. Meanwhile, the CF-based fitting is direct, but its accuracy (measured in terms of the Bhattacharyya distance metric) depends on the CF parameter configuration, for which we propose a simple but accurate approach. We numerically show that the CF-based fitting usually outperforms the absolute moment -based fitting and that both the scale and number of degrees of freedom of the fitting distribution increase almost linearly with $K$.
Web services commonly employ Content Distribution Networks (CDNs) for performance and security. As web traffic is becoming 100% HTTPS, more and more websites allow CDNs to terminate their HTTPS connections. This practice may expose a website's user sensitive information such as a user's login password to a third-party CDN. In this paper, we measure and quantify the extent of user password exposure to third-party CDNs. We find that among Alexa top 50K websites, at least 12,451 of them use CDNs and contain user login entrances. Among those websites, 33% of them expose users' passwords to the CDNs, and a popular CDN may observe passwords from more than 40% of its customers. This result suggests that if a CDN infrastructure has a vulnerability or an insider attack, many users' accounts will be at risk. If we assume the attacker is a passive eavesdropper, a website can avoid this vulnerability by encrypting users' passwords in HTTPS connections. Our measurement shows that less than 17% of the websites adopt this countermeasure.
Various methods for Multi-Agent Reinforcement Learning (MARL) have been developed with the assumption that agents' policies are based on accurate state information. However, policies learned through Deep Reinforcement Learning (DRL) are susceptible to adversarial state perturbation attacks. In this work, we propose a State-Adversarial Markov Game (SAMG) and make the first attempt to investigate the fundamental properties of MARL under state uncertainties. Our analysis shows that the commonly used solution concepts of optimal agent policy and robust Nash equilibrium do not always exist in SAMGs. To circumvent this difficulty, we consider a new solution concept called robust agent policy, where agents aim to maximize the worst-case expected state value. We prove the existence of robust agent policy for finite state and finite action SAMGs. Additionally, we propose a Robust Multi-Agent Adversarial Actor-Critic (RMA3C) algorithm to learn robust policies for MARL agents under state uncertainties. Our experiments demonstrate that our algorithm outperforms existing methods when faced with state perturbations and greatly improves the robustness of MARL policies. Our code is public on //songyanghan.github.io/what_is_solution/.
A multitude of explainability methods and associated fidelity performance metrics have been proposed to help better understand how modern AI systems make decisions. However, much of the current work has remained theoretical -- without much consideration for the human end-user. In particular, it is not yet known (1) how useful current explainability methods are in practice for more real-world scenarios and (2) how well associated performance metrics accurately predict how much knowledge individual explanations contribute to a human end-user trying to understand the inner-workings of the system. To fill this gap, we conducted psychophysics experiments at scale to evaluate the ability of human participants to leverage representative attribution methods for understanding the behavior of different image classifiers representing three real-world scenarios: identifying bias in an AI system, characterizing the visual strategy it uses for tasks that are too difficult for an untrained non-expert human observer as well as understanding its failure cases. Our results demonstrate that the degree to which individual attribution methods help human participants better understand an AI system varied widely across these scenarios. This suggests a critical need for the field to move past quantitative improvements of current attribution methods towards the development of complementary approaches that provide qualitatively different sources of information to human end-users.
In this article, we define extensions of copula-based dependence measures for data with arbitrary distributions, in the non-serial case, i.e., for independent and identically distributed random vectors, as well as in serial case, i.e., for time series. These dependence measures are covariances with respect to a multilinear copula associated with the data. We also consider multivariate extensions based on M\"obius transforms. We find the asymptotic distributions of the statistics under the hypothesis of independence or randomness and under contiguous alternatives. This enables us to find out locally most powerful test statistics for some alternatives, whatever the margins. Numerical experiments are performed for combinations of these statistics to assess the finite sample performance.
If $A$ and $B$ are sets such that $A \subset B$, generalisation may be understood as the inference from $A$ of a hypothesis sufficient to construct $B$. One might infer any number of hypotheses from $A$, yet only some of those may generalise to $B$. How can one know which are likely to generalise? One strategy is to choose the shortest, equating the ability to compress information with the ability to generalise (a proxy for intelligence). We examine this in the context of a mathematical formalism of enactive cognition. We show that compression is neither necessary nor sufficient to maximise performance (measured in terms of the probability of a hypothesis generalising). We formulate a proxy unrelated to length or simplicity, called weakness. We show that if tasks are uniformly distributed, then there is no choice of proxy that performs at least as well as weakness maximisation in all tasks while performing strictly better in at least one. In other words, weakness is the pareto optimal choice of proxy. In experiments comparing maximum weakness and minimum description length in the context of binary arithmetic, the former generalised at between $1.1$ and $5$ times the rate of the latter. We argue this demonstrates that weakness is a far better proxy, and explains why Deepmind's Apperception Engine is able to generalise effectively.
Contention resolution schemes (or CR schemes), introduced by Chekuri, Vondrak and Zenklusen, are a class of randomized rounding algorithms for converting a fractional solution to a relaxation for a down-closed constraint family into an integer solution. A CR scheme takes a fractional point $x$ in a relaxation polytope, rounds each coordinate $x_i$ independently to get a possibly non-feasible set, and then drops some elements in order to satisfy the constraints. Intuitively, a contention resolution scheme is $c$-balanced if every element $i$ is selected with probability at least $c \cdot x_i$. It is known that general matroids admit a $(1-1/e)$-balanced CR scheme, and that this is (asymptotically) optimal. This is in particular true for the special case of uniform matroids of rank one. In this work, we provide a simple and explicit monotone CR scheme for uniform matroids of rank $k$ on $n$ elements with a balancedness of $1 - \binom{n}{k}\:\left(1-\frac{k}{n}\right)^{n+1-k}\:\left(\frac{k}{n}\right)^k$, and show that this is optimal. As $n$ grows, this expression converges from above to $1 - e^{-k}k^k/k!$. While this asymptotic bound can be obtained by combining previously known results, these require defining an exponential-sized linear program, as well as using random sampling and the ellipsoid algorithm. Our procedure, on the other hand, has the advantage of being simple and explicit. This scheme extends naturally into an optimal CR scheme for partition matroids.
Offline reinforcement learning aims to train a policy on a pre-recorded and fixed dataset without any additional environment interactions. There are two major challenges in this setting: (1) extrapolation error caused by approximating the value of state-action pairs not well-covered by the training data and (2) distributional shift between behavior and inference policies. One way to tackle these problems is to induce conservatism - i.e., keeping the learned policies closer to the behavioral ones. To achieve this, we build upon recent works on learning policies in latent action spaces and use a special form of Normalizing Flows for constructing a generative model, which we use as a conservative action encoder. This Normalizing Flows action encoder is pre-trained in a supervised manner on the offline dataset, and then an additional policy model - controller in the latent space - is trained via reinforcement learning. This approach avoids querying actions outside of the training dataset and therefore does not require additional regularization for out-of-dataset actions. We evaluate our method on various locomotion and navigation tasks, demonstrating that our approach outperforms recently proposed algorithms with generative action models on a large portion of datasets.