We consider the hypothesis testing problem that two vertices $i$ and $j$ of a generalized random dot product graph have the same latent positions, possibly up to scaling. Special cases of this hypotheses test include testing whether two vertices in a stochastic block model or degree-corrected stochastic block model graph have the same block membership vectors. We propose several test statistics based on the empirical Mahalanobis distances between the $i$th and $j$th rows of either the adjacency or the normalized Laplacian spectral embedding of the graph. We show that, under mild conditions, these test statistics have limiting chi-square distributions under both the null and local alternative hypothesis, and we derived explicit expressions for the non-centrality parameters under the local alternative. Using these limit results, we address the model selection problem of choosing between the standard stochastic block model and its degree-corrected variant. The effectiveness of our proposed tests are illustrated via both simulation studies and real data applications.
The paper provides three results for SVARs under the assumption that the primitive shocks are mutually independent. First, a framework is proposed to study the dynamic effects of disaster-type shocks with infinite variance. We show that the least squares estimates of the VAR are consistent but have non-standard properties. Second, it is shown that the restrictions imposed on a SVAR can be validated by testing independence of the identified shocks. The test can be applied whether the data have fat or thin tails, and to over as well as exactly identified models. Third, the disaster shock is identified as the component with the largest kurtosis, where the mutually independent components are estimated using an estimator that is valid even in the presence of an infinite variance shock. Two applications are considered. In the first, the independence test is used to shed light on the conflicting evidence regarding the role of uncertainty in economic fluctuations. In the second, disaster shocks are shown to have short term economic impact arising mostly from feedback dynamics.
In this work, we introduce statistical testing under distributional shifts. We are interested in the hypothesis $P^* \in H_0$ for a target distribution $P^*$, but observe data from a different distribution $Q^*$. We assume that $P^*$ is related to $Q^*$ through a known shift $\tau$ and formally introduce hypothesis testing in this setting. We propose a general testing procedure that first resamples from the observed data to construct an auxiliary data set and then applies an existing test in the target domain. We prove that if the size of the resample is at most $o(\sqrt{n})$ and the resampling weights are well-behaved, this procedure inherits the pointwise asymptotic level and power from the target test. If the map $\tau$ is estimated from data, we can maintain the above guarantees under mild conditions if the estimation works sufficiently well. We further extend our results to uniform asymptotic level and a different resampling scheme. Testing under distributional shifts allows us to tackle a diverse set of problems. We argue that it may prove useful in reinforcement learning and covariate shift, we show how it reduces conditional to unconditional independence testing and we provide example applications in causal inference.
Node Kayles is a well-known two-player impartial game on graphs: Given an undirected graph, each player alternately chooses a vertex not adjacent to previously chosen vertices, and a player who cannot choose a new vertex loses the game. The problem of deciding if the first player has a winning strategy in this game is known to be PSPACE-complete. There are a few studies on algorithmic aspects of this problem. In this paper, we consider the problem from the viewpoint of fixed-parameter tractability. We show that the problem is fixed-parameter tractable parameterized by the size of a minimum vertex cover or the modular-width of a given graph. Moreover, we give a polynomial kernelization with respect to neighborhood diversity.
The problem of predicting links in large networks is an important task in a variety of practical applications, including social sciences, biology and computer security. In this paper, statistical techniques for link prediction based on the popular random dot product graph model are carefully presented, analysed and extended to dynamic settings. Motivated by a practical application in cyber-security, this paper demonstrates that random dot product graphs not only represent a powerful tool for inferring differences between multiple networks, but are also efficient for prediction purposes and for understanding the temporal evolution of the network. The probabilities of links are obtained by fusing information at two stages: spectral methods provide estimates of latent positions for each node, and time series models are used to capture temporal dynamics. In this way, traditional link prediction methods, usually based on decompositions of the entire network adjacency matrix, are extended using temporal information. The methods presented in this article are applied to a number of simulated and real-world graphs, showing promising results.
We propose a novel methodology for drawing iid realizations from any target distribution on the Euclidean space with arbitrary dimension. No assumption of compact support is necessary for the validity of our theory and method. Our idea is to construct an appropriate infinite sequence of concentric closed ellipsoids, represent the target distribution as an infinite mixture on the central ellipsoid and the ellipsoidal annuli, and to construct efficient perfect samplers for the mixture components. In contrast with most of the existing works on perfect sampling, ours is not only a theoretically valid method, it is practically applicable to all target distributions on any dimensional Euclidean space and very much amenable to parallel computation. We validate the practicality and usefulness of our methodology by generating 10000 iid realizations from the standard distributions such as normal, Student's t with 5 degrees of freedom and Cauchy, for dimensions d = 1, 5, 10, 50, 100, as well as from a 50-dimensional mixture normal distribution. The implementation time in all the cases are very reasonable, and often less than a minute in our parallel implementation. The results turned out to be highly accurate. We also apply our method to draw 10000 iid realizations from the posterior distributions associated with the well-known Challenger data, a Salmonella data and the 160-dimensional challenging spatial example of the radionuclide count data on Rongelap Island. Again, we are able to obtain quite encouraging results with very reasonable computing time.
We show a connection between sampling and optimization on discrete domains. For a family of distributions $\mu$ defined on size $k$ subsets of a ground set of elements that is closed under external fields, we show that rapid mixing of natural local random walks implies the existence of simple approximation algorithms to find $\max \mu(\cdot)$. More precisely we show that if (multi-step) down-up random walks have spectral gap at least inverse polynomially large in $k$, then (multi-step) local search can find $\max \mu(\cdot)$ within a factor of $k^{O(k)}$. As the main application of our result, we show a simple nearly-optimal $k^{O(k)}$-factor approximation algorithm for MAP inference on nonsymmetric DPPs. This is the first nontrivial multiplicative approximation for finding the largest size $k$ principal minor of a square (not-necessarily-symmetric) matrix $L$ with $L+L^\intercal\succeq 0$. We establish the connection between sampling and optimization by showing that an exchange inequality, a concept rooted in discrete convex analysis, can be derived from fast mixing of local random walks. We further connect exchange inequalities with composable core-sets for optimization, generalizing recent results on composable core-sets for DPP maximization to arbitrary distributions that satisfy either the strongly Rayleigh property or that have a log-concave generating polynomial.
Based on the canonical correlation analysis we derive series representations of the probability density function (PDF) and the cumulative distribution function (CDF) of the information density of arbitrary Gaussian random vectors. Using the series representations we give closed-form expressions of the PDF and CDF for important special cases and derive tight approximations for the general case. Furthermore, we derive recurrence formulas, which allow very efficient numerical calculations with an arbitrarily high accuracy as demonstrated with an implementation in Python publicly available on GitLab. Finally, we discuss the (in)validity of Gaussian approximations of the information density.
Let $\theta_0,\theta_1 \in \mathbb{R}^d$ be the population risk minimizers associated to some loss $\ell:\mathbb{R}^d\times \mathcal{Z}\to\mathbb{R}$ and two distributions $\mathbb{P}_0,\mathbb{P}_1$ on $\mathcal{Z}$. The models $\theta_0,\theta_1$ are unknown, and $\mathbb{P}_0,\mathbb{P}_1$ can be accessed by drawing i.i.d samples from them. Our work is motivated by the following model discrimination question: "What sizes of the samples from $\mathbb{P}_0$ and $\mathbb{P}_1$ allow to distinguish between the two hypotheses $\theta^*=\theta_0$ and $\theta^*=\theta_1$ for given $\theta^*\in\{\theta_0,\theta_1\}$?" Making the first steps towards answering it in full generality, we first consider the case of a well-specified linear model with squared loss. Here we provide matching upper and lower bounds on the sample complexity as given by $\min\{1/\Delta^2,\sqrt{r}/\Delta\}$ up to a constant factor; here $\Delta$ is a measure of separation between $\mathbb{P}_0$ and $\mathbb{P}_1$ and $r$ is the rank of the design covariance matrix. We then extend this result in two directions: (i) for general parametric models in asymptotic regime; (ii) for generalized linear models in small samples ($n\le r$) under weak moment assumptions. In both cases we derive sample complexity bounds of a similar form while allowing for model misspecification. In fact, our testing procedures only access $\theta^*$ via a certain functional of empirical risk. In addition, the number of observations that allows us to reach statistical confidence does not allow to "resolve" the two models $-$ that is, recover $\theta_0,\theta_1$ up to $O(\Delta)$ prediction accuracy. These two properties allow to use our framework in applied tasks where one would like to $\textit{identify}$ a prediction model, which can be proprietary, while guaranteeing that the model cannot be actually $\textit{inferred}$ by the identifying agent.
We present a general approach to constructing permutation tests that are both exact for the null hypothesis of equality of distributions and asymptotically correct for testing equality of parameters of distributions while allowing the distributions themselves to differ. These robust permutation tests transform a given test statistic by a consistent estimator of its limiting distribution function before enumerating its permutation distribution. This transformation, known as prepivoting, aligns the unconditional limiting distribution for the test statistic with the probability limit of its permutation distribution. Through prepivoting, the tests permute one minus an asymptotically valid $p$-value for testing the null of equality of parameters. We describe two approaches for prepivoting within permutation tests, one directly using asymptotic normality and the other using the bootstrap. We further illustrate that permutation tests using bootstrap prepivoting can provide improvements to the order of the error in rejection probability relative to competing transformations when testing equality of parameters, while maintaining exactness under equality of distributions. Simulation studies highlight the versatility of the proposal, illustrating the restoration of asymptotic validity to a wide range of permutation tests conducted when only the parameters of distributions are equal.
In order to overcome the expressive limitations of graph neural networks (GNNs), we propose the first method that exploits vector flows over graphs to develop globally consistent directional and asymmetric aggregation functions. We show that our directional graph networks (DGNs) generalize convolutional neural networks (CNNs) when applied on a grid. Whereas recent theoretical works focus on understanding local neighbourhoods, local structures and local isomorphism with no global information flow, our novel theoretical framework allows directional convolutional kernels in any graph. First, by defining a vector field in the graph, we develop a method of applying directional derivatives and smoothing by projecting node-specific messages into the field. Then we propose the use of the Laplacian eigenvectors as such vector field, and we show that the method generalizes CNNs on an n-dimensional grid, and is provably more discriminative than standard GNNs regarding the Weisfeiler-Lehman 1-WL test. Finally, we bring the power of CNN data augmentation to graphs by providing a means of doing reflection, rotation and distortion on the underlying directional field. We evaluate our method on different standard benchmarks and see a relative error reduction of 8\% on the CIFAR10 graph dataset and 11% to 32% on the molecular ZINC dataset. An important outcome of this work is that it enables to translate any physical or biological problems with intrinsic directional axes into a graph network formalism with an embedded directional field.