In this work, we discuss some properties of the eigenvalues of some classes of signed complete graphs. We also obtain the form of characteristic polynomial for these graphs.
We construct a bipartite generalization of Alon and Szegedy's nearly orthogonal vectors, thereby obtaining strong bounds for several extremal problems involving the Lov\'asz theta function, vector chromatic number, minimum semidefinite rank, nonnegative rank, and extension complexity of polytopes. In particular, we derive some general lower bounds for the vector chromatic number which may be of independent interest.
Generative diffusion models have achieved spectacular performance in many areas of generative modeling. While the fundamental ideas behind these models come from non-equilibrium physics, in this paper we show that many aspects of these models can be understood using the tools of equilibrium statistical mechanics. Using this reformulation, we show that generative diffusion models undergo second-order phase transitions corresponding to symmetry breaking phenomena. We argue that this lead to a form of instability that lies at the heart of their generative capabilities and that can be described by a set of mean field critical exponents. We conclude by analyzing recent work connecting diffusion models and associative memory networks in view of the thermodynamic formulations.
Bayesian optimal design of experiments is a well-established approach to planning experiments. Briefly, a probability distribution, known as a statistical model, for the responses is assumed which is dependent on a vector of unknown parameters. A utility function is then specified which gives the gain in information for estimating the true value of the parameters using the Bayesian posterior distribution. A Bayesian optimal design is given by maximising the expectation of the utility with respect to the joint distribution given by the statistical model and prior distribution for the true parameter values. The approach takes account of the experimental aim via specification of the utility and of all assumed sources of uncertainty via the expected utility. However, it is predicated on the specification of the statistical model. Recently, a new type of statistical inference, known as Gibbs (or General Bayesian) inference, has been advanced. This is Bayesian-like, in that uncertainty on unknown quantities is represented by a posterior distribution, but does not necessarily rely on specification of a statistical model. Thus the resulting inference should be less sensitive to misspecification of the statistical model. The purpose of this paper is to propose Gibbs optimal design: a framework for optimal design of experiments for Gibbs inference. The concept behind the framework is introduced along with a computational approach to find Gibbs optimal designs in practice. The framework is demonstrated on exemplars including linear models, and experiments with count and time-to-event responses.
Importance sampling is a popular technique in Bayesian inference: by reweighting samples drawn from a proposal distribution we are able to obtain samples and moment estimates from a Bayesian posterior over some $n$ latent variables. Recent work, however, indicates that importance sampling scales poorly -- in order to accurately approximate the true posterior, the required number of importance samples grows is exponential in the number of latent variables [Chatterjee and Diaconis, 2018]. Massively parallel importance sampling works around this issue by drawing $K$ samples for each of the $n$ latent variables and reasoning about all $K^n$ combinations of latent samples. In principle, we can reason efficiently over $K^n$ combinations of samples by exploiting conditional independencies in the generative model. However, in practice this requires complex algorithms that traverse backwards through the graphical model, and we need separate backward traversals for each computation (posterior expectations, marginals and samples). Our contribution is to exploit the source term trick from physics to entirely avoid the need to hand-write backward traversals. Instead, we demonstrate how to simply and easily compute all the required quantities -- posterior expectations, marginals and samples -- by differentiating through a slightly modified marginal likelihood estimator.
In this paper we consider the numerical approximation of infinite horizon problems via the dynamic programming approach. The value function of the problem solves a Hamilton-Jacobi-Bellman (HJB) equation that is approximated by a fully discrete method. It is known that the numerical problem is difficult to handle by the so called curse of dimensionality. To mitigate this issue we apply a reduction of the order by means of a new proper orthogonal decomposition (POD) method based on time derivatives. We carry out the error analysis of the method using recently proved optimal bounds for the fully discrete approximations. Moreover, the use of snapshots based on time derivatives allow us to bound some terms of the error that could not be bounded in a standard POD approach. Some numerical experiments show the good performance of the method in practice.
Whittle-Mat\'ern fields are a recently introduced class of Gaussian processes on metric graphs, which are specified as solutions to a fractional-order stochastic differential equation. Unlike earlier covariance-based approaches for specifying Gaussian fields on metric graphs, the Whittle-Mat\'ern fields are well-defined for any compact metric graph and can provide Gaussian processes with differentiable sample paths. We derive the main statistical properties of the model class, particularly the consistency and asymptotic normality of maximum likelihood estimators of model parameters and the necessary and sufficient conditions for asymptotic optimality properties of linear prediction based on the model with misspecified parameters. The covariance function of the Whittle-Mat\'ern fields is generally unavailable in closed form, and they have therefore been challenging to use for statistical inference. However, we show that for specific values of the fractional exponent, when the fields have Markov properties, likelihood-based inference and spatial prediction can be performed exactly and computationally efficiently. This facilitates using the Whittle-Mat\'ern fields in statistical applications involving big datasets without the need for any approximations. The methods are illustrated via an application to modeling of traffic data, where allowing for differentiable processes dramatically improves the results.
Confounder selection, namely choosing a set of covariates to control for confounding between a treatment and an outcome, is arguably the most important step in the design of observational studies. Previous methods, such as Pearl's celebrated back-door criterion, typically require pre-specifying a causal graph, which can often be difficult in practice. We propose an interactive procedure for confounder selection that does not require pre-specifying the graph or the set of observed variables. This procedure iteratively expands the causal graph by finding what we call "primary adjustment sets" for a pair of possibly confounded variables. This can be viewed as inverting a sequence of latent projections of the underlying causal graph. Structural information in the form of primary adjustment sets is elicited from the user, bit by bit, until either a set of covariates are found to control for confounding or it can be determined that no such set exists. Other information, such as the causal relations between confounders, is not required by the procedure. We show that if the user correctly specifies the primary adjustment sets in every step, our procedure is both sound and complete.
In this paper, we study two graph convexity parameters: iteration time and general position number. The iteration time was defined in 1981 in the geodesic convexity, but its computational complexity was so far open. The general position number was defined in the geodesic convexity and proved NP-hard in 2018. We extend these parameters to any graph convexity and prove that the iteration number is NP-hard in the P3 convexity. We use this result to prove that the iteration time is also NP-hard in the geodesic convexity even in graphs with diameter two, a long standing open question. These results are also important since they are the last two missing NP-hardness results regarding the ten most studied graph convexity parameters in the geodesic and P3 convexities. We also prove that the general position number of the monophonic convexity is W[1]-hard (parameterized by the size of the solution) and $n^{1-\varepsilon}$-inapproximable in polynomial time for any $\varepsilon>0$ unless P=NP, even in graphs with diameter two. Finally, we also obtain FPT results on the general position number in the P3 convexity and we prove that it is W[1]-hard (parameterized by the size of the solution).
Estimating the structure of directed acyclic graphs (DAGs) from observational data remains a significant challenge in machine learning. Most research in this area concentrates on learning a single DAG for the entire population. This paper considers an alternative setting where the graph structure varies across individuals based on available "contextual" features. We tackle this contextual DAG problem via a neural network that maps the contextual features to a DAG, represented as a weighted adjacency matrix. The neural network is equipped with a novel projection layer that ensures the output matrices are sparse and satisfy a recently developed characterization of acyclicity. We devise a scalable computational framework for learning contextual DAGs and provide a convergence guarantee and an analytical gradient for backpropagating through the projection layer. Our experiments suggest that the new approach can recover the true context-specific graph where existing approaches fail.
Suppose a finite, unweighted, combinatorial graph $G = (V,E)$ is the union of several (degree-)regular graphs which are then additionally connected with a few additional edges. $G$ will then have only a small number of vertices $v \in V$ with the property that one of their neighbors $(v,w) \in E$ has a higher degree $\mbox{deg}(w) > \mbox{deg}(v)$. We prove the converse statement: if a graph has few vertices having a neighbor with higher degree and satisfies a mild regularity condition, then, via adding and removing a few edges, the graph can be turned into a disjoint union of (distance-)regular graphs. The number of edge operations depends on the maximum degree and number of vertices with a higher degree neighbor but is independent of the size of $|V|$.