We present a new algorithm for amortized inference in sparse probabilistic graphical models (PGMs), which we call $\Delta$-amortized inference ($\Delta$-AI). Our approach is based on the observation that when the sampling of variables in a PGM is seen as a sequence of actions taken by an agent, sparsity of the PGM enables local credit assignment in the agent's policy learning objective. This yields a local constraint that can be turned into a local loss in the style of generative flow networks (GFlowNets) that enables off-policy training but avoids the need to instantiate all the random variables for each parameter update, thus speeding up training considerably. The $\Delta$-AI objective matches the conditional distribution of a variable given its Markov blanket in a tractable learned sampler, which has the structure of a Bayesian network, with the same conditional distribution under the target PGM. As such, the trained sampler recovers marginals and conditional distributions of interest and enables inference of partial subsets of variables. We illustrate $\Delta$-AI's effectiveness for sampling from synthetic PGMs and training latent variable models with sparse factor structure.
This paper introduces $\textbf{gemact}$, a $\textbf{Python}$ package for actuarial modelling based on the collective risk model. The library supports applications to risk costing and risk transfer, loss aggregation, and loss reserving. We add new probability distributions to those available in $\textbf{scipy}$, including the (a, b, 0) and (a, b, 1) discrete distributions, copulas of the Archimedean family, the Gaussian, the Student t and the Fundamental copulas. We provide an implementation of the AEP algorithm for calculating the cumulative distribution function of the sum of dependent, non-negative random variables, given their dependency structure specified with a copula. The theoretical framework is introduced at the beginning of each section to give the reader with a sufficient understanding of the underlying actuarial models.
Accurate simulation of deformable linear object (DLO) dynamics is challenging if the task at hand requires a human-interpretable model that also yields fast predictions. To arrive at such a model, we draw inspiration from the rigid finite element method (R-FEM) and model a DLO as a serial chain of rigid bodies whose internal state is unrolled through time by a dynamics network. As this state is not observed directly, the dynamics network is trained jointly with a physics-informed encoder which maps observed motion variables to the DLO's hidden state. To encourage that the state acquires a physically meaningful representation, we leverage the forward kinematics of the underlying R-FEM model as a decoder. Through robot experiments we demonstrate that the proposed architecture provides an easy-to-handle, yet capable DLO dynamics model yielding physically interpretable predictions from partial observations. The project code is available at: \url{//tinyurl.com/fei-networks}
We prove and collect numerous explicit and computable results for the fractional Laplacian $(-\Delta)^s f(x)$ with $s>0$ as well as its whole space inverse, the Riesz potential, $(-\Delta)^{-s}f(x)$ with $s\in\left(0,\frac{1}{2}\right)$. Choices of $f(x)$ include weighted classical orthogonal polynomials such as the Legendre, Chebyshev, Jacobi, Laguerre and Hermite polynomials, or first and second kind Bessel functions with or without sinusoid weights. Some higher dimensional fractional Laplacians and Riesz potentials of generalized Zernike polynomials on the unit ball and its complement as well as whole space generalized Laguerre polynomials are also discussed. The aim of this paper is to aid in the continued development of numerical methods for problems involving the fractional Laplacian or the Riesz potential in bounded and unbounded domains -- both directly by providing useful basis or frame functions for spectral method approaches and indirectly by providing accessible ways to construct computable toy problems on which to test new numerical methods.
We offer an alternative proof, using the Stein-Chen method, of Bollob\'{a}s' theorem concerning the distribution of the extreme degrees of a random graph. Our proof also provides a rate of convergence of the extreme degree to its asymptotic distribution. The same method also applies in a more general setting where the probability of every pair of vertices being connected by edges depends on the number of vertices.
Learning distance functions between complex objects, such as the Wasserstein distance to compare point sets, is a common goal in machine learning applications. However, functions on such complex objects (e.g., point sets and graphs) are often required to be invariant to a wide variety of group actions e.g. permutation or rigid transformation. Therefore, continuous and symmetric product functions (such as distance functions) on such complex objects must also be invariant to the product of such group actions. We call these functions symmetric and factor-wise group invariant (or SFGI functions in short). In this paper, we first present a general neural network architecture for approximating SFGI functions. The main contribution of this paper combines this general neural network with a sketching idea to develop a specific and efficient neural network which can approximate the $p$-th Wasserstein distance between point sets. Very importantly, the required model complexity is independent of the sizes of input point sets. On the theoretical front, to the best of our knowledge, this is the first result showing that there exists a neural network with the capacity to approximate Wasserstein distance with bounded model complexity. Our work provides an interesting integration of sketching ideas for geometric problems with universal approximation of symmetric functions. On the empirical front, we present a range of results showing that our newly proposed neural network architecture performs comparatively or better than other models (including a SOTA Siamese Autoencoder based approach). In particular, our neural network generalizes significantly better and trains much faster than the SOTA Siamese AE. Finally, this line of investigation could be useful in exploring effective neural network design for solving a broad range of geometric optimization problems (e.g., $k$-means in a metric space).
Diffusion model has become a main paradigm for synthetic data generation in many subfields of modern machine learning, including computer vision, language model, or speech synthesis. In this paper, we leverage the power of diffusion model for generating synthetic tabular data. The heterogeneous features in tabular data have been main obstacles in tabular data synthesis, and we tackle this problem by employing the auto-encoder architecture. When compared with the state-of-the-art tabular synthesizers, the resulting synthetic tables from our model show nice statistical fidelities to the real data, and perform well in downstream tasks for machine learning utilities. We conducted the experiments over $15$ publicly available datasets. Notably, our model adeptly captures the correlations among features, which has been a long-standing challenge in tabular data synthesis. Our code is available at //github.com/UCLA-Trustworthy-AI-Lab/AutoDiffusion.
We derive novel and sharp high-dimensional Berry--Esseen bounds for the sum of $m$-dependent random vectors over the class of hyper-rectangles exhibiting only a poly-logarithmic dependence in the dimension. Our results hold under minimal assumptions, such as non-degenerate covariances and finite third moments, and yield a sample complexity of order $\sqrt{m/n}$, aside from logarithmic terms, matching the optimal rates established in the univariate case. When specialized to the sums of independent non-degenerate random vectors, we obtain sharp rates under the weakest possible conditions. On the technical side, we develop an inductive relationship between anti-concentration inequalities and Berry--Esseen bounds, inspired by the classical Lindeberg swapping method and the concentration inequality approach for dependent data, that may be of independent interest.
Compartmental models provide simple and efficient tools to analyze the relevant transmission processes during an outbreak, to produce short-term forecasts or transmission scenarios, and to assess the impact of vaccination campaigns. However, their calibration is not straightforward, since many factors contribute to the rapid change of the transmission dynamics during an epidemic. For example, there might be changes in the individual awareness, the imposition of non-pharmacological interventions and the emergence of new variants. As a consequence, model parameters such as the transmission rate are doomed to change in time, making their assessment more challenging. Here, we propose to use Physics-Informed Neural Networks (PINNs) to track the temporal changes in the model parameters and provide an estimate of the model state variables. PINNs recently gained attention in many engineering applications thanks to their ability to consider both the information from data (typically uncertain) and the governing equations of the system. The ability of PINNs to identify unknown model parameters makes them particularly suitable to solve ill-posed inverse problems, such as those arising in the application of epidemiological models. Here, we develop a reduced-split approach for the implementation of PINNs to estimate the temporal changes in the state variables and transmission rate of an epidemic based on the SIR model equation and infectious data. The main idea is to split the training first on the epidemiological data, and then on the residual of the system equations. The proposed method is applied to five synthetic test cases and two real scenarios reproducing the first months of the COVID-19 Italian pandemic. Our results show that the split implementation of PINNs outperforms the standard approach in terms of accuracy (up to one order of magnitude) and computational times (speed up of 20%).
A class of graphs admits an adjacency labeling scheme of size $b(n)$, if the vertices in each of its $n$-vertex graphs can be assigned binary strings (called labels) of length $b(n)$ so that the adjacency of two vertices can be determined solely from their labels. We give tight bounds on the size of adjacency labels for every family of monotone (i.e., subgraph-closed) classes with a well-behaved growth function between $2^{O(n \log n)}$ and $2^{O(n^{2-\delta})}$ for any $\delta > 0$. Specifically, we show that for any function $f: \mathbb N \to \mathbb R$ satisfying $\log n \leqslant f(n) \leqslant n^{1-\delta}$ for any fixed $\delta > 0$, and some sub-multiplicative condition, there are monotone graph classes with growth $2^{O(nf(n))}$ that do not admit adjacency labels of size at most $f(n) \log n$. On the other hand, any such class does admit adjacency labels of size $O(f(n)\log n)$. Surprisingly this tight bound is a $\Theta(\log n)$ factor away from the information-theoretic bound of $O(f(n))$. The special case when $f = \log$ implies that the recently-refuted Implicit Graph Conjecture [Hatami and Hatami, FOCS 2022] also fails within monotone classes.
A common way to approximate $F(A)b$ -- the action of a matrix function on a vector -- is to use the Arnoldi approximation. Since a new vector needs to be generated and stored in every iteration, one is often forced to rely on restart algorithms which are either not efficient, not stable or only applicable to restricted classes of functions. We present a new representation of the error of the Arnoldi iterates if the function $F$ is given as a Laplace transform. Based on this representation we build an efficient and stable restart algorithm. In doing so we extend earlier work for the class of Stieltjes functions which are special Laplace transforms. We report several numerical experiments including comparisons with the restart method for Stieltjes functions.