We consider the problem of evaluating dynamic consistency in discrete time probabilistic filters that approximate stochastic system state densities with Gaussian mixtures. Dynamic consistency means that the estimated probability distributions correctly describe the actual uncertainties. As such, the problem of consistency testing naturally arises in applications with regards to estimator tuning and validation. However, due to the general complexity of the density functions involved, straightforward approaches for consistency testing of mixture-based estimators have remained challenging to define and implement. This paper derives a new exact result for Gaussian mixture consistency testing within the framework of normalized deviation squared (NDS) statistics. It is shown that NDS test statistics for generic multivariate Gaussian mixture models exactly follow mixtures of generalized chi-square distributions, for which efficient computational tools are available. The accuracy and utility of the resulting consistency tests are numerically demonstrated on static and dynamic mixture estimation examples.
We develop a conformal inference method to construct joint confidence regions for structured groups of missing entries within a sparsely observed matrix. This method is useful to provide reliable uncertainty estimation for group-level collaborative filtering; for example, it can be applied to help suggest a movie for a group of friends to watch together. Unlike standard conformal techniques, which make inferences for one individual at a time, our method achieves stronger group-level guarantees by carefully assembling a structured calibration data set mimicking the patterns expected among the test group of interest. We propose a generalized weighted conformalization framework to deal with the lack of exchangeability arising from such structured calibration, and in this process we introduce several innovations to overcome computational challenges. The practicality and effectiveness of our method are demonstrated through extensive numerical experiments and an analysis of the MovieLens 100K data set.
This work addresses the problem of simulating Gaussian random fields that are continuously indexed over a class of metric graphs, termed graphs with Euclidean edges, being more general and flexible than linear networks. We introduce three general algorithms that allow to reconstruct a wide spectrum of random fields having a covariance function that depends on a specific metric, called resistance metric, and proposed in recent literature. The algorithms are applied to a synthetic case study consisting of a street network. They prove to be fast and accurate in that they reproduce the target covariance function and provide random fields whose finite-dimensional distributions are approximately Gaussian.
Second Moment Methods (SMMs) are developed that are consistent with the Discontinuous Galerkin (DG) spatial discretization of the discrete ordinates (or \Sn) transport equations. The low-order (LO) diffusion system of equations is discretized with fully consistent \Pone, Local Discontinuous Galerkin (LDG), and Interior Penalty (IP) methods. A discrete residual approach is used to derive SMM correction terms that make each of the LO systems consistent with the high-order (HO) discretization. We show that the consistent methods are more accurate and have better solution quality than independently discretized LO systems, that they preserve the diffusion limit, and that the LDG and IP consistent SMMs can be scalably solved in parallel on a challenging, multi-material benchmark problem.
Given the ever-increasing size of modern neural networks, the significance of sparse architectures has surged due to their accelerated inference speeds and minimal memory demands. When it comes to global pruning techniques, Iterative Magnitude Pruning (IMP) still stands as a state-of-the-art algorithm despite its simple nature, particularly in extremely sparse regimes. In light of the recent finding that the two successive matching IMP solutions are linearly connected without a loss barrier, we propose Sparse Weight Averaging with Multiple Particles (SWAMP), a straightforward modification of IMP that achieves performance comparable to an ensemble of two IMP solutions. For every iteration, we concurrently train multiple sparse models, referred to as particles, using different batch orders yet the same matching ticket, and then weight average such models to produce a single mask. We demonstrate that our method consistently outperforms existing baselines across different sparsities through extensive experiments on various data and neural network structures.
With the rapid increase in machine learning workloads performed on HPC systems, it is beneficial to regularly perform machine learning specific benchmarks to monitor performance and identify issues. Furthermore, as part of the Edinburgh International Data Facility, EPCC currently hosts a wide range of machine learning accelerators including Nvidia GPUs, the Graphcore Bow Pod64 and Cerebras CS-2, which are managed via Kubernetes and Slurm. We extended the Reframe framework to support the Kubernetes scheduler backend, and utilise Reframe to perform machine learning benchmarks, and we discuss the preliminary results collected and challenges involved in integrating Reframe across multiple platforms and architectures.
This report enlists 13 functional conditions cashed out in computational terms that have been argued to be constituent of conscious valenced experience. These are extracted from existing empirical and theoretical literature on, among others, animal sentience, medical disorders, anaesthetics, philosophy, evolution, neuroscience, and artificial intelligence.
Recently, the performance of monocular depth estimation (MDE) has been significantly boosted with the integration of transformer models. However, the transformer models are usually computationally-expensive, and their effectiveness in light-weight models are limited compared to convolutions. This limitation hinders their deployment on resource-limited devices. In this paper, we propose a cross-architecture knowledge distillation method for MDE, dubbed DisDepth, to enhance efficient CNN models with the supervision of state-of-the-art transformer models. Concretely, we first build a simple framework of convolution-based MDE, which is then enhanced with a novel local-global convolution module to capture both local and global information in the image. To effectively distill valuable information from the transformer teacher and bridge the gap between convolution and transformer features, we introduce a method to acclimate the teacher with a ghost decoder. The ghost decoder is a copy of the student's decoder, and adapting the teacher with the ghost decoder aligns the features to be student-friendly while preserving their original performance. Furthermore, we propose an attentive knowledge distillation loss that adaptively identifies features valuable for depth estimation. This loss guides the student to focus more on attentive regions, improving its performance. Extensive experiments on KITTI and NYU Depth V2 datasets demonstrate the effectiveness of DisDepth. Our method achieves significant improvements on various efficient backbones, showcasing its potential for efficient monocular depth estimation.
Recent studies indicate that the noise characteristics of phasor measurement units (PMUs) can be more accurately described by non-Gaussian distributions. Consequently, estimation techniques based on Gaussian noise assumptions may produce poor results with PMU data. This paper considers the PMU based line parameter estimation (LPE) problem, and investigates the performance of four state-of-the-art techniques in solving this problem in presence of non-Gaussian measurement noise. The rigorous comparative analysis highlights the merits and demerits of each technique w.r.t. the LPE problem, and identifies conditions under which they are expected to give good results.
Hypergeometric sequences are rational-valued sequences that satisfy first-order linear recurrence relations with polynomial coefficients; that is, $\langle u_n \rangle_{n=0}^\infty$ is hypergeometric if it satisfies a first-order linear recurrence of the form $p(n)u_{n+1} = q(n)u_{n}$ with polynomial coefficients $p,q\in\mathbb{Z}[x]$ and $u_0\in\mathbb{Q}$. In this paper, we consider the Threshold Problem for hypergeometric sequences: given a hypergeometric sequence $\langle u_n\rangle_{n=0}^\infty$ and a threshold $t\in\mathbb{Q}$, determine whether $u_n \ge t$ for each $n\in\mathbb{N}_0$. We establish decidability for the Threshold Problem under the assumption that the coefficients $p$ and $q$ are monic polynomials whose roots lie in an imaginary quadratic extension of $\mathbb{Q}$. We also establish conditional decidability results; for example, under the assumption that the coefficients $p$ and $q$ are monic polynomials whose roots lie in any number of quadratic extensions of $\mathbb{Q}$, the Threshold Problem is decidable subject to the truth of Schanuel's conjecture. Finally, we show how our approach both recovers and extends some of the recent decidability results on the Membership Problem for hypergeometric sequences with quadratic parameters.
Incompleteness is a common problem for existing knowledge graphs (KGs), and the completion of KG which aims to predict links between entities is challenging. Most existing KG completion methods only consider the direct relation between nodes and ignore the relation paths which contain useful information for link prediction. Recently, a few methods take relation paths into consideration but pay less attention to the order of relations in paths which is important for reasoning. In addition, these path-based models always ignore nonlinear contributions of path features for link prediction. To solve these problems, we propose a novel KG completion method named OPTransE. Instead of embedding both entities of a relation into the same latent space as in previous methods, we project the head entity and the tail entity of each relation into different spaces to guarantee the order of relations in the path. Meanwhile, we adopt a pooling strategy to extract nonlinear and complex features of different paths to further improve the performance of link prediction. Experimental results on two benchmark datasets show that the proposed model OPTransE performs better than state-of-the-art methods.