Inferential models (IMs) represent a novel possibilistic approach for achieving provably valid statistical inference. This paper introduces a general framework for fusing independent IMs in a "black-box" manner, requiring no knowledge of the original IMs construction details. The underlying logic of this framework mirrors that of the IMs approach. First, a fusing function for the initial IMs' possibility contours is selected. Given the possible lack of guarantee regarding the calibration of this function for valid inferences, a "validification" step is performed. Subsequently, a straightforward normalization step is executed to ensure that the final output conforms to a possibility contour.
Deep equilibrium (DEQ) models are widely recognized as a memory efficient alternative to standard neural networks, achieving state-of-the-art performance in language modeling and computer vision tasks. These models solve a fixed point equation instead of explicitly computing the output, which sets them apart from standard neural networks. However, existing DEQ models often lack formal guarantees of the existence and uniqueness of the fixed point, and the convergence of the numerical scheme used for computing the fixed point is not formally established. As a result, DEQ models are potentially unstable in practice. To address these drawbacks, we introduce a novel class of DEQ models called positive concave deep equilibrium (pcDEQ) models. Our approach, which is based on nonlinear Perron-Frobenius theory, enforces nonnegative weights and activation functions that are concave on the positive orthant. By imposing these constraints, we can easily ensure the existence and uniqueness of the fixed point without relying on additional complex assumptions commonly found in the DEQ literature, such as those based on monotone operator theory in convex analysis. Furthermore, the fixed point can be computed with the standard fixed point algorithm, and we provide theoretical guarantees of its geometric convergence, which, in particular, simplifies the training process. Experiments demonstrate the competitiveness of our pcDEQ models against other implicit models.
The goal of multi-objective optimisation is to identify the Pareto front surface which is the set obtained by connecting the best trade-off points. Typically this surface is computed by evaluating the objectives at different points and then interpolating between the subset of the best evaluated trade-off points. In this work, we propose to parameterise the Pareto front surface using polar coordinates. More precisely, we show that any Pareto front surface can be equivalently represented using a scalar-valued length function which returns the projected length along any positive radial direction. We then use this representation in order to rigorously develop the theory and applications of stochastic Pareto front surfaces. In particular, we derive many Pareto front surface statistics of interest such as the expectation, covariance and quantiles. We then discuss how these can be used in practice within a design of experiments setting, where the goal is to both infer and use the Pareto front surface distribution in order to make effective decisions. Our framework allows for clear uncertainty quantification and we also develop advanced visualisation techniques for this purpose. Finally we discuss the applicability of our ideas within multivariate extreme value theory and illustrate our methodology in a variety of numerical examples, including a case study with a real-world air pollution data set.
Artificial intelligence systems, particularly large language models (LLMs), are increasingly being employed in high-stakes decisions that impact both individuals and society at large, often without adequate safeguards to ensure safety, quality, and equity. Yet LLMs hallucinate, lack common sense, and are biased - shortcomings that may reflect LLMs' inherent limitations and thus may not be remedied by more sophisticated architectures, more data, or more human feedback. Relying solely on LLMs for complex, high-stakes decisions is therefore problematic. Here we present a hybrid collective intelligence system that mitigates these risks by leveraging the complementary strengths of human experience and the vast information processed by LLMs. We apply our method to open-ended medical diagnostics, combining 40,762 differential diagnoses made by physicians with the diagnoses of five state-of-the art LLMs across 2,133 medical cases. We show that hybrid collectives of physicians and LLMs outperform both single physicians and physician collectives, as well as single LLMs and LLM ensembles. This result holds across a range of medical specialties and professional experience, and can be attributed to humans' and LLMs' complementary contributions that lead to different kinds of errors. Our approach highlights the potential for collective human and machine intelligence to improve accuracy in complex, open-ended domains like medical diagnostics.
Noninformative priors constructed for estimation purposes are usually not appropriate for model selection and testing. The methodology of integral priors was developed to get prior distributions for Bayesian model selection when comparing two models, modifying initial improper reference priors. We propose a generalization of this methodology to more than two models. Our approach adds an artificial copy of each model under comparison by compactifying the parametric space and creating an ergodic Markov chain across all models that returns the integral priors as marginals of the stationary distribution. Besides the garantee of their existance and the lack of paradoxes attached to estimation reference priors, an additional advantage of this methodology is that the simulation of this Markov chain is straightforward as it only requires simulations of imaginary training samples for all models and from the corresponding posterior distributions. This renders its implementation automatic and generic, both in the nested case and in the nonnested case.
We study the problem of parametric estimation for continuously observed stochastic differential equation driven by fractional Brownian motion. Under some assumptions on drift and diffusion coefficients, we construct maximum likelihood estimator and establish its the asymptotic normality and moment convergence of the drift parameter when a small dispersion coefficient vanishes.
The gait generator, which is capable of producing rhythmic signals for coordinating multiple joints, is an essential component in the quadruped robot locomotion control framework. The biological counterpart of the gait generator is the Central Pattern Generator (abbreviated as CPG), a small neural network consisting of interacting neurons. Inspired by this architecture, researchers have designed artificial neural networks composed of simulated neurons or oscillator equations. Despite the widespread application of these designed CPGs in various robot locomotion controls, some issues remain unaddressed, including: (1) Simplistic network designs often overlook the symmetry between signal and network structure, resulting in fewer gait patterns than those found in nature. (2) Due to minimal architectural consideration, quadruped control CPGs typically consist of only four neurons, which restricts the network's direct control to leg phases rather than joint coordination. (3) Gait changes are achieved by varying the neuron couplings or the assignment between neurons and legs, rather than through external stimulation. We apply symmetry theory to design an eight-neuron network, composed of Stein neuronal models, capable of achieving five gaits and coordinated control of the hip-knee joints. We validate the signal stability of this network as a gait generator through numerical simulations, which reveal various results and patterns encountered during gait transitions using neuronal stimulation. Based on these findings, we have developed several successful gait transition strategies through neuronal stimulations. Using a commercial quadruped robot model, we demonstrate the usability and feasibility of this network by implementing motion control and gait transitions.
We propose a scalable variational Bayes method for statistical inference for a single or low-dimensional subset of the coordinates of a high-dimensional parameter in sparse linear regression. Our approach relies on assigning a mean-field approximation to the nuisance coordinates and carefully modelling the conditional distribution of the target given the nuisance. This requires only a preprocessing step and preserves the computational advantages of mean-field variational Bayes, while ensuring accurate and reliable inference for the target parameter, including for uncertainty quantification. We investigate the numerical performance of our algorithm, showing that it performs competitively with existing methods. We further establish accompanying theoretical guarantees for estimation and uncertainty quantification in the form of a Bernstein--von Mises theorem.
We describe a probabilistic methodology, based on random walk estimates, to obtain exponential upper bounds for the probability of observing unusually small maximal components in two classical (near-)critical random graph models. More specifically, we analyse the near-critical Erd\H{o}s-R\'enyi model $\mathbb{G}(n,p)$ and the random graph $\mathbb{G}(n,d,p)$ obtained by performing near-critical $p$-bond percolation on a simple random $d$-regular graph and show that, for each one of these models, the probability that the size of a largest component is smaller than $n^{2/3}/A$ is at most of order $\exp(-A^{3/2})$. The exponent $3/2$ is known to be optimal for the near-critical $\mathbb{G}(n,p)$ random graph, whereas for the near-critical $\mathbb{G}(n,d,p)$ model the best known upper bound for the above probability was of order $A^{-3/5}$. As a secondary result we show, by means of an optimized version of the martingale method of Nachmias and Peres, that the above probability of observing an unusually small maximal component is at most of order $\exp(-A^{3/5})$ in other two critical models, namely a random intersection graph and the quantum random graph; this stretched-exponential bounds also improve upon the known (polynomial) bounds available for these other two critical models.
The statistical modeling of discrete extremes has received less attention than their continuous counterparts in the Extreme Value Theory (EVT) literature. One approach to the transition from continuous to discrete extremes is the modeling of threshold exceedances of integer random variables by the discrete version of the generalized Pareto distribution. However, the optimal choice of thresholds defining exceedances remains a problematic issue. Moreover, in a regression framework, the treatment of the majority of non-extreme data below the selected threshold is either ignored or separated from the extremes. To tackle these issues, we expand on the concept of employing a smooth transition between the bulk and the upper tail of the distribution. In the case of zero inflation, we also develop models with an additional parameter. To incorporate possible predictors, we relate the parameters to additive smoothed predictors via an appropriate link, as in the generalized additive model (GAM) framework. A penalized maximum likelihood estimation procedure is implemented. We illustrate our modeling proposal with a real dataset of avalanche activity in the French Alps. With the advantage of bypassing the threshold selection step, our results indicate that the proposed models are more flexible and robust than competing models, such as the negative binomial distribution
We propose a novel and simple spectral method based on the semi-discrete Fourier transforms to discretize the fractional Laplacian $(-\Delta)^\frac{\alpha}{2}$. Numerical analysis and experiments are provided to study its performance. Our method has the same symbol $|\boldsymbol\xi|^\alpha$ as the fractional Laplacian $(-\Delta)^\frac{\alpha}{2}$ at the discrete level, and thus it can be viewed as the exact discrete analogue of the fractional Laplacian. This {\it unique feature} distinguishes our method from other existing methods for the fractional Laplacian. Note that our method is different from the Fourier pseudospectral methods in the literature which are usually limited to periodic boundary conditions (see Remark \ref{remark0}). Numerical analysis shows that our method can achieve a spectral accuracy. The stability and convergence of our method in solving the fractional Poisson equations were analyzed. Our scheme yields a multilevel Toeplitz stiffness matrix, and thus fast algorithms can be developed for efficient matrix-vector multiplications. The computational complexity is ${\mathcal O}(2N\log(2N))$, and the memory storage is ${\mathcal O}(N)$ with $N$ the total number of points. Extensive numerical experiments verify our analytical results and demonstrate the effectiveness of our method in solving various problems.