Random circuits giving rise to unitary designs are key tools in quantum information science and many-body physics. In this work, we investigate a class of random quantum circuits with a specific gate structure. Within this framework, we prove that one-dimensional structured random circuits with non-Haar random local gates can exhibit substantially more global randomness compared to Haar random circuits with the same underlying circuit architecture. In particular, we derive all the exact eigenvalues and eigenvectors of the second-moment operators for these structured random circuits under a solvable condition, by establishing a link to the Kitaev chain, and show that their spectral gaps can exceed those of Haar random circuits. Our findings have applications in improving circuit depth bounds for randomized benchmarking and the generation of approximate unitary 2-designs from shallow random circuits.
Gradient descent is one of the most widely used iterative algorithms in modern statistical learning. However, its precise algorithmic dynamics in high-dimensional settings remain only partially understood, which has therefore limited its broader potential for statistical inference applications. This paper provides a precise, non-asymptotic distributional characterization of gradient descent iterates in a broad class of empirical risk minimization problems, in the so-called mean-field regime where the sample size is proportional to the signal dimension. Our non-asymptotic state evolution theory holds for both general non-convex loss functions and non-Gaussian data, and reveals the central role of two Onsager correction matrices that precisely characterize the non-trivial dependence among all gradient descent iterates in the mean-field regime. Although the Onsager correction matrices are typically analytically intractable, our state evolution theory facilitates a generic gradient descent inference algorithm that consistently estimates these matrices across a broad class of models. Leveraging this algorithm, we show that the state evolution can be inverted to construct (i) data-driven estimators for the generalization error of gradient descent iterates and (ii) debiased gradient descent iterates for inference of the unknown signal. Detailed applications to two canonical models--linear regression and (generalized) logistic regression--are worked out to illustrate model-specific features of our general theory and inference methods.
Background: The standard regulatory approach to assess replication success is the two-trials rule, requiring both the original and the replication study to be significant with effect estimates in the same direction. The sceptical p-value was recently presented as an alternative method for the statistical assessment of the replicability of study results. Methods: We compare the statistical properties of the sceptical p-value and the two-trials rule. We illustrate the performance of the different methods using real-world evidence emulations of randomized, controlled trials (RCTs) conducted within the RCT DUPLICATE initiative. Results: The sceptical p-value depends not only on the two p-values, but also on sample size and effect size of the two studies. It can be calibrated to have the same Type-I error rate as the two-trials rule, but has larger power to detect an existing effect. In the application to the results from the RCT DUPLICATE initiative, the sceptical p-value leads to qualitatively similar results than the two-trials rule, but tends to show more evidence for treatment effects compared to the two-trials rule. Conclusion: The sceptical p-value represents a valid statistical measure to assess the replicability of study results and is especially useful in the context of real-world evidence emulations.
Biological neural networks seem qualitatively superior (e.g. in learning, flexibility, robustness) to current artificial like Multi-Layer Perceptron (MLP) or Kolmogorov-Arnold Network (KAN). Simultaneously, in contrast to them: biological have fundamentally multidirectional signal propagation \cite{axon}, also of probability distributions e.g. for uncertainty estimation, and are believed not being able to use standard backpropagation training \cite{backprop}. There are proposed novel artificial neurons based on HCR (Hierarchical Correlation Reconstruction) allowing to remove the above low level differences: with neurons containing local joint distribution model (of its connections), representing joint density on normalized variables as just linear combination of $(f_\mathbf{j})$ orthonormal polynomials: $\rho(\mathbf{x})=\sum_{\mathbf{j}\in B} a_\mathbf{j} f_\mathbf{j}(\mathbf{x})$ for $\mathbf{x} \in [0,1]^d$ and $B\subset \mathbb{N}^d$ some chosen basis. By various index summations of such $(a_\mathbf{j})_{\mathbf{j}\in B}$ tensor as neuron parameters, we get simple formulas for e.g. conditional expected values for propagation in any direction, like $E[x|y,z]$, $E[y|x]$, which degenerate to KAN-like parametrization if restricting to pairwise dependencies. Such HCR network can also propagate probability distributions (also joint) like $\rho(y,z|x)$. It also allows for additional training approaches, like direct $(a_\mathbf{j})$ estimation, through tensor decomposition, or more biologically plausible information bottleneck training: layers directly influencing only neighbors, optimizing content to maximize information about the next layer, and minimizing about the previous to remove noise, extract crucial information.
Generative adversarial networks (GANs) can synthesize high-quality (HQ) images, and GAN inversion is a technique that discovers how to invert given images back to latent space. While existing methods perform on StyleGAN inversion, they have limited performance and are not generalized to different GANs. To address these issues, we proposed a self-supervised method to pre-train and fine-tune GAN encoders. First, we designed an adaptive block to fit different encoder architectures for inverting diverse GANs. Then we pre-train GAN encoders using synthesized images and emphasize local regions through cropping images. Finally, we fine-tune the pre-trained GAN encoder for inverting real images. Compared with state-of-the-art methods, our method achieved better results that reconstructed high-quality images on mainstream GANs. Our code and pre-trained models are available at: //github.com/disanda/Deep-GAN-Encoders.
The need for statistical models of orientations arises in many applications in engineering and computer science. Orientational data appear as sets of angles, unit vectors, rotation matrices or quaternions. In the field of directional statistics, a lot of advances have been made in modelling such types of data. However, only a few of these tools are used in engineering and computer science applications. Hence, this paper aims to serve as a cheat sheet for those probability distributions of orientations. Models for 1-DOF, 2-DOF and 3-DOF orientations are discussed. For each of them, expressions for the density function, fitting to data, and sampling are presented. The paper is written with a compromise between engineering and statistics in terms of notation and terminology. A Python library with functions for some of these models is provided. Using this library, two examples of applications to real data are presented.
This manuscript describes the notions of blocker and interdiction applied to well-known optimization problems. The main interest of these two concepts is the capability to analyze the existence of a combinatorial structure after some modifications. We focus on graph modification, like removing vertices or links in a network. In the interdiction version, we have a budget for modification to reduce as much as possible the size of a given combinatorial structure. Whereas, for the blocker version, we minimize the number of modifications such that the network does not contain a given combinatorial structure. Blocker and interdiction problems have some similarities and can be applied to well-known optimization problems. We consider matching, connectivity, shortest path, max flow, and clique problems. For these problems, we analyze either the blocker version or the interdiction one. Applying the concept of blocker or interdiction to well-known optimization problems can change their complexities. Some optimization problems become harder when one of these two notions is applied. For this reason, we propose some complexity analysis to show when an optimization problem, or the associated decision problem, becomes harder. Another fundamental aspect developed in the manuscript is the use of exact methods to tackle these optimization problems. The main way to solve these problems is to use integer linear programming to model them. An interesting aspect of integer linear programming is the possibility to analyze theoretically the strength of these models, using cutting planes. For most of the problems studied in this manuscript, a polyhedral analysis is performed to prove the strength of inequalities or describe new families of inequalities. The exact algorithms proposed are based on Branch-and-Cut or Branch-and-Price algorithm, where dedicated separation and pricing algorithms are proposed.
Nested integration problems arise in various scientific and engineering applications, including Bayesian experimental design, financial risk assessment, and uncertainty quantification. These nested integrals take the form $\int f\left(\int g(\bs{y},\bs{x})\di{}\bs{x}\right)\di{}\bs{y}$, for nonlinear $f$, making them computationally challenging, particularly in high-dimensional settings. Although widely used for single integrals, traditional Monte Carlo (MC) methods can be inefficient when encountering complexities of nested integration. This work introduces a novel multilevel estimator, combining deterministic and randomized quasi-MC (rQMC) methods to handle nested integration problems efficiently. In this context, the inner number of samples and the discretization accuracy of the inner integrand evaluation constitute the level. We provide a comprehensive theoretical analysis of the estimator, deriving error bounds demonstrating significant reductions in bias and variance compared with standard methods. The proposed estimator is particularly effective in scenarios where the integrand is evaluated approximately, as it adapts to different levels of resolution without compromising precision. We verify the performance of our method via numerical experiments, focusing on estimating the expected information gain of experiments. We further introduce a truncation scheme to address the eventual unboundedness of the experimental noise. When applied to Gaussian noise in the estimator, this truncation scheme renders the same computational complexity as in the bounded noise case up to multiplicative logarithmic terms. The results reveal that the proposed multilevel rQMC estimator outperforms existing MC and rQMC approaches, offering a substantial reduction in computational costs and offering a powerful tool for practitioners dealing with complex, nested integration problems across various domains.
Deep neural networks (DNNs) achieve state-of-the-art performance on many tasks, but this often requires increasingly larger model sizes, which in turn leads to more complex internal representations. Explainability techniques (XAI) have made remarkable progress in the interpretability of ML models. However, the non-relational nature of Graph neural networks (GNNs) make it difficult to reuse already existing XAI methods. While other works have focused on instance-based explanation methods for GNNs, very few have investigated model-based methods and, to our knowledge, none have tried to probe the embedding of the GNNs for well-known structural graph properties. In this paper we present a model agnostic explainability pipeline for GNNs employing diagnostic classifiers. This pipeline aims to probe and interpret the learned representations in GNNs across various architectures and datasets, refining our understanding and trust in these models.
Brains perform timely reliable decision-making by Bayes theorem. Bayes theorem quantifies events as probabilities and, through probability rules, renders the decisions. Learning from this, applying Bayes theorem in practical problems can visualize the potential risks and decision confidence, thereby enabling efficient user-scene interactions. However, given the probabilistic nature, implementing Bayes theorem with the conventional deterministic computing can inevitably induce excessive computational cost and decision latency. Herein, we propose a probabilistic computing approach using memristors to implement Bayes theorem. We integrate volatile memristors with Boolean logics and, by exploiting the volatile stochastic switching of the memristors, realize Boolean operations with statistical probabilities and correlations, key for enabling Bayes theorem. To practically demonstrate the effectiveness of our memristor-enabled Bayes theorem approach in user-scene interactions, we design lightweight Bayesian inference and fusion operators using our probabilistic logics and apply the operators in road scene parsing for self-driving, including route planning and obstacle detection. The results show that our operators can achieve reliable decisions at a rate over 2,500 frames per second, outperforming human decision-making and the existing driving assistance systems.
In large-scale systems there are fundamental challenges when centralised techniques are used for task allocation. The number of interactions is limited by resource constraints such as on computation, storage, and network communication. We can increase scalability by implementing the system as a distributed task-allocation system, sharing tasks across many agents. However, this also increases the resource cost of communications and synchronisation, and is difficult to scale. In this paper we present four algorithms to solve these problems. The combination of these algorithms enable each agent to improve their task allocation strategy through reinforcement learning, while changing how much they explore the system in response to how optimal they believe their current strategy is, given their past experience. We focus on distributed agent systems where the agents' behaviours are constrained by resource usage limits, limiting agents to local rather than system-wide knowledge. We evaluate these algorithms in a simulated environment where agents are given a task composed of multiple subtasks that must be allocated to other agents with differing capabilities, to then carry out those tasks. We also simulate real-life system effects such as networking instability. Our solution is shown to solve the task allocation problem to 6.7% of the theoretical optimal within the system configurations considered. It provides 5x better performance recovery over no-knowledge retention approaches when system connectivity is impacted, and is tested against systems up to 100 agents with less than a 9% impact on the algorithms' performance.