Popular artificial neural networks (ANN) optimize parameters for unidirectional value propagation, assuming some arbitrary parametrization type like Multi-Layer Perceptron (MLP) or Kolmogorov-Arnold Network (KAN). In contrast, for biological neurons e.g. "it is not uncommon for axonal propagation of action potentials to happen in both directions"~\cite{axon} - suggesting they are optimized to continuously operate in multidirectional way. Additionally, statistical dependencies a single neuron could model is not just (expected) value dependence, but entire joint distributions including also higher moments. Such more agnostic joint distribution neuron would allow for multidirectional propagation (of distributions or values) e.g. $\rho(x|y,z)$ or $\rho(y,z|x)$ by substituting to $\rho(x,y,z)$ and normalizing. There will be discussed Hierarchical Correlation Reconstruction (HCR) for such neuron model: assuming $\rho(x,y,z)=\sum_{ijk} a_{ijk} f_i(x) f_j(y) f_k(z)$ type parametrization of joint distribution in polynomial basis $f_i$, which allows for flexible, inexpensive processing including nonlinearities, direct model estimation and update, trained through standard backpropagation or novel ways for such structure up to tensor decomposition or information bottleneck approach. Using only pairwise (input-output) dependencies, its expected value prediction becomes KAN-like with trained activation functions as polynomials, can be extended by adding higher order dependencies through included products - in conscious interpretable way, allowing for multidirectional propagation of both values and probability densities.
We characterize the convergence properties of traditional best-response (BR) algorithms in computing solutions to mixed-integer Nash equilibrium problems (MI-NEPs) that turn into a class of monotone Nash equilibrium problems (NEPs) once relaxed the integer restrictions. We show that the sequence produced by a Jacobi/Gauss-Seidel BR method always approaches a bounded region containing the entire solution set of the MI-NEP, whose tightness depends on the problem data, and it is related to the degree of strong monotonicity of the relaxed NEP. When the underlying algorithm is applied to the relaxed NEP, we establish data-dependent complexity results characterizing its convergence to the unique solution of the NEP. In addition, we derive one of the very few sufficient conditions for the existence of solutions to MI-NEPs. The theoretical results developed bring important practical benefits, illustrated on a numerical instance of a smart building control application.
When applying multivariate extreme values statistics to analyze tail risk in compound events defined by a multivariate random vector, one often assumes that all dimensions share the same extreme value index. While such an assumption can be tested using a Wald-type test, the performance of such a test deteriorates as the dimensionality increases. This paper introduces a novel test for testing extreme value indices in a high dimensional setting. We show the asymptotic behavior of the test statistic and conduct simulation studies to evaluate its finite sample performance. The proposed test significantly outperforms existing methods in high dimensional settings. We apply this test to examine two datasets previously assumed to have identical extreme value indices across all dimensions.
We propose and implement a protocol for a scalable, cost-effective, information-theoretically secure key distribution and management system. The system, called Distributed Symmetric Key Establishment (DSKE), relies on pre-shared random numbers between DSKE clients and a group of Security Hubs. Any group of DSKE clients can use the DSKE protocol to distill from the pre-shared numbers a secret key. The clients are protected from Security Hub compromise via a secret sharing scheme that allows the creation of the final key without the need to trust individual Security Hubs. Precisely, if the number of compromised Security Hubs does not exceed a certain threshold, confidentiality is guaranteed to DSKE clients and, at the same time, robustness against denial-of-service (DoS) attacks. The DSKE system can be used for quantum-secure communication, can be easily integrated into existing network infrastructures, and can support arbitrary groups of communication parties that have access to a key. We discuss the high-level protocol, analyze its security, including its robustness against disruption. A proof-of-principle demonstration of secure communication between two distant clients with a DSKE-based VPN using Security Hubs on Amazon Web Server (AWS) nodes thousands of kilometres away from them was performed, demonstrating the feasibility of DSKE-enabled secret sharing one-time-pad encryption with a data rate above 50 Mbit/s and a latency below 70 ms.
We augment a thermodynamically consistent diffuse interface model for the description of line tension phenomena by multiplicative stochastic noise to capture the effects of thermal fluctuations and establish the existence of pathwise unique (stochastically) strong solutions. By starting from a fully discrete linear finite element scheme, we do not only prove the well-posedness of the model, but also provide a practicable and convergent scheme for its numerical treatment. Conceptually, our discrete scheme relies on a recently developed augmentation of the scalar auxiliary variable approach, which reduces the requirements on the time regularity of the solution. By showing that fully discrete solutions to this scheme satisfy an energy estimate, we obtain first uniform regularity results. Establishing Nikolskii estimates with respect to time, we are able to show convergence towards pathwise unique martingale solutions by applying Jakubowski's generalization of Skorokhod's theorem. Finally, a generalization of the Gy\"ongy--Krylov characterization of convergence in probability provides convergence towards strong solutions and thereby completes the proof.
The estimation of functional networks through functional covariance and graphical models have recently attracted increasing attention in settings with high dimensional functional data, where the number of functional variables p is comparable to, and maybe larger than, the number of subjects. In this paper, we first reframe the functional covariance model estimation as a tuning-free problem of simultaneously testing p(p-1)/2 hypotheses for cross-covariance functions. Our procedure begins by constructing a Hilbert-Schmidt-norm-based test statistic for each pair, and employs normal quantile transformations for all test statistics, upon which a multiple testing step is proposed. We then explore the multiple testing procedure under a general error-contamination framework and establish that our procedure can control false discoveries asymptotically. Additionally, we demonstrate that our proposed methods for two concrete examples: the functional covariance model with partial observations and, importantly, the more challenging functional graphical model, can be seamlessly integrated into the general error-contamination framework, and, with verifiable conditions, achieve theoretical guarantees on effective false discovery control. Finally, we showcase the superiority of our proposals through extensive simulations and functional connectivity analysis of two neuroimaging datasets.
We propose a hybridized domain decomposition formulation of the discrete fracture network model, allowing for independent discretization of the individual fractures. A natural norm stabilization, obtained by penalizing the residual measured in the norm for the space where it naturally lives, is added to the local problem in the individual fracture so that no compatibility condition of inf-sup type is required between the Lagrange multiplier and the primal unknown, which can then be discretized independently of each other. Optimal stability and error estimates are proven, which are confirmed by numerical tests.
Uncertainty reduction is vital for improving system reliability and reducing risks. To identify the best target for uncertainty reduction, uncertainty importance measure is commonly used to prioritize the significance of input variable uncertainties. Then, designers will take steps to reduce the uncertainties of variables with high importance. However, for variables with minimal uncertainty, the cost of controlling their uncertainties can be unacceptable. Therefore, uncertainty magnitude should also be considered in developing uncertainty reduction strategies. Although variance-based methods have been developed for this purpose, they are dependent on statistical moments and have limitations when dealing with highly-skewed distributions that are commonly encountered in practical applications. Motivated by this problem, we propose a new uncertainty importance measure based on cumulative residual entropy. The proposed measure is moment-independent based on the cumulative distribution function, which can handle the highly-skewed distributions properly. Numerical implementations for estimating the proposed measure are devised and verified. A real-world engineering case considering highly-skewed distributions is introduced to show the procedure of developing uncertainty reduction strategies considering uncertainty magnitude and corresponding cost. The results demonstrate that the proposed measure can present a different uncertainty reduction recommendation compared to the variance-based approach because of its moment-independent characteristic.
We introduce neural information field filter, a Bayesian state and parameter estimation method for high-dimensional nonlinear dynamical systems given large measurement datasets. Solving such a problem using traditional methods, such as Kalman and particle filters, is computationally expensive. Information field theory is a Bayesian approach that can efficiently reconstruct dynamical model state paths and calibrate model parameters from noisy measurement data. To apply the method, we parameterize the time evolution state path using the span of a finite linear basis. The existing method has to reparameterize the state path by initial states to satisfy the initial condition. Designing an expressive yet simple linear basis before knowing the true state path is crucial for inference accuracy but challenging. Moreover, reparameterizing the state path using the initial state is easy to perform for a linear basis, but is nontrivial for more complex and expressive function parameterizations, such as neural networks. The objective of this paper is to simplify and enrich the class of state path parameterizations using neural networks for the information field theory approach. To this end, we propose a generalized physics-informed conditional prior using an auxiliary initial state. We show the existing reparameterization is a special case. We parameterize the state path using a residual neural network that consists of a linear basis function and a Fourier encoding fully connected neural network residual function. The residual function aims to correct the error of the linear basis function. To sample from the intractable posterior distribution, we develop an optimization algorithm, nested stochastic variational inference, and a sampling algorithm, nested preconditioned stochastic gradient Langevin dynamics. A series of numerical and experimental examples verify and validate the proposed method.
For obtaining optimal first-order convergence guarantee for stochastic optimization, it is necessary to use a recurrent data sampling algorithm that samples every data point with sufficient frequency. Most commonly used data sampling algorithms (e.g., i.i.d., MCMC, random reshuffling) are indeed recurrent under mild assumptions. In this work, we show that for a particular class of stochastic optimization algorithms, we do not need any other property (e.g., independence, exponential mixing, and reshuffling) than recurrence in data sampling algorithms to guarantee the optimal rate of first-order convergence. Namely, using regularized versions of Minimization by Incremental Surrogate Optimization (MISO), we show that for non-convex and possibly non-smooth objective functions, the expected optimality gap converges at an optimal rate $O(n^{-1/2})$ under general recurrent sampling schemes. Furthermore, the implied constant depends explicitly on the `speed of recurrence', measured by the expected amount of time to visit a given data point either averaged (`target time') or supremized (`hitting time') over the current location. We demonstrate theoretically and empirically that convergence can be accelerated by selecting sampling algorithms that cover the data set most effectively. We discuss applications of our general framework to decentralized optimization and distributed non-negative matrix factorization.
Bayesian neural networks (BNN) promise to combine the predictive performance of neural networks with principled uncertainty modeling important for safety-critical systems and decision making. However, posterior uncertainty estimates depend on the choice of prior, and finding informative priors in weight-space has proven difficult. This has motivated variational inference (VI) methods that pose priors directly on the function generated by the BNN rather than on weights. In this paper, we address a fundamental issue with such function-space VI approaches pointed out by Burt et al. (2020), who showed that the objective function (ELBO) is negative infinite for most priors of interest. Our solution builds on generalized VI (Knoblauch et al., 2019) with the regularized KL divergence (Quang, 2019) and is, to the best of our knowledge, the first well-defined variational objective for function-space inference in BNNs with Gaussian process (GP) priors. Experiments show that our method incorporates the properties specified by the GP prior on synthetic and small real-world data sets, and provides competitive uncertainty estimates for regression, classification and out-of-distribution detection compared to BNN baselines with both function and weight-space priors.