Transfer Entropy (TE), the primary method for determining directed information flow within a network system, can exhibit bias - either in deficiency or excess - during both pairwise and conditioned calculations, owing to high-order dependencies among the dynamic processes under consideration and the remaining processes in the system used for conditioning. Here, we propose a novel approach. Instead of conditioning TE on all network processes except the driver and target, as in its fully conditioned version, or not conditioning at all, as in the pairwise approach, our method searches for both the multiplets of variables that maximize information flow and those that minimize it. This provides a decomposition of TE into unique, redundant, and synergistic atoms. Our approach enables the quantification of the relative importance of high-order effects compared to pure two-body effects in information transfer between two processes, while also highlighting the processes that contribute to building these high-order effects alongside the driver. We demonstrate the application of our approach in climatology by analyzing data from El Ni\~{n}o and the Southern Oscillation.
Moment models with suitable closure can lead to accurate and computationally efficient solvers for particle transport. Hence, we propose a new asymptotic preserving scheme for the M1 model of linear transport that works uniformly for any Knudsen number. Our idea is to apply the M1 closure at the numerical level to an existing asymptotic preserving scheme for the corresponding kinetic equation, namely the Unified Gas Kinetic scheme (UGKS) originally proposed in [27] and extended to linear transport in [24]. In order to ensure the moments realizability in this new scheme, the UGKS positivity needs to be maintained. We propose a new density reconstruction in time to obtain this property. A second order extension is also suggested and validated. Several test cases show the performances of this new scheme.
We present a physics-informed neural network (PINN) approach for the discovery of slow invariant manifolds (SIMs), for the most general class of fast/slow dynamical systems of ODEs. In contrast to other machine learning (ML) approaches that construct reduced order black box surrogate models using simple regression, and/or require a priori knowledge of the fast and slow variables, our approach, simultaneously decomposes the vector field into fast and slow components and provides a functional of the underlying SIM in a closed form. The decomposition is achieved by finding a transformation of the state variables to the fast and slow ones, which enables the derivation of an explicit, in terms of fast variables, SIM functional. The latter is obtained by solving a PDE corresponding to the invariance equation within the Geometric Singular Perturbation Theory (GSPT) using a single-layer feedforward neural network with symbolic differentiation. The performance of the proposed physics-informed ML framework is assessed via three benchmark problems: the Michaelis-Menten, the target mediated drug disposition (TMDD) reaction model and a fully competitive substrate-inhibitor(fCSI) mechanism. We also provide a comparison with other GPST methods, namely the quasi steady state approximation (QSSA), the partial equilibrium approximation (PEA) and CSP with one and two iterations. We show that the proposed PINN scheme provides SIM approximations, of equivalent or even higher accuracy, than those provided by QSSA, PEA and CSP, especially close to the boundaries of the underlying SIMs.
Adaptiveness is a key principle in information processing including statistics and machine learning. We investigate the usefulness of adaptive methods in the framework of asymptotic binary hypothesis testing, when each hypothesis represents asymptotically many independent instances of a quantum channel, and the tests are based on using the unknown channel and observing outputs. Unlike the familiar setting of quantum states as hypotheses, there is a fundamental distinction between adaptive and non-adaptive strategies with respect to the channel uses, and we introduce a number of further variants of the discrimination tasks by imposing different restrictions on the test strategies. The following results are obtained: (1) We prove that for classical-quantum channels, adaptive and non-adaptive strategies lead to the same error exponents both in the symmetric (Chernoff) and asymmetric (Hoeffding, Stein) settings. (2) The first separation between adaptive and non-adaptive symmetric hypothesis testing exponents for quantum channels, which we derive from a general lower bound on the error probability for non-adaptive strategies; the concrete example we analyze is a pair of entanglement-breaking channels. (3)We prove, in some sense generalizing the previous statement, that for general channels adaptive strategies restricted to classical feed-forward and product state channel inputs are not superior in the asymptotic limit to non-adaptive product state strategies. (4) As an application of our findings, we address the discrimination power of an arbitrary quantum channel and show that adaptive strategies with classical feedback and no quantum memory at the input do not increase the discrimination power of the channel beyond non-adaptive tensor product input strategies.
Image information is restricted by the dynamic range of the detector, which can be addressed using multi-exposure image fusion (MEF). The conventional MEF approach employed in ptychography is often inadequate under conditions of low signal-to-noise ratio (SNR) or variations in illumination intensity. To address this, we developed a Bayesian approach for MEF based on a modified Poisson noise model that considers the background and saturation. Our method outperforms conventional MEF under challenging experimental conditions, as demonstrated by the synthetic and experimental data. Furthermore, this method is versatile and applicable to any imaging scheme requiring high dynamic range (HDR).
When complex Bayesian models exhibit implausible behaviour, one solution is to assemble available information into an informative prior. Challenges arise as prior information is often only available for the observable quantity, or some model-derived marginal quantity, rather than directly pertaining to the natural parameters in our model. We propose a method for translating available prior information, in the form of an elicited distribution for the observable or model-derived marginal quantity, into an informative joint prior. Our approach proceeds given a parametric class of prior distributions with as yet undetermined hyperparameters, and minimises the difference between the supplied elicited distribution and corresponding prior predictive distribution. We employ a global, multi-stage Bayesian optimisation procedure to locate optimal values for the hyperparameters. Three examples illustrate our approach: a cure-fraction survival model, where censoring implies that the observable quantity is a priori a mixed discrete/continuous quantity; a setting in which prior information pertains to $R^{2}$ -- a model-derived quantity; and a nonlinear regression model.
As language models (LMs) become more capable, it is increasingly important to align them with human preferences. However, the dominant paradigm for training Preference Models (PMs) for that purpose suffers from fundamental limitations, such as lack of transparency and scalability, along with susceptibility to overfitting the preference dataset. We propose Compositional Preference Models (CPMs), a novel PM framework that decomposes one global preference assessment into several interpretable features, obtains scalar scores for these features from a prompted LM, and aggregates these scores using a logistic regression classifier. Through these simple steps, CPMs allow to control which properties of the preference data are used to train the preference model and to build it based on features that are believed to underlie the human preference judgment. Our experiments show that CPMs not only improve generalization and are more robust to overoptimization than standard PMs, but also that best-of-n samples obtained using CPMs tend to be preferred over samples obtained using conventional PMs. Overall, our approach demonstrates the benefits of endowing PMs with priors about which features determine human preferences while relying on LM capabilities to extract those features in a scalable and robust way.
The ParaDiag family of algorithms solves differential equations by using preconditioners that can be inverted in parallel through diagonalization. In the context of optimal control of linear parabolic PDEs, the state-of-the-art ParaDiag method is limited to solving self-adjoint problems with a tracking objective. We propose three improvements to the ParaDiag method: the use of alpha-circulant matrices to construct an alternative preconditioner, a generalization of the algorithm for solving non-self-adjoint equations, and the formulation of an algorithm for terminal-cost objectives. We present novel analytic results about the eigenvalues of the preconditioned systems for all discussed ParaDiag algorithms in the case of self-adjoint equations, which proves the favorable properties the alpha-circulant preconditioner. We use these results to perform a theoretical parallel-scaling analysis of ParaDiag for self-adjoint problems. Numerical tests confirm our findings and suggest that the self-adjoint behavior, which is backed by theory, generalizes to the non-self-adjoint case. We provide a sequential, open-source reference solver in Matlab for all discussed algorithms.
In many communication contexts, the capabilities of the involved actors cannot be known beforehand, whether it is a cell, a plant, an insect, or even a life form unknown to Earth. Regardless of the recipient, the message space and time scale could be too fast, too slow, too large, or too small and may never be decoded. Therefore, it pays to devise a way to encode messages agnostic of space and time scales. We propose the use of fractal functions as self-executable infinite-frequency carriers for sending messages, given their properties of structural self-similarity and scale invariance. We call it `fractal messaging'. Starting from a spatial embedding, we introduce a framework for a space-time scale-free messaging approach to this challenge. When considering a space and time-agnostic framework for message transmission, it would be interesting to encode a message such that it could be decoded at several spatio-temporal scales. Hence, the core idea of the framework proposed herein is to encode a binary message as waves along infinitely many frequencies (in power-like distributions) and amplitudes, transmit such a message, and then decode and reproduce it. To do so, the components of the Weierstrass function, a known fractal, are used as carriers of the message. Each component will have its amplitude modulated to embed the binary stream, allowing for a space-time-agnostic approach to messaging.
The goal of explainable Artificial Intelligence (XAI) is to generate human-interpretable explanations, but there are no computationally precise theories of how humans interpret AI generated explanations. The lack of theory means that validation of XAI must be done empirically, on a case-by-case basis, which prevents systematic theory-building in XAI. We propose a psychological theory of how humans draw conclusions from saliency maps, the most common form of XAI explanation, which for the first time allows for precise prediction of explainee inference conditioned on explanation. Our theory posits that absent explanation humans expect the AI to make similar decisions to themselves, and that they interpret an explanation by comparison to the explanations they themselves would give. Comparison is formalized via Shepard's universal law of generalization in a similarity space, a classic theory from cognitive science. A pre-registered user study on AI image classifications with saliency map explanations demonstrate that our theory quantitatively matches participants' predictions of the AI.
Hashing has been widely used in approximate nearest search for large-scale database retrieval for its computation and storage efficiency. Deep hashing, which devises convolutional neural network architecture to exploit and extract the semantic information or feature of images, has received increasing attention recently. In this survey, several deep supervised hashing methods for image retrieval are evaluated and I conclude three main different directions for deep supervised hashing methods. Several comments are made at the end. Moreover, to break through the bottleneck of the existing hashing methods, I propose a Shadow Recurrent Hashing(SRH) method as a try. Specifically, I devise a CNN architecture to extract the semantic features of images and design a loss function to encourage similar images projected close. To this end, I propose a concept: shadow of the CNN output. During optimization process, the CNN output and its shadow are guiding each other so as to achieve the optimal solution as much as possible. Several experiments on dataset CIFAR-10 show the satisfying performance of SRH.