A random walk-based method is proposed to efficiently compute the solution of a large class of fractional in time linear systems of differential equations (linear F-ODE systems), along with the derivatives with respect to the system parameters. Such a method is unbiased and unconditionally stable, and can therefore be used to provide an unbiased estimation of individual entries of the solution, or the full solution. By using stochastic differentiation techniques, it can be used as well to provide unbiased estimators of the sensitivities of the solution with respect to the problem parameters without any additional computational cost. The time complexity of the algorithm is discussed here, along with suitable variance bounds, which prove in practice the convergence of the algorithm. Finally, several test cases were run to assess the validity of the algorithm.
The Kennedy and O'Hagan (KOH) calibration framework uses coupled Gaussian processes (GPs) to meta-model an expensive simulator (first GP), tune its ``knobs" (calibration inputs) to best match observations from a real physical/field experiment and correct for any modeling bias (second GP) when predicting under new field conditions (design inputs). There are well-established methods for placement of design inputs for data-efficient planning of a simulation campaign in isolation, i.e., without field data: space-filling, or via criterion like minimum integrated mean-squared prediction error (IMSPE). Analogues within the coupled GP KOH framework are mostly absent from the literature. Here we derive a closed form IMSPE criterion for sequentially acquiring new simulator data for KOH. We illustrate how acquisitions space-fill in design space, but concentrate in calibration space. Closed form IMSPE precipitates a closed-form gradient for efficient numerical optimization. We demonstrate that our KOH-IMSPE strategy leads to a more efficient simulation campaign on benchmark problems, and conclude with a showcase on an application to equilibrium concentrations of rare earth elements for a liquid-liquid extraction reaction.
The parametrization of wireless channels by so-called "beyond-diagonal reconfigurable intelligent surfaces" (BD-RIS) is mathematically characterized by a matrix whose off-diagonal entries are partially or fully populated. Physically, this corresponds to tunable coupling mechanisms between the RIS elements that originate from the RIS control circuit. Here, we derive a physics-compliant diagonal representation for BD-RIS-parametrized channels. Recognizing that the RIS control circuit, irrespective of its detailed architecture, can always be represented as a multi-port network with auxiliary ports terminated by tunable individual loads, we physics-compliantly express the BD-RIS-parametrized channel as a multi-port chain cascade of i) radio environment, ii) static parts of the control circuit, and iii) individually tunable loads. Thus, the cascade of the former two systems is terminated by a system that is mathematically always characterized by a diagonal matrix. This physics-compliant diagonal representation implies that existing algorithms for channel estimation and optimization for conventional ("diagonal") RIS can be readily applied to BD-RIS scenarios. We demonstrate this in an experimentally grounded case study. Importantly, we highlight that, operationally, an ambiguous characterization of the cascade of radio environment and the static parts of the control circuit is required, but not the breakdown into the characteristics of its two constituent systems nor the lifting of the ambiguities. Nonetheless, we demonstrate how to derive or estimate the characteristics of the static parts of the control circuit for pedagogical purposes. The diagonal representation of BD-RIS-parametrized channels also enables their treatment with coupled-dipole-based models. We furthermore derive the assumptions under which the physics-compliant BD-RIS model simplifies to the widespread linear cascaded model.
We propose a numerical algorithm for the computation of multi-marginal optimal transport (MMOT) problems involving general probability measures that are not necessarily discrete. By developing a relaxation scheme in which marginal constraints are replaced by finitely many linear constraints and by proving a specifically tailored duality result for this setting, we approximate the MMOT problem by a linear semi-infinite optimization problem. Moreover, we are able to recover a feasible and approximately optimal solution of the MMOT problem, and its sub-optimality can be controlled to be arbitrarily close to 0 under mild conditions. The developed relaxation scheme leads to a numerical algorithm which can compute a feasible approximate optimizer of the MMOT problem whose theoretical sub-optimality can be chosen to be arbitrarily small. Besides the approximate optimizer, the algorithm is also able to compute both an upper bound and a lower bound for the optimal value of the MMOT problem. The difference between the computed bounds provides an explicit sub-optimality bound for the computed approximate optimizer. We demonstrate the proposed algorithm in three numerical experiments involving an MMOT problem that stems from fluid dynamics, the Wasserstein barycenter problem, and a large-scale MMOT problem with 100 marginals. We observe that our algorithm is capable of computing high-quality solutions of these MMOT problems and the computed sub-optimality bounds are much less conservative than their theoretical upper bounds in all the experiments.
We propose a CPU-GPU heterogeneous computing method for solving time-evolution partial differential equation problems many times with guaranteed accuracy, in short time-to-solution and low energy-to-solution. On a single-GH200 node, the proposed method improved the computation speed by 86.4 and 8.67 times compared to the conventional method run only on CPU and only on GPU, respectively. Furthermore, the energy-to-solution was reduced by 32.2-fold (from 9944 J to 309 J) and 7.01-fold (from 2163 J to 309 J) when compared to using only the CPU and GPU, respectively. Using the proposed method on the Alps supercomputer, a 51.6-fold and 6.98-fold speedup was attained when compared to using only the CPU and GPU, respectively, and a high weak scaling efficiency of 94.3% was obtained up to 1,920 compute nodes. These implementations were realized using directive-based parallel programming models while enabling portability, indicating that directives are highly effective in analyses in heterogeneous computing environments.
We formulate a uniform tail bound for empirical processes indexed by a class of functions, in terms of the individual deviations of the functions rather than the worst-case deviation in the considered class. The tail bound is established by introducing an initial ``deflation'' step to the standard generic chaining argument. The resulting tail bound is the sum of the complexity of the ``deflated function class'' in terms of a generalization of Talagrand's $\gamma$ functional, and the deviation of the function instance, both of which are formulated based on the natural seminorm induced by the corresponding Cram\'{e}r functions. Leveraging another less demanding natural seminorm, we also show similar bounds, though with implicit dependence on the sample size, in the more general case where finite exponential moments cannot be assumed. We also provide approximations of the tail bounds in terms of the more prevalent Orlicz norms or their ``incomplete'' versions under suitable moment conditions.
Neural simulation-based inference (SBI) describes an emerging family of methods for Bayesian inference with intractable likelihood functions that use neural networks as surrogate models. Here we introduce sbijax, a Python package that implements a wide variety of state-of-the-art methods in neural simulation-based inference using a user-friendly programming interface. sbijax offers high-level functionality to quickly construct SBI estimators, and compute and visualize posterior distributions with only a few lines of code. In addition, the package provides functionality for conventional approximate Bayesian computation, to compute model diagnostics, and to automatically estimate summary statistics. By virtue of being entirely written in JAX, sbijax is extremely computationally efficient, allowing rapid training of neural networks and executing code automatically in parallel on both CPU and GPU.
Quantum learning models hold the potential to bring computational advantages over the classical realm. As powerful quantum servers become available on the cloud, ensuring the protection of clients' private data becomes crucial. By incorporating quantum homomorphic encryption schemes, we present a general framework that enables quantum delegated and federated learning with a computation-theoretical data privacy guarantee. We show that learning and inference under this framework feature substantially lower communication complexity compared with schemes based on blind quantum computing. In addition, in the proposed quantum federated learning scenario, there is less computational burden on local quantum devices from the client side, since the server can operate on encrypted quantum data without extracting any information. We further prove that certain quantum speedups in supervised learning carry over to private delegated learning scenarios employing quantum kernel methods. Our results provide a valuable guide toward privacy-guaranteed quantum learning on the cloud, which may benefit future studies and security-related applications.
The consensus problem in distributed computing involves a network of agents aiming to compute the average of their initial vectors through local communication, represented by an undirected graph. This paper focuses on the studying of this problem using an average-case analysis approach, particularly over regular graphs. Traditional algorithms for solving the consensus problem often rely on worst-case performance evaluation scenarios, which may not reflect typical performance in real-world applications. Instead, we apply average-case analysis, focusing on the expected spectral distribution of eigenvalues to obtain a more realistic view of performance. Key contributions include deriving the optimal method for consensus on regular graphs, showing its relation to the Heavy Ball method, analyzing its asymptotic convergence rate, and comparing it to various first-order methods through numerical experiments.
We solve constrained optimal transport problems in which the marginal laws are given by the laws of solutions of stochastic differential equations (SDEs). We consider SDEs with irregular coefficients, making only minimal regularity assumptions. We show that the so-called synchronous coupling is optimal among bicausal couplings, that is couplings that respect the flow of information encoded in the stochastic processes. Our results provide a method to numerically compute the adapted Wasserstein distance between laws of SDEs with irregular coefficients. We show that this can be applied to quantifying model uncertainty in stochastic optimisation problems. Moreover, we introduce a transformation-based semi-implicit numerical scheme and establish the first strong convergence result for SDEs with exponentially growing and discontinuous drift.
We present a new Krylov subspace recycling method for solving a linear system of equations, or a sequence of slowly changing linear systems. Our approach is to reduce the computational overhead of recycling techniques while still benefiting from the acceleration afforded by such techniques. As such, this method augments an unprojected Krylov subspace. Furthermore, it combines randomized sketching and deflated restarting in a way that avoids orthogononalizing a full Krylov basis. We call this new method GMRES-SDR (sketched deflated restarting). With this new method, we provide new theory, which initially characterizes unaugmented sketched GMRES as a projection method for which the projectors involve the sketching operator. We demonstrate that sketched GMRES and its sibling method sketched FOM are an MR/OR pairing, just like GMRES and FOM. We furthermore obtain residual convergence estimates. Building on this, we characterize GMRES-SDR also in terms of sketching-based projectors. Compression of the augmented Krylov subspace for recycling is performed using a sketched version of harmonic Ritz vectors. We present results of numerical experiments demonstrating the effectiveness of GMRES-SDR over competitor methods such as GMRES-DR and GCRO-DR.