Powerful hardware services and software libraries are vital tools for quickly and affordably designing, testing, and executing quantum algorithms. A robust large-scale study of how the performance of these platforms scales with the number of qubits is key to providing quantum solutions to challenging industry problems. This work benchmarks the runtime and accuracy for a representative sample of specialized high-performance simulated and physical quantum processing units. Results show the QMware simulator can reduce the runtime for executing a quantum circuit by up to 78% compared to the next fastest option for algorithms with fewer than 27 qubits. The AWS SV1 simulator offers a runtime advantage for larger circuits, up to the maximum 34 qubits available with SV1. Beyond this limit, QMware can execute circuits as large as 40 qubits. Physical quantum devices, such as Rigetti's Aspen-M2, can provide an exponential runtime advantage for circuits with more than 30 qubits. However, the high financial cost of physical quantum processing units presents a serious barrier to practical use. Moreover, only IonQ's Harmony quantum device achieves high fidelity with more than four qubits. This study paves the way to understanding the optimal combination of available software and hardware for executing practical quantum algorithms.
Stochastic inversion problems are typically encountered when it is wanted to quantify the uncertainty affecting the inputs of computer models. They consist in estimating input distributions from noisy, observable outputs, and such problems are increasingly examined in Bayesian contexts where the targeted inputs are affected by stochastic uncertainties. In this regard, a stochastic input can be qualified as meaningful if it explains most of the output uncertainty. While such inverse problems are characterized by identifiability conditions, constraints of "signal to noise", that can formalize this meaningfulness, should be accounted for within the definition of the model, prior to inference. This article investigates the possibility of forcing a solution to be meaningful in the context of parametric uncertainty quantification, through the tools of global sensitivity analysis and information theory (variance, entropy, Fisher information). Such forcings have mainly the nature of constraints placed on the input covariance, and can be made explicit by considering linear or linearizable models. Simulated experiments indicate that, when injected into the modeling process, these constraints can limit the influence of measurement or process noise on the estimation of the input distribution, and let hope for future extensions in a full non-linear framework, for example through the use of linear Gaussian mixtures.
Multiphysics processes in fractured porous media is a research field of importance for several subsurface applications and has received considerable attention over the last decade. The dynamics are characterised by strong couplings between processes as well as interaction between the processes and the structure of the fractured medium itself. The rich range of behavior calls for explorative mathematical modelling, such as experimentation with constitutive laws and novel coupling concepts between physical processes. Moreover, efficient simulations of the strong couplings between multiphysics processes and geological structures require the development of tailored numerical methods. We present a modelling framework and its implementation in the open-source simulation toolbox PorePy, which is designed for rapid prototyping of multiphysics processes in fractured porous media. PorePy uses a mixed-dimensional representation of the fracture geometry and generally applies fully implicit couplings between processes. The code design follows the paradigms of modularity and differentiable programming, which together allow for extreme flexibility in experimentation with governing equations with minimal changes to the code base. The code integrity is supported by a multilevel testing framework ensuring the reliability of the code. We present our modelling framework within a context of thermo-poroelasticity in deformable fractured porous media, illustrating the close relation between the governing equations and the source code. We furthermore discuss the design of the testing framework and present simulations showcasing the extendibility of PorePy, as well as the type of results that can be produced by mixed-dimensional simulation tools.
The application of deep learning to non-stationary temporal datasets can lead to overfitted models that underperform under regime changes. In this work, we propose a modular machine learning pipeline for ranking predictions on temporal panel datasets which is robust under regime changes. The modularity of the pipeline allows the use of different models, including Gradient Boosting Decision Trees (GBDTs) and Neural Networks, with and without feature engineering. We evaluate our framework on financial data for stock portfolio prediction, and find that GBDT models with dropout display high performance, robustness and generalisability with reduced complexity and computational cost. We then demonstrate how online learning techniques, which require no retraining of models, can be used post-prediction to enhance the results. First, we show that dynamic feature projection improves robustness by reducing drawdown in regime changes. Second, we demonstrate that dynamical model ensembling based on selection of models with good recent performance leads to improved Sharpe and Calmar ratios of out-of-sample predictions. We also evaluate the robustness of our pipeline across different data splits and random seeds with good reproducibility.
Black holes are believed to be fast scramblers of information, as they rapidly destroy local correlations and spread information throughout the system. Unscrambling this information is in principle possible, given perfect knowledge of the black hole internal dynamics[arXiv:1710.03363]. This work shows that even if one doesn't know the internal dynamics of the black hole, information can be efficiently decoded from an unknown black hole by observing the outgoing Hawking radiation. We show that, surprisingly, black holes with an unknown internal dynamics that are rapidly scrambling but not fully chaotic admit Clifford decoders: the salient properties of a scrambling unitary can be efficiently recovered even if exponentially complex. This recovery is possible because all the redundant complexity can be described as an entropy, the stabilizer entropy. We show how for non-chaotic black holes the stabilizer entropy can be efficiently pumped away, just as in a refrigerator.
The use of Air traffic management (ATM) simulators for planing and operations can be challenging due to their modelling complexity. This paper presents XALM (eXplainable Active Learning Metamodel), a three-step framework integrating active learning and SHAP (SHapley Additive exPlanations) values into simulation metamodels for supporting ATM decision-making. XALM efficiently uncovers hidden relationships among input and output variables in ATM simulators, those usually of interest in policy analysis. Our experiments show XALM's predictive performance comparable to the XGBoost metamodel with fewer simulations. Additionally, XALM exhibits superior explanatory capabilities compared to non-active learning metamodels. Using the `Mercury' (flight and passenger) ATM simulator, XALM is applied to a real-world scenario in Paris Charles de Gaulle airport, extending an arrival manager's range and scope by analysing six variables. This case study illustrates XALM's effectiveness in enhancing simulation interpretability and understanding variable interactions. By addressing computational challenges and improving explainability, XALM complements traditional simulation-based analyses. Lastly, we discuss two practical approaches for reducing the computational burden of the metamodelling further: we introduce a stopping criterion for active learning based on the inherent uncertainty of the metamodel, and we show how the simulations used for the metamodel can be reused across key performance indicators, thus decreasing the overall number of simulations needed.
This paper addresses the problem of providing robust estimators under a functional logistic regression model. Logistic regression is a popular tool in classification problems with two populations. As in functional linear regression, regularization tools are needed to compute estimators for the functional slope. The traditional methods are based on dimension reduction or penalization combined with maximum likelihood or quasi--likelihood techniques and for that reason, they may be affected by misclassified points especially if they are associated to functional covariates with atypical behaviour. The proposal given in this paper adapts some of the best practices used when the covariates are finite--dimensional to provide reliable estimations. Under regularity conditions, consistency of the resulting estimators and rates of convergence for the predictions are derived. A numerical study illustrates the finite sample performance of the proposed method and reveals its stability under different contamination scenarios. A real data example is also presented.
Penalized $M-$estimators for logistic regression models have been previously study for fixed dimension in order to obtain sparse statistical models and automatic variable selection. In this paper, we derive asymptotic results for penalized $M-$estimators when the dimension $p$ grows to infinity with the sample size $n$. Specifically, we obtain consistency and rates of convergence results, for some choices of the penalty function. Moreover, we prove that these estimators consistently select variables with probability tending to 1 and derive their asymptotic distribution.
Gaussian processes (GPs) are widely-used tools in spatial statistics and machine learning and the formulae for the mean function and covariance kernel of a GP $v$ that is the image of another GP $u$ under a linear transformation $T$ acting on the sample paths of $u$ are well known, almost to the point of being folklore. However, these formulae are often used without rigorous attention to technical details, particularly when $T$ is an unbounded operator such as a differential operator, which is common in many modern applications. This note provides a self-contained proof of the claimed formulae for the case of a closed, densely-defined operator $T$ acting on the sample paths of a square-integrable (not necessarily Gaussian) stochastic process. Our proof technique relies upon Hille's theorem for the Bochner integral of a Banach-valued random variable.
The developed computational approach is capable of initiating and propagating cracks inside materials and along material interfaces of general multi-domain structures under quasi-static conditions. Special attention is paid to particular situation of a solid with inhomogeneities. Description of the fracture processes are based on the theory of material damage. It introduces two independent damage parameters to distinguish between interface and internal cracks. The parameter responsible for interface cracks is defined in a thin adhesive layer of the interface and renders relation between stress and strain quantities in fashion of cohesive zone models.The second parameter is defined inside material domains and it is founded on the theory of phase-field fracture guaranteeing the material damage to occur in a thin material strip introducing a regularised model of internal cracks. Additional property of both interface and phase-field damage is their capability to distinguish between fracture modes which is useful if the structures is subjected to combined loading. The solution methodology is based on a variational approach which allows implementation of non-linear programming optimisation into standard methods of finite-element discretisation and time stepping method.Computational implementation is prepared in MATLAB whose numerical data validate developed formulation for analysis of problems of fracture in multi-domain elements of structures.
We derive information-theoretic generalization bounds for supervised learning algorithms based on the information contained in predictions rather than in the output of the training algorithm. These bounds improve over the existing information-theoretic bounds, are applicable to a wider range of algorithms, and solve two key challenges: (a) they give meaningful results for deterministic algorithms and (b) they are significantly easier to estimate. We show experimentally that the proposed bounds closely follow the generalization gap in practical scenarios for deep learning.