As quantum theory allows for information processing and computing tasks that otherwise are not possible with classical systems, there is a need and use of quantum Internet beyond existing network systems. At the same time, the realization of a desirably functional quantum Internet is hindered by fundamental and practical challenges such as high loss during transmission of quantum systems, decoherence due to interaction with the environment, fragility of quantum states, etc. We study the implications of these constraints by analyzing the limitations on the scaling and robustness of quantum Internet. Considering quantum networks, we present practical bottlenecks for secure communication, delegated computing, and resource distribution among end nodes. Motivated by the power of abstraction in graph theory (in association with quantum information theory), we consider graph-theoretic quantifiers to assess network robustness and provide critical values of communication lines for viable communication over quantum Internet. In particular, we begin by discussing limitations on usefulness of isotropic states as device-independent quantum key repeaters which otherwise could be useful for device-independent quantum key distribution. We consider some quantum networks of practical interest, ranging from satellite-based networks connecting far-off spatial locations to currently available quantum processor architectures within computers, and analyze their robustness to perform quantum information processing tasks. Some of these tasks form primitives for delegated quantum computing, e.g., entanglement distribution and quantum teleportation. For some examples of quantum networks, we present algorithms to perform different quantum network tasks of interest such as constructing the network structure, finding the shortest path between a pair of end nodes, and optimizing the flow of resources at a node.
Advances in large language models (LLMs) have driven an explosion of interest about their societal impacts. Much of the discourse around how they will impact social equity has been cautionary or negative, focusing on questions like "how might LLMs be biased and how would we mitigate those biases?" This is a vital discussion: the ways in which AI generally, and LLMs specifically, can entrench biases have been well-documented. But equally vital, and much less discussed, is the more opportunity-focused counterpoint: "what promising applications do LLMs enable that could promote equity?" If LLMs are to enable a more equitable world, it is not enough just to play defense against their biases and failure modes. We must also go on offense, applying them positively to equity-enhancing use cases to increase opportunities for underserved groups and reduce societal discrimination. There are many choices which determine the impact of AI, and a fundamental choice very early in the pipeline is the problems we choose to apply it to. If we focus only later in the pipeline -- making LLMs marginally more fair as they facilitate use cases which intrinsically entrench power -- we will miss an important opportunity to guide them to equitable impacts. Here, we highlight the emerging potential of LLMs to promote equity by presenting four newly possible, promising research directions, while keeping risks and cautionary points in clear view.
A posteriori reduced-order models, e.g. proper orthogonal decomposition, are essential to affordably tackle realistic parametric problems. They rely on a trustful training set, that is a family of full-order solutions (snapshots) representative of all possible outcomes of the parametric problem. Having such a rich collection of snapshots is not, in many cases, computationally viable. A strategy for data augmentation, designed for parametric laminar incompressible flows, is proposed to enrich poorly populated training sets. The goal is to include in the new, artificial snapshots emerging features, not present in the original basis, that do enhance the quality of the reduced-order solution. The methodologies devised are based on exploiting basic physical principles, such as mass and momentum conservation, to devise physically-relevant, artificial snapshots at a fraction of the cost of additional full-order solutions. Interestingly, the numerical results show that the ideas exploiting only mass conservation (i.e., incompressibility) are not producing significant added value with respect to the standard linear combinations of snapshots. Conversely, accounting for the linearized momentum balance via the Oseen equation does improve the quality of the resulting approximation and therefore is an effective data augmentation strategy in the framework of viscous incompressible laminar flows.
Many interesting physical problems described by systems of hyperbolic conservation laws are stiff, and thus impose a very small time-step because of the restrictive CFL stability condition. In this case, one can exploit the superior stability properties of implicit time integration which allows to choose the time-step only from accuracy requirements, and thus avoid the use of small time-steps. We discuss an efficient framework to devise high order implicit schemes for stiff hyperbolic systems without tailoring it to a specific problem. The nonlinearity of high order schemes, due to space- and time-limiting procedures which control nonphysical oscillations, makes the implicit time integration difficult, e.g.~because the discrete system is nonlinear also on linear problems. This nonlinearity of the scheme is circumvented as proposed in (Puppo et al., Comm.~Appl.~Math.~\& Comput., 2023) for scalar conservation laws, where a first order implicit predictor is computed to freeze the nonlinear coefficients of the essentially non-oscillatory space reconstruction, and also to assist limiting in time. In addition, we propose a novel conservative flux-centered a-posteriori time-limiting procedure using numerical entropy indicators to detect troubled cells. The numerical tests involve classical and artificially devised stiff problems using the Euler's system of gas-dynamics.
A wide variety of model explanation approaches have been proposed in recent years, all guided by very different rationales and heuristics. In this paper, we take a new route and cast interpretability as a statistical inference problem. We propose a general deep probabilistic model designed to produce interpretable predictions. The model parameters can be learned via maximum likelihood, and the method can be adapted to any predictor network architecture and any type of prediction problem. Our method is a case of amortized interpretability models, where a neural network is used as a selector to allow for fast interpretation at inference time. Several popular interpretability methods are shown to be particular cases of regularised maximum likelihood for our general model. We propose new datasets with ground truth selection which allow for the evaluation of the features importance map. Using these datasets, we show experimentally that using multiple imputation provides more reasonable interpretations.
Transformers play a central role in the inner workings of large language models. We develop a mathematical framework for analyzing Transformers based on their interpretation as interacting particle systems, which reveals that clusters emerge in long time. Our study explores the underlying theory and offers new perspectives for mathematicians as well as computer scientists.
As in many fields of medical research, survival analysis has witnessed a growing interest in the application of deep learning techniques to model complex, high-dimensional, heterogeneous, incomplete, and censored medical data. Current methods often make assumptions about the relations between data that may not be valid in practice. In response, we introduce SAVAE (Survival Analysis Variational Autoencoder), a novel approach based on Variational Autoencoders. SAVAE contributes significantly to the field by introducing a tailored ELBO formulation for survival analysis, supporting various parametric distributions for covariates and survival time (as long as the log-likelihood is differentiable). It offers a general method that consistently performs well on various metrics, demonstrating robustness and stability through different experiments. Our proposal effectively estimates time-to-event, accounting for censoring, covariate interactions, and time-varying risk associations. We validate our model in diverse datasets, including genomic, clinical, and demographic data, with varying levels of censoring. This approach demonstrates competitive performance compared to state-of-the-art techniques, as assessed by the Concordance Index and the Integrated Brier Score. SAVAE also offers an interpretable model that parametrically models covariates and time. Moreover, its generative architecture facilitates further applications such as clustering, data imputation, and the generation of synthetic patient data through latent space inference from survival data.
Most categorical models for dependent types have traditionally been heavily set based: contexts form a category, and for each we have a set of types in said context -- and for each type a set of terms of said type. This is the case for categories with families, categories with attributes, and natural models; in particular, all of them can be traced back to certain discrete Grothendieck fibrations. We extend this intuition to the case of general, non necessarily discrete, fibrations, so that over a given context one has not only a set but a category of types. We argue that the added structure can be attributed to a notion of subtyping that shares many features with that of coercive subtyping, in the sense that it is the product of thinking about subtyping as an abbreviation mechanism: we say that a given type $A'$ is a subtype of $A$ if there is a unique coercion from $A'$ to $A$. Whenever we need a term of type $A$, then, it suffices to have a term of type $A'$, which we can `plug-in' into $A$. For this version of subtyping we provide rules, coherences, and explicit models, and we compare and contrast it to coercive subtyping as introduced by Z. Luo and others. We conclude by suggesting how the tools we present can be employed in finding appropriate rules relating subtyping and certain type constructors.
As a crossover frontier of physics and mechanics, quantum computing is showing its great potential in computational mechanics. However, quantum hardware noise remains a critical barrier to achieving accurate simulation results due to the limitation of the current hardware level. In this paper, we integrate error-mitigated quantum computing in data-driven computational mechanics, where the zero-noise extrapolation (ZNE) technique is employed to improve the accuracy of quantum computing. Numerical examples including multiscale simulation of a composite L-shaped beam are conducted with the quantum computer simulator Qpanda, and the results validate the effectiveness of the proposed method. We believe this work presents a promising step towards using the power of quantum computing in computational mechanics.
The goal of explainable Artificial Intelligence (XAI) is to generate human-interpretable explanations, but there are no computationally precise theories of how humans interpret AI generated explanations. The lack of theory means that validation of XAI must be done empirically, on a case-by-case basis, which prevents systematic theory-building in XAI. We propose a psychological theory of how humans draw conclusions from saliency maps, the most common form of XAI explanation, which for the first time allows for precise prediction of explainee inference conditioned on explanation. Our theory posits that absent explanation humans expect the AI to make similar decisions to themselves, and that they interpret an explanation by comparison to the explanations they themselves would give. Comparison is formalized via Shepard's universal law of generalization in a similarity space, a classic theory from cognitive science. A pre-registered user study on AI image classifications with saliency map explanations demonstrate that our theory quantitatively matches participants' predictions of the AI.
We derive information-theoretic generalization bounds for supervised learning algorithms based on the information contained in predictions rather than in the output of the training algorithm. These bounds improve over the existing information-theoretic bounds, are applicable to a wider range of algorithms, and solve two key challenges: (a) they give meaningful results for deterministic algorithms and (b) they are significantly easier to estimate. We show experimentally that the proposed bounds closely follow the generalization gap in practical scenarios for deep learning.