Scientific applications are starting to explore the viability of quantum computing. This exploration typically begins with quantum simulations that can run on existing classical platforms, albeit without the performance advantages of real quantum resources. In the context of high-performance computing (HPC), the incorporation of simulation software can often take advantage of the powerful resources to help scale-up the simulation size. The configuration, installation and operation of these quantum simulation packages on HPC resources can often be rather daunting and increases friction for experimentation by scientific application developers. We describe a framework to help streamline access to quantum simulation software running on HPC resources. This includes an interface for circuit-based quantum computing tasks, as well as the necessary resource management infrastructure to make effective use of the underlying HPC resources. The primary contributions of this work include a classification of different usage models for quantum simulation in an HPC context, a review of the software architecture for our approach and a detailed description of the prototype implementation to experiment with these ideas using two different simulators (TNQVM \& NWQ-Sim). We include initial experimental results running on the Frontier supercomputer at the Oak Ridge Leadership Computing Facility (OLCF) using a synthetic workload generated via the SupermarQ quantum benchmarking framework.
Regression on function spaces is typically limited to models with Gaussian process priors. We introduce the notion of universal functional regression, in which we aim to learn a prior distribution over non-Gaussian function spaces that remains mathematically tractable for functional regression. To do this, we develop Neural Operator Flows (OpFlow), an infinite-dimensional extension of normalizing flows. OpFlow is an invertible operator that maps the (potentially unknown) data function space into a Gaussian process, allowing for exact likelihood estimation of functional point evaluations. OpFlow enables robust and accurate uncertainty quantification via drawing posterior samples of the Gaussian process and subsequently mapping them into the data function space. We empirically study the performance of OpFlow on regression and generation tasks with data generated from Gaussian processes with known posterior forms and non-Gaussian processes, as well as real-world earthquake seismograms with an unknown closed-form distribution.
Diffusion models have recently been successfully applied to a wide range of robotics applications for learning complex multi-modal behaviors from data. However, prior works have mostly been confined to single-robot and small-scale environments due to the high sample complexity of learning multi-robot diffusion models. In this paper, we propose a method for generating collision-free multi-robot trajectories that conform to underlying data distributions while using only single-robot data. Our algorithm, Multi-robot Multi-model planning Diffusion (MMD), does so by combining learned diffusion models with classical search-based techniques -- generating data-driven motions under collision constraints. Scaling further, we show how to compose multiple diffusion models to plan in large environments where a single diffusion model fails to generalize well. We demonstrate the effectiveness of our approach in planning for dozens of robots in a variety of simulated scenarios motivated by logistics environments. View video demonstrations in our supplementary material, and our code at: //github.com/yoraish/mmd.
Designing sample-efficient and computationally feasible reinforcement learning (RL) algorithms is particularly challenging in environments with large or infinite state and action spaces. In this paper, we advance this effort by presenting an efficient algorithm for Markov Decision Processes (MDPs) where the state-action value function of any policy is linear in a given feature map. This challenging setting can model environments with infinite states and actions, strictly generalizes classic linear MDPs, and currently lacks a computationally efficient algorithm under online access to the MDP. Specifically, we introduce a new RL algorithm that efficiently finds a near-optimal policy in this setting, using a number of episodes and calls to a cost-sensitive classification (CSC) oracle that are both polynomial in the problem parameters. Notably, our CSC oracle can be efficiently implemented when the feature dimension is constant, representing a clear improvement over state-of-the-art methods, which require solving non-convex problems with horizon-many variables and can incur computational costs that are exponential in the horizon.
Nutrient load simulators are large, deterministic, models that simulate the hydrodynamics and biogeochemical processes in aquatic ecosystems. They are central tools for planning cost efficient actions to fight eutrophication since they allow scenario predictions on impacts of nutrient load reductions to, e.g., harmful algal biomass growth. Due to being computationally heavy, the uncertainties related to these predictions are typically not rigorously assessed though. In this work, we developed a novel Bayesian computational approach for estimating the uncertainties in predictions of the Finnish coastal nutrient load model FICOS. First, we constructed a likelihood function for the multivariate spatiotemporal outputs of the FICOS model. Then, we used Bayes optimization to locate the posterior mode for the model parameters conditional on long term monitoring data. After that, we constructed a space filling design for FICOS model runs around the posterior mode and used it to train a Gaussian process emulator for the (log) posterior density of the model parameters. We then integrated over this (approximate) parameter posterior to produce probabilistic predictions for algal biomass and chlorophyll a concentration under alternative nutrient load reduction scenarios. Our computational algorithm allowed for fast posterior inference and the Gaussian process emulator had good predictive accuracy within the highest posterior probability mass region. The posterior predictive scenarios showed that the probability to reach the EUs Water Framework Directive objectives in the Finnish Archipelago Sea is generally low even under large load reductions.
The claw problem is central in the fields of theoretical computer science as well as cryptography. The optimal quantum query complexity of the problem is known to be $\Omega\left(\sqrt{G}+(FG)^{1/3} \right)$ for input functions $f\colon [F]\to Z$ and $g\colon [G]\to Z$. However, the lower bound was proved when the range $Z$ is sufficiently large (i.e., $|{Z}|=\Omega(FG)$). The current paper proves the lower bound holds even for every smaller range $Z$ with $|{Z}|\ge F+G$. This implies that $\Omega\left(\sqrt{G}+(FG)^{1/3} \right)$ is tight for every such range. In addition, the lower bound $\Omega\left(\sqrt{G}+F^{1/3}G^{1/6}M^{1/6}\right)$ is provided for even smaller range $Z=[M]$ with every $M\in [2,F+G]$ by reducing the claw problem for $|{Z}|= F+G$. The proof technique is general enough to apply to any $k$-symmetric property (e.g., the $k$-claw problem), i.e., the Boolean function $\Phi$ on the set of $k$ functions with different-size domains and a common range such that $\Phi$ is invariant under the permutations over each domain and the permutations over the range. More concretely, it generalizes Ambainis's argument [Theory of Computing, 1(1):37-46] to the multiple-function case by using the notion of multisymmetric polynomials.
Causal models seek to unravel the cause-effect relationships among variables from observed data, as opposed to mere mappings among them, as traditional regression models do. This paper introduces a novel causal discovery algorithm designed for settings in which variables exhibit linearly sparse relationships. In such scenarios, the causal links represented by directed acyclic graphs (DAGs) can be encapsulated in a structural matrix. The proposed approach leverages the structural matrix's ability to reconstruct data and the statistical properties it imposes on the data to identify the correct structural matrix. This method does not rely on independence tests or graph fitting procedures, making it suitable for scenarios with limited training data. Simulation results demonstrate that the proposed method outperforms the well-known PC, GES, BIC exact search, and LINGAM-based methods in recovering linearly sparse causal structures.
The success of AI models relies on the availability of large, diverse, and high-quality datasets, which can be challenging to obtain due to data scarcity, privacy concerns, and high costs. Synthetic data has emerged as a promising solution by generating artificial data that mimics real-world patterns. This paper provides an overview of synthetic data research, discussing its applications, challenges, and future directions. We present empirical evidence from prior art to demonstrate its effectiveness and highlight the importance of ensuring its factuality, fidelity, and unbiasedness. We emphasize the need for responsible use of synthetic data to build more powerful, inclusive, and trustworthy language models.
Knowledge graph completion aims to predict missing relations between entities in a knowledge graph. While many different methods have been proposed, there is a lack of a unifying framework that would lead to state-of-the-art results. Here we develop PathCon, a knowledge graph completion method that harnesses four novel insights to outperform existing methods. PathCon predicts relations between a pair of entities by: (1) Considering the Relational Context of each entity by capturing the relation types adjacent to the entity and modeled through a novel edge-based message passing scheme; (2) Considering the Relational Paths capturing all paths between the two entities; And, (3) adaptively integrating the Relational Context and Relational Path through a learnable attention mechanism. Importantly, (4) in contrast to conventional node-based representations, PathCon represents context and path only using the relation types, which makes it applicable in an inductive setting. Experimental results on knowledge graph benchmarks as well as our newly proposed dataset show that PathCon outperforms state-of-the-art knowledge graph completion methods by a large margin. Finally, PathCon is able to provide interpretable explanations by identifying relations that provide the context and paths that are important for a given predicted relation.
It is important to detect anomalous inputs when deploying machine learning systems. The use of larger and more complex inputs in deep learning magnifies the difficulty of distinguishing between anomalous and in-distribution examples. At the same time, diverse image and text data are available in enormous quantities. We propose leveraging these data to improve deep anomaly detection by training anomaly detectors against an auxiliary dataset of outliers, an approach we call Outlier Exposure (OE). This enables anomaly detectors to generalize and detect unseen anomalies. In extensive experiments on natural language processing and small- and large-scale vision tasks, we find that Outlier Exposure significantly improves detection performance. We also observe that cutting-edge generative models trained on CIFAR-10 may assign higher likelihoods to SVHN images than to CIFAR-10 images; we use OE to mitigate this issue. We also analyze the flexibility and robustness of Outlier Exposure, and identify characteristics of the auxiliary dataset that improve performance.
While existing machine learning models have achieved great success for sentiment classification, they typically do not explicitly capture sentiment-oriented word interaction, which can lead to poor results for fine-grained analysis at the snippet level (a phrase or sentence). Factorization Machine provides a possible approach to learning element-wise interaction for recommender systems, but they are not directly applicable to our task due to the inability to model contexts and word sequences. In this work, we develop two Position-aware Factorization Machines which consider word interaction, context and position information. Such information is jointly encoded in a set of sentiment-oriented word interaction vectors. Compared to traditional word embeddings, SWI vectors explicitly capture sentiment-oriented word interaction and simplify the parameter learning. Experimental results show that while they have comparable performance with state-of-the-art methods for document-level classification, they benefit the snippet/sentence-level sentiment analysis.