久久久久精品电影,91资源电影网站,日本亚洲欧美日韩国产第一页,2021日本人人爽人人爽,精品欧美一区二区精品久久久94

We construct quantum algorithms to compute the solution and/or physical observables of nonlinear ordinary differential equations (ODEs) and nonlinear Hamilton-Jacobi equations (HJE) via linear representations or exact mappings between nonlinear ODEs/HJE and linear partial differential equations (the Liouville equation and the Koopman-von Neumann equation). The connection between the linear representations and the original nonlinear system is established through the Dirac delta function or the level set mechanism. We compare the quantum linear systems algorithms based methods and the quantum simulation methods arising from different numerical approximations, including the finite difference discretisations and the Fourier spectral discretisations for the two different linear representations, with the result showing that the quantum simulation methods usually give the best performance in time complexity. We also propose the Schr\"odinger framework to solve the Liouville equation for the HJE, since it can be recast as the semiclassical limit of the Wigner transform of the Schr\"odinger equation. Comparsion between the Schr\"odinger and the Liouville framework will also be made.

相關內容

線性的

關注 0

Analysis · Learning · 聯邦學習 · state-of-the-art · CASES ·

2022 年 10 月 27 日

A Unified Analysis of Federated Learning with Arbitrary Client Participation

Shiqiang Wang,Mingyue Ji

from arxiv, Accepted to NeurIPS 2022

Federated learning (FL) faces challenges of intermittent client availability and computation/communication efficiency. As a result, only a small subset of clients can participate in FL at a given time. It is important to understand how partial client participation affects convergence, but most existing works have either considered idealized participation patterns or obtained results with non-zero optimality error for generic patterns. In this paper, we provide a unified convergence analysis for FL with arbitrary client participation. We first introduce a generalized version of federated averaging (FedAvg) that amplifies parameter updates at an interval of multiple FL rounds. Then, we present a novel analysis that captures the effect of client participation in a single term. By analyzing this term, we obtain convergence upper bounds for a wide range of participation patterns, including both non-stochastic and stochastic cases, which match either the lower bound of stochastic gradient descent (SGD) or the state-of-the-art results in specific settings. We also discuss various insights, recommendations, and experimental results.

語言模型化 · MoDELS · Performer · NLU · Processing（編程語言） ·

2022 年 10 月 26 日

EW-Tune: A Framework for Privately Fine-Tuning Large Language Models with Differential Privacy

Rouzbeh Behnia,Mohamamdreza Ebrahimi,Jason Pacheco,balaji Padmanabhan

from arxiv, Accepted at IEEE ICDM Workshop on Machine Learning for Cybersecurity (MLC) 2022

Pre-trained Large Language Models (LLMs) are an integral part of modern AI that have led to breakthrough performances in complex AI tasks. Major AI companies with expensive infrastructures are able to develop and train these large models with billions and millions of parameters from scratch. Third parties, researchers, and practitioners are increasingly adopting these pre-trained models and fine-tuning them on their private data to accomplish their downstream AI tasks. However, it has been shown that an adversary can extract/reconstruct the exact training samples from these LLMs, which can lead to revealing personally identifiable information. The issue has raised deep concerns about the privacy of LLMs. Differential privacy (DP) provides a rigorous framework that allows adding noise in the process of training or fine-tuning LLMs such that extracting the training data becomes infeasible (i.e., with a cryptographically small success probability). While the theoretical privacy guarantees offered in most extant studies assume learning models from scratch through many training iterations in an asymptotic setting, this assumption does not hold in fine-tuning scenarios in which the number of training iterations is significantly smaller. To address the gap, we present \ewtune, a DP framework for fine-tuning LLMs based on Edgeworth accountant with finite-sample privacy guarantees. Our results across four well-established natural language understanding (NLU) tasks show that while \ewtune~adds privacy guarantees to LLM fine-tuning process, it directly contributes to decreasing the induced noise to up to 5.6\% and improves the state-of-the-art LLMs performance by up to 1.1\% across all NLU tasks. We have open-sourced our implementations for wide adoption and public testing purposes.

Processing（編程語言） · Learning · ML · 輸出 · 機器學習建模 ·

2022 年 10 月 26 日

Learning to predict arbitrary quantum processes

Hsin-Yuan Huang,Sitan Chen,John Preskill

from arxiv, 10 pages + 37-page appendix

We present an efficient machine learning (ML) algorithm for predicting any unknown quantum process $\mathcal{E}$ over $n$ qubits. For a wide range of distributions $\mathcal{D}$ on arbitrary $n$-qubit states, we show that this ML algorithm can learn to predict any local property of the output from the unknown process $\mathcal{E}$, with a small average error over input states drawn from $\mathcal{D}$. The ML algorithm is computationally efficient even when the unknown process is a quantum circuit with exponentially many gates. Our algorithm combines efficient procedures for learning properties of an unknown state and for learning a low-degree approximation to an unknown observable. The analysis hinges on proving new norm inequalities, including a quantum analogue of the classical Bohnenblust-Hille inequality, which we derive by giving an improved algorithm for optimizing local Hamiltonians. Overall, our results highlight the potential for ML models to predict the output of complex quantum dynamics much faster than the time needed to run the process itself.

離散化 · Extensibility · 鞍點 · Analysis · UniFormer ·

2022 年 10 月 26 日

Quantum simulation of real-space dynamics

Andrew M. Childs,Jiaqi Leng,Tongyang Li,Jin-Peng Liu,Chenyi Zhang

Quantum simulation is a prominent application of quantum computers. While there is extensive previous work on simulating finite-dimensional systems, less is known about quantum algorithms for real-space dynamics. We conduct a systematic study of such algorithms. In particular, we show that the dynamics of a $d$-dimensional Schr\"{o}dinger equation with $\eta$ particles can be simulated with gate complexity $\tilde{O}\bigl(\eta d F \text{poly}(\log(g'/\epsilon))\bigr)$, where $\epsilon$ is the discretization error, $g'$ controls the higher-order derivatives of the wave function, and $F$ measures the time-integrated strength of the potential. Compared to the best previous results, this exponentially improves the dependence on $\epsilon$ and $g'$ from $\text{poly}(g'/\epsilon)$ to $\text{poly}(\log(g'/\epsilon))$ and polynomially improves the dependence on $T$ and $d$, while maintaining best known performance with respect to $\eta$. For the case of Coulomb interactions, we give an algorithm using $\eta^{3}(d+\eta)T\text{poly}(\log(\eta dTg'/(\Delta\epsilon)))/\Delta$ one- and two-qubit gates, and another using $\eta^{3}(4d)^{d/2}T\text{poly}(\log(\eta dTg'/(\Delta\epsilon)))/\Delta$ one- and two-qubit gates and QRAM operations, where $T$ is the evolution time and the parameter $\Delta$ regulates the unbounded Coulomb interaction. We give applications to several computational problems, including faster real-space simulation of quantum chemistry, rigorous analysis of discretization error for simulation of a uniform electron gas, and a quadratic improvement to a quantum algorithm for escaping saddle points in nonconvex optimization.

MoDELS · 語言模型化 · Performer · 值域 · 縮放 ·

2022 年 10 月 26 日

Emergent Abilities of Large Language Models

Jason Wei,Yi Tay,Rishi Bommasani,Colin Raffel,Barret Zoph,Sebastian Borgeaud,Dani Yogatama,Maarten Bosma,Denny Zhou,Donald Metzler,Ed H. Chi,Tatsunori Hashimoto,Oriol Vinyals,Percy Liang,Jeff Dean,William Fedus

from arxiv, Transactions on Machine Learning Research (TMLR), 2022

Scaling up language models has been shown to predictably improve performance and sample efficiency on a wide range of downstream tasks. This paper instead discusses an unpredictable phenomenon that we refer to as emergent abilities of large language models. We consider an ability to be emergent if it is not present in smaller models but is present in larger models. Thus, emergent abilities cannot be predicted simply by extrapolating the performance of smaller models. The existence of such emergence implies that additional scaling could further expand the range of capabilities of language models.

離散化 · Networking · Neural Networks · state-of-the-art · 自助法/自舉法 ·

2022 年 10 月 25 日

Neuro-symbolic partial differential equation solver

Pouria Mistani,Samira Pakravan,Rajesh Ilango,Sanjay Choudhry,Frederic Gibou

from arxiv, Accepted for publication at NeurIPS 2022 (ML4PS workshop). arXiv admin note: substantial text overlap with arXiv:2210.14312

We present a highly scalable strategy for developing mesh-free neuro-symbolic partial differential equation solvers from existing numerical discretizations found in scientific computing. This strategy is unique in that it can be used to efficiently train neural network surrogate models for the solution functions and the differential operators, while retaining the accuracy and convergence properties of state-of-the-art numerical solvers. This neural bootstrapping method is based on minimizing residuals of discretized differential systems on a set of random collocation points with respect to the trainable parameters of the neural network, achieving unprecedented resolution and optimal scaling for solving physical and biological systems.

收縮 · MoDELS · 情景 · 頻率主義學派 · 最大后驗 ·

2022 年 10 月 25 日

Laplace priors and spatial inhomogeneity in Bayesian inverse problems

Sergios Agapiou,Sven Wang

Spatially inhomogeneous functions, which may be smooth in some regions and rough in other regions, are modelled naturally in a Bayesian manner using so-called Besov priors which are given by random wavelet expansions with Laplace-distributed coefficients. This paper studies theoretical guarantees for such prior measures - specifically, we examine their frequentist posterior contraction rates in the setting of non-linear inverse problems with Gaussian white noise. Our results are first derived under a general local Lipschitz assumption on the forward map. We then verify the assumption for two non-linear inverse problems arising from elliptic partial differential equations, the Darcy flow model from geophysics as well as a model for the Schr\"odinger equation appearing in tomography. In the course of the proofs, we also obtain novel concentration inequalities for penalized least squares estimators with $\ell^1$ wavelet penalty, which have a natural interpretation as maximum a posteriori (MAP) estimators. The true parameter is assumed to belong to some spatially inhomogeneous Besov class $B^{\alpha}_{11}$, $\alpha>0$. In a setting with direct observations, we complement these upper bounds with a lower bound on the rate of contraction for arbitrary Gaussian priors. An immediate consequence of our results is that while Laplace priors can achieve minimax-optimal rates over $B^{\alpha}_{11}$-classes, Gaussian priors are limited to a (by a polynomial factor) slower contraction rate. This gives information-theoretical justification for the intuition that Laplace priors are more compatible with $\ell^1$ regularity structure in the underlying parameter.

MoDELS · 線性的 · 推斷 · Extensibility · CASES ·

2022 年 10 月 25 日

Intuitive Joint Priors for Bayesian Linear Multilevel Models: The R2D2M2 prior

Javier Enrique Aguilar,Paul-Christian Bürkner

from arxiv, 60 pages, 21 figures, 9 tables

The training of high-dimensional regression models on comparably sparse data is an important yet complicated topic, especially when there are many more model parameters than observations in the data. From a Bayesian perspective, inference in such cases can be achieved with the help of shrinkage prior distributions, at least for generalized linear models. However, real-world data usually possess multilevel structures, such as repeated measurements or natural groupings of individuals, which existing shrinkage priors are not built to deal with. We generalize and extend one of these priors, the R2D2 prior by Zhang et al. (2020), to linear multilevel models leading to what we call the R2D2M2 prior. The proposed prior enables both local and global shrinkage of the model parameters. It comes with interpretable hyperparameters, which we show to be intrinsically related to vital properties of the prior, such as rates of concentration around the origin, tail behavior, and amount of shrinkage the prior exerts. We offer guidelines on how to select the prior's hyperparameters by deriving shrinkage factors and measuring the effective number of non-zero model coefficients. Hence, the user can readily evaluate and interpret the amount of shrinkage implied by a specific choice of hyperparameters. Finally, we perform extensive experiments on simulated and real data, showing that our inference procedure for the prior is well calibrated, has desirable global and local regularization properties and enables the reliable and interpretable estimation of much more complex Bayesian multilevel models than was previously possible.

INFORMS · 統計量 · 泛函 · 相互獨立的 · 評分函數 ·

2022 年 10 月 24 日

Rates of Fisher information convergence in the central limit theorem for nonlinear statistics

Nguyen Tien Dung

from arxiv, 39 pages. Added Theorems 5.1-5.2

We develop a general method to study the Fisher information distance in central limit theorem for nonlinear statistics. We first construct completely new representations for the score function. We then use these representations to derive quantitative estimates for the Fisher information distance. To illustrate the applicability of our approach, explicit rates of Fisher information convergence for quadratic forms and the functions of sample means are provided. For the sums of independent random variables, we obtain the Fisher information bounds without requiring the finiteness of Poincar\'e constant. Our method can also be used to bound the Fisher information distance in non-central limit theorems.

估計/估計量 · 估計誤差 · MoDELS · 學成 · 無偏 ·

2020 年 12 月 17 日

The Causal Learning of Retail Delinquency

Yiyan Huang,Cheuk Hang Leung,Xing Yan,Qi Wu,Nanbo Peng,Dongdong Wang,Zhixiang Huang

from arxiv, This paper was accepted and will be published in the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21)

This paper focuses on the expected difference in borrower's repayment when there is a change in the lender's credit decisions. Classical estimators overlook the confounding effects and hence the estimation error can be magnificent. As such, we propose another approach to construct the estimators such that the error can be greatly reduced. The proposed estimators are shown to be unbiased, consistent, and robust through a combination of theoretical analysis and numerical testing. Moreover, we compare the power of estimating the causal quantities between the classical estimators and the proposed estimators. The comparison is tested across a wide range of models, including linear regression models, tree-based models, and neural network-based models, under different simulated datasets that exhibit different levels of causality, different degrees of nonlinearity, and different distributional properties. Most importantly, we apply our approaches to a large observational dataset provided by a global technology firm that operates in both the e-commerce and the lending business. We find that the relative reduction of estimation error is strikingly substantial if the causal effects are accounted for correctly.