人人干人人摸人人操,亚洲国产一区二区精品91,日本欧美一区二区免费不卡,国产一中文一无码一日韩

2023 年 6 月 8 日

Learning the random variables in Monte Carlo simulations with stochastic gradient descent: Machine learning for parametric PDEs and financial derivative pricing

Sebastian Becker,Arnulf Jentzen,Marvin S. Müller,Philippe von Wurstemberger

from arxiv, 71 pages, 4 Figures, 14 Tables; to appear in Math. Finance

In financial engineering, prices of financial products are computed approximately many times each trading day with (slightly) different parameters in each calculation. In many financial models such prices can be approximated by means of Monte Carlo (MC) simulations. To obtain a good approximation the MC sample size usually needs to be considerably large resulting in a long computing time to obtain a single approximation. In this paper we introduce a new approximation strategy for parametric approximation problems including the parametric financial pricing problems described above. A central aspect of the approximation strategy proposed in this article is to combine MC algorithms with machine learning techniques to, roughly speaking, learn the random variables (LRV) in MC simulations. In other words, we employ stochastic gradient descent (SGD) optimization methods not to train parameters of standard artificial neural networks (ANNs) but to learn random variables appearing in MC approximations. We numerically test the LRV strategy on various parametric problems with convincing results when compared with standard MC simulations, Quasi-Monte Carlo simulations, SGD-trained shallow ANNs, and SGD-trained deep ANNs. Our numerical simulations strongly indicate that the LRV strategy might be capable to overcome the curse of dimensionality in the $L^\infty$-norm in several cases where the standard deep learning approach has been proven not to be able to do so. This is not a contradiction to lower bounds established in the scientific literature because this new LRV strategy is outside of the class of algorithms for which lower bounds have been established in the scientific literature. The proposed LRV strategy is of general nature and not only restricted to the parametric financial pricing problems described above, but applicable to a large class of approximation problems.

相關內容

近似

關注 0

泛函 · 優化器 · 樣本 · 相互獨立的 · 維數災難 ·

2023 年 8 月 1 日

Convergence Rates for Non-Log-Concave Sampling and Log-Partition Estimation

David Holzmüller,Francis Bach

from arxiv, Changes in v3: Minor corrections and improvements. Plots can be reproduced using the code at //github.com/dholzmueller/sampling_experiments

Sampling from Gibbs distributions $p(x) \propto \exp(-V(x)/\varepsilon)$ and computing their log-partition function are fundamental tasks in statistics, machine learning, and statistical physics. However, while efficient algorithms are known for convex potentials $V$, the situation is much more difficult in the non-convex case, where algorithms necessarily suffer from the curse of dimensionality in the worst case. For optimization, which can be seen as a low-temperature limit of sampling, it is known that smooth functions $V$ allow faster convergence rates. Specifically, for $m$-times differentiable functions in $d$ dimensions, the optimal rate for algorithms with $n$ function evaluations is known to be $O(n^{-m/d})$, where the constant can potentially depend on $m, d$ and the function to be optimized. Hence, the curse of dimensionality can be alleviated for smooth functions at least in terms of the convergence rate. Recently, it has been shown that similarly fast rates can also be achieved with polynomial runtime $O(n^{3.5})$, where the exponent $3.5$ is independent of $m$ or $d$. Hence, it is natural to ask whether similar rates for sampling and log-partition computation are possible, and whether they can be realized in polynomial time with an exponent independent of $m$ and $d$. We show that the optimal rates for sampling and log-partition computation are sometimes equal and sometimes faster than for optimization. We then analyze various polynomial-time sampling algorithms, including an extension of a recent promising optimization approach, and find that they sometimes exhibit interesting behavior but no near-optimal rates. Our results also give further insights on the relation between sampling, log-partition, and optimization problems.

蒙特卡羅 · 蒙特卡羅方法 · Processing（編程語言） · Lipschitz · 泛函 ·

2023 年 7 月 31 日

A multilevel Monte Carlo algorithm for SDEs driven by countably dimensional Wiener process and Poisson random measure

Micha? Sobieraj

from arxiv, 22 pages

In this paper, we investigate the properties of standard and multilevel Monte Carlo methods for weak approximation of solutions of stochastic differential equations (SDEs) driven by the infinite-dimensional Wiener process and Poisson random measure with the Lipschitz payoff function. The error of the truncated dimension randomized numerical scheme, which is determined by two parameters, i.e grid density $n \in \mathbb{N}_{+}$ and truncation dimension parameter $M \in \mathbb{N}_{+},$ is of the order $n^{-1/2}+\delta(M)$ such that $\delta(\cdot)$ is positive and decreasing to $0$. The paper introduces the complexity model and provides proof for the upper complexity bound of the multilevel Monte Carlo method which depends on two increasing sequences of parameters for both $n$ and $M.$ The complexity is measured in terms of upper bound for mean-squared error and compared with the complexity of the standard Monte Carlo algorithm. The results from numerical experiments as well as Python and CUDA C implementation are also reported.

推斷 · 統計量 · 求逆 · Networking · INFORMS ·

2023 年 7 月 31 日

Variational Inverting Network for Statistical Inverse Problems of Partial Differential Equations

Junxiong Jia,Yanni Wu,Peijun Li,Deyu Meng

from arxiv, 61 pages

To quantify uncertainties in inverse problems of partial differential equations (PDEs), we formulate them into statistical inference problems using Bayes' formula. Recently, well-justified infinite-dimensional Bayesian analysis methods have been developed to construct dimension-independent algorithms. However, there are three challenges for these infinite-dimensional Bayesian methods: prior measures usually act as regularizers and are not able to incorporate prior information efficiently; complex noises, such as more practical non-i.i.d. distributed noises, are rarely considered; and time-consuming forward PDE solvers are needed to estimate posterior statistical quantities. To address these issues, an infinite-dimensional inference framework has been proposed based on the infinite-dimensional variational inference method and deep generative models. Specifically, by introducing some measure equivalence assumptions, we derive the evidence lower bound in the infinite-dimensional setting and provide possible parametric strategies that yield a general inference framework called the Variational Inverting Network (VINet). This inference framework can encode prior and noise information from learning examples. In addition, relying on the power of deep neural networks, the posterior mean and variance can be efficiently and explicitly generated in the inference stage. In numerical experiments, we design specific network structures that yield a computable VINet from the general inference framework. Numerical examples of linear inverse problems of an elliptic equation and the Helmholtz equation are presented to illustrate the effectiveness of the proposed inference framework.

奇異的 · 向量空間 · 統計量 · 估計/估計量 · SimPLe ·

2023 年 7 月 31 日

Non-centered parametric variational Bayes' approach for hierarchical inverse problems of partial differential equations

Jiaming Sui,Junxiong Jia

from arxiv, 43 pages

This paper proposes a non-centered parameterization based infinite-dimensional mean-field variational inference (NCP-iMFVI) approach for solving the hierarchical Bayesian inverse problems. This method can generate available estimates from the approximated posterior distribution efficiently. To avoid the mutually singular obstacle that occurred in the infinite-dimensional hierarchical approach, we propose a rigorous theory of the non-centered variational Bayesian approach. Since the non-centered parameterization weakens the connection between the parameter and the hyper-parameter, we can introduce the hyper-parameter to all terms of the eigendecomposition of the prior covariance operator. We also show the relationships between the NCP-iMFVI and infinite-dimensional hierarchical approaches with centered parameterization. The proposed algorithm is applied to three inverse problems governed by the simple smooth equation, the Helmholtz equation, and the steady-state Darcy flow equation. Numerical results confirm our theoretical findings, illustrate the efficiency of solving the iMFVI problem formulated by large-scale linear and nonlinear statistical inverse problems, and verify the mesh-independent property.

Networking · Neural Networks · 泛函 · 近似 · 可約的 ·

2023 年 7 月 30 日

On Neural Network approximation of ideal adversarial attack and convergence of adversarial training

Rajdeep Haldar,Qifan Song

Adversarial attacks are usually expressed in terms of a gradient-based operation on the input data and model, this results in heavy computations every time an attack is generated. In this work, we solidify the idea of representing adversarial attacks as a trainable function, without further gradient computation. We first motivate that the theoretical best attacks, under proper conditions, can be represented as smooth piece-wise functions (piece-wise H\"older functions). Then we obtain an approximation result of such functions by a neural network. Subsequently, we emulate the ideal attack process by a neural network and reduce the adversarial training to a mathematical game between an attack network and a training model (a defense network). We also obtain convergence rates of adversarial loss in terms of the sample size $n$ for adversarial training in such a setting.

隨機動力系統 · 線性的 · 動力系統 · 離散化 · 縮放 ·

2023 年 7 月 29 日

Efficient Quantum Algorithms for Nonlinear Stochastic Dynamical Systems

Abeynaya Gnanasekaran,Amit Surana,Tuhin Sahai

from arxiv, IEEE International Conference on Quantum Computing and Engineering (QCE23)

In this paper, we propose efficient quantum algorithms for solving nonlinear stochastic differential equations (SDE) via the associated Fokker-Planck equation (FPE). We discretize the FPE in space and time using two well-known numerical schemes, namely Chang-Cooper and implicit finite difference. We then compute the solution of the resulting system of linear equations using the quantum linear systems algorithm. We present detailed error and complexity analyses for both these schemes and demonstrate that our proposed algorithms, under certain conditions, provably compute the solution to the FPE within prescribed $\epsilon$ error bounds with polynomial dependence on state dimension $d$. Classical numerical methods scale exponentially with dimension, thus, our approach, under the aforementioned conditions, provides an \emph{exponential speed-up} over traditional approaches.

分段 · 非凸 · INFORMS · 有向 · 經驗風險最小化 ·

2023 年 7 月 29 日

Data-driven Piecewise Affine Decision Rules for Stochastic Programming with Covariate Information

Yiyang Zhang,Junyi Liu,Xiaobo Zhao

Focusing on stochastic programming (SP) with covariate information, this paper proposes an empirical risk minimization (ERM) method embedded within a nonconvex piecewise affine decision rule (PADR), which aims to learn the direct mapping from features to optimal decisions. We establish the nonasymptotic consistency result of our PADR-based ERM model for unconstrained problems and asymptotic consistency result for constrained ones. To solve the nonconvex and nondifferentiable ERM problem, we develop an enhanced stochastic majorization-minimization algorithm and establish the asymptotic convergence to (composite strong) directional stationarity along with complexity analysis. We show that the proposed PADR-based ERM method applies to a broad class of nonconvex SP problems with theoretical consistency guarantees and computational tractability. Our numerical study demonstrates the superior performance of PADR-based ERM methods compared to state-of-the-art approaches under various settings, with significantly lower costs, less computation time, and robustness to feature dimensions and nonlinearity of the underlying dependency.

估計/估計量 · 隨機場 · 統計量 · binary · 估計誤差 ·

2023 年 7 月 28 日

On the perimeter estimation of pixelated excursion sets of 2D anisotropic random fields

Ryan Cotsakis,Elena Di Bernardino,Thomas Opitz

from arxiv, 35 pages, 17 figures

We are interested in creating statistical methods to provide informative summaries of random fields through the geometry of their excursion sets. To this end, we introduce an estimator for the length of the perimeter of excursion sets of random fields on $\mathbb{R}^2$ observed over regular square tilings. The proposed estimator acts on the empirically accessible binary digital images of the excursion regions and computes the length of a piecewise linear approximation of the excursion boundary. The estimator is shown to be consistent as the pixel size decreases, without the need of any normalization constant, and with neither assumption of Gaussianity nor isotropy imposed on the underlying random field. In this general framework, even when the domain grows to cover $\mathbb{R}^2$, the estimation error is shown to be of smaller order than the side length of the domain. For affine, strongly mixing random fields, this translates to a multivariate Central Limit Theorem for our estimator when multiple levels are considered simultaneously. Finally, we conduct several numerical studies to investigate statistical properties of the proposed estimator in the finite-sample data setting.

蒙特卡羅 · Extensibility · 泰勒級數 · Processing（編程語言） · 泰勒 ·

2023 年 7 月 28 日

Stochastic automatic differentiation for Monte Carlo processes

Guilherme Catumba,Alberto Ramos,Bryan Zaldivar

from arxiv, 21 pages, 5 images, 2 tables

Monte Carlo methods represent a cornerstone of computer science. They allow to sample high dimensional distribution functions in an efficient way. In this paper we consider the extension of Automatic Differentiation (AD) techniques to Monte Carlo process, addressing the problem of obtaining derivatives (and in general, the Taylor series) of expectation values. Borrowing ideas from the lattice field theory community, we examine two approaches. One is based on reweighting while the other represents an extension of the Hamiltonian approach typically used by the Hybrid Monte Carlo (HMC) and similar algorithms. We show that the Hamiltonian approach can be understood as a change of variables of the reweighting approach, resulting in much reduced variances of the coefficients of the Taylor series. This work opens the door to find other variance reduction techniques for derivatives of expectation values.

Networking · 學成 · Principle · MoDELS · Networks ·

2021 年 6 月 18 日

The Principles of Deep Learning Theory

Daniel A. Roberts,Sho Yaida,Boris Hanin

from arxiv, 451 pages, to be published by Cambridge University Press

This book develops an effective theory approach to understanding deep neural networks of practical relevance. Beginning from a first-principles component-level picture of networks, we explain how to determine an accurate description of the output of trained networks by solving layer-to-layer iteration equations and nonlinear learning dynamics. A main result is that the predictions of networks are described by nearly-Gaussian distributions, with the depth-to-width aspect ratio of the network controlling the deviations from the infinite-width Gaussian description. We explain how these effectively-deep networks learn nontrivial representations from training and more broadly analyze the mechanism of representation learning for nonlinear models. From a nearly-kernel-methods perspective, we find that the dependence of such models' predictions on the underlying learning algorithm can be expressed in a simple and universal way. To obtain these results, we develop the notion of representation group flow (RG flow) to characterize the propagation of signals through the network. By tuning networks to criticality, we give a practical solution to the exploding and vanishing gradient problem. We further explain how RG flow leads to near-universal behavior and lets us categorize networks built from different activation functions into universality classes. Altogether, we show that the depth-to-width ratio governs the effective model complexity of the ensemble of trained networks. By using information-theoretic techniques, we estimate the optimal aspect ratio at which we expect the network to be practically most useful and show how residual connections can be used to push this scale to arbitrary depths. With these tools, we can learn in detail about the inductive bias of architectures, hyperparameters, and optimizers.