销魂美女一区二区三区AV-国产一区二区三区日本韩国

ReLU · 近似 · 泄漏修正線性單元/泄漏整流線性單元 · Softplus · Lipschitz ·

2023 年 9 月 24 日

Deep neural networks with ReLU, leaky ReLU, and softplus activation provably overcome the curse of dimensionality for Kolmogorov partial differential equations with Lipschitz nonlinearities in the $L^p$-sense

Julia Ackermann,Arnulf Jentzen,Thomas Kruse,Benno Kuckuck,Joshua Lee Padgett

from arxiv, 52 pages

Recently, several deep learning (DL) methods for approximating high-dimensional partial differential equations (PDEs) have been proposed. The interest that these methods have generated in the literature is in large part due to simulations which appear to demonstrate that such DL methods have the capacity to overcome the curse of dimensionality (COD) for PDEs in the sense that the number of computational operations they require to achieve a certain approximation accuracy $\varepsilon\in(0,\infty)$ grows at most polynomially in the PDE dimension $d\in\mathbb N$ and the reciprocal of $\varepsilon$. While there is thus far no mathematical result that proves that one of such methods is indeed capable of overcoming the COD, there are now a number of rigorous results in the literature that show that deep neural networks (DNNs) have the expressive power to approximate PDE solutions without the COD in the sense that the number of parameters used to describe the approximating DNN grows at most polynomially in both the PDE dimension $d\in\mathbb N$ and the reciprocal of the approximation accuracy $\varepsilon>0$. Roughly speaking, in the literature it is has been proved for every $T>0$ that solutions $u_d\colon [0,T]\times\mathbb R^d\to \mathbb R$, $d\in\mathbb N$, of semilinear heat PDEs with Lipschitz continuous nonlinearities can be approximated by DNNs with ReLU activation at the terminal time in the $L^2$-sense without the COD provided that the initial value functions $\mathbb R^d\ni x\mapsto u_d(0,x)\in\mathbb R$, $d\in\mathbb N$, can be approximated by ReLU DNNs without the COD. It is the key contribution of this work to generalize this result by establishing this statement in the $L^p$-sense with $p\in(0,\infty)$ and by allowing the activation function to be more general covering the ReLU, the leaky ReLU, and the softplus activation functions as special cases.

相關內容

ReLU

關注 0

可約的 · 平滑 · MoDELS · INFORMS · Networking ·

2023 年 11 月 8 日

Physics informed machine learning with Smoothed Particle Hydrodynamics: Hierarchy of reduced Lagrangian models of turbulence

Michael Woodward,Yifeng Tian,Criston Hyett,Chris Fryer,Daniel Livescu,Mikhail Stepanov,Michael Chertkov

from arxiv, Phys. Rev. Fluids 8, 054602. Published 5 May 2023

Building efficient, accurate and generalizable reduced order models of developed turbulence remains a major challenge. This manuscript approaches this problem by developing a hierarchy of parameterized reduced Lagrangian models for turbulent flows, and investigates the effects of enforcing physical structure through Smoothed Particle Hydrodynamics (SPH) versus relying on neural networks (NN)s as universal function approximators. Starting from Neural Network (NN) parameterizations of a Lagrangian acceleration operator, this hierarchy of models gradually incorporates a weakly compressible and parameterized SPH framework, which enforces physical symmetries, such as Galilean, rotational and translational invariances. Within this hierarchy, two new parameterized smoothing kernels are developed in order to increase the flexibility of the learn-able SPH simulators. For each model we experiment with different loss functions which are minimized using gradient based optimization, where efficient computations of gradients are obtained by using Automatic Differentiation (AD) and Sensitivity Analysis (SA). Each model within the hierarchy is trained on two data sets associated with weekly compressible Homogeneous Isotropic Turbulence (HIT): (1) a validation set using weakly compressible SPH; and (2) a high fidelity set from Direct Numerical Simulations (DNS). Numerical evidence shows that encoding more SPH structure improves generalizability to different turbulent Mach numbers and time shifts, and that including the novel parameterized smoothing kernels improves the accuracy of SPH at the resolved scales.

Processing（編程語言） · NVM · 表示 · 蒙特卡羅 · 推斷 ·

2023 年 11 月 7 日

Generalised shot noise representations of stochastic systems driven by non-Gaussian Lévy processes

Marcos Tapia Costa,Ioannis Kontoyiannis,Simon Godsill

from arxiv, 34 pages, 14 figures

We consider the problem of obtaining effective representations for the solutions of linear, vector-valued stochastic differential equations (SDEs) driven by non-Gaussian pure-jump L\'evy processes, and we show how such representations lead to efficient simulation methods. The processes considered constitute a broad class of models that find application across the physical and biological sciences, mathematics, finance and engineering. Motivated by important relevant problems in statistical inference, we derive new, generalised shot-noise simulation methods whenever a normal variance-mean (NVM) mixture representation exists for the driving L\'evy process, including the generalised hyperbolic, normal-Gamma, and normal tempered stable cases. Simple, explicit conditions are identified for the convergence of the residual of a truncated shot-noise representation to a Brownian motion in the case of the pure L\'evy process, and to a Brownian-driven SDE in the case of the L\'evy-driven SDE. These results provide Gaussian approximations to the small jumps of the process under the NVM representation. The resulting representations are of particular importance in state inference and parameter estimation for L\'evy-driven SDE models, since the resulting conditionally Gaussian structures can be readily incorporated into latent variable inference methods such as Markov chain Monte Carlo (MCMC), Expectation-Maximisation (EM), and sequential Monte Carlo.

Networking · Neural Networks · 近似 · 縮放 · 模型評估 ·

2023 年 11 月 7 日

Enhanced physics-informed neural networks with domain scaling and residual correction methods for multi-frequency elliptic problems

Deok-Kyu Jang,Hyea Hyun Kim,Kyungsoo Kim

In this paper, neural network approximation methods are developed for elliptic partial differential equations with multi-frequency solutions. Neural network work approximation methods have advantages over classical approaches in that they can be applied without much concerns on the form of the differential equations or the shape or dimension of the problem domain. When applied to problems with multi-frequency solutions, the performance and accuracy of neural network approximation methods are strongly affected by the contrast of the high- and low-frequency parts in the solutions. To address this issue, domain scaling and residual correction methods are proposed. The efficiency and accuracy of the proposed methods are demonstrated for multi-frequency model problems.

塊 · 非凸 · 優化器 · 正則化項 · Projection ·

2023 年 11 月 7 日

Block majorization-minimization with diminishing radius for constrained nonconvex optimization

Hanbaek Lyu,Yuchen Li

from arxiv, 31 pages, 6 figures. The hyperlinks to equations in the previous version were not compiled correctly. Fixed in the new version

Block majorization-minimization (BMM) is a simple iterative algorithm for nonconvex constrained optimization that sequentially minimizes majorizing surrogates of the objective function in each block coordinate while the other coordinates are held fixed. BMM entails a large class of optimization algorithms such as block coordinate descent and its proximal-point variant, expectation-minimization, and block projected gradient descent. We establish that for general constrained nonconvex optimization, BMM with strongly convex surrogates can produce an $\epsilon$-stationary point within $O(\epsilon^{-2}(\log \epsilon^{-1})^{2})$ iterations and asymptotically converges to the set of stationary points. Furthermore, we propose a trust-region variant of BMM that can handle surrogates that are only convex and still obtain the same iteration complexity and asymptotic stationarity. These results hold robustly even when the convex sub-problems are inexactly solved as long as the optimality gaps are summable. As an application, we show that a regularized version of the celebrated multiplicative update algorithm for nonnegative matrix factorization by Lee and Seung has iteration complexity of $O(\epsilon^{-2}(\log \epsilon^{-1})^{2})$. The same result holds for a wide class of regularized nonnegative tensor decomposition algorithms as well as the classical block projected gradient descent algorithm. These theoretical results are validated through various numerical experiments.

學習器 · Performer · 集成 · CASES · Guidance ·

2023 年 11 月 6 日

Practical considerations for variable screening in the Super Learner

Brian D. Williamson,Drew King,Ying Huang

from arxiv, 14 pages, 4 figures, 1 table

Estimating a prediction function is a fundamental component of many data analyses. The Super Learner ensemble, a particular implementation of stacking, has desirable theoretical properties and has been used successfully in many applications. Dimension reduction can be accomplished by using variable screening algorithms, including the lasso, within the ensemble prior to fitting other prediction algorithms. However, the performance of a Super Learner using the lasso for dimension reduction has not been fully explored in cases where the lasso is known to perform poorly. We provide empirical results that suggest that a diverse set of candidate screening algorithms should be used to protect against poor performance of any one screen, similar to the guidance for choosing a library of prediction algorithms for the Super Learner.

多峰值 · Learning · 無監督 · 可約的 · 泛化理論 ·

2023 年 11 月 6 日

Imaging through multimode fibres with physical prior

Chuncheng Zhang,Yingjie Shi,Zheyi Yao,Xiubao Sui,Qian Cheng

Imaging through perturbed multimode fibres based on deep learning has been widely researched. However, existing methods mainly use target-speckle pairs in different configurations. It is challenging to reconstruct targets without trained networks. In this paper, we propose a physics-assisted, unsupervised, learning-based fibre imaging scheme. The role of the physical prior is to simplify the mapping relationship between the speckle pattern and the target image, thereby reducing the computational complexity. The unsupervised network learns target features according to the optimized direction provided by the physical prior. Therefore, the reconstruction process of the online learning only requires a few speckle patterns and unpaired targets. The proposed scheme also increases the generalization ability of the learning-based method in perturbed multimode fibres. Our scheme has the potential to extend the application of multimode fibre imaging.

Networking · 代價 · Markov · 分解的 · 相關系數 ·

2023 年 11 月 3 日

Dynamic Redundancy-aware Blockchain-based Partial Computation Offloading for the Metaverse in In-network Computing

Ibrahim Aliyu,Cho-Rong Yu,Tai-Won Um,Jinsul Kim

The computing in the network (COIN) paradigm has emerged as a potential solution for computation-intensive applications like the metaverse by utilizing unused network resources. The blockchain (BC) guarantees task-offloading privacy, but cost reduction, queueing delays, and redundancy elimination remain open problems. This paper presents a redundancy-aware BC-based approach for the metaverse's partial computation offloading (PCO). Specifically, we formulate a joint BC redundancy factor (BRF) and PCO problem to minimize computation costs, maximize incentives, and meet delay and BC offloading constraints. We proved this problem is NP-hard and transformed it into two subproblems based on their temporal correlation: real-time PCO and Markov decision process-based BRF. We formulated the PCO problem as a multiuser game, proposed a decentralized algorithm for Nash equilibrium under any BC redundancy state, and designed a double deep Q-network-based algorithm for the optimal BRF policy. The BRF strategy is updated periodically based on user computation demand and network status to assist the PCO algorithm. The experimental results suggest that the proposed approach outperforms existing schemes, resulting in a remarkable 47% reduction in cost overhead, delivering approximately 64% higher rewards, and achieving convergence in just a few training episodes.

估計/估計量 · 泛函 · 在線 · Performer · 統計量 ·

2023 年 11 月 3 日

Online non-parametric likelihood-ratio estimation by Pearson-divergence functional minimization

Alejandro de la Concha,Nicolas Vayatis,Argyris Kalogeratos

Quantifying the difference between two probability density functions, $p$ and $q$, using available data, is a fundamental problem in Statistics and Machine Learning. A usual approach for addressing this problem is the likelihood-ratio estimation (LRE) between $p$ and $q$, which -- to our best knowledge -- has been investigated mainly for the offline case. This paper contributes by introducing a new framework for online non-parametric LRE (OLRE) for the setting where pairs of iid observations $(x_t \sim p, x'_t \sim q)$ are observed over time. The non-parametric nature of our approach has the advantage of being agnostic to the forms of $p$ and $q$. Moreover, we capitalize on the recent advances in Kernel Methods and functional minimization to develop an estimator that can be efficiently updated online. We provide theoretical guarantees for the performance of the OLRE method along with empirical validation in synthetic experiments.

Networking · 磁流變材料 · 泛函 · MoDELS · 值域 ·

2023 年 11 月 3 日

Simulation of acquisition shifts in T2 Flair MR images to stress test AI segmentation networks

Christiane Posselt,Mehmet Yigit Avci,Mehmet Yigitsoy,Patrick Schünke,Christoph Kolbitsch,Tobias Sch?ffter,Stefanie Remmele

from arxiv, 33 pages, 10 figures The paper was submitted to SPIE Journal of Medical Imaging

Purpose: To provide a simulation framework for routine neuroimaging test data, which allows for "stress testing" of deep segmentation networks against acquisition shifts that commonly occur in clinical practice for T2 weighted (T2w) fluid attenuated inversion recovery (FLAIR) Magnetic Resonance Imaging (MRI) protocols. Approach: The approach simulates "acquisition shift derivatives" of MR images based on MR signal equations. Experiments comprise the validation of the simulated images by real MR scans and example stress tests on state-of-the-art MS lesion segmentation networks to explore a generic model function to describe the F1 score in dependence of the contrast-affecting sequence parameters echo time (TE) and inversion time (TI). Results: The differences between real and simulated images range up to 19 % in gray and white matter for extreme parameter settings. For the segmentation networks under test the F1 score dependency on TE and TI can be well described by quadratic model functions (R^2 > 0.9). The coefficients of the model functions indicate that changes of TE have more influence on the model performance than TI. Conclusions: We show that these deviations are in the range of values as may be caused by erroneous or individual differences of relaxation times as described by literature. The coefficients of the F1 model function allow for quantitative comparison of the influences of TE and TI. Limitations arise mainly from tissues with the low baseline signal (like CSF) and when the protocol contains contrast-affecting measures that cannot be modelled due to missing information in the DICOM header.

語音增強 · 估計/估計量 · MoDELS · 損失函數（機器學習） · Performer ·

2019 年 3 月 7 日

Phase-aware Speech Enhancement with Deep Complex U-Net

Hyeong-Seok Choi,Jang-Hyun Kim,Jaesung Huh,Adrian Kim,Jung-Woo Ha,Kyogu Lee

from arxiv, Accepted paper at International Conference on Learning Representations (ICLR) 2019

Most deep learning-based models for speech enhancement have mainly focused on estimating the magnitude of spectrogram while reusing the phase from noisy speech for reconstruction. This is due to the difficulty of estimating the phase of clean speech. To improve speech enhancement performance, we tackle the phase estimation problem in three ways. First, we propose Deep Complex U-Net, an advanced U-Net structured model incorporating well-defined complex-valued building blocks to deal with complex-valued spectrograms. Second, we propose a polar coordinate-wise complex-valued masking method to reflect the distribution of complex ideal ratio masks. Third, we define a novel loss function, weighted source-to-distortion ratio (wSDR) loss, which is designed to directly correlate with a quantitative evaluation measure. Our model was evaluated on a mixture of the Voice Bank corpus and DEMAND database, which has been widely used by many deep learning models for speech enhancement. Ablation experiments were conducted on the mixed dataset showing that all three proposed approaches are empirically valid. Experimental results show that the proposed method achieves state-of-the-art performance in all metrics, outperforming previous approaches by a large margin.