一级a视频免费一区二区,久久人人爽人人爽人人片69AV,久久伊人精品影院一本到综合,亚洲婷婷久久播AV一区二区三区

控制器 · 散度 · 路徑 · Neural Networks · Networks ·

2023 年 1 月 29 日

Solving high-dimensional Hamilton-Jacobi-Bellman PDEs using neural networks: perspectives from the theory of controlled diffusions and measures on path space

Nikolas Nüsken,Lorenz Richter

Optimal control of diffusion processes is intimately connected to the problem of solving certain Hamilton-Jacobi-Bellman equations. Building on recent machine learning inspired approaches towards high-dimensional PDEs, we investigate the potential of $\textit{iterative diffusion optimisation}$ techniques, in particular considering applications in importance sampling and rare event simulation, and focusing on problems without diffusion control, with linearly controlled drift and running costs that depend quadratically on the control. More generally, our methods apply to nonlinear parabolic PDEs with a certain shift invariance. The choice of an appropriate loss function being a central element in the algorithmic design, we develop a principled framework based on divergences between path measures, encompassing various existing methods. Motivated by connections to forward-backward SDEs, we propose and study the novel $\textit{log-variance}$ divergence, showing favourable properties of corresponding Monte Carlo estimators. The promise of the developed approach is exemplified by a range of high-dimensional and metastable numerical examples.

相關內容

控制器

關注 5

Extensibility · 可約的 · Principle · state-of-the-art · Performer ·

2023 年 3 月 21 日

Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective

Yuexiao Ma,Huixia Li,Xiawu Zheng,Xuefeng Xiao,Rui Wang,Shilei Wen,Xin Pan,Fei Chao,Rongrong Ji

from arxiv, Accepted by CVPR 2023

Post-training quantization (PTQ) is widely regarded as one of the most efficient compression methods practically, benefitting from its data privacy and low computation costs. We argue that an overlooked problem of oscillation is in the PTQ methods. In this paper, we take the initiative to explore and present a theoretical proof to explain why such a problem is essential in PTQ. And then, we try to solve this problem by introducing a principled and generalized framework theoretically. In particular, we first formulate the oscillation in PTQ and prove the problem is caused by the difference in module capacity. To this end, we define the module capacity (ModCap) under data-dependent and data-free scenarios, where the differentials between adjacent modules are used to measure the degree of oscillation. The problem is then solved by selecting top-k differentials, in which the corresponding modules are jointly optimized and quantized. Extensive experiments demonstrate that our method successfully reduces the performance drop and is generalized to different neural networks and PTQ methods. For example, with 2/4 bit ResNet-50 quantization, our method surpasses the previous state-of-the-art method by 1.9%. It becomes more significant on small model quantization, e.g. surpasses BRECQ method by 6.61% on MobileNetV2*0.5.

Integration · Networking · 平滑 · 泛函 · Neural Networks ·

2023 年 3 月 21 日

Adaptive quadratures for nonlinear approximation of low-dimensional PDEs using smooth neural networks

Alexandre Magueresse,Santiago Badia

Physics-informed neural networks (PINNs) and their variants have recently emerged as alternatives to traditional partial differential equation (PDE) solvers, but little literature has focused on devising accurate numerical integration methods for neural networks (NNs), which is essential for getting accurate solutions. In this work, we propose adaptive quadratures for the accurate integration of neural networks and apply them to loss functions appearing in low-dimensional PDE discretisations. We show that at opposite ends of the spectrum, continuous piecewise linear (CPWL) activation functions enable one to bound the integration error, while smooth activations ease the convergence of the optimisation problem. We strike a balance by considering a CPWL approximation of a smooth activation function. The CPWL activation is used to obtain an adaptive decomposition of the domain into regions where the network is almost linear, and we derive an adaptive global quadrature from this mesh. The loss function is then obtained by evaluating the smooth network (together with other quantities, e.g., the forcing term) at the quadrature points. We propose a method to approximate a class of smooth activations by CPWL functions and show that it has a quadratic convergence rate. We then derive an upper bound for the overall integration error of our proposed adaptive quadrature. The benefits of our quadrature are evaluated on a strong and weak formulation of the Poisson equation in dimensions one and two. Our numerical experiments suggest that compared to Monte-Carlo integration, our adaptive quadrature makes the convergence of NNs quicker and more robust to parameter initialisation while needing significantly fewer integration points and keeping similar training times.

PDE · Principle · Performer · 離散化 · FAST ·

2023 年 3 月 20 日

Message Passing Neural PDE Solvers

Johannes Brandstetter,Daniel Worrall,Max Welling

from arxiv, Published at ICLR 2022 (Spotlight paper), Github: //github.com/brandstetter-johannes/MP-Neural-PDE-Solvers

The numerical solution of partial differential equations (PDEs) is difficult, having led to a century of research so far. Recently, there have been pushes to build neural--numerical hybrid solvers, which piggy-backs the modern trend towards fully end-to-end learned systems. Most works so far can only generalize over a subset of properties to which a generic solver would be faced, including: resolution, topology, geometry, boundary conditions, domain discretization regularity, dimensionality, etc. In this work, we build a solver, satisfying these properties, where all the components are based on neural message passing, replacing all heuristically designed components in the computation graph with backprop-optimized neural function approximators. We show that neural message passing solvers representationally contain some classical methods, such as finite differences, finite volumes, and WENO schemes. In order to encourage stability in training autoregressive models, we put forward a method that is based on the principle of zero-stability, posing stability as a domain adaptation problem. We validate our method on various fluid-like flow problems, demonstrating fast, stable, and accurate performance across different domain topologies, equation parameters, discretizations, etc., in 1D and 2D.

Networking · Neural Networks · 損失函數（機器學習） · 泛函 · ForCES ·

2023 年 3 月 20 日

Bi-orthogonal fPINN: A physics-informed neural network method for solving time-dependent stochastic fractional PDEs

Lei Ma,Rong xin Li,Fanhai Zeng,Ling Guo,George Em Karniadakis

Fractional partial differential equations (FPDEs) can effectively represent anomalous transport and nonlocal interactions. However, inherent uncertainties arise naturally in real applications due to random forcing or unknown material properties. Mathematical models considering nonlocal interactions with uncertainty quantification can be formulated as stochastic fractional partial differential equations (SFPDEs). There are many challenges in solving SFPDEs numerically, especially for long-time integration since such problems are high-dimensional and nonlocal. Here, we combine the bi-orthogonal (BO) method for representing stochastic processes with physics-informed neural networks (PINNs) for solving partial differential equations to formulate the bi-orthogonal PINN method (BO-fPINN) for solving time-dependent SFPDEs. Specifically, we introduce a deep neural network for the stochastic solution of the time-dependent SFPDEs, and include the BO constraints in the loss function following a weak formulation. Since automatic differentiation is not currently applicable to fractional derivatives, we employ discretization on a grid to to compute the fractional derivatives of the neural network output. The weak formulation loss function of the BO-fPINN method can overcome some drawbacks of the BO methods and thus can be used to solve SFPDEs with eigenvalue crossings. Moreover, the BO-fPINN method can be used for inverse SFPDEs with the same framework and same computational complexity as for forward problems. We demonstrate the effectiveness of the BO-fPINN method for different benchmark problems. The results demonstrate the flexibility and efficiency of the proposed method, especially for inverse problems.

穩健性 · Networking · Neural Networks · Extensibility · INFORMS ·

2023 年 3 月 17 日

Towards Reliable Neural Specifications

Chuqin Geng,Nham Le,Xiaojie Xu,Zhaoyue Wang,Arie Gurfinkel,Xujie Si

from arxiv, 19 pages, 16 figures

Having reliable specifications is an unavoidable challenge in achieving verifiable correctness, robustness, and interpretability of AI systems. Existing specifications for neural networks are in the paradigm of data as specification. That is, the local neighborhood centering around a reference input is considered to be correct (or robust). While existing specifications contribute to verifying adversarial robustness, a significant problem in many research domains, our empirical study shows that those verified regions are somewhat tight, and thus fail to allow verification of test set inputs, making them impractical for some real-world applications. To this end, we propose a new family of specifications called neural representation as specification, which uses the intrinsic information of neural networks - neural activation patterns (NAPs), rather than input data to specify the correctness and/or robustness of neural network predictions. We present a simple statistical approach to mining neural activation patterns. To show the effectiveness of discovered NAPs, we formally verify several important properties, such as various types of misclassifications will never happen for a given NAP, and there is no ambiguity between different NAPs. We show that by using NAP, we can verify a significant region of the input space, while still recalling 84% of the data on MNIST. Moreover, we can push the verifiable bound to 10 times larger on the CIFAR10 benchmark. Thus, we argue that NAPs can potentially be used as a more reliable and extensible specification for neural network verification.

分解的 · 因子分析 · MoDELS · Analysis · 查準率/準確率 ·

2023 年 3 月 17 日

What Can We Learn from a Semiparametric Factor Analysis of Item Responses and Response Time? An Illustration with the PISA 2015 Data

Yang Liu,Weimeng Wang

It is widely believed that a joint factor analysis of item responses and response time (RT) may yield more precise ability scores that are conventionally predicted from responses only. For this purpose, a simple-structure factor model is often preferred as it only requires specifying an additional measurement model for item-level RT while leaving the original item response theory (IRT) model for responses intact. The added speed factor indicated by item-level RT correlates with the ability factor in the IRT model, allowing RT data to carry additional information about respondents' ability. However, parametric simple-structure factor models are often restrictive and fit poorly to empirical data, which prompts under-confidence in the suitablity of a simple factor structure. In the present paper, we analyze the 2015 Programme for International Student Assessment (PISA) mathematics data using a semiparametric simple-structure model. We conclude that a simple factor structure attains a decent fit after further parametric assumptions in the measurement model are sufficiently relaxed. Furthermore, our semiparametric model implies that the association between latent ability and speed/slowness is strong in the population, but the form of association is nonlinear. It follows that scoring based on the fitted model can substantially improve the precision of ability scores.

估計/估計量 · MoDELS · 似然 · Extensibility · 可辨認的 ·

2023 年 3 月 17 日

High-dimensional Censored Regression via the Penalized Tobit Likelihood

Tate Jacobson,Hui Zou

High-dimensional regression and regression with a left-censored response are each well-studied topics. In spite of this, few methods have been proposed which deal with both of these complications simultaneously. The Tobit model -- long the standard method for censored regression in economics -- has not been adapted for high-dimensional regression at all. To fill this gap and bring up-to-date techniques from high-dimensional statistics to the field of high-dimensional left-censored regression, we propose several penalized Tobit models. We develop a fast algorithm which combines quadratic minimization with coordinate descent to compute the penalized Tobit solution path. Theoretically, we analyze the Tobit lasso and Tobit with a folded concave penalty, bounding the $\ell_2$ estimation loss for the former and proving that a local linear approximation estimator for the latter possesses the strong oracle property. Through an extensive simulation study, we find that our penalized Tobit models provide more accurate predictions and parameter estimates than other methods. We use a penalized Tobit model to analyze high-dimensional left-censored HIV viral load data from the AIDS Clinical Trials Group and identify potential drug resistance mutations in the HIV genome. Appendices contain intermediate theoretical results and technical proofs.

Learning · Networking · Neural Networks · 經驗風險 · MoDELS ·

2023 年 3 月 16 日

Learning time-scales in two-layers neural networks

Rapha?l Berthier,Andrea Montanari,Kangjie Zhou

from arxiv, 54 pages, 9 figures

Gradient-based learning in multi-layer neural networks displays a number of striking features. In particular, the decrease rate of empirical risk is non-monotone even after averaging over large batches. Long plateaus in which one observes barely any progress alternate with intervals of rapid decrease. These successive phases of learning often take place on very different time scales. Finally, models learnt in an early phase are typically `simpler' or `easier to learn' although in a way that is difficult to formalize. Although theoretical explanations of these phenomena have been put forward, each of them captures at best certain specific regimes. In this paper, we study the gradient flow dynamics of a wide two-layer neural network in high-dimension, when data are distributed according to a single-index model (i.e., the target function depends on a one-dimensional projection of the covariates). Based on a mixture of new rigorous results, non-rigorous mathematical derivations, and numerical simulations, we propose a scenario for the learning dynamics in this setting. In particular, the proposed evolution exhibits separation of timescales and intermittency. These behaviors arise naturally because the population gradient flow can be recast as a singularly perturbed dynamical system.

估計/估計量 · 估計誤差 · MoDELS · 學成 · 無偏 ·

2020 年 12 月 17 日

The Causal Learning of Retail Delinquency

Yiyan Huang,Cheuk Hang Leung,Xing Yan,Qi Wu,Nanbo Peng,Dongdong Wang,Zhixiang Huang

from arxiv, This paper was accepted and will be published in the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21)

This paper focuses on the expected difference in borrower's repayment when there is a change in the lender's credit decisions. Classical estimators overlook the confounding effects and hence the estimation error can be magnificent. As such, we propose another approach to construct the estimators such that the error can be greatly reduced. The proposed estimators are shown to be unbiased, consistent, and robust through a combination of theoretical analysis and numerical testing. Moreover, we compare the power of estimating the causal quantities between the classical estimators and the proposed estimators. The comparison is tested across a wide range of models, including linear regression models, tree-based models, and neural network-based models, under different simulated datasets that exhibit different levels of causality, different degrees of nonlinearity, and different distributional properties. Most importantly, we apply our approaches to a large observational dataset provided by a global technology firm that operates in both the e-commerce and the lending business. We find that the relative reduction of estimation error is strikingly substantial if the causal effects are accounted for correctly.

邊 · AIM · Processing（編程語言） · 可辨認的 · Taxonomy ·

2020 年 3 月 26 日

A Survey on Edge Intelligence

Dianlei Xu,Tong Li,Yong Li,Xiang Su,Sasu Tarkoma,Pan Hui

Edge intelligence refers to a set of connected systems and devices for data collection, caching, processing, and analysis in locations close to where data is captured based on artificial intelligence. The aim of edge intelligence is to enhance the quality and speed of data processing and protect the privacy and security of the data. Although recently emerged, spanning the period from 2011 to now, this field of research has shown explosive growth over the past five years. In this paper, we present a thorough and comprehensive survey on the literature surrounding edge intelligence. We first identify four fundamental components of edge intelligence, namely edge caching, edge training, edge inference, and edge offloading, based on theoretical and practical results pertaining to proposed and deployed systems. We then aim for a systematic classification of the state of the solutions by examining research results and observations for each of the four components and present a taxonomy that includes practical problems, adopted techniques, and application goals. For each category, we elaborate, compare and analyse the literature from the perspectives of adopted techniques, objectives, performance, advantages and drawbacks, etc. This survey article provides a comprehensive introduction to edge intelligence and its application areas. In addition, we summarise the development of the emerging research field and the current state-of-the-art and discuss the important open issues and possible theoretical and technical solutions.