亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

<tr id='4b9z2'><strong id='4b9z2'></strong><small id='4b9z2'></small><button id='4b9z2'></button><li id='4b9z2'><noscript id='4b9z2'><big id='4b9z2'></big><dt id='4b9z2'></dt></noscript></li></tr><ol id='4b9z2'><option id='4b9z2'><table id='4b9z2'><blockquote id='4b9z2'><tbody id='4b9z2'></tbody></blockquote></table></option></ol><u id='4b9z2'></u><kbd id='4b9z2'><kbd id='4b9z2'></kbd></kbd>

<code id='4b9z2'><strong id='4b9z2'></strong></code>

<fieldset id='4b9z2'></fieldset>

<span id='4b9z2'></span>

<ins id='4b9z2'></ins>

<acronym id='4b9z2'><em id='4b9z2'></em><td id='4b9z2'><div id='4b9z2'></div></td></acronym><address id='4b9z2'><big id='4b9z2'><big id='4b9z2'></big><legend id='4b9z2'></legend></big></address>

<i id='4b9z2'><div id='4b9z2'><ins id='4b9z2'></ins></div></i>

<i id='4b9z2'></i>

·

規范化的 · 基 · 估計/估計量 · 似然 · 優化器 ·

2023 年 3 月 9 日

Building Normalizing Flows with Stochastic Interpolants

Michael S. Albergo,Eric Vanden-Eijnden

from arxiv, ICLR 2023

A generative model based on a continuous-time normalizing flow between any pair of base and target probability densities is proposed. The velocity field of this flow is inferred from the probability current of a time-dependent density that interpolates between the base and the target in finite time. Unlike conventional normalizing flow inference methods based the maximum likelihood principle, which require costly backpropagation through ODE solvers, our interpolant approach leads to a simple quadratic loss for the velocity itself which is expressed in terms of expectations that are readily amenable to empirical estimation. The flow can be used to generate samples from either the base or target, and to estimate the likelihood at any time along the interpolant. In addition, the flow can be optimized to minimize the path length of the interpolant density, thereby paving the way for building optimal transport maps. In situations where the base is a Gaussian density, we also show that the velocity of our normalizing flow can also be used to construct a diffusion model to sample the target as well as estimate its score. However, our approach shows that we can bypass this diffusion completely and work at the level of the probability flow with greater simplicity, opening an avenue for methods based solely on ordinary differential equations as an alternative to those based on stochastic differential equations. Benchmarking on density estimation tasks illustrates that the learned flow can match and surpass conventional continuous flows at a fraction of the cost, and compares well with diffusions on image generation on CIFAR-10 and ImageNet $32\times32$. The method scales ab-initio ODE flows to previously unreachable image resolutions, demonstrated up to $128\times128$.

相關內容

規范化的

線性變換 · DAG · 潛在 · 線性的 · 變換 ·

2023 年 5 月 1 日

Score-based Causal Representation Learning with Interventions

Burak Varici,Emre Acarturk,Karthikeyan Shanmugam,Abhishek Kumar,Ali Tajer

from arxiv, This version outlines large classes of non-linear causal models in the latent space for which our assumptions hold. It also discusses the latest updates of related literature

This paper studies the causal representation learning problem when the latent causal variables are observed indirectly through an unknown linear transformation. The objectives are: (i) recovering the unknown linear transformation (up to scaling) and (ii) determining the directed acyclic graph (DAG) underlying the latent variables. Sufficient conditions for DAG recovery are established, and it is shown that a large class of non-linear models in the latent space (e.g., causal mechanisms parameterized by two-layer neural networks) satisfy these conditions. These sufficient conditions ensure that the effect of an intervention can be detected correctly from changes in the score. Capitalizing on this property, recovering a valid transformation is facilitated by the following key property: any valid transformation renders latent variables' score function to necessarily have the minimal variations across different interventional environments. This property is leveraged for perfect recovery of the latent DAG structure using only \emph{soft} interventions. For the special case of stochastic \emph{hard} interventions, with an additional hypothesis testing step, one can also uniquely recover the linear transformation up to scaling and a valid causal ordering.

線性的 · 塊 · GROUP · 相互獨立的 · 估計/估計量 ·

2023 年 4 月 30 日

Representing the Special Linear Group with Block Unitriangular Matrices

We prove that every element of the special linear group can be represented as the product of at most six block unitriangular matrices, and that there exist matrices for which six products are necessary, independent of indexing. We present an analogous result for the general linear group. These results serve as general statements regarding the representational power of alternating linear updates. The factorizations and lower bounds of this work immediately imply tight estimates on the expressive power of linear affine coupling blocks in machine learning.

極小點 · 估計/估計量 · Minimax · Principle · 推斷 ·

2023 年 4 月 29 日

Estimation and Inference for Minimizer and Minimum of Convex Functions: Optimality, Adaptivity, and Uncertainty Principles

T. Tony Cai,Ran Chen,Yuancheng Zhu

Optimal estimation and inference for both the minimizer and minimum of a convex regression function under the white noise and nonparametric regression models are studied in a non-asymptotic local minimax framework, where the performance of a procedure is evaluated at individual functions. Fully adaptive and computationally efficient algorithms are proposed and sharp minimax lower bounds are given for both the estimation accuracy and expected length of confidence intervals for the minimizer and minimum. The non-asymptotic local minimax framework brings out new phenomena in simultaneous estimation and inference for the minimizer and minimum. We establish a novel Uncertainty Principle that provides a fundamental limit on how well the minimizer and minimum can be estimated simultaneously for any convex regression function. A similar result holds for the expected length of the confidence intervals for the minimizer and minimum.

規范化的 · 統計量 · 樣本 · 全條件概率 · 條件概率分布 ·

2023 年 4 月 28 日

Flow Away your Differences: Conditional Normalizing Flows as an Improvement to Reweighting

Malte Algren,Tobias Golling,Manuel Guth,Chris Pollard,John Andrew Raine

from arxiv, 21 pages, 9 figures

We present an alternative to reweighting techniques for modifying distributions to account for a desired change in an underlying conditional distribution, as is often needed to correct for mis-modelling in a simulated sample. We employ conditional normalizing flows to learn the full conditional probability distribution from which we sample new events for conditional values drawn from the target distribution to produce the desired, altered distribution. In contrast to common reweighting techniques, this procedure is independent of binning choice and does not rely on an estimate of the density ratio between two distributions. In several toy examples we show that normalizing flows outperform reweighting approaches to match the distribution of the target.We demonstrate that the corrected distribution closes well with the ground truth, and a statistical uncertainty on the training dataset can be ascertained with bootstrapping. In our examples, this leads to a statistical precision up to three times greater than using reweighting techniques with identical sample sizes for the source and target distributions. We also explore an application in the context of high energy particle physics.

Learning · 相對熵 · 優化器 · 散度 · 流形 ·

2023 年 4 月 28 日

A Near-Optimal Gradient Flow for Learning Neural Energy-Based Models

Yang Wu,Pengxu Wei,Liang Lin

from arxiv, 42 pages

In this paper, we propose a novel numerical scheme to optimize the gradient flows for learning energy-based models (EBMs). From a perspective of physical simulation, we redefine the problem of approximating the gradient flow utilizing optimal transport (i.e. Wasserstein) metric. In EBMs, the learning process of stepwise sampling and estimating data distribution performs the functional gradient of minimizing the global relative entropy between the current and target real distribution, which can be treated as dynamic particles moving from disorder to target manifold. Previous learning schemes mainly minimize the entropy concerning the consecutive time KL divergence in each learning step. However, they are prone to being stuck in the local KL divergence by projecting non-smooth information within smooth manifold, which is against the optimal transport principle. To solve this problem, we derive a second-order Wasserstein gradient flow of the global relative entropy from Fokker-Planck equation. Compared with existing schemes, Wasserstein gradient flow is a smoother and near-optimal numerical scheme to approximate real data densities. We also derive this near-proximal scheme and provide its numerical computation equations. Our extensive experiments demonstrate the practical superiority and potentials of our proposed scheme on fitting complex distributions and generating high-quality, high-dimensional data with neural EBMs.

近似 · 混合分布 · 泛函 · HAT · 高斯混合（模型） ·

2023 年 4 月 28 日

Quasi-Monte Carlo methods for mixture distributions and approximated distributions via piecewise linear interpolation

Tiangang Cui,Josef Dick,Friedrich Pillichshammer

We study numerical integration over bounded regions in $\mathbb{R}^s, s\ge1$ with respect to some probability measure. We replace random sampling with quasi-Monte Carlo methods, where the underlying point set is derived from deterministic constructions that aim to fill the space more evenly than random points. Such quasi-Monte Carlo point sets are ordinarily designed for the uniform measure, and the theory only works for product measures when a coordinate-wise transformation is applied. Going beyond this setting, we first consider the case where the target density is a mixture distribution where each term in the mixture comes from a product distribution. Next we consider target densities which can be approximated with such mixture distributions. We require the approximation to be a sum of coordinate-wise products and the approximation to be positive everywhere (so that they can be re-scaled to probability density functions). We use tensor product hat function approximations for this purpose here, since a hat function approximation of a positive function is itself positive. We also study more complex algorithms, where we first approximate the target density with a general Gaussian mixture distribution and approximate the mixtures with an adaptive hat function approximation on rotated intervals. The Gaussian mixture approximation allows us to locate the essential parts of the target density, whereas the adaptive hat function approximation allows us to approximate the finer structure of the target density. We prove convergence rates for each of the integration techniques based on quasi-Monte Carlo sampling for integrands with bounded partial mixed derivatives. The employed algorithms are based on digital $(t,s)$-sequences over the finite field $\mathbb{F}_2$ and an inversion method. Numerical examples illustrate the performance of the algorithms for some target densities and integrands.

樣本 · 噪聲 · 小批量 · 可約的 · 噪聲分布 ·

2023 年 4 月 28 日

Multisample Flow Matching: Straightening Flows with Minibatch Couplings

Aram-Alexandre Pooladian,Heli Ben-Hamu,Carles Domingo-Enrich,Brandon Amos,Yaron Lipman,Ricky Chen

Simulation-free methods for training continuous-time generative models construct probability paths that go between noise distributions and individual data samples. Recent works, such as Flow Matching, derived paths that are optimal for each data sample. However, these algorithms rely on independent data and noise samples, and do not exploit underlying structure in the data distribution for constructing probability paths. We propose Multisample Flow Matching, a more general framework that uses non-trivial couplings between data and noise samples while satisfying the correct marginal constraints. At very small overhead costs, this generalization allows us to (i) reduce gradient variance during training, (ii) obtain straighter flows for the learned vector field, which allows us to generate high-quality samples using fewer function evaluations, and (iii) obtain transport maps with lower cost in high dimensions, which has applications beyond generative modeling. Importantly, we do so in a completely simulation-free manner with a simple minimization objective. We show that our proposed methods improve sample consistency on downsampled ImageNet data sets, and lead to better low-cost sample generation.

INFORMS · 知識 (knowledge) · 可辨認的 · Integration · Learning ·

2023 年 4 月 27 日

Some of the variables, some of the parameters, some of the times, with some physics known: Identification with partial information

Saurabh Malani,Tom S. Bertalan,Tianqi Cui,Jose L. Avalos,Michael Betenbaugh,Ioannis G. Kevrekidis

from arxiv, 25 pages, 15 figures

Experimental data is often comprised of variables measured independently, at different sampling rates (non-uniform ${\Delta}$t between successive measurements); and at a specific time point only a subset of all variables may be sampled. Approaches to identifying dynamical systems from such data typically use interpolation, imputation or subsampling to reorganize or modify the training data $\textit{prior}$ to learning. Partial physical knowledge may also be available $\textit{a priori}$ (accurately or approximately), and data-driven techniques can complement this knowledge. Here we exploit neural network architectures based on numerical integration methods and $\textit{a priori}$ physical knowledge to identify the right-hand side of the underlying governing differential equations. Iterates of such neural-network models allow for learning from data sampled at arbitrary time points $\textit{without}$ data modification. Importantly, we integrate the network with available partial physical knowledge in "physics informed gray-boxes"; this enables learning unknown kinetic rates or microbial growth functions while simultaneously estimating experimental parameters.

估計/估計量 · MoDELS · 極大似然 · 最大似然估計 · 極大似然估計 ·

2023 年 4 月 26 日

Consistency of MLE for partially observed diffusions, with application in market microstructure modeling

Sergey Nadtochiy,Yuan Yin

This paper presents a tractable sufficient condition for the consistency of maximum likelihood estimators (MLEs) in partially observed diffusion models, stated in terms of stationary distribution of the associated fully observed diffusion, under the assumption that the set of unknown parameter values is finite. This sufficient condition is then verified in the context of a latent price model of market microstructure, yielding consistency of maximum likelihood estimators of the unknown parameters in this model. Finally, we compute the latter estimators using historical financial data taken from the NASDAQ exchange.

采樣法 · 方差 · 圖形處理器 · INFORMS · 泛化理論 ·

2020 年 6 月 24 日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Weilin Cong,Rana Forsati,Mahmut Kandemir,Mehrdad Mahdavi

Sampling methods (e.g., node-wise, layer-wise, or subgraph) has become an indispensable strategy to speed up training large-scale Graph Neural Networks (GNNs). However, existing sampling methods are mostly based on the graph structural information and ignore the dynamicity of optimization, which leads to high variance in estimating the stochastic gradients. The high variance issue can be very pronounced in extremely large graphs, where it results in slow convergence and poor generalization. In this paper, we theoretically analyze the variance of sampling methods and show that, due to the composite structure of empirical risk, the variance of any sampling method can be decomposed into \textit{embedding approximation variance} in the forward stage and \textit{stochastic gradient variance} in the backward stage that necessities mitigating both types of variance to obtain faster convergence rate. We propose a decoupled variance reduction strategy that employs (approximate) gradient information to adaptively sample nodes with minimal variance, and explicitly reduces the variance introduced by embedding approximation. We show theoretically and empirically that the proposed method, even with smaller mini-batch sizes, enjoys a faster convergence rate and entails a better generalization compared to the existing methods.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

估計/估計量

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191

<tr id='4b9z2'><strong id='4b9z2'></strong><small id='4b9z2'></small><button id='4b9z2'></button><li id='4b9z2'><noscript id='4b9z2'><big id='4b9z2'></big><dt id='4b9z2'></dt></noscript></li></tr><ol id='4b9z2'><option id='4b9z2'><table id='4b9z2'><blockquote id='4b9z2'><tbody id='4b9z2'></tbody></blockquote></table></option></ol><u id='4b9z2'></u><kbd id='4b9z2'><kbd id='4b9z2'></kbd></kbd>

<code id='4b9z2'><strong id='4b9z2'></strong></code>

<fieldset id='4b9z2'></fieldset>

<span id='4b9z2'></span>

<ins id='4b9z2'></ins>

<acronym id='4b9z2'><em id='4b9z2'></em><td id='4b9z2'><div id='4b9z2'></div></td></acronym><address id='4b9z2'><big id='4b9z2'><big id='4b9z2'></big><legend id='4b9z2'></legend></big></address>

<i id='4b9z2'><div id='4b9z2'><ins id='4b9z2'></ins></div></i>

<i id='4b9z2'></i>