国产欧美日韩综合在线,亚洲日韩网站在线观看,久久精品人妻人人爽人人玩

Conditional Neural Processes~(CNPs) formulate distributions over functions and generate function observations with exact conditional likelihoods. CNPs, however, have limited expressivity for high-dimensional observations, since their predictive distribution is factorized into a product of unconstrained (typically) Gaussian outputs. Previously, this could be handled using latent variables or autoregressive likelihood, but at the expense of intractable training and quadratically increased complexity. Instead, we propose calibrating CNPs with an adversarial training scheme besides regular maximum likelihood estimates. Specifically, we train an energy-based model (EBM) with noise contrastive estimation, which enforces EBM to identify true observations from the generations of CNP. In this way, CNP must generate predictions closer to the ground-truth to fool EBM, instead of merely optimizing with respect to the fixed-form likelihood. From generative function reconstruction to downstream regression and classification tasks, we demonstrate that our method fits mainstream CNP members, showing effectiveness when unconstrained Gaussian likelihood is defined, requiring minimal computation overhead while preserving foundation properties of CNPs.

相關內容

似(si)然

關注 0

估計/估計量 · 優化器 · 向量化 · Networking · MoDELS ·

2023 年 5 月 15 日

Straightening Out the Straight-Through Estimator: Overcoming Optimization Challenges in Vector Quantized Networks

Minyoung Huh,Brian Cheung,Pulkit Agrawal,Phillip Isola

This work examines the challenges of training neural networks using vector quantization using straight-through estimation. We find that a primary cause of training instability is the discrepancy between the model embedding and the code-vector distribution. We identify the factors that contribute to this issue, including the codebook gradient sparsity and the asymmetric nature of the commitment loss, which leads to misaligned code-vector assignments. We propose to address this issue via affine re-parameterization of the code vectors. Additionally, we introduce an alternating optimization to reduce the gradient error introduced by the straight-through estimation. Moreover, we propose an improvement to the commitment loss to ensure better alignment between the codebook representation and the model embedding. These optimization methods improve the mathematical approximation of the straight-through estimation and, ultimately, the model performance. We demonstrate the effectiveness of our methods on several common model architectures, such as AlexNet, ResNet, and ViT, across various tasks, including image classification and generative modeling.

傳感器 · 回合 · INFORMS · Processing（編程語言） · 卷積 ·

2023 年 5 月 15 日

Environmental Sensor Placement with Convolutional Gaussian Neural Processes

Tom R. Andersson,Wessel P. Bruinsma,Stratis Markou,James Requeima,Alejandro Coca-Castro,Anna Vaughan,Anna-Louise Ellis,Matthew A. Lazzara,Dani Jones,J. Scott Hosking,Richard E. Turner

from arxiv, Accepted in Environmental Data Science (Climate Informatics 2023 Special Issue)

Environmental sensors are crucial for monitoring weather conditions and the impacts of climate change. However, it is challenging to place sensors in a way that maximises the informativeness of their measurements, particularly in remote regions like Antarctica. Probabilistic machine learning models can suggest informative sensor placements by finding sites that maximally reduce prediction uncertainty. Gaussian process (GP) models are widely used for this purpose, but they struggle with capturing complex non-stationary behaviour and scaling to large datasets. This paper proposes using a convolutional Gaussian neural process (ConvGNP) to address these issues. A ConvGNP uses neural networks to parameterise a joint Gaussian distribution at arbitrary target locations, enabling flexibility and scalability. Using simulated surface air temperature anomaly over Antarctica as training data, the ConvGNP learns spatial and seasonal non-stationarities, outperforming a non-stationary GP baseline. In a simulated sensor placement experiment, the ConvGNP better predicts the performance boost obtained from new observations than GP baselines, leading to more informative sensor placements. We contrast our approach with physics-based sensor placement methods and propose future steps towards an operational sensor placement recommendation system. Our work could help to realise environmental digital twins that actively direct measurement sampling to improve the digital representation of reality.

隨機漫步 · 估計/估計量 · 近似 · 有向 · 控制器 ·

2023 年 5 月 15 日

Random walks and moving boundaries: Estimating the penetration of diffusants into dense rubbers

Surendra Nepal,Magnus Ogren,Yosief Wondmagegne,Adrian Muntean

from arxiv, 15 pages, 10 figures, 2 tables

For certain materials science scenarios arising in rubber technology, one-dimensional moving boundary problems (MBPs) with kinetic boundary conditions are capable of unveiling the large-time behavior of the diffusants penetration front, giving a direct estimate on the service life of the material. In this paper, we propose a random walk algorithm able to lead to good numerical approximations of both the concentration profile and the location of the sharp front. Essentially, the proposed scheme decouples the target evolution system in two steps: (i) the ordinary differential equation corresponding to the evaluation of the speed of the moving boundary is solved via an explicit Euler method, and (ii) the associated diffusion problem is solved by a random walk method. To verify the correctness of our random walk algorithm we compare the resulting approximations to results based on a finite element approach with a controlled convergence rate. Our numerical experiments recover well penetration depth measurements of an experimental setup targeting dense rubbers.

泛函 · Continuity · 表示 · 黑盒子 · 標量 ·

2023 年 5 月 14 日

A new iterative method for construction of the Kolmogorov-Arnold representation

Michael Poluektov,Andrew Polar

The Kolmogorov-Arnold representation of a continuous multivariate function is a decomposition of the function into a structure of inner and outer functions of a single variable. It can be a convenient tool for tasks where it is required to obtain a predictive model that maps some vector input of a black box system into a scalar output. However, the construction of such representation based on the recorded input-output data is a challenging task. In the present paper, it is suggested to decompose the underlying functions of the representation into continuous basis functions and parameters. A novel lightweight algorithm for parameter identification is then proposed. The algorithm is based on the Newton-Kaczmarz method for solving non-linear systems of equations and is locally convergent. Numerical examples show that it is more robust with respect to the section of the initial guess for the parameters than the straightforward application of the Gauss-Newton method for parameter identification.

正則化項 · 平滑 · Learning · CASES · 核化 ·

2023 年 5 月 12 日

Random Smoothing Regularization in Kernel Gradient Descent Learning

Liang Ding,Tianyang Hu,Jiahang Jiang,Donghao Li,Wenjia Wang,Yuan Yao

Random smoothing data augmentation is a unique form of regularization that can prevent overfitting by introducing noise to the input data, encouraging the model to learn more generalized features. Despite its success in various applications, there has been a lack of systematic study on the regularization ability of random smoothing. In this paper, we aim to bridge this gap by presenting a framework for random smoothing regularization that can adaptively and effectively learn a wide range of ground truth functions belonging to the classical Sobolev spaces. Specifically, we investigate two underlying function spaces: the Sobolev space of low intrinsic dimension, which includes the Sobolev space in $D$-dimensional Euclidean space or low-dimensional sub-manifolds as special cases, and the mixed smooth Sobolev space with a tensor structure. By using random smoothing regularization as novel convolution-based smoothing kernels, we can attain optimal convergence rates in these cases using a kernel gradient descent algorithm, either with early stopping or weight decay. It is noteworthy that our estimator can adapt to the structural assumptions of the underlying data and avoid the curse of dimensionality. This is achieved through various choices of injected noise distributions such as Gaussian, Laplace, or general polynomial noises, allowing for broad adaptation to the aforementioned structural assumptions of the underlying data. The convergence rate depends only on the effective dimension, which may be significantly smaller than the actual data dimension. We conduct numerical experiments on simulated data to validate our theoretical results.

估計/估計量 · 泛函 · 線性的 · 3D · MoDELS ·

2023 年 5 月 11 日

3D fictitious wave domain CSEM inversion by adjoint source estimation

Pengliang Yang

Marine controlled-source electromagnetic (CSEM) method has proved its potential in detecting highly resistive hydrocarbon bearing formations. A novel frequency domain CSEM inversion approach using fictitious wave domain time stepping modelling is presented. Using Lagrangian-based adjoint state method, the inversion gradient with respect to resistivity can be computed by the product between the forward and adjoint fields. Simulation of the adjoint field using the same modelling engine is challenging as it requires time domain adjoint source time functions while only a few discrete frequencies of the data residual are available for the inversion. A regularized linear inverse problem is formulated in order to estimate a long time series from very few frequency samples. It can then be solved using linear optimization technique, yielding a matrix-free implementation. Instead of computing adjoint source time function one by one at each receiver location, a basis function implementation has been developed such that the inverse problem can be solved only once and reused every time to construct all time-domain adjoint sources. The method allows computing all frequencies of the EM fields in one go without heavy memory and computational overhead, making efficient 3D CSEM inversion feasible. Numerical examples are employed to demonstrate the application of our method.

估計/估計量 · MoDELS · 集成 · Processing（編程語言） · 相似度 ·

2023 年 5 月 10 日

A Neural Emulator for Uncertainty Estimation of Fire Propagation

Andrew Bolt,Conrad Sanderson,Joel Janek Dabrowski,Carolyn Huston,Petra Kuhnert

Wildfire propagation is a highly stochastic process where small changes in environmental conditions (such as wind speed and direction) can lead to large changes in observed behaviour. A traditional approach to quantify uncertainty in fire-front progression is to generate probability maps via ensembles of simulations. However, use of ensembles is typically computationally expensive, which can limit the scope of uncertainty analysis. To address this, we explore the use of a spatio-temporal neural-based modelling approach to directly estimate the likelihood of fire propagation given uncertainty in input parameters. The uncertainty is represented by deliberately perturbing the input weather forecast during model training. The computational load is concentrated in the model training process, which allows larger probability spaces to be explored during deployment. Empirical evaluations indicate that the proposed model achieves comparable fire boundaries to those produced by the traditional SPARK simulation platform, with an overall Jaccard index (similarity score) of 67.4% on a set of 35 simulated fires. When compared to a related neural model (emulator) which was employed to generate probability maps via ensembles of emulated fires, the proposed approach produces competitive Jaccard similarity scores while being approximately an order of magnitude faster.

Processing（編程語言） · NVM · 表示 · 蒙特卡羅 · 推斷 ·

2023 年 5 月 10 日

Generalised shot noise representations of stochastic systems driven by non-Gaussian Lévy processes

Marcos Tapia Costa,Ioannis Kontoyiannis,Simon Godsill

from arxiv, 34 pages, 14 figures

We consider the problem of obtaining effective representations for the solutions of linear, vector-valued stochastic differential equations (SDEs) driven by non-Gaussian pure-jump L\'evy processes, and we show how such representations lead to efficient simulation methods. The processes considered constitute a broad class of models that find application across the physical and biological sciences, mathematics, finance and engineering. Motivated by important relevant problems in statistical inference, we derive new, generalised shot-noise simulation methods whenever a normal variance-mean (NVM) mixture representation exists for the driving L\'evy process, including the generalised hyperbolic, normal-Gamma, and normal tempered stable cases. Simple, explicit conditions are identified for the convergence of the residual of a truncated shot-noise representation to a Brownian motion in the case of the pure L\'evy process, and to a Brownian-driven SDE in the case of the L\'evy-driven SDE. These results provide Gaussian approximations to the small jumps of the process under the NVM representation. The resulting representations are of particular importance in state inference and parameter estimation for L\'evy-driven SDE models, since the resulting conditionally Gaussian structures can be readily incorporated into latent variable inference methods such as Markov chain Monte Carlo (MCMC), Expectation-Maximisation (EM), and sequential Monte Carlo.

INFORMS · 互信息 · Extensibility · 后向 · 可辨認的 ·

2020 年 6 月 30 日

Adversarial Mutual Information for Text Generation

Boyuan Pan,Yazheng Yang,Kaizhao Liang,Bhavya Kailkhura,Zhongming Jin,Xian-Sheng Hua,Deng Cai,Bo Li

from arxiv, Published at ICML 2020

Recent advances in maximizing mutual information (MI) between the source and target have demonstrated its effectiveness in text generation. However, previous works paid little attention to modeling the backward network of MI (i.e., dependency from the target to the source), which is crucial to the tightness of the variational information maximization lower bound. In this paper, we propose Adversarial Mutual Information (AMI): a text generation framework which is formed as a novel saddle point (min-max) optimization aiming to identify joint interactions between the source and target. Within this framework, the forward and backward networks are able to iteratively promote or demote each other's generated instances by comparing the real and synthetic data distributions. We also develop a latent noise sampling strategy that leverages random variations at the high-level semantic space to enhance the long term dependency in the generation process. Extensive experiments based on different text generation tasks demonstrate that the proposed AMI framework can significantly outperform several strong baselines, and we also show that AMI has potential to lead to a tighter lower bound of maximum mutual information for the variational information maximization problem.

采樣法 · 方差 · 圖形處理器 · INFORMS · 泛化理論 ·

2020 年 6 月 24 日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Weilin Cong,Rana Forsati,Mahmut Kandemir,Mehrdad Mahdavi

Sampling methods (e.g., node-wise, layer-wise, or subgraph) has become an indispensable strategy to speed up training large-scale Graph Neural Networks (GNNs). However, existing sampling methods are mostly based on the graph structural information and ignore the dynamicity of optimization, which leads to high variance in estimating the stochastic gradients. The high variance issue can be very pronounced in extremely large graphs, where it results in slow convergence and poor generalization. In this paper, we theoretically analyze the variance of sampling methods and show that, due to the composite structure of empirical risk, the variance of any sampling method can be decomposed into \textit{embedding approximation variance} in the forward stage and \textit{stochastic gradient variance} in the backward stage that necessities mitigating both types of variance to obtain faster convergence rate. We propose a decoupled variance reduction strategy that employs (approximate) gradient information to adaptively sample nodes with minimal variance, and explicitly reduces the variance introduced by embedding approximation. We show theoretically and empirically that the proposed method, even with smaller mini-batch sizes, enjoys a faster convergence rate and entails a better generalization compared to the existing methods.