又大又硬又长又粗免费看_亚洲国产欧美一区二区午夜浪_国产日本在线一区二区播放_婷婷一区二区三区_欧美精品国产精品日韩电影_欧美日韩精品新区乱码在线观看_亚洲免费福利视频

Ill-posed linear inverse problems that combine knowledge of the forward measurement model with prior models arise frequently in various applications, from computational photography to medical imaging. Recent research has focused on solving these problems with score-based generative models (SGMs) that produce perceptually plausible images, especially in inpainting problems. In this study, we exploit the particular structure of the prior defined in the SGM to formulate recovery in a Bayesian framework as a Feynman--Kac model adapted from the forward diffusion model used to construct score-based diffusion. To solve this Feynman--Kac problem, we propose the use of Sequential Monte Carlo methods. The proposed algorithm, MCGdiff, is shown to be theoretically grounded and we provide numerical simulations showing that it outperforms competing baselines when dealing with ill-posed inverse problems.

相關內容

蒙特(te)卡羅

關注 1

MoDELS · Analysis · 稀疏 · 可辨認的 · Extensibility ·

2023 年 10 月 5 日

Scalable Bayesian computation for crossed and nested hierarchical models

Omiros Papaspiliopoulos,Timothée Stumpf-Fétizon,Giacomo Zanella

We develop sampling algorithms to fit Bayesian hierarchical models, the computational complexity of which scales linearly with the number of observations and the number of parameters in the model. We focus on crossed random effect and nested multilevel models, which are used ubiquitously in applied sciences. The posterior dependence in both classes is sparse: in crossed random effects models it resembles a random graph, whereas in nested multilevel models it is tree-structured. For each class we identify a framework for scalable computation, building on previous work. Methods for crossed models are based on extensions of appropriately designed collapsed Gibbs samplers, where we introduce the idea of local centering; while methods for nested models are based on sparse linear algebra and data augmentation. We provide a theoretical analysis of the proposed algorithms in some simplified settings, including a comparison with previously proposed methodologies and an average-case analysis based on random graph theory. Numerical experiments, including two challenging real data analyses on predicting electoral results and real estate prices, compare with off-the-shelf Hamiltonian Monte Carlo, displaying drastic improvement in performance.

變換 · MoDELS · Transformer模型 · 可約的 · INFORMS ·

2023 年 10 月 4 日

A Study of Quantisation-aware Training on Time Series Transformer Models for Resource-constrained FPGAs

Tianheng Ling,Chao Qian,Lukas Einhaus,Gregor Schiele

from arxiv, 12 pages, 1 figure

This study explores the quantisation-aware training (QAT) on time series Transformer models. We propose a novel adaptive quantisation scheme that dynamically selects between symmetric and asymmetric schemes during the QAT phase. Our approach demonstrates that matching the quantisation scheme to the real data distribution can reduce computational overhead while maintaining acceptable precision. Moreover, our approach is robust when applied to real-world data and mixed-precision quantisation, where most objects are quantised to 4 bits. Our findings inform model quantisation and deployment decisions while providing a foundation for advancing quantisation techniques.

MoDELS · 門控循環單元 · 循環神經網絡 · 門控 · 模型評估 ·

2023 年 10 月 3 日

Parallelizing non-linear sequential models over the sequence length

Yi Heng Lim,Qi Zhu,Joshua Selfridge,Muhammad Firmansyah Kasim

Sequential models, such as Recurrent Neural Networks and Neural Ordinary Differential Equations, have long suffered from slow training due to their inherent sequential nature. For many years this bottleneck has persisted, as many thought sequential models could not be parallelized. We challenge this long-held belief with our parallel algorithm that accelerates GPU evaluation of sequential models by up to 3 orders of magnitude faster without compromising output accuracy. The algorithm does not need any special structure in the sequential models' architecture, making it applicable to a wide range of architectures. Using our method, training sequential models can be more than 10 times faster than the common sequential method without any meaningful difference in the training results. Leveraging this accelerated training, we discovered the efficacy of the Gated Recurrent Unit in a long time series classification problem with 17k time samples. By overcoming the training bottleneck, our work serves as the first step to unlock the potential of non-linear sequential models for long sequence problems.

經驗回放 · Agent · Learning · HER · Guidance ·

2023 年 10 月 3 日

Learning and reusing primitive behaviours to improve Hindsight Experience Replay sample efficiency

Francisco Roldan Sanchez,Qiang Wang,David Cordova Bulens,Kevin McGuinness,Stephen Redmond,Noel O'Connor

from arxiv, 5 pages, 2 figures, 1 algorithm. Submitted to ICARA 2024

Hindsight Experience Replay (HER) is a technique used in reinforcement learning (RL) that has proven to be very efficient for training off-policy RL-based agents to solve goal-based robotic manipulation tasks using sparse rewards. Even though HER improves the sample efficiency of RL-based agents by learning from mistakes made in past experiences, it does not provide any guidance while exploring the environment. This leads to very large training times due to the volume of experience required to train an agent using this replay strategy. In this paper, we propose a method that uses primitive behaviours that have been previously learned to solve simple tasks in order to guide the agent toward more rewarding actions during exploration while learning other more complex tasks. This guidance, however, is not executed by a manually designed curriculum, but rather using a critic network to decide at each timestep whether or not to use the actions proposed by the previously-learned primitive policies. We evaluate our method by comparing its performance against HER and other more efficient variations of this algorithm in several block manipulation tasks. We demonstrate the agents can learn a successful policy faster when using our proposed method, both in terms of sample efficiency and computation time. Code is available at //github.com/franroldans/qmp-her.

YOLOv5 · MoDELS · 優化器 · NWD · 損失函數（機器學習） ·

2023 年 10 月 3 日

Improvement and Enhancement of YOLOv5 Small Target Recognition Based on Multi-module Optimization

Qingyang Li,Yuchen Li,Hongyi Duan,JiaLiang Kang,Jianan Zhang,Xueqian Gan,Ruotong Xu

from arxiv, 8 pages 10 figures

In this paper, the limitations of YOLOv5s model on small target detection task are deeply studied and improved. The performance of the model is successfully enhanced by introducing GhostNet-based convolutional module, RepGFPN-based Neck module optimization, CA and Transformer's attention mechanism, and loss function improvement using NWD. The experimental results validate the positive impact of these improvement strategies on model precision, recall and mAP. In particular, the improved model shows significant superiority in dealing with complex backgrounds and tiny targets in real-world application tests. This study provides an effective optimization strategy for the YOLOv5s model on small target detection, and lays a solid foundation for future related research and applications.

邊緣似然函數 · 似然 · 邊緣化 · 近似 · GROUP ·

2023 年 10 月 2 日

Approximate Marginal Likelihood Inference in Mixed Models for Grouped Data

Alex Stringer

from arxiv, Appendix sections will be "supplementary material" upon publication

A method is introduced for approximate marginal likelihood inference via adaptive Gaussian quadrature in mixed models with a single grouping factor. The core technical contribution is an algorithm for computing the exact gradient of the approximate log marginal likelihood. This leads to efficient maximum likelihood via quasi-Newton optimization that is demonstrated to be faster than existing approaches based on finite-differenced gradients or derivative-free optimization. The method is specialized to Bernoulli mixed models with multivariate, correlated Gaussian random effects; here computations are performed using an inverse log-Cholesky parameterization of the Gaussian density that involves no matrix decomposition during model fitting, while Wald confidence intervals are provided for variance parameters on the original scale. Simulations give evidence of these intervals attaining nominal coverage if enough quadrature points are used, for data comprised of a large number of very small groups exhibiting large between-group heterogeneity. In contrast, the Laplace approximation is shown to give especially poor coverage and high bias for data comprised of a large number of small groups. Adaptive quadrature mitigates this, and the methods in this paper improve the computational feasibility of this more accurate method. All results may be reproduced using code available at \url{//github.com/awstringer1/aghmm-paper-code}.

估計/估計量 · 集成 · 方陣 · 訓練誤差 · Analysis ·

2023 年 10 月 2 日

Corrected generalized cross-validation for finite ensembles of penalized estimators

Pierre Bellec,Jin-Hong Du,Takuya Koriyama,Pratik Patil,Kai Tan

from arxiv, 82 pages, 25 figures

Generalized cross-validation (GCV) is a widely-used method for estimating the squared out-of-sample prediction risk that employs a scalar degrees of freedom adjustment (in a multiplicative sense) to the squared training error. In this paper, we examine the consistency of GCV for estimating the prediction risk of arbitrary ensembles of penalized least squares estimators. We show that GCV is inconsistent for any finite ensemble of size greater than one. Towards repairing this shortcoming, we identify a correction that involves an additional scalar correction (in an additive sense) based on degrees of freedom adjusted training errors from each ensemble component. The proposed estimator (termed CGCV) maintains the computational advantages of GCV and requires neither sample splitting, model refitting, or out-of-bag risk estimation. The estimator stems from a finer inspection of ensemble risk decomposition and two intermediate risk estimators for the components in this decomposition. We provide a non-asymptotic analysis of the CGCV and the two intermediate risk estimators for ensembles of convex penalized estimators under Gaussian features and a linear response model. In the special case of ridge regression, we extend the analysis to general feature and response distributions using random matrix theory, which establishes model-free uniform consistency of CGCV.

Markovian · Markov · INTERACT · 統計量 · 估計/估計量 ·

2023 年 10 月 2 日

On the Ziv-Merhav theorem beyond Markovianity

Nicholas Barnfield,Rapha?l Grondin,Gaia Pozzoli,Renaud Raquépas

We generalize to a broader class of decoupled measures a result of Ziv and Merhav on universal estimation of the specific cross (or relative) entropy for a pair of multi-level Markov measures. The result covers pairs of suitably regular g-measures and pairs of equilibrium measures arising from the small space of interactions in mathematical statistical mechanics.

統計量 · 均值 · Continuity · SimPLe · MoDELS ·

2023 年 9 月 29 日

Simulations for Meta-analysis of Magnitude Measures

Elena Kulinskaya,David C. Hoaglin

from arxiv, 56 pages, 33 figures

Meta-analysis aims to combine effect measures from several studies. For continuous outcomes, the most popular effect measures use simple or standardized differences in sample means. However, a number of applications focus on the absolute values of these effect measures (i.e., unsigned magnitude effects). We provide statistical methods for meta-analysis of magnitude effects based on standardized mean differences. We propose a suitable statistical model for random-effects meta-analysis of absolute standardized mean differences (ASMD), investigate a number of statistical methods for point and interval estimation, and provide practical recommendations for choosing among them.

平滑 · 注意力機制 · 反向傳播 · 維特比算法 · 正則化項 ·

2018 年 2 月 20 日

Differentiable Dynamic Programming for Structured Prediction and Attention

Arthur Mensch,Mathieu Blondel

Dynamic programming (DP) solves a variety of structured combinatorial problems by iteratively breaking them down into smaller subproblems. In spite of their versatility, DP algorithms are usually non-differentiable, which hampers their use as a layer in neural networks trained by backpropagation. To address this issue, we propose to smooth the max operator in the dynamic programming recursion, using a strongly convex regularizer. This allows to relax both the optimal value and solution of the original combinatorial problem, and turns a broad class of DP algorithms into differentiable operators. Theoretically, we provide a new probabilistic perspective on backpropagating through these DP operators, and relate them to inference in graphical models. We derive two particular instantiations of our framework, a smoothed Viterbi algorithm for sequence prediction and a smoothed DTW algorithm for time-series alignment. We showcase these instantiations on two structured prediction tasks and on structured and sparse attention for neural machine translation.