苹果电影在线观看免费高清,中文字幕一二三区乱码不卡,波野结女一区二区三区高清AV

Stochastic variational inference algorithms are derived for fitting various heteroskedastic time series models. We examine Gaussian, t, and skew-t response GARCH models and fit these using Gaussian variational approximating densities. We implement efficient stochastic gradient ascent procedures based on the use of control variates or the reparameterization trick and demonstrate that the proposed implementations provide a fast and accurate alternative to Markov chain Monte Carlo sampling. Additionally, we present sequential updating versions of our variational algorithms, which are suitable for efficient portfolio construction and dynamic asset allocation.

相關內容

向量空間

關注 0

模型評估 · Learning · 可約的 · MoDELS · 深度學習 ·

2023 年 10 月 16 日

Microscaling Data Formats for Deep Learning

Bita Darvish Rouhani,Ritchie Zhao,Ankit More,Mathew Hall,Alireza Khodamoradi,Summer Deng,Dhruv Choudhary,Marius Cornea,Eric Dellinger,Kristof Denolf,Stosic Dusan,Venmugil Elango,Maximilian Golub,Alexander Heinecke,Phil James-Roxby,Dharmesh Jani,Gaurav Kolhe,Martin Langhammer,Ada Li,Levi Melnick,Maral Mesmakhosroshahi,Andres Rodriguez,Michael Schulte,Rasoul Shafipour,Lei Shao,Michael Siu,Pradeep Dubey,Paulius Micikevicius,Maxim Naumov,Colin Verilli,Ralph Wittig,Eric Chung

Narrow bit-width data formats are key to reducing the computational and storage costs of modern deep learning applications. This paper evaluates Microscaling (MX) data formats that combine a per-block scaling factor with narrow floating-point and integer types for individual elements.MX formats balance the competing needs of hardware efficiency, model accuracy, and user friction. Empirical results on over two dozen benchmarks demonstrate practicality of MX data formats as a drop-in replacement for baseline FP32 for AI inference and training with low user friction. We also show the first instance of training generative language models at sub-8-bit weights, activations, and gradients with minimal accuracy loss and no modifications to the training recipe.

Analysis · binary · MoDELS · 估計/估計量 · SimPLe ·

2023 年 10 月 16 日

Sharp Bounds for Generalized Causal Sensitivity Analysis

Dennis Frauen,Valentyn Melnychuk,Stefan Feuerriegel

from arxiv, Accepted at NeurIPS 2023

Causal inference from observational data is crucial for many disciplines such as medicine and economics. However, sharp bounds for causal effects under relaxations of the unconfoundedness assumption (causal sensitivity analysis) are subject to ongoing research. So far, works with sharp bounds are restricted to fairly simple settings (e.g., a single binary treatment). In this paper, we propose a unified framework for causal sensitivity analysis under unobserved confounding in various settings. For this, we propose a flexible generalization of the marginal sensitivity model (MSM) and then derive sharp bounds for a large class of causal effects. This includes (conditional) average treatment effects, effects for mediation analysis and path analysis, and distributional effects. Furthermore, our sensitivity model is applicable to discrete, continuous, and time-varying treatments. It allows us to interpret the partial identification problem under unobserved confounding as a distribution shift in the latent confounders while evaluating the causal effect of interest. In the special case of a single binary treatment, our bounds for (conditional) average treatment effects coincide with recent optimality results for causal sensitivity analysis. Finally, we propose a scalable algorithm to estimate our sharp bounds from observational data.

MoDELS · state-of-the-art · HTTPS · 多峰值 · 樣本 ·

2023 年 10 月 15 日

A Recipe for Watermarking Diffusion Models

Yunqing Zhao,Tianyu Pang,Chao Du,Xiao Yang,Ngai-Man Cheung,Min Lin

Diffusion models (DMs) have demonstrated advantageous potential on generative tasks. Widespread interest exists in incorporating DMs into downstream applications, such as producing or editing photorealistic images. However, practical deployment and unprecedented power of DMs raise legal issues, including copyright protection and monitoring of generated content. In this regard, watermarking has been a proven solution for copyright protection and content monitoring, but it is underexplored in the DMs literature. Specifically, DMs generate samples from longer tracks and may have newly designed multimodal structures, necessitating the modification of conventional watermarking pipelines. To this end, we conduct comprehensive analyses and derive a recipe for efficiently watermarking state-of-the-art DMs (e.g., Stable Diffusion), via training from scratch or finetuning. Our recipe is straightforward but involves empirically ablated implementation details, providing a foundation for future research on watermarking DMs. The code is available at //github.com/yunqing-me/WatermarkDM.

Analysis · 離散化 · 講稿 · 鞍點 · Weight ·

2023 年 10 月 13 日

A Local Fourier Analysis for Additive Schwarz Smoothers

álvaro Pé de la Riva,Carmen Rodrigo,Francisco J. Gaspar,James H. Adler,Xiaozhe Hu,Ludmil Zikatanov

In this work, a local Fourier analysis is presented to study the convergence of multigrid methods based on additive Schwarz smoothers. This analysis is presented as a general framework which allows us to study these smoothers for any type of discretization and problem. The presented framework is crucial in practice since it allows one to know a priori the answer to questions such as what is the size of the patch to use within these relaxations, the size of the overlapping, or even the optimal values for the weights involved in the smoother. Results are shown for a class of additive and restricted additive Schwarz relaxations used within a multigrid framework applied to high-order finite-element discretizations and saddle point problems, which are two of the contexts in which these type of relaxations are widely used.

正則化項 · 正則表達式 · 推斷 · 講稿 ·

2023 年 10 月 12 日

A Completeness Theorem for Probabilistic Regular Expressions

Wojciech Ró?owski,Alexandra Silva

We introduce Probabilistic Regular Expressions (PRE), a probabilistic analogue of regular expressions denoting probabilistic languages in which every word is assigned a probability of being generated. We present and prove the completeness of an inference system for reasoning about probabilistic language equivalence of PRE based on Salomaa's axiomatisation of Kleene Algebra.

蒙特卡羅 · 可約的 · INFORMS · 成比例 · Next ·

2023 年 10 月 12 日

Pure Monte Carlo Counterfactual Regret Minimization

Ju Qi,Ting Feng,Falun Hei,Zhemei Fang,Yunfeng Luo

Counterfactual Regret Minimization (CFR) and its variants are the best algorithms so far for solving large-scale incomplete information games. However, we believe that there are two problems with CFR: First, matrix multiplication is required in CFR iteration, and the time complexity of one iteration is too high; Secondly, the game characteristics in the real world are different. Just using one CFR algorithm will not be perfectly suitable for all game problems. For these two problems, this paper proposes a new algorithm called Pure CFR (PCFR) based on CFR. PCFR can be seen as a combination of CFR and Fictitious Play (FP), inheriting the concept of counterfactual regret (value) from CFR, and using the best response strategy instead of the regret matching strategy for the next iteration. This algorithm has three advantages. First, PCFR can be combined with any CFR variant. The resulting Pure MCCFR (PMCCFR) can significantly reduce the time and space complexity of one iteration. Secondly, our experiments show that the convergence speed of the PMCCFR is 2$\sim$3 times that of the MCCFR. Finally, there is a type of game that is very suitable for PCFR, we call this type of game clear-game, which is characterized by a high proportion of dominated strategies. Experiments show that in clear-game, the convergence rate of PMCCFR is two orders of magnitude higher than that of MCCFR.

Subspace · 求逆 · 張成子空間 · 秩 · 有向 ·

2023 年 10 月 11 日

The Algebra for Stabilizer Codes

Cole Comfort

from arxiv, Fixed errors. Made presentation much cleaner. Made results more precise

There is a bijection between odd prime dimensional qudit pure stabilizer states modulo invertible scalars and affine Lagrangian subspaces of finite dimensional symplectic $\mathbb{F}_p$-vector spaces. In the language of the stabilizer formalism, full rank stabilizer tableaux are exactly the bases for affine Lagrangian subspaces. This correspondence extends to an isomorphism of props: the composition of stabilizer circuits corresponds to the relational composition of affine subspaces spanned by the tableaux, the tensor product corresponds to the direct sum. In this paper, we extend this correspondence between stabilizer circuits and tableaux to the mixed setting; regarding stabilizer codes as affine coisotropic subspaces (again only in odd prime qudit dimension/for qubit CSS codes). We show that by splitting the projector for a stabilizer code we recover the error detection protocol and the error correction protocol with affine classical processing power.

多峰值 · 模態 · INFORMS · MoDELS · 可約的 ·

2021 年 6 月 30 日

Attention Bottlenecks for Multimodal Fusion

Arsha Nagrani,Shan Yang,Anurag Arnab,Aren Jansen,Cordelia Schmid,Chen Sun

Humans perceive the world by concurrently processing and fusing high-dimensional inputs from multiple modalities such as vision and audio. Machine perception models, in stark contrast, are typically modality-specific and optimised for unimodal benchmarks, and hence late-stage fusion of final representations or predictions from each modality (`late-fusion') is still a dominant paradigm for multimodal video classification. Instead, we introduce a novel transformer based architecture that uses `fusion bottlenecks' for modality fusion at multiple layers. Compared to traditional pairwise self-attention, our model forces information between different modalities to pass through a small number of bottleneck latents, requiring the model to collate and condense the most relevant information in each modality and only share what is necessary. We find that such a strategy improves fusion performance, at the same time reducing computational cost. We conduct thorough ablation studies, and achieve state-of-the-art results on multiple audio-visual classification benchmarks including Audioset, Epic-Kitchens and VGGSound. All code and models will be released.

估計/估計量 · contrastive · INFORMS · 互信息 · 表示學習 ·

2021 年 6 月 25 日

Decomposed Mutual Information Estimation for Contrastive Representation Learning

Alessandro Sordoni,Nouha Dziri,Hannes Schulz,Geoff Gordon,Phil Bachman,Remi Tachet

from arxiv, ICML 2021

Recent contrastive representation learning methods rely on estimating mutual information (MI) between multiple views of an underlying context. E.g., we can derive multiple views of a given image by applying data augmentation, or we can split a sequence into views comprising the past and future of some step in the sequence. Contrastive lower bounds on MI are easy to optimize, but have a strong underestimation bias when estimating large amounts of MI. We propose decomposing the full MI estimation problem into a sum of smaller estimation problems by splitting one of the views into progressively more informed subviews and by applying the chain rule on MI between the decomposed views. This expression contains a sum of unconditional and conditional MI terms, each measuring modest chunks of the total MI, which facilitates approximation via contrastive bounds. To maximize the sum, we formulate a contrastive lower bound on the conditional MI which can be approximated efficiently. We refer to our general approach as Decomposed Estimation of Mutual Information (DEMI). We show that DEMI can capture a larger amount of MI than standard non-decomposed contrastive bounds in a synthetic setting, and learns better representations in a vision domain and for dialogue generation.

圖 · MoDELS · Continuity · 圖形處理器 · 隱藏層 ·

2020 年 6 月 7 日

Principal Neighbourhood Aggregation for Graph Nets

Gabriele Corso,Luca Cavalleri,Dominique Beaini,Pietro Liò,Petar Veli?kovi?

Graph Neural Networks (GNNs) have been shown to be effective models for different predictive tasks on graph-structured data. Recent work on their expressive power has focused on isomorphism tasks and countable feature spaces. We extend this theoretical framework to include continuous features - which occur regularly in real-world input domains and within the hidden layers of GNNs - and we demonstrate the requirement for multiple aggregation functions in this context. Accordingly, we propose Principal Neighbourhood Aggregation (PNA), a novel architecture combining multiple aggregators with degree-scalers (which generalize the sum aggregator). Finally, we compare the capacity of different models to capture and exploit the graph structure via a novel benchmark containing multiple tasks taken from classical graph theory, alongside existing benchmarks from real-world domains, all of which demonstrate the strength of our model. With this work, we hope to steer some of the GNN research towards new aggregation methods which we believe are essential in the search for powerful and robust models.