亚洲精品无码黄色网站在线观看,日韩精品大片一区二区三区四区

from arxiv, submitted to icassp 2024. code on github: //github.com/nii-yamagishilab/project-NN-Pytorch-scripts/tree/master/project/10-asvspoof-vocoded-trn-ssl

A speech spoofing countermeasure (CM) that discriminates between unseen spoofed and bona fide data requires diverse training data. While many datasets use spoofed data generated by speech synthesis systems, it was recently found that data vocoded by neural vocoders were also effective as the spoofed training data. Since many neural vocoders are fast in building and generation, this study used multiple neural vocoders and created more than 9,000 hours of vocoded data on the basis of the VoxCeleb2 corpus. This study investigates how this large-scale vocoded data can improve spoofing countermeasures that use data-hungry self-supervised learning (SSL) models. Experiments demonstrated that the overall CM performance on multiple test sets improved when using features extracted by an SSL model continually trained on the vocoded data. Further improvement was observed when using a new SSL distilled from the two SSLs before and after the continual training. The CM with the distilled SSL outperformed the previous best model on challenging unseen test sets, including the ASVspoof 2019 logical access, WaveFake, and In-the-Wild.

相關內容

SSL

關注 3

重要性采樣 · 邊緣化 · 樣本 · 估計/估計量 · 后向 ·

2023 年 10 月 26 日

Using autodiff to estimate posterior moments, marginals and samples

Sam Bowyer,Thomas Heap,Laurence Aitchison

Importance sampling is a popular technique in Bayesian inference: by reweighting samples drawn from a proposal distribution we are able to obtain samples and moment estimates from a Bayesian posterior over some $n$ latent variables. Recent work, however, indicates that importance sampling scales poorly -- in order to accurately approximate the true posterior, the required number of importance samples grows is exponential in the number of latent variables [Chatterjee and Diaconis, 2018]. Massively parallel importance sampling works around this issue by drawing $K$ samples for each of the $n$ latent variables and reasoning about all $K^n$ combinations of latent samples. In principle, we can reason efficiently over $K^n$ combinations of samples by exploiting conditional independencies in the generative model. However, in practice this requires complex algorithms that traverse backwards through the graphical model, and we need separate backward traversals for each computation (posterior expectations, marginals and samples). Our contribution is to exploit the source term trick from physics to entirely avoid the need to hand-write backward traversals. Instead, we demonstrate how to simply and easily compute all the required quantities -- posterior expectations, marginals and samples -- by differentiating through a slightly modified marginal likelihood estimator.

平滑 · 穩健性 · 去噪 · MoDELS · 模型評估 ·

2023 年 10 月 25 日

Multi-scale Diffusion Denoised Smoothing

Jongheon Jeong,Jinwoo Shin

from arxiv, 24 pages; NeurIPS 2023; Code is available at //github.com/jh-jeong/smoothing-multiscale

Along with recent diffusion models, randomized smoothing has become one of a few tangible approaches that offers adversarial robustness to models at scale, e.g., those of large pre-trained models. Specifically, one can perform randomized smoothing on any classifier via a simple "denoise-and-classify" pipeline, so-called denoised smoothing, given that an accurate denoiser is available - such as diffusion model. In this paper, we investigate the trade-off between accuracy and certified robustness of denoised smoothing: for example, we question on which representation of diffusion model would maximize the certified robustness of denoised smoothing. We consider a new objective that aims collective robustness of smoothed classifiers across multiple noise levels at a shared diffusion model, which also suggests a new way to compensate the cost of accuracy in randomized smoothing for its certified robustness. This objective motivates us to fine-tune diffusion model (a) to perform consistent denoising whenever the original image is recoverable, but (b) to generate rather diverse outputs otherwise. Our experiments show that this fine-tuning scheme of diffusion models combined with the multi-scale smoothing enables a strong certified robustness possible at highest noise level while maintaining the accuracy closer to non-smoothed classifiers.

儲層計算 · Performer · 通道 · state-of-the-art · 優化器 ·

2023 年 10 月 25 日

Multi-parallel-task Time-delay Reservoir Computing combining a Silicon Microring with WDM

Bernard J. Giron Castro,Christophe Peucheret,Darko Zibar,Francesco Da Ros

from arxiv, 3 pages, 2 figures, Submitted to Optical Fiber Communication Conference (OFC) 2024

We numerically demonstrate a microring-based time-delay reservoir computing scheme that simultaneously solves three tasks involving time-series prediction, classification, and wireless channel equalization. Each task performed on a wavelength-multiplexed channel achieves state-of-the-art performance with optimized power and frequency detuning.

MoDELS · Extensibility · 確定性模型 · 線性的 · 操作 ·

2023 年 10 月 25 日

A coherent differential PCF

Thomas Ehrhard

The categorical models of the differential lambda-calculus are additive categories because of the Leibniz rule which requires the summation of two expressions. This means that, as far as the differential lambda-calculus and differential linear logic are concerned, these models feature finite non-determinism and indeed these languages are essentially non-deterministic. In a previous paper we introduced a categorical framework for differentiation which does not require additivity and is compatible with deterministic models such as coherence spaces and probabilistic models such as probabilistic coherence spaces. Based on this semantics we develop a syntax of a deterministic version of the differential lambda-calculus. One nice feature of this new approach to differentiation is that it is compatible with general fixpoints of terms, so our language is actually a differential extension of PCF for which we provide a fully deterministic operational semantics.

縮放 · Machine Learning · Learning · 方差 · 方差縮放 ·

2023 年 10 月 24 日

Can bin-wise scaling improve consistency and adaptivity of prediction uncertainty for machine learning regression ?

Pascal Pernot

from arxiv, This version corrects an error in the estimation of the Sx scores for the test set, affecting Fig. 2 and Tables I-III of the initial version. The main points of the discussion and the conclusions are unchanged

Binwise Variance Scaling (BVS) has recently been proposed as a post hoc recalibration method for prediction uncertainties of machine learning regression problems that is able of more efficient corrections than uniform variance (or temperature) scaling. The original version of BVS uses uncertainty-based binning, which is aimed to improve calibration conditionally on uncertainty, i.e. consistency. I explore here several adaptations of BVS, in particular with alternative loss functions and a binning scheme based on an input-feature (X) in order to improve adaptivity, i.e. calibration conditional on X. The performances of BVS and its proposed variants are tested on a benchmark dataset for the prediction of atomization energies and compared to the results of isotonic regression.

可理解性 · SimPLe · Analysis · Principle · CASES ·

2023 年 10 月 24 日

The Quantum Tortoise and the Classical Hare: A simple framework for understanding which problems quantum computing will accelerate (and which it will not)

Sukwoong Choi,William S. Moses,Neil Thompson

Quantum computing promises transformational gains for solving some problems, but little to none for others. For anyone hoping to use quantum computers now or in the future, it is important to know which problems will benefit. In this paper, we introduce a framework for answering this question both intuitively and quantitatively. The underlying structure of the framework is a race between quantum and classical computers, where their relative strengths determine when each wins. While classical computers operate faster, quantum computers can sometimes run more efficient algorithms. Whether the speed advantage or the algorithmic advantage dominates determines whether a problem will benefit from quantum computing or not. Our analysis reveals that many problems, particularly those of small to moderate size that can be important for typical businesses, will not benefit from quantum computing. Conversely, larger problems or those with particularly big algorithmic gains will benefit from near-term quantum computing. Since very large algorithmic gains are rare in practice and theorized to be rare even in principle, our analysis suggests that the benefits from quantum computing will flow either to users of these rare cases, or practitioners processing very large data.

MoDELS · Analysis · 賓夕法尼亞大學 (University of Pennsylvania) · 推斷 · Learning ·

2023 年 10 月 23 日

Uncovering patterns for adverse pregnancy outcomes with a Bayesian spatial model: Evidence from Philadelphia

Cecilia Balocchi,Ray Bai,Jessica Liu,Silvia P. Canelón,Edward I. George,Yong Chen,Mary R. Boland

from arxiv, 29 pages, 7 figures, 9 tables

We introduce a Bayesian conditional autoregressive model for analyzing patient-specific and neighborhood risks of stillbirth and preterm birth within a city. Our fully Bayesian approach automatically learns the amount of spatial heterogeneity and spatial dependence between neighborhoods. Our model provides meaningful inferences and uncertainty quantification for both covariate effects and neighborhood risk probabilities through their posterior distributions. We apply our methodology to data from the city of Philadelphia. Using electronic health records (45,919 deliveries at hospitals within the University of Pennsylvania Health System) and United States Census Bureau data from 363 census tracts in Philadelphia, we find that both patient-level characteristics (e.g. self-identified race/ethnicity) and neighborhood-level characteristics (e.g. violent crime) are highly associated with patients' odds of stillbirth or preterm birth. Our neighborhood risk analysis further reveals that census tracts in West Philadelphia and North Philadelphia are at highest risk of these outcomes. Specifically, neighborhoods with higher rates of women in poverty or on public assistance have greater neighborhood risk for these outcomes, while neighborhoods with higher rates of college-educated women or women in the labor force have lower risk. Our findings could be useful for targeted individual and neighborhood interventions.

均值 · 相同 · 分解的 · 統計理論 ·

2023 年 10 月 22 日

The total variation distance between high-dimensional Gaussians with the same mean

Luc Devroye,Abbas Mehrabian,Tommy Reddad

from arxiv, In an earlier version, tight bounds were claimed for the total-variation distance between two general Gaussians. But the proof of the upper bound was incorrect, and we removed the flawed bound from the paper. Later, Arbas, Ashtiani, and Liaw (Theorem 1.8 in arxiv.org/abs/2303.04288v2) proved tight bounds for the total-variation distance between two general Gaussians, solving the original problem

Given two high-dimensional Gaussians with the same mean, we prove a lower and an upper bound for their total variation distance, which are within a constant factor of one another.

語言模型化 · MoDELS · IR · 似然 · 掩碼語言模型化 ·

2020 年 10 月 20 日

PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval

Xinyu Ma,Jiafeng Guo,Ruqing Zhang,Yixing Fan,Xiang Ji,Xueqi Cheng

from arxiv, Accepted by WSDM2021

Recently pre-trained language representation models such as BERT have shown great success when fine-tuned on downstream tasks including information retrieval (IR). However, pre-training objectives tailored for ad-hoc retrieval have not been well explored. In this paper, we propose Pre-training with Representative wOrds Prediction (PROP) for ad-hoc retrieval. PROP is inspired by the classical statistical language model for IR, specifically the query likelihood model, which assumes that the query is generated as the piece of text representative of the "ideal" document. Based on this idea, we construct the representative words prediction (ROP) task for pre-training. Given an input document, we sample a pair of word sets according to the document language model, where the set with higher likelihood is deemed as more representative of the document. We then pre-train the Transformer model to predict the pairwise preference between the two word sets, jointly with the Masked Language Model (MLM) objective. By further fine-tuning on a variety of representative downstream ad-hoc retrieval tasks, PROP achieves significant improvements over baselines without pre-training or with other pre-training methods. We also show that PROP can achieve exciting performance under both the zero- and low-resource IR settings. The code and pre-trained models are available at //github.com/Albert-Ma/PROP.

文本分類 · 語言模型化 · BERT · state-of-the-art · MoDELS ·

2019 年 5 月 14 日

How to Fine-Tune BERT for Text Classification?

Chi Sun,Xipeng Qiu,Yige Xu,Xuanjing Huang

Language model pre-training has proven to be useful in learning universal language representations. As a state-of-the-art language model pre-training model, BERT (Bidirectional Encoder Representations from Transformers) has achieved amazing results in many language understanding tasks. In this paper, we conduct exhaustive experiments to investigate different fine-tuning methods of BERT on text classification task and provide a general solution for BERT fine-tuning. Finally, the proposed solution obtains new state-of-the-art results on eight widely-studied text classification datasets.