青柠在线观看免费高清1-日本特黄AAA大片24免费区

We introduce a soft-detection variant of Guessing Random Additive Noise Decoding (GRAND) called Quantized GRAND (QGRAND) that can efficiently decode any moderate redundancy block-code of any length in an algorithm that is suitable for highly parallelized implementation in hardware. QGRAND can avail of any level of quantized soft information, is established to be almost capacity achieving, and is shown to provide near maximum likelihood decoding performance when provided with five or more bits of soft information per received bit.

相關內容

解碼

關注 0

Learning · MoDELS · 泛化理論 · 視頻描述生成（Video Caption） · 稀疏 ·

2022 年 6 月 9 日

Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional MoEs

Jinguo Zhu,Xizhou Zhu,Wenhai Wang,Xiaohua Wang,Hongsheng Li,Xiaogang Wang,Jifeng Dai

To build an artificial neural network like the biological intelligence system, recent works have unified numerous tasks into a generalist model, which can process various tasks with shared parameters and do not have any task-specific modules. While generalist models achieve promising results on various benchmarks, they have performance degradation on some tasks compared with task-specialized models. In this work, we find that interference among different tasks and modalities is the main factor to this phenomenon. To mitigate such interference, we introduce the Conditional Mixture-of-Experts (Conditional MoEs) to generalist models. Routing strategies under different levels of conditions are proposed to take both the training/inference cost and generalization ability into account. By incorporating the proposed Conditional MoEs, the recently proposed generalist model Uni-Perceiver can effectively mitigate the interference across tasks and modalities, and achieves state-of-the-art results on a series of downstream tasks via prompt tuning on 1% of downstream data. Moreover, the introduction of Conditional MoEs still holds the generalization ability of generalist models to conduct zero-shot inference on new tasks, e.g., video-text retrieval and video caption. Code and pre-trained generalist models shall be released.

線性可分 · 分離的 · Neural Networks · Networking · 線性的 ·

2022 年 6 月 9 日

Adversarial Noises Are Linearly Separable for (Nearly) Random Neural Networks

Huishuai Zhang,Da Yu,Yiping Lu,Di He

from arxiv, 13 pages

Adversarial examples, which are usually generated for specific inputs with a specific model, are ubiquitous for neural networks. In this paper we unveil a surprising property of adversarial noises when they are put together, i.e., adversarial noises crafted by one-step gradient methods are linearly separable if equipped with the corresponding labels. We theoretically prove this property for a two-layer network with randomly initialized entries and the neural tangent kernel setup where the parameters are not far from initialization. The proof idea is to show the label information can be efficiently backpropagated to the input while keeping the linear separability. Our theory and experimental evidence further show that the linear classifier trained with the adversarial noises of the training data can well classify the adversarial noises of the test data, indicating that adversarial noises actually inject a distributional perturbation to the original data distribution. Furthermore, we empirically demonstrate that the adversarial noises may become less linearly separable when the above conditions are compromised while they are still much easier to classify than original features.

穩健性 · 估計/估計量 · 噪聲 · 估計誤差 · Analysis ·

2022 年 6 月 9 日

Robust Matrix Completion with Heavy-tailed Noise

Bingyan Wang,Jianqing Fan

This paper studies low-rank matrix completion in the presence of heavy-tailed and possibly asymmetric noise, where we aim to estimate an underlying low-rank matrix given a set of highly incomplete noisy entries. Though the matrix completion problem has attracted much attention in the past decade, there is still lack of theoretical understanding when the observations are contaminated by heavy-tailed noises. Prior theory falls short of explaining the empirical results and is unable to capture the optimal dependence of the estimation error on the noise level. In this paper, we adopt an adaptive Huber loss to accommodate heavy-tailed noise, which is robust against large and possibly asymmetric errors when the parameter in the loss function is carefully designed to balance the Huberization biases and robustness to outliers. Then, we propose an efficient nonconvex algorithm via a balanced low-rank Burer-Monteiro matrix factorization and gradient decent with robust spectral initialization. We prove that under merely bounded second moment condition on the error distributions, rather than the sub-Gaussian assumption, the Euclidean error of the iterates generated by the proposed algorithm decrease geometrically fast until achieving a minimax-optimal statistical estimation error, which has the same order as that in the sub-Gaussian case. The key technique behind this significant advancement is a powerful leave-one-out analysis framework. The theoretical results are corroborated by our simulation studies.

Learning · 貝葉斯網/貝葉斯網絡 · 可辨認的 · Networking · 泛函 ·

2022 年 6 月 9 日

Learning Multitask Gaussian Bayesian Networks

Shuai Liu,Yixuan Qiu,Baojuan Li,Huaning Wang,Xiangyu Chang

Major depressive disorder (MDD) requires study of brain functional connectivity alterations for patients, which can be uncovered by resting-state functional magnetic resonance imaging (rs-fMRI) data. We consider the problem of identifying alterations of brain functional connectivity for a single MDD patient. This is particularly difficult since the amount of data collected during an fMRI scan is too limited to provide sufficient information for individual analysis. Additionally, rs-fMRI data usually has the characteristics of incompleteness, sparsity, variability, high dimensionality and high noise. To address these problems, we proposed a multitask Gaussian Bayesian network (MTGBN) framework capable for identifying individual disease-induced alterations for MDD patients. We assume that such disease-induced alterations show some degrees of similarity with the tool to learn such network structures from observations to understanding of how system are structured jointly from related tasks. First, we treat each patient in a class of observation as a task and then learn the Gaussian Bayesian networks (GBNs) of this data class by learning from all tasks that share a default covariance matrix that encodes prior knowledge. This setting can help us to learn more information from limited data. Next, we derive a closed-form formula of the complete likelihood function and use the Monte-Carlo Expectation-Maximization(MCEM) algorithm to search for the approximately best Bayesian network structures efficiently. Finally, we assess the performance of our methods with simulated and real-world rs-fMRI data.

穩健性 · 噪聲 · VQ-VAE · 掩碼 · Performer ·

2022 年 6 月 8 日

Robust Semantic Communications with Masked VQ-VAE Enabled Codebook

Qiyu Hu,Guangyi Zhang,Zhijin Qin,Yunlong Cai,Guanding Yu,Geoffrey Ye Li

from arxiv, 30 pages, 11 figures. arXiv admin note: text overlap with arXiv:2202.03338

Although semantic communications have exhibited satisfactory performance for a large number of tasks, the impact of semantic noise and the robustness of the systems have not been well investigated. Semantic noise refers to the misleading between the intended semantic symbols and received ones, thus cause the failure of tasks. In this paper, we first propose a framework for the robust end-to-end semantic communication systems to combat the semantic noise. In particular, we analyze sample-dependent and sample-independent semantic noise. To combat the semantic noise, the adversarial training with weight perturbation is developed to incorporate the samples with semantic noise in the training dataset. Then, we propose to mask a portion of the input, where the semantic noise appears frequently, and design the masked vector quantized-variational autoencoder (VQ-VAE) with the noise-related masking strategy. We use a discrete codebook shared by the transmitter and the receiver for encoded feature representation. To further improve the system robustness, we develop a feature importance module (FIM) to suppress the noise-related and task-unrelated features. Thus, the transmitter simply needs to transmit the indices of these important task-related features in the codebook. Simulation results show that the proposed method can be applied in many downstream tasks and significantly improve the robustness against semantic noise with remarkable reduction on the transmission overhead.

Learning · 可約的 · 噪聲 · 縮放 · Performer ·

2022 年 6 月 8 日

Action Noise in Off-Policy Deep Reinforcement Learning: Impact on Exploration and Performance

Jakob Hollenstein,Sayantan Auddy,Matteo Saveriano,Erwan Renaudo,Justus Piater

Many deep reinforcement learning algorithms rely on simple forms of exploration, such as the additive action-noise often used in continuous control domains. Typically, the scaling factor of this action noise is chosen as a hyper-parameter and kept constant during training. In this paper, we analyze how the learned policy is impacted by the noise type, scale, and reducing of the scaling factor over time. We consider the two most prominent types of action-noise: Gaussian and Ornstein-Uhlenbeck noise, and perform a vast experimental campaign by systematically varying the noise type and scale parameter, and by measuring variables of interest like the expected return of the policy and the state space coverage during exploration. For the latter, we propose a novel state-space coverage measure $\operatorname{X}_{\mathcal{U}\text{rel}}$ that is more robust to boundary artifacts than previously proposed measures. Larger noise scales generally increase state space coverage. However, we found that increasing the space coverage using a larger noise scale is often not beneficial. On the contrary, reducing the noise-scale over the training process reduces the variance and generally improves the learning performance. We conclude that the best noise-type and scale are environment dependent, and based on our observations, derive heuristic rules for guiding the choice of the action noise as a starting point for further optimization.

Learning · 高斯混合（模型） · 估計/估計量 · Continuity · STOC ·

2022 年 6 月 7 日

Continuous LWE is as Hard as LWE & Applications to Learning Gaussian Mixtures

Aparna Gupte,Neekon Vafa,Vinod Vaikuntanathan

from arxiv, Fixed bugs in Lemma 9 and Section 6

We show direct and conceptually simple reductions between the classical learning with errors (LWE) problem and its continuous analog, CLWE (Bruna, Regev, Song and Tang, STOC 2021). This allows us to bring to bear the powerful machinery of LWE-based cryptography to the applications of CLWE. For example, we obtain the hardness of CLWE under the classical worst-case hardness of the gap shortest vector problem. Previously, this was known only under quantum worst-case hardness of lattice problems. More broadly, with our reductions between the two problems, any future developments to LWE will also apply to CLWE and its downstream applications. As a concrete application, we show an improved hardness result for density estimation for mixtures of Gaussians. In this computational problem, given sample access to a mixture of Gaussians, the goal is to output a function that estimates the density function of the mixture. Under the (plausible and widely believed) exponential hardness of the classical LWE problem, we show that Gaussian mixture density estimation in $\mathbb{R}^n$ with roughly $\log n$ Gaussian components given $\mathsf{poly}(n)$ samples requires time quasi-polynomial in $n$. Under the (conservative) polynomial hardness of LWE, we show hardness of density estimation for $n^{\epsilon}$ Gaussians for any constant $\epsilon > 0$, which improves on Bruna, Regev, Song and Tang (STOC 2021), who show hardness for at least $\sqrt{n}$ Gaussians under polynomial (quantum) hardness assumptions. Our key technical tool is a reduction from classical LWE to LWE with $k$-sparse secrets where the multiplicative increase in the noise is only $O(\sqrt{k})$, independent of the ambient dimension $n$.

無偏估計 · 估計/估計量 · 輸入分布 · 無偏 · 方陣 ·

2022 年 6 月 7 日

Unbiased estimators for random design regression

Micha? Dereziński,Manfred K. Warmuth,Daniel Hsu

In linear regression we wish to estimate the optimum linear least squares predictor for a distribution over $d$-dimensional input points and real-valued responses, based on a small sample. Under standard random design analysis, where the sample is drawn i.i.d. from the input distribution, the least squares solution for that sample can be viewed as the natural estimator of the optimum. Unfortunately, this estimator almost always incurs an undesirable bias coming from the randomness of the input points, which is a significant bottleneck in model averaging. In this paper we show that it is possible to draw a non-i.i.d. sample of input points such that, regardless of the response model, the least squares solution is an unbiased estimator of the optimum. Moreover, this sample can be produced efficiently by augmenting a previously drawn i.i.d. sample with an additional set of $d$ points, drawn jointly according to a certain determinantal point process constructed from the input distribution rescaled by the squared volume spanned by the points. Motivated by this, we develop a theoretical framework for studying volume-rescaled sampling, and in the process prove a number of new matrix expectation identities. We use them to show that for any input distribution and $\epsilon>0$ there is a random design consisting of $O(d\log d+ d/\epsilon)$ points from which an unbiased estimator can be constructed whose expected square loss over the entire distribution is bounded by $1+\epsilon$ times the loss of the optimum. We provide efficient algorithms for generating such unbiased estimators in a number of practical settings and support our claims experimentally.

Analysis · 解碼 · GROUP · 集成 · Performer ·

2022 年 6 月 7 日

Group Properties of Polar Codes for Automorphism Ensemble Decoding

Valerio Bioglio,Ingmar Land,Charles Pillet

from arxiv, submitted to IEEE for possible publication

In this paper, we propose an analysis of the automorphism group of polar codes, with the scope of designing codes tailored for automorphism ensemble (AE) decoding. We prove the equivalence between the notion of decreasing monomial codes and the universal partial order (UPO) framework for the description of polar codes. Then, we analyze the algebraic properties of the affine automorphisms group of polar codes, providing a novel description of its structure and proposing a classification of automorphisms providing the same results under permutation decoding. Finally, we propose a method to list all the automorphisms that may lead to different candidates under AE decoding; by introducing the concept of redundant automorphisms, we find the maximum number of permutations providing possibly different codeword candidates under AE-SC, proposing a method to list all of them. A numerical analysis of the error correction performance of AE algorithm for the decoding of polar codes concludes the paper.

模型選擇 · Learning · cancer · Analysis · 穩健性 ·

2022 年 6 月 7 日

Model selection for robust learning of mutational signatures using Negative Binomial non-negative matrix factorization

Marta Pelizzola,Ragnhild Laursen,Asger Hobolth

The spectrum of mutations in a collection of cancer genomes can be described by a mixture of a few mutational signatures. The mutational signatures can be found using non-negative matrix factorization (NMF). To extract the mutational signatures we have to assume a distribution for the observed mutational counts and a number of mutational signatures. In most applications, the mutational counts are assumed to be Poisson distributed, but they are often overdispersed, and thus the Negative Binomial distribution is more appropriate. We demonstrate using a simulation study that Negative Binomial NMF requires fewer signatures than Poisson NMF to fit the data and we propose a Negative Binomial NMF with a patient specific overdispersion parameter to capture the variation across patients. We also introduce a robust model selection procedure inspired by cross-validation to determine the number of signatures. Furthermore we study the influence of the distributional assumption in relation to two classical model selection procedures: the Akaike information criterion (AIC) and the Bayesian information criterion (BIC). In the presence of overdispersion we show that our model selection procedure is more robust at determining the correct number of signatures than state-of-the-art methods, which are overestimating the number of signatures. We apply our proposed analysis on a wide range of simulated data and on a data set from breast cancer patients. The code for our algorithms and analysis is available in the R package SigMoS and can be found at //github.com/MartaPelizzola/SigMoS.