男男网站网址视频免费观看-国产日本亚洲欧美一区二区

Diffusion models trained with mean squared error loss tend to generate unrealistic samples. Current state-of-the-art models rely on classifier-free guidance to improve sample quality, yet its surprising effectiveness is not fully understood. In this paper, We show that the effectiveness of classifier-free guidance partly originates from it being a form of implicit perceptual guidance. As a result, we can directly incorporate perceptual loss in diffusion training to improve sample quality. Since the score matching objective used in diffusion training strongly resembles the denoising autoencoder objective used in unsupervised training of perceptual networks, the diffusion model itself is a perceptual network and can be used to generate meaningful perceptual loss. We propose a novel self-perceptual objective that results in diffusion models capable of generating more realistic samples. For conditional generation, our method only improves sample quality without entanglement with the conditional input and therefore does not sacrifice sample diversity. Our method can also improve sample quality for unconditional generation, which was not possible with classifier-free guidance before.

相關內容

Guidance

關注 3

MoDELS · 流形 · 多樣性 · 控制器 · Subspace ·

2024 年 2 月 19 日

Mixed Gaussian Flow for Diverse Trajectory Prediction

Jiahe Chen,Jinkun Cao,Dahua Lin,Kris Kitani,Jiangmiao Pang

Existing trajectory prediction studies intensively leverage generative models. Normalizing flow is one of the genres with the advantage of being invertible to derive the probability density of predicted trajectories. However, mapping from a standard Gaussian by a flow-based model hurts the capacity to capture complicated patterns of trajectories, ignoring the under-represented motion intentions in the training data. To solve the problem, we propose a flow-based model to transform a mixed Gaussian prior into the future trajectory manifold. The model shows a better capacity for generating diverse trajectory patterns. Also, by associating each sub-Gaussian with a certain subspace of trajectories, we can generate future trajectories with controllable motion intentions. In such a fashion, the flow-based model is not encouraged to simply seek the most likelihood of the intended manifold anymore but a family of controlled manifolds with explicit interpretability. Our proposed method is demonstrated to show state-of-the-art performance in the quantitative evaluation of sampling well-aligned trajectories in top-M generated candidates. We also demonstrate that it can generate diverse, controllable, and out-of-distribution trajectories. Code is available at //github.com/mulplue/MGF.

去噪 · MoDELS · Markov · 分數匹配 · 近似 ·

2024 年 2 月 18 日

From Denoising Diffusions to Denoising Markov Models

Joe Benton,Yuyang Shi,Valentin De Bortoli,George Deligiannidis,Arnaud Doucet

Denoising diffusions are state-of-the-art generative models exhibiting remarkable empirical performance. They work by diffusing the data distribution into a Gaussian distribution and then learning to reverse this noising process to obtain synthetic datapoints. The denoising diffusion relies on approximations of the logarithmic derivatives of the noised data densities using score matching. Such models can also be used to perform approximate posterior simulation when one can only sample from the prior and likelihood. We propose a unifying framework generalising this approach to a wide class of spaces and leading to an original extension of score matching. We illustrate the resulting models on various applications.

特化 · 稀疏 · MoDELS · Unstructured · 剪枝 ·

2024 年 2 月 16 日

Accelerating Sparse DNNs Based on Tiled GEMM

Cong Guo,Fengchen Xue,Jingwen Leng,Yuxian Qiu,Yue Guan,Weihao Cui,Quan Chen,Minyi Guo

from arxiv, Accepted by IEEE Transactions on Computers. arXiv admin note: substantial text overlap with arXiv:2008.13006

Network pruning can reduce the computation cost of deep neural network (DNN) models. However, sparse models often produce randomly-distributed weights to maintain accuracy, leading to irregular computations. Consequently, unstructured sparse models cannot achieve meaningful speedup on commodity hardware built for dense matrix computations. Accelerators are usually modified or designed with structured sparsity-optimized architectures for exploiting sparsity. For example, the Ampere architecture introduces a sparse tensor core, which adopts the 2:4 sparsity pattern. We propose a pruning method that builds upon the insight that matrix multiplication generally breaks the large matrix into multiple smaller tiles for parallel execution. We present the tile-wise sparsity pattern, which maintains a structured sparsity pattern at the tile level for efficient execution but allows for irregular pruning at the global scale to maintain high accuracy. In addition, the tile-wise sparsity is implemented at the global memory level, and the 2:4 sparsity executes at the register level inside the sparse tensor core. We can combine these two patterns into a tile-vector-wise (TVW) sparsity pattern to explore more fine-grained sparsity and further accelerate the sparse DNN models. We evaluate the TVW on the GPU, achieving averages of $1.85\times$, $2.75\times$, and $22.18\times$ speedups over the dense model, block sparsity, and unstructured sparsity.

白化 · MoDELS · Performance · 相似度 · state-of-the-art ·

2024 年 2 月 16 日

Are ID Embeddings Necessary? Whitening Pre-trained Text Embeddings for Effective Sequential Recommendation

Lingzi Zhang,Xin Zhou,Zhiwei Zeng,Zhiqi Shen

Recent sequential recommendation models have combined pre-trained text embeddings of items with item ID embeddings to achieve superior recommendation performance. Despite their effectiveness, the expressive power of text features in these models remains largely unexplored. While most existing models emphasize the importance of ID embeddings in recommendations, our study takes a step further by studying sequential recommendation models that only rely on text features and do not necessitate ID embeddings. Upon examining pretrained text embeddings experimentally, we discover that they reside in an anisotropic semantic space, with an average cosine similarity of over 0.8 between items. We also demonstrate that this anisotropic nature hinders recommendation models from effectively differentiating between item representations and leads to degenerated performance. To address this issue, we propose to employ a pre-processing step known as whitening transformation, which transforms the anisotropic text feature distribution into an isotropic Gaussian distribution. Our experiments show that whitening pre-trained text embeddings in the sequential model can significantly improve recommendation performance. However, the full whitening operation might break the potential manifold of items with similar text semantics. To preserve the original semantics while benefiting from the isotropy of the whitened text features, we introduce WhitenRec+, an ensemble approach that leverages both fully whitened and relaxed whitened item representations for effective recommendations. We further discuss and analyze the benefits of our design through experiments and proofs. Experimental results on three public benchmark datasets demonstrate that WhitenRec+ outperforms state-of-the-art methods for sequential recommendation.

簇 · 相關系數 · 分解的 · 泛化理論 · CASE ·

2024 年 2 月 15 日

Correlation Clustering with Vertex Splitting

Matthias Bentert,Alex Crane,P?l Gr?n?s Drange,Felix Reidl,Blair D. Sullivan

We explore Cluster Editing and its generalization Correlation Clustering with a new operation called permissive vertex splitting which addresses finding overlapping clusters in the face of uncertain information. We determine that both problems are NP-hard, yet they exhibit significant differences in parameterized complexity and approximability. For Cluster Editing with Permissive Vertex Splitting, we show a polynomial kernel when parameterized by the solution size and develop a polynomial-time algorithm with approximation factor 7. In the case of Correlation Clustering, we establish para-NP-hardness when parameterized by solution size and demonstrate that computing an $n^{1-\epsilon}$-approximation is NP-hard for any constant $\epsilon > 0$. Additionally, we extend the established link between Correlation Clustering and Multicut to the setting with permissive vertex splitting.

樣本 · MoDELS · Processing（編程語言） · 推斷 · 自回歸過程 ·

2024 年 2 月 15 日

Accelerating Parallel Sampling of Diffusion Models

Zhiwei Tang,Jiasheng Tang,Hao Luo,Fan Wang,Tsung-Hui Chang

Diffusion models have emerged as state-of-the-art generative models for image generation. However, sampling from diffusion models is usually time-consuming due to the inherent autoregressive nature of their sampling process. In this work, we propose a novel approach that accelerates the sampling of diffusion models by parallelizing the autoregressive process. Specifically, we reformulate the sampling process as solving a system of triangular nonlinear equations through fixed-point iteration. With this innovative formulation, we explore several systematic techniques to further reduce the iteration steps required by the solving process. Applying these techniques, we introduce ParaTAA, a universal and training-free parallel sampling algorithm that can leverage extra computational and memory resources to increase the sampling speed. Our experiments demonstrate that ParaTAA can decrease the inference steps required by common sequential sampling algorithms such as DDIM and DDPM by a factor of 4~14 times. Notably, when applying ParaTAA with 100 steps DDIM for Stable Diffusion, a widely-used text-to-image diffusion model, it can produce the same images as the sequential sampling in only 7 inference steps.

Agent · 秩 · 閾值 · ForCES · 相互獨立的 ·

2024 年 2 月 15 日

Unbalanced Random Matching Markets with Partial Preferences

Aditya Potukuchi,Shikha Singh

Properties of stable matchings in the popular random-matching-market model have been studied for over 50 years. In a random matching market, each agent has complete preferences drawn uniformly and independently at random. Wilson (1972), Knuth (1976) and Pittel (1989) proved that in balanced random matching markets, the proposers are matched to their $\ln n$th choice on average. In this paper, we consider markets where agents have partial (truncated) preferences, that is, the proposers only rank their top $d$ partners. Despite the long history of the problem, the following fundamental question remained unanswered: \emph{what is the smallest value of $d$ that results in a perfect stable matching with high probability?} In this paper, we answer this question exactly -- we prove that a degree of $\ln^2 n$ is necessary and sufficient. That is, we show that if $d < (1-\epsilon) \ln^2 n$ then no stable matching is perfect and if $d > (1+ \epsilon) \ln^2 n$, then every stable matching is perfect with high probability. This settles a recent conjecture by Kanoria, Min and Qian (2021). We generalize this threshold for unbalanced markets: we consider a matching market with $n$ agents on the shorter side and $n(\alpha+1)$ agents on the longer side. We show that for markets with $\alpha =o(1)$, the sharp threshold characterizing the existence of perfect stable matching occurs when $d$ is $\ln n \cdot \ln \left(\frac{1 + \alpha}{\alpha + (1/n(\alpha+1))} \right)$. Finally, we extend the line of work studying the effect of imbalance on the expected rank of the proposers (termed the ``stark effect of competition''). We establish the regime in unbalanced markets that forces this stark effect to take shape in markets with partial preferences.

MoDELS · 學成 · SOTA · Continuity · 深度學習 ·

2021 年 11 月 10 日

A Survey on Green Deep Learning

Jingjing Xu,Wangchunshu Zhou,Zhiyi Fu,Hao Zhou,Lei Li

In recent years, larger and deeper models are springing up and continuously pushing state-of-the-art (SOTA) results across various fields like natural language processing (NLP) and computer vision (CV). However, despite promising results, it needs to be noted that the computations required by SOTA models have been increased at an exponential rate. Massive computations not only have a surprisingly large carbon footprint but also have negative effects on research inclusiveness and deployment on real-world applications. Green deep learning is an increasingly hot research field that appeals to researchers to pay attention to energy usage and carbon emission during model training and inference. The target is to yield novel results with lightweight and efficient technologies. Many technologies can be used to achieve this goal, like model compression and knowledge distillation. This paper focuses on presenting a systematic review of the development of Green deep learning technologies. We classify these approaches into four categories: (1) compact networks, (2) energy-efficient training strategies, (3) energy-efficient inference approaches, and (4) efficient data usage. For each category, we discuss the progress that has been achieved and the unresolved challenges.

次最優 · ML · 極小點 · state-of-the-art · MoDELS ·

2020 年 12 月 10 日

Composite Adversarial Attacks

Xiaofeng Mao,Yuefeng Chen,Shuhui Wang,Hang Su,Yuan He,Hui Xue

from arxiv, To appear in AAAI 2021, code will be released later

Adversarial attack is a technique for deceiving Machine Learning (ML) models, which provides a way to evaluate the adversarial robustness. In practice, attack algorithms are artificially selected and tuned by human experts to break a ML system. However, manual selection of attackers tends to be sub-optimal, leading to a mistakenly assessment of model security. In this paper, a new procedure called Composite Adversarial Attack (CAA) is proposed for automatically searching the best combination of attack algorithms and their hyper-parameters from a candidate pool of \textbf{32 base attackers}. We design a search space where attack policy is represented as an attacking sequence, i.e., the output of the previous attacker is used as the initialization input for successors. Multi-objective NSGA-II genetic algorithm is adopted for finding the strongest attack policy with minimum complexity. The experimental result shows CAA beats 10 top attackers on 11 diverse defenses with less elapsed time (\textbf{6 $\times$ faster than AutoAttack}), and achieves the new state-of-the-art on $l_{\infty}$, $l_{2}$ and unrestricted adversarial attacks.

BLEU · INFORMS · state-of-the-art · contrastive · Neural Networks ·

2018 年 3 月 6 日

Self-Attention with Relative Position Representations

Peter Shaw,Jakob Uszkoreit,Ashish Vaswani

from arxiv, NAACL 2018

Relying entirely on an attention mechanism, the Transformer introduced by Vaswani et al. (2017) achieves state-of-the-art results for machine translation. In contrast to recurrent and convolutional neural networks, it does not explicitly model relative or absolute position information in its structure. Instead, it requires adding representations of absolute positions to its inputs. In this work we present an alternative approach, extending the self-attention mechanism to efficiently consider representations of the relative positions, or distances between sequence elements. On the WMT 2014 English-to-German and English-to-French translation tasks, this approach yields improvements of 1.3 BLEU and 0.3 BLEU over absolute position representations, respectively. Notably, we observe that combining relative and absolute position representations yields no further improvement in translation quality. We describe an efficient implementation of our method and cast it as an instance of relation-aware self-attention mechanisms that can generalize to arbitrary graph-labeled inputs.