男插曲女视频免费观看-无套内谢少妇毛片免费看看

Transfer learning aims to improve the performance of target tasks by transferring knowledge acquired in source tasks. The standard approach is pre-training followed by fine-tuning or linear probing. Especially, selecting a proper source domain for a specific target domain under predefined tasks is crucial for improving efficiency and effectiveness. It is conventional to solve this problem via estimating transferability. However, existing methods can not reach a trade-off between performance and cost. To comprehensively evaluate estimation methods, we summarize three properties: stability, reliability and efficiency. Building upon them, we propose Principal Gradient Expectation(PGE), a simple yet effective method for assessing transferability. Specifically, we calculate the gradient over each weight unit multiple times with a restart scheme, and then we compute the expectation of all gradients. Finally, the transferability between the source and target is estimated by computing the gap of normalized principal gradients. Extensive experiments show that the proposed metric is superior to state-of-the-art methods on all properties.

相關內容

估計/估計量

關注 3

奇異的 · 對數幾率 · 分解 · 奇異值分解 · Boosting（一種模型訓練加速方式） ·

2023 年 5 月 5 日

Boosting Adversarial Transferability via Fusing Logits of Top-1 Decomposed Feature

Juanjuan Weng,Zhiming Luo,Dazhen Lin,Shaozi Li,Zhun Zhong

Recent research has shown that Deep Neural Networks (DNNs) are highly vulnerable to adversarial samples, which are highly transferable and can be used to attack other unknown black-box models. To improve the transferability of adversarial samples, several feature-based adversarial attack methods have been proposed to disrupt neuron activation in the middle layers. However, current state-of-the-art feature-based attack methods typically require additional computation costs for estimating the importance of neurons. To address this challenge, we propose a Singular Value Decomposition (SVD)-based feature-level attack method. Our approach is inspired by the discovery that eigenvectors associated with the larger singular values decomposed from the middle layer features exhibit superior generalization and attention properties. Specifically, we conduct the attack by retaining the decomposed Top-1 singular value-associated feature for computing the output logits, which are then combined with the original logits to optimize adversarial examples. Our extensive experimental results verify the effectiveness of our proposed method, which can be easily integrated into various baselines to significantly enhance the transferability of adversarial samples for disturbing normally trained CNNs and advanced defense strategies. The source code of this study is available at \textcolor{blue}{\href{//anonymous.4open.science/r/SVD-SSA-13BF/README.md}{Link}}.

優化器 · 估計/估計量 · 線搜索 · 集成 · Performer ·

2023 年 5 月 4 日

Adjoint-Free 4D-Var Methods Via Line Search Optimization For Non-Linear Data Assimilation

Elias Nin-Ruiz,Jairo Diaz-Rodriguez

This paper proposes two practical implementations of Four-Dimensional Variational (4D-Var) Ensemble Kalman Filter (4D-EnKF) methods for non-linear data assimilation. Our formulations' main idea is to avoid the intrinsic need for adjoint models in the context of 4D-Var optimization and, even more, to handle non-linear observation operators during the assimilation of observations. The proposed methods work as follows: snapshots of an ensemble of model realizations are taken at observation times, these snapshots are employed to build control spaces onto which analysis increments can be estimated. Via the linearization of observation operators at observation times, a line-search based optimization method is proposed to estimate optimal analysis increments. The convergence of this method is theoretically proven as long as the dimension of control-spaces equals model one. In the first formulation, control spaces are given by full-rank square root approximations of background error covariance matrices via the Bickel and Levina precision matrix estimator. In this context, we propose an iterative Woodbury matrix formula to perform the optimization steps efficiently. The last formulation can be considered as an extension of the Maximum Likelihood Ensemble Filter to the 4D-Var context. This employs pseudo-square root approximations of prior error covariance matrices to build control spaces. Experimental tests are performed by using the Lorenz 96 model. The results reveal that, in terms of Root-Mean-Square-Error values, both methods can obtain reasonable estimates of posterior error modes in the 4D-Var optimization problem. Moreover, the accuracies of the proposed filter implementations can be improved as the ensemble sizes are increased.

特化 · 閾值 · 平滑 · 估計/估計量 · 可約的 ·

2023 年 5 月 4 日

Sparsity Domain Smoothing Based Thresholding Recovery Method for OFDM Sparse Channel Estimation

Mohammad Hossein Bahonar,Reza Ghaderi Zefreh,Rouhollah Amiri

Due to the ever increasing data rate demand of beyond 5G networks and considering the wide range of Orthogonal Frequency Division Multipllexing (OFDM) technique in cellular systems, it is critical to reduce pilot overhead of OFDM systems in order to increase data rate of such systems. Due to sparsity of multipath channels, sparse recovery methods can be exploited to reduce pilot overhead. OFDM pilots are utilized as random samples for channel impulse response estimation. We propose a three-step sparsity recovery algorithm which is based on sparsity domain smoothing. Time domain residue computation, sparsity domain smoothing, and adaptive thresholding sparsifying are the three-steps of the proposed scheme. To the best of our knowledge, the proposed sparsity domain smoothing based thresholding recovery method known as SDS-IMAT has not been used for OFDM sparse channel estimation in the literature. Pilot locations are also derived based on the minimization of the measurement matrix coherence. Numerical results verify that the performance of the proposed scheme outperforms other existing thresholding and greedy recovery methods and has a near-optimal performance. The effectiveness of the proposed scheme is shown in terms of mean square error and bit error rate.

motivation · Agent · 知識 (knowledge) · 黑盒 · Learning ·

2023 年 5 月 4 日

IMAP: Intrinsically Motivated Adversarial Policy

Xiang Zheng,Xingjun Ma,Shengjie Wang,Xinyu Wang,Chao Shen,Cong Wang

Reinforcement learning (RL) agents are known to be vulnerable to evasion attacks during deployment. In single-agent environments, attackers can inject imperceptible perturbations on the policy or value network's inputs or outputs; in multi-agent environments, attackers can control an adversarial opponent to indirectly influence the victim's observation. Adversarial policies offer a promising solution to craft such attacks. Still, current approaches either require perfect or partial knowledge of the victim policy or suffer from sample inefficiency due to the sparsity of task-related rewards. To overcome these limitations, we propose the Intrinsically Motivated Adversarial Policy (IMAP) for efficient black-box evasion attacks in single- and multi-agent environments without any knowledge of the victim policy. IMAP uses four intrinsic objectives based on state coverage, policy coverage, risk, and policy divergence to encourage exploration and discover stronger attacking skills. We also design a novel Bias-Reduction (BR) method to boost IMAP further. Our experiments demonstrate the effectiveness of these intrinsic objectives and BR in improving adversarial policy learning in the black-box setting against multiple types of victim agents in various single- and multi-agent MuJoCo environments. Notably, our IMAP reduces the performance of the state-of-the-art robust WocaR-PPO agents by 34\%-54\% and achieves a SOTA attacking success rate of 83.91\% in the two-player zero-sum game YouShallNotPass.

標注 · 文本分類 · MoDELS · Performer · 語言模型化 ·

2023 年 5 月 3 日

The Benefits of Label-Description Training for Zero-Shot Text Classification

Lingyu Gao,Debanjan Ghosh,Kevin Gimpel

Large language models have improved zero-shot text classification by allowing the transfer of semantic knowledge from the training data in order to classify among specific label sets in downstream tasks. We propose a simple way to further improve zero-shot accuracies with minimal effort. We curate small finetuning datasets intended to describe the labels for a task. Unlike typical finetuning data, which has texts annotated with labels, our data simply describes the labels in language, e.g., using a few related terms, dictionary/encyclopedia entries, and short templates. Across a range of topic and sentiment datasets, our method is more accurate than zero-shot by 15-17% absolute. It is also more robust to choices required for zero-shot classification, such as patterns for prompting the model to classify and mappings from labels to tokens in the model's vocabulary. Furthermore, since our data merely describes the labels but does not use input texts, finetuning on it yields a model that performs strongly on multiple text domains for a given label set, even improving over few-shot out-of-domain classification in multiple settings.

Networking · 查準率/準確率 · 可交換的 · 圖注意力網絡 · Attention ·

2023 年 5 月 3 日

Attention Based Feature Fusion For Multi-Agent Collaborative Perception

Ahmed N. Ahmed,Siegfried Mercelis,Ali Anwar

In the domain of intelligent transportation systems (ITS), collaborative perception has emerged as a promising approach to overcome the limitations of individual perception by enabling multiple agents to exchange information, thus enhancing their situational awareness. Collaborative perception overcomes the limitations of individual sensors, allowing connected agents to perceive environments beyond their line-of-sight and field of view. However, the reliability of collaborative perception heavily depends on the data aggregation strategy and communication bandwidth, which must overcome the challenges posed by limited network resources. To improve the precision of object detection and alleviate limited network resources, we propose an intermediate collaborative perception solution in the form of a graph attention network (GAT). The proposed approach develops an attention-based aggregation strategy to fuse intermediate representations exchanged among multiple connected agents. This approach adaptively highlights important regions in the intermediate feature maps at both the channel and spatial levels, resulting in improved object detection precision. We propose a feature fusion scheme using attention-based architectures and evaluate the results quantitatively in comparison to other state-of-the-art collaborative perception approaches. Our proposed approach is validated using the V2XSim dataset. The results of this work demonstrate the efficacy of the proposed approach for intermediate collaborative perception in improving object detection average precision while reducing network resource usage.

Performer · 模型評估 · MoDELS · Performance · 穩健性 ·

2023 年 5 月 2 日

Out-of-distribution detection algorithms for robust insect classification

Mojdeh Saadati,Aditya Balu,Shivani Chiranjeevi,Talukder Zaki Jubery,Asheesh K Singh,Soumik Sarkar,Arti Singh,Baskar Ganapathysubramanian

Deep learning-based approaches have produced models with good insect classification accuracy; Most of these models are conducive for application in controlled environmental conditions. One of the primary emphasis of researchers is to implement identification and classification models in the real agriculture fields, which is challenging because input images that are wildly out of the distribution (e.g., images like vehicles, animals, humans, or a blurred image of an insect or insect class that is not yet trained on) can produce an incorrect insect classification. Out-of-distribution (OOD) detection algorithms provide an exciting avenue to overcome these challenge as it ensures that a model abstains from making incorrect classification prediction of non-insect and/or untrained insect class images. We generate and evaluate the performance of state-of-the-art OOD algorithms on insect detection classifiers. These algorithms represent a diversity of methods for addressing an OOD problem. Specifically, we focus on extrusive algorithms, i.e., algorithms that wrap around a well-trained classifier without the need for additional co-training. We compared three OOD detection algorithms: (i) Maximum Softmax Probability, which uses the softmax value as a confidence score, (ii) Mahalanobis distance-based algorithm, which uses a generative classification approach; and (iii) Energy-Based algorithm that maps the input data to a scalar value, called energy. We performed an extensive series of evaluations of these OOD algorithms across three performance axes: (a) \textit{Base model accuracy}: How does the accuracy of the classifier impact OOD performance? (b) How does the \textit{level of dissimilarity to the domain} impact OOD performance? and (c) \textit{Data imbalance}: How sensitive is OOD performance to the imbalance in per-class sample size?

變分自編碼 · contrastive · 自編碼器 · MoDELS · Performer ·

2021 年 3 月 19 日

Adversarial and Contrastive Variational Autoencoder for Sequential Recommendation

Zhe Xie,Chengxuan Liu,Yichi Zhang,Hongtao Lu,Dong Wang,Yue Ding

from arxiv, 11 pages, WWW 2021

Sequential recommendation as an emerging topic has attracted increasing attention due to its important practical significance. Models based on deep learning and attention mechanism have achieved good performance in sequential recommendation. Recently, the generative models based on Variational Autoencoder (VAE) have shown the unique advantage in collaborative filtering. In particular, the sequential VAE model as a recurrent version of VAE can effectively capture temporal dependencies among items in user sequence and perform sequential recommendation. However, VAE-based models suffer from a common limitation that the representational ability of the obtained approximate posterior distribution is limited, resulting in lower quality of generated samples. This is especially true for generating sequences. To solve the above problem, in this work, we propose a novel method called Adversarial and Contrastive Variational Autoencoder (ACVAE) for sequential recommendation. Specifically, we first introduce the adversarial training for sequence generation under the Adversarial Variational Bayes (AVB) framework, which enables our model to generate high-quality latent variables. Then, we employ the contrastive loss. The latent variables will be able to learn more personalized and salient characteristics by minimizing the contrastive loss. Besides, when encoding the sequence, we apply a recurrent and convolutional structure to capture global and local relationships in the sequence. Finally, we conduct extensive experiments on four real-world datasets. The experimental results show that our proposed ACVAE model outperforms other state-of-the-art methods.

樣本 · 類別 · 損失 · Performer · SimPLe ·

2019 年 1 月 16 日

Class-Balanced Loss Based on Effective Number of Samples

Yin Cui,Menglin Jia,Tsung-Yi Lin,Yang Song,Serge Belongie

from arxiv, Code is available at: //github.com/richardaecn/class-balanced-loss

With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.

圖 · 學成 · Extensibility · 知識圖譜 · 平滑 ·

2018 年 5 月 31 日

Rethinking Knowledge Graph Propagation for Zero-Shot Learning

Michael Kampffmeyer,Yinbo Chen,Xiaodan Liang,Hao Wang,Yujia Zhang,Eric P. Xing

from arxiv, The first two authors contributed equally. Code at //github.com/cyvius96/adgpm

The potential of graph convolutional neural networks for the task of zero-shot learning has been demonstrated recently. These models are highly sample efficient as related concepts in the graph structure share statistical strength allowing generalization to new classes when faced with a lack of data. However, knowledge from distant nodes can get diluted when propagating through intermediate nodes, because current approaches to zero-shot learning use graph propagation schemes that perform Laplacian smoothing at each layer. We show that extensive smoothing does not help the task of regressing classifier weights in zero-shot learning. In order to still incorporate information from distant nodes and utilize the graph structure, we propose an Attentive Dense Graph Propagation Module (ADGPM). ADGPM allows us to exploit the hierarchical graph structure of the knowledge graph through additional connections. These connections are added based on a node's relationship to its ancestors and descendants and an attention scheme is further used to weigh their contribution depending on the distance to the node. Finally, we illustrate that finetuning of the feature representation after training the ADGPM leads to considerable improvements. Our method achieves competitive results, outperforming previous zero-shot learning approaches.