A级日本乱理伦片免费入口_91午夜夜伦鲁鲁片免费无码影视_免费黄色视频一区_国产91色综合久久免费_91在线视频免费观看_国产日韩AI换脸在线第一页_2020天堂在线亚洲精品专区

Although lane detection methods have shown impressive performance in real-world scenarios, most of methods require post-processing which is not robust enough. Therefore, end-to-end detectors like DEtection TRansformer(DETR) have been introduced in lane detection. However, one-to-one label assignment in DETR can degrade the training efficiency due to label semantic conflicts. Besides, positional query in DETR is unable to provide explicit positional prior, making it difficult to be optimized. In this paper, we present the One-to-Several Transformer(O2SFormer). We first propose the one-to-several label assignment, which combines one-to-one and one-to-many label assignments to improve the training efficiency while keeping end-to-end detection. To overcome the difficulty in optimizing one-to-one assignment. We further propose the layer-wise soft label which adjusts the positive weight of positive lane anchors across different decoder layers. Finally, we design the dynamic anchor-based positional query to explore positional prior by incorporating lane anchors into positional query. Experimental results show that O2SFormer significantly speeds up the convergence of DETR and outperforms Transformer-based and CNN-based detectors on the CULane dataset. Code will be available at //github.com/zkyseu/O2SFormer.

相關內容

標注

關注 2

圖像還原 · 優化器 · MoDELS · Learning · Neural Networks ·

2023 年 6 月 21 日

Accelerating Multiframe Blind Deconvolution via Deep Learning

A. Asensio Ramos,S. Esteban Pozuelo,C. Kuckein

from arxiv, 26 pages, 9 figures, accepted for publication in Solar Physics

Ground-based solar image restoration is a computationally expensive procedure that involves nonlinear optimization techniques. The presence of atmospheric turbulence produces perturbations in individual images that make it necessary to apply blind deconvolution techniques. These techniques rely on the observation of many short exposure frames that are used to simultaneously infer the instantaneous state of the atmosphere and the unperturbed object. We have recently explored the use of machine learning to accelerate this process, with promising results. We build upon this previous work to propose several interesting improvements that lead to better models. As well, we propose a new method to accelerate the restoration based on algorithm unrolling. In this method, the image restoration problem is solved with a gradient descent method that is unrolled and accelerated aided by a few small neural networks. The role of the neural networks is to correct the estimation of the solution at each iterative step. The model is trained to perform the optimization in a small fixed number of steps with a curated dataset. Our findings demonstrate that both methods significantly reduce the restoration time compared to the standard optimization procedure. Furthermore, we showcase that these models can be trained in an unsupervised manner using observed images from three different instruments. Remarkably, they also exhibit robust generalization capabilities when applied to new datasets. To foster further research and collaboration, we openly provide the trained models, along with the corresponding training and evaluation code, as well as the training dataset, to the scientific community.

超參數 · tuning · 端到端 · 異常檢測 · 測試數據 ·

2023 年 6 月 21 日

End-to-End Augmentation Hyperparameter Tuning for Self-Supervised Anomaly Detection

Jaemin Yoo,Lingxiao Zhao,Leman Akoglu

Self-supervised learning (SSL) has emerged as a promising paradigm that presents self-generated supervisory signals to real-world problems, bypassing the extensive manual labeling burden. SSL is especially attractive for unsupervised tasks such as anomaly detection, where labeled anomalies are often nonexistent and costly to obtain. While self-supervised anomaly detection (SSAD) has seen a recent surge of interest, the literature has failed to treat data augmentation as a hyperparameter. Meanwhile, recent works have reported that the choice of augmentation has significant impact on detection performance. In this paper, we introduce ST-SSAD (Self-Tuning Self-Supervised Anomaly Detection), the first systematic approach to SSAD in regards to rigorously tuning augmentation. To this end, our work presents two key contributions. The first is a new unsupervised validation loss that quantifies the alignment between the augmented training data and the (unlabeled) test data. In principle we adopt transduction, quantifying the extent to which augmentation mimics the true anomaly-generating mechanism, in contrast to augmenting data with arbitrary pseudo anomalies without regard to test data. Second, we present new differentiable augmentation functions, allowing data augmentation hyperparameter(s) to be tuned end-to-end via our proposed validation loss. Experiments on two testbeds with semantic class anomalies and subtle industrial defects show that systematically tuning augmentation offers significant performance gains over current practices.

語音識別 · MoDELS · Performer · Learning · 會話智能體 ·

2023 年 6 月 21 日

Federated Self-Learning with Weak Supervision for Speech Recognition

Milind Rao,Gopinath Chennupati,Gautam Tiwari,Anit Kumar Sahu,Anirudh Raju,Ariya Rastrow,Jasha Droppo

from arxiv, Proceedings of ICASSP 2023

Automatic speech recognition (ASR) models with low-footprint are increasingly being deployed on edge devices for conversational agents, which enhances privacy. We study the problem of federated continual incremental learning for recurrent neural network-transducer (RNN-T) ASR models in the privacy-enhancing scheme of learning on-device, without access to ground truth human transcripts or machine transcriptions from a stronger ASR model. In particular, we study the performance of a self-learning based scheme, with a paired teacher model updated through an exponential moving average of ASR models. Further, we propose using possibly noisy weak-supervision signals such as feedback scores and natural language understanding semantics determined from user behavior across multiple turns in a session of interactions with the conversational agent. These signals are leveraged in a multi-task policy-gradient training approach to improve the performance of self-learning for ASR. Finally, we show how catastrophic forgetting can be mitigated by combining on-device learning with a memory-replay approach using selected historical datasets. These innovations allow for 10% relative improvement in WER on new use cases with minimal degradation on other test sets in the absence of strong-supervision signals such as ground-truth transcriptions.

EDTER · 變換 · 邊 · Extensibility · INFORMS ·

2022 年 3 月 16 日

EDTER: Edge Detection with Transformer

Mengyang Pu,Yaping Huang,Yuming Liu,Qingji Guan,Haibin Ling

from arxiv, Accepted by CVPR2022

Convolutional neural networks have made significant progresses in edge detection by progressively exploring the context and semantic features. However, local details are gradually suppressed with the enlarging of receptive fields. Recently, vision transformer has shown excellent capability in capturing long-range dependencies. Inspired by this, we propose a novel transformer-based edge detector, \emph{Edge Detection TransformER (EDTER)}, to extract clear and crisp object boundaries and meaningful edges by exploiting the full image context information and detailed local cues simultaneously. EDTER works in two stages. In Stage I, a global transformer encoder is used to capture long-range global context on coarse-grained image patches. Then in Stage II, a local transformer encoder works on fine-grained patches to excavate the short-range local cues. Each transformer encoder is followed by an elaborately designed Bi-directional Multi-Level Aggregation decoder to achieve high-resolution features. Finally, the global context and local cues are combined by a Feature Fusion Module and fed into a decision head for edge prediction. Extensive experiments on BSDS500, NYUDv2, and Multicue demonstrate the superiority of EDTER in comparison with state-of-the-arts.

Performer · 變換 · 目標檢測 · 無監督 · Faster R-CNN ·

2020 年 11 月 18 日

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

Zhigang Dai,Bolun Cai,Yugeng Lin,Junying Chen

Object detection with transformers (DETR) reaches competitive performance with Faster R-CNN via a transformer encoder-decoder architecture. Inspired by the great success of pre-training transformers in natural language processing, we propose a pretext task named random query patch detection to unsupervisedly pre-train DETR (UP-DETR) for object detection. Specifically, we randomly crop patches from the given image and then feed them as queries to the decoder. The model is pre-trained to detect these query patches from the original image. During the pre-training, we address two critical issues: multi-task learning and multi-query localization. (1) To trade-off multi-task learning of classification and localization in the pretext task, we freeze the CNN backbone and propose a patch feature reconstruction branch which is jointly optimized with patch detection. (2) To perform multi-query localization, we introduce UP-DETR from single-query patch and extend it to multi-query patches with object query shuffle and attention mask. In our experiments, UP-DETR significantly boosts the performance of DETR with faster convergence and higher precision on PASCAL VOC and COCO datasets. The code will be available soon.

小樣本學習 · 閾值 · 學成 · Better · 估計/估計量 ·

2020 年 10 月 11 日

Few-shot Learning for Multi-label Intent Detection

Yutai Hou,Yongkui Lai,Yushan Wu,Wanxiang Che,Ting Liu

In this paper, we study the few-shot multi-label classification for user intent detection. For multi-label intent detection, state-of-the-art work estimates label-instance relevance scores and uses a threshold to select multiple associated intent labels. To determine appropriate thresholds with only a few examples, we first learn universal thresholding experience on data-rich domains, and then adapt the thresholds to certain few-shot domains with a calibration based on nonparametric learning. For better calculation of label-instance relevance score, we introduce label name embedding as anchor points in representation space, which refines representations of different classes to be well-separated from each other. Experiments on two datasets show that the proposed model significantly outperforms strong baselines in both one-shot and five-shot settings.

圖卷積神經網絡/圖卷積網絡 · 圖 · 簇 · 圖卷積 · 注意力機制 ·

2020 年 3 月 13 日

Adaptive Graph Convolutional Network with Attention Graph Clustering for Co-saliency Detection

Kaihua Zhang,Tengpeng Li,Shiwen Shen,Bo Liu,Jin Chen,Qingshan Liu

from arxiv, CVPR2020

Co-saliency detection aims to discover the common and salient foregrounds from a group of relevant images. For this task, we present a novel adaptive graph convolutional network with attention graph clustering (GCAGC). Three major contributions have been made, and are experimentally shown to have substantial practical merits. First, we propose a graph convolutional network design to extract information cues to characterize the intra- and interimage correspondence. Second, we develop an attention graph clustering algorithm to discriminate the common objects from all the salient foreground objects in an unsupervised fashion. Third, we present a unified framework with encoder-decoder structure to jointly train and optimize the graph convolutional network, attention graph cluster, and co-saliency detection decoder in an end-to-end manner. We evaluate our proposed GCAGC method on three cosaliency detection benchmark datasets (iCoseg, Cosal2015 and COCO-SEG). Our GCAGC method obtains significant improvements over the state-of-the-arts on most of them.

目標檢測 · 模型評估 · 學成 · 注意力機制 · Networking ·

2019 年 4 月 15 日

Reverse Attention for Salient Object Detection

Shuhan Chen,Xiuli Tan,Ben Wang,Xuelong Hu

from arxiv, ECCV 2018

Benefit from the quick development of deep learning techniques, salient object detection has achieved remarkable progresses recently. However, there still exists following two major challenges that hinder its application in embedded devices, low resolution output and heavy model weight. To this end, this paper presents an accurate yet compact deep network for efficient salient object detection. More specifically, given a coarse saliency prediction in the deepest layer, we first employ residual learning to learn side-output residual features for saliency refinement, which can be achieved with very limited convolutional parameters while keep accuracy. Secondly, we further propose reverse attention to guide such side-output residual learning in a top-down manner. By erasing the current predicted salient regions from side-output features, the network can eventually explore the missing object parts and details which results in high resolution and accuracy. Experiments on six benchmark datasets demonstrate that the proposed approach compares favorably against state-of-the-art methods, and with advantages in terms of simplicity, efficiency (45 FPS) and model size (81 MB).

樣本 · Performer · 注意力機制 · 目標檢測 · MINE ·

2019 年 4 月 9 日

Prime Sample Attention in Object Detection

Yuhang Cao,Kai Chen,Chen Change Loy,Dahua Lin

It is a common paradigm in object detection frameworks to treat all samples equally and target at maximizing the performance on average. In this work, we revisit this paradigm through a careful study on how different samples contribute to the overall performance measured in terms of mAP. Our study suggests that the samples in each mini-batch are neither independent nor equally important, and therefore a better classifier on average does not necessarily mean higher mAP. Motivated by this study, we propose the notion of Prime Samples, those that play a key role in driving the detection performance. We further develop a simple yet effective sampling and learning strategy called PrIme Sample Attention (PISA) that directs the focus of the training process towards such samples. Our experiments demonstrate that it is often more effective to focus on prime samples than hard samples when training a detector. Particularly, On the MSCOCO dataset, PISA outperforms the random sampling baseline and hard mining schemes, e.g. OHEM and Focal Loss, consistently by more than 1% on both single-stage and two-stage detectors, with a strong backbone ResNeXt-101.

異常點 · 異常檢測 · CIFAR-10 · Extensibility · Performance ·

2018 年 12 月 21 日

Deep Anomaly Detection with Outlier Exposure

Dan Hendrycks,Mantas Mazeika,Thomas G. Dietterich

from arxiv, ICLR 2019; PyTorch code available at //github.com/hendrycks/outlier-exposure

It is important to detect anomalous inputs when deploying machine learning systems. The use of larger and more complex inputs in deep learning magnifies the difficulty of distinguishing between anomalous and in-distribution examples. At the same time, diverse image and text data are available in enormous quantities. We propose leveraging these data to improve deep anomaly detection by training anomaly detectors against an auxiliary dataset of outliers, an approach we call Outlier Exposure (OE). This enables anomaly detectors to generalize and detect unseen anomalies. In extensive experiments on natural language processing and small- and large-scale vision tasks, we find that Outlier Exposure significantly improves detection performance. We also observe that cutting-edge generative models trained on CIFAR-10 may assign higher likelihoods to SVHN images than to CIFAR-10 images; we use OE to mitigate this issue. We also analyze the flexibility and robustness of Outlier Exposure, and identify characteristics of the auxiliary dataset that improve performance.