良辰好景知几何电视剧免费观看_亚洲日本午夜一区二区三区_天天爱天天做天天操天天舔_欧美色综合精品视频在线_无遮挡看着又刺激又爽很黄的视频_亚洲日韩国产精品无码专区_视频一区二区三区欧美国产剧情

Crowd management relies on inspection of surveillance video either by operators or by object detection models. These models are large, making it difficult to deploy them on resource constrained edge hardware. Instead, the computations are often offloaded to a (third party) cloud platform. While crowd management may be a legitimate application, transferring video from the camera to remote infrastructure may open the door for extracting additional information that are infringements of privacy, like person tracking or face recognition. In this paper, we use adversarial training to obtain a lightweight obfuscator that transforms video frames to only retain the necessary information for person detection. Importantly, the obfuscated data can be processed by publicly available object detectors without retraining and without significant loss of accuracy.

相關內容

INFORMS

關注 10

《計算機信息》雜志發表高質量的論文，擴大了運籌學和計算的范圍，尋求有關理論、方法、實驗、系統和應用方面的原創研究論文、新穎的調查和教程論文，以及描述新的和有用的軟件工具的論文。官網鏈接： · Processing（編程語言） · INFORMS · Performer · 目標領域 ·

2021 年 12 月 28 日

TAGPerson: A Target-Aware Generation Pipeline for Person Re-identification

Kai Chen,Weihua Chen,Tao He,Rong Du,Fan Wang,Xiuyu Sun,Yuchen Guo,Guiguang Ding

Nowadays, real data in person re-identification (ReID) task is facing privacy issues, e.g., the banned dataset DukeMTMC-ReID. Thus it becomes much harder to collect real data for ReID task. Meanwhile, the labor cost of labeling ReID data is still very high and further hinders the development of the ReID research. Therefore, many methods turn to generate synthetic images for ReID algorithms as alternatives instead of real images. However, there is an inevitable domain gap between synthetic and real images. In previous methods, the generation process is based on virtual scenes, and their synthetic training data can not be changed according to different target real scenes automatically. To handle this problem, we propose a novel Target-Aware Generation pipeline to produce synthetic person images, called TAGPerson. Specifically, it involves a parameterized rendering method, where the parameters are controllable and can be adjusted according to target scenes. In TAGPerson, we extract information from target scenes and use them to control our parameterized rendering process to generate target-aware synthetic images, which would hold a smaller gap to the real images in the target domain. In our experiments, our target-aware synthetic images can achieve a much higher performance than the generalized synthetic images on MSMT17, i.e. 47.5% vs. 40.9% for rank-1 accuracy. We will release this toolkit\footnote{\noindent Code is available at \href{//github.com/tagperson/tagperson-blender}{//github.com/tagperson/tagperson-blender}} for the ReID community to generate synthetic images at any desired taste.

MoDELS · Extensibility · Storage · INFORMS · Better ·

2020 年 10 月 8 日

Privacy-Preserving News Recommendation Model Learning

Tao Qi,Fangzhao Wu,Chuhan Wu,Yongfeng Huang,Xing Xie

from arxiv, The paper is to appear in "Findings of ACL: EMNLP 2020"

News recommendation aims to display news articles to users based on their personal interest. Existing news recommendation methods rely on centralized storage of user behavior data for model training, which may lead to privacy concerns and risks due to the privacy-sensitive nature of user behaviors. In this paper, we propose a privacy-preserving method for news recommendation model training based on federated learning, where the user behavior data is locally stored on user devices. Our method can leverage the useful information in the behaviors of massive number users to train accurate news recommendation models and meanwhile remove the need of centralized storage of them. More specifically, on each user device we keep a local copy of the news recommendation model, and compute gradients of the local model based on the user behaviors in this device. The local gradients from a group of randomly selected users are uploaded to server, which are further aggregated to update the global model in the server. Since the model gradients may contain some implicit private information, we apply local differential privacy (LDP) to them before uploading for better privacy protection. The updated global model is then distributed to each user device for local model update. We repeat this process for multiple rounds. Extensive experiments on a real-world dataset show the effectiveness of our method in news recommendation model training with privacy protection.

遷移學習 · 學成 · 目標檢測 · Neural Networks · Processing（編程語言） ·

2018 年 11 月 12 日

A Framework of Transfer Learning in Object Detection for Embedded Systems

Ioannis Athanasiadis,Panagiotis Mousouliotis,Loukas Petrou

Transfer learning is one of the subjects undergoing intense study in the area of machine learning. In object recognition and object detection there are known experiments for the transferability of parameters, but not for neural networks which are suitable for object-detection in real time embedded applications, such as the SqueezeDet neural network. We use transfer learning to accelerate the training of SqueezeDet to a new group of classes. Also, experiments are conducted to study the transferability and co-adaptation phenomena introduced by the transfer learning process. To accelerate training, we propose a new implementation of the SqueezeDet training which provides a faster pipeline for data processing and achieves $1.8$ times speedup compared to the initial implementation. Finally, we created a mechanism for automatic hyperparamer optimization using an empirical method.

Performer · 無監督 · 數據集 · 次最優 · Automator ·

2018 年 10 月 4 日

Unsupervised Adversarial Visual Level Domain Adaptation for Learning Video Object Detectors from Images

Avisek Lahiri,Charan Reddy,Prabir Kumar Biswas

from arxiv, * First two authors contributed equally

Deep learning based object detectors require thousands of diversified bounding box and class annotated examples. Though image object detectors have shown rapid progress in recent years with the release of multiple large-scale static image datasets, object detection on videos still remains an open problem due to scarcity of annotated video frames. Having a robust video object detector is an essential component for video understanding and curating large-scale automated annotations in videos. Domain difference between images and videos makes the transferability of image object detectors to videos sub-optimal. The most common solution is to use weakly supervised annotations where a video frame has to be tagged for presence/absence of object categories. This still takes up manual effort. In this paper we take a step forward by adapting the concept of unsupervised adversarial image-to-image translation to perturb static high quality images to be visually indistinguishable from a set of video frames. We assume the presence of a fully annotated static image dataset and an unannotated video dataset. Object detector is trained on adversarially transformed image dataset using the annotations of the original dataset. Experiments on Youtube-Objects and Youtube-Objects-Subset datasets with two contemporary baseline object detectors reveal that such unsupervised pixel level domain adaptation boosts the generalization performance on video frames compared to direct application of original image object detector. Also, we achieve competitive performance compared to recent baselines of weakly supervised methods. This paper can be seen as an application of image translation for cross domain object detection.

對象識別 · 目標檢測 · MoDELS · INTERACT · Networking ·

2018 年 6 月 14 日

Relation Networks for Object Detection

Han Hu,Jiayuan Gu,Zheng Zhang,Jifeng Dai,Yichen Wei

Although it is well believed for years that modeling relations between objects would help object recognition, there has not been evidence that the idea is working in the deep learning era. All state-of-the-art object detection systems still rely on recognizing object instances individually, without exploiting their relations during learning. This work proposes an object relation module. It processes a set of objects simultaneously through interaction between their appearance feature and geometry, thus allowing modeling of their relations. It is lightweight and in-place. It does not require additional supervision and is easy to embed in existing networks. It is shown effective on improving object recognition and duplicate removal steps in the modern object detection pipeline. It verifies the efficacy of modeling object relations in CNN based detection. It gives rise to the first fully end-to-end object detector.

蒸餾 · 損失函數（機器學習） · Networking · 未標記 · 教師網絡 ·

2018 年 5 月 16 日

Object detection at 200 Frames Per Second

Rakesh Mehta,Cemalettin Ozturk

In this paper, we propose an efficient and fast object detector which can process hundreds of frames per second. To achieve this goal we investigate three main aspects of the object detection framework: network architecture, loss function and training data (labeled and unlabeled). In order to obtain compact network architecture, we introduce various improvements, based on recent work, to develop an architecture which is computationally light-weight and achieves a reasonable performance. To further improve the performance, while keeping the complexity same, we utilize distillation loss function. Using distillation loss we transfer the knowledge of a more accurate teacher network to proposed light-weight student network. We propose various innovations to make distillation efficient for the proposed one stage detector pipeline: objectness scaled distillation loss, feature map non-maximal suppression and a single unified distillation loss function for detection. Finally, building upon the distillation loss, we explore how much can we push the performance by utilizing the unlabeled data. We train our model with unlabeled data using the soft labels of the teacher network. Our final network consists of 10x fewer parameters than the VGG based object detection network and it achieves a speed of more than 200 FPS and proposed changes improve the detection accuracy by 14 mAP over the baseline on Pascal dataset.

平滑 · 噪聲 · Performer · 子空間 · 正則化項 ·

2018 年 4 月 10 日

Camera Style Adaptation for Person Re-identification

Zhun Zhong,Liang Zheng,Zhedong Zheng,Shaozi Li,Yi Yang

from arxiv, To appear in CVPR 2018

Being a cross-camera retrieval task, person re-identification suffers from image style variations caused by different cameras. The art implicitly addresses this problem by learning a camera-invariant descriptor subspace. In this paper, we explicitly consider this challenge by introducing camera style (CamStyle) adaptation. CamStyle can serve as a data augmentation approach that smooths the camera style disparities. Specifically, with CycleGAN, labeled training images can be style-transferred to each camera, and, along with the original training samples, form the augmented training set. This method, while increasing data diversity against over-fitting, also incurs a considerable level of noise. In the effort to alleviate the impact of noise, the label smooth regularization (LSR) is adopted. The vanilla version of our method (without LSR) performs reasonably well on few-camera systems in which over-fitting often occurs. With LSR, we demonstrate consistent improvement in all systems regardless of the extent of over-fitting. We also report competitive accuracy compared with the state of the art.

INFORMS · Performer · 查全率/召回率 · 目標檢測 · 學成 ·

2018 年 3 月 19 日

Zero-Shot Detection

Pengkai Zhu,Hanxiao Wang,Tolga Bolukbasi,Venkatesh Saligrama

from arxiv, 16 pages, 5 figures

As we move towards large-scale object detection, it is unrealistic to expect annotated training data for all object classes at sufficient scale, and so methods capable of unseen object detection are required. We propose a novel zero-shot method based on training an end-to-end model that fuses semantic attribute prediction with visual features to propose object bounding boxes for seen and unseen classes. While we utilize semantic features during training, our method is agnostic to semantic information for unseen classes at test-time. Our method retains the efficiency and effectiveness of YOLO for objects seen during training, while improving its performance for novel and unseen objects. The ability of state-of-art detection methods to learn discriminative object features to reject background proposals also limits their performance for unseen objects. We posit that, to detect unseen objects, we must incorporate semantic information into the visual domain so that the learned visual features reflect this information and leads to improved recall rates for unseen objects. We test our method on PASCAL VOC and MS COCO dataset and observed significant improvements on the average precision of unseen classes.

目標檢測 · 特征提取 · 學成 · 匯聚 · 端到端 ·

2018 年 3 月 19 日

Learning Region Features for Object Detection

Jiayuan Gu,Han Hu,Liwei Wang,Yichen Wei,Jifeng Dai

While most steps in the modern object detection methods are learnable, the region feature extraction step remains largely hand-crafted, featured by RoI pooling methods. This work proposes a general viewpoint that unifies existing region feature extraction methods and a novel method that is end-to-end learnable. The proposed method removes most heuristic choices and outperforms its RoI pooling counterparts. It moves further towards fully learnable object detection.

YOLO · 可約的 · FAST · ONCE · Yolo ·

2017 年 9 月 18 日

Fast YOLO: A Fast You Only Look Once System for Real-time Embedded Object Detection in Video

Mohammad Javad Shafiee,Brendan Chywl,Francis Li,Alexander Wong

Object detection is considered one of the most challenging problems in this field of computer vision, as it involves the combination of object classification and object localization within a scene. Recently, deep neural networks (DNNs) have been demonstrated to achieve superior object detection performance compared to other approaches, with YOLOv2 (an improved You Only Look Once model) being one of the state-of-the-art in DNN-based object detection methods in terms of both speed and accuracy. Although YOLOv2 can achieve real-time performance on a powerful GPU, it still remains very challenging for leveraging this approach for real-time object detection in video on embedded computing devices with limited computational power and limited memory. In this paper, we propose a new framework called Fast YOLO, a fast You Only Look Once framework which accelerates YOLOv2 to be able to perform object detection in video on embedded devices in a real-time manner. First, we leverage the evolutionary deep intelligence framework to evolve the YOLOv2 network architecture and produce an optimized architecture (referred to as O-YOLOv2 here) that has 2.8X fewer parameters with just a ~2% IOU drop. To further reduce power consumption on embedded devices while maintaining performance, a motion-adaptive inference method is introduced into the proposed Fast YOLO framework to reduce the frequency of deep inference with O-YOLOv2 based on temporal motion characteristics. Experimental results show that the proposed Fast YOLO framework can reduce the number of deep inferences by an average of 38.13%, and an average speedup of ~3.3X for objection detection in video compared to the original YOLOv2, leading Fast YOLO to run an average of ~18FPS on a Nvidia Jetson TX1 embedded system.