两个人的视频免费国语版_日韩毛片天天视频_123香蕉免费一区二区三区_日本一区二区三区激情视频_伊人AV无码一区二区三区_黄色A视频免费在线播放_被同时进入CAO视频

Cheng-hsin Wuu,Ningyuan Zheng,Scott Ardisson,Rohan Bali,Danielle Belko,Eric Brockmeyer,Lucas Evans,Timothy Godisart,Hyowon Ha,Xuhua Huang,Alexander Hypes,Taylor Koska,Steven Krenn,Stephen Lombardi,Xiaomin Luo,Kevyn McPhail,Laura Millerschoen,Michal Perdoch,Mark Pitts,Alexander Richard,Jason Saragih,Junko Saragih,Takaaki Shiratori,Tomas Simon,Matt Stewart,Autumn Trimble,Xinshuo Weng,David Whitewolf,Chenglei Wu,Shoou-I Yu,Yaser Sheikh

Photorealistic avatars of human faces have come a long way in recent years, yet research along this area is limited by a lack of publicly available, high-quality datasets covering both, dense multi-view camera captures, and rich facial expressions of the captured subjects. In this work, we present Multiface, a new multi-view, high-resolution human face dataset collected from 13 identities at Reality Labs Research for neural face rendering. We introduce Mugsy, a large scale multi-camera apparatus to capture high-resolution synchronized videos of a facial performance. The goal of Multiface is to close the gap in accessibility to high quality data in the academic community and to enable research in VR telepresence. Along with the release of the dataset, we conduct ablation studies on the influence of different model architectures toward the model's interpolation capacity of novel viewpoint and expressions. With a conditional VAE model serving as our baseline, we found that adding spatial bias, texture warp field, and residual connections improves performance on novel view synthesis. Our code and data is available at: //github.com/facebookresearch/multiface

相關內容

數據集

關注 88

數據集，又稱為資料集、數據集合或資料集合，是一種由數據所組成的集合。
Data set（或dataset）是一個數據的集合，通常以表格形式出現。每一列代表一個特定變量。每一行都對應于某一成員的數據集的問題。它列出的價值觀為每一個變量，如身高和體重的一個物體或價值的隨機數。每個數值被稱為數據資料。對應于行數，該數據集的數據可能包括一個或多個成員。

Continuity · MoDELS · Networking · Processing（編程語言） · 推斷 ·

2023 年 8 月 16 日

AdaBrowse: Adaptive Video Browser for Efficient Continuous Sign Language Recognition

Lianyu Hu,Liqing Gao,Zekang Liu,Chi-Man Pun,Wei Feng

from arxiv, ACMMM2023

Raw videos have been proven to own considerable feature redundancy where in many cases only a portion of frames can already meet the requirements for accurate recognition. In this paper, we are interested in whether such redundancy can be effectively leveraged to facilitate efficient inference in continuous sign language recognition (CSLR). We propose a novel adaptive model (AdaBrowse) to dynamically select a most informative subsequence from input video sequences by modelling this problem as a sequential decision task. In specific, we first utilize a lightweight network to quickly scan input videos to extract coarse features. Then these features are fed into a policy network to intelligently select a subsequence to process. The corresponding subsequence is finally inferred by a normal CSLR model for sentence prediction. As only a portion of frames are processed in this procedure, the total computations can be considerably saved. Besides temporal redundancy, we are also interested in whether the inherent spatial redundancy can be seamlessly integrated together to achieve further efficiency, i.e., dynamically selecting a lowest input resolution for each sample, whose model is referred to as AdaBrowse+. Extensive experimental results on four large-scale CSLR datasets, i.e., PHOENIX14, PHOENIX14-T, CSL-Daily and CSL, demonstrate the effectiveness of AdaBrowse and AdaBrowse+ by achieving comparable accuracy with state-of-the-art methods with 1.44$\times$ throughput and 2.12$\times$ fewer FLOPs. Comparisons with other commonly-used 2D CNNs and adaptive efficient methods verify the effectiveness of AdaBrowse. Code is available at \url{//github.com/hulianyuyy/AdaBrowse}.

話題 · 可理解性 · Projection · Networking · 論文 ·

2023 年 8 月 16 日

RTVis: Research Trend Visualization Toolkit

Xingyu Shen,Yueqian Lin,Zhixian Zhang,Xin Tong

from arxiv, Accepted by IEEE VIS 2023 (Poster). 2 pages, 1 figure. For our demo page, visit //www.rtvis.design/

When researchers are about to start a new project or have just entered a new research field, choosing a proper research topic is always challenging. To help them have an overall understanding of the research trend in real-time and find out the research topic they are interested in, we developed the Research Trend Visualization toolkit (RTVis) to analyze and visualize the research paper information. RTVis consists of a field theme river, a co-occurrence network, a specialized citation bar chart, and a word frequency race diagram, showing the field change through time, cooperating relationship among authors, paper citation numbers in different venues, and the most common words in the abstract part respectively. Moreover, RTVis is open source and easy to deploy. The demo of our toolkit and code with detailed documentation are both available online.

INFORMS · 知識 (knowledge) · 事件抽取 · 可辨認的 · 基于上下文的表示 ·

2023 年 8 月 16 日

EnrichEvent: Enriching Social Data with Contextual Information for Emerging Event Extraction

Mohammadali Sefidi Esfahani,Mohammad Akbari

Social platforms have emerged as crucial platforms for disseminating information and discussing real-life social events, which offers an excellent opportunity for researchers to design and implement novel event detection frameworks. However, most existing approaches merely exploit keyword burstiness or network structures to detect unspecified events. Thus, they often fail to identify unspecified events regarding the challenging nature of events and social data. Social data, e.g., tweets, is characterized by misspellings, incompleteness, word sense ambiguation, and irregular language, as well as variation in aspects of opinions. Moreover, extracting discriminative features and patterns for evolving events by exploiting the limited structural knowledge is almost infeasible. To address these challenges, in this thesis, we propose a novel framework, namely EnrichEvent, that leverages the lexical and contextual representations of streaming social data. In particular, we leverage contextual knowledge, as well as lexical knowledge, to detect semantically related tweets and enhance the effectiveness of the event detection approaches. Eventually, our proposed framework produces cluster chains for each event to show the evolving variation of the event through time. We conducted extensive experiments to evaluate our framework, validating its high performance and effectiveness in detecting and distinguishing unspecified social events.

Networking · 分離的 · 塊 · Attention · state-of-the-art ·

2023 年 8 月 16 日

SCANet: A Self- and Cross-Attention Network for Audio-Visual Speech Separation

Kai Li,Runxuan Yang,Xiaolin Hu

from arxiv, 14 pages, 3 figures

The integration of different modalities, such as audio and visual information, plays a crucial role in human perception of the surrounding environment. Recent research has made significant progress in designing fusion modules for audio-visual speech separation. However, they predominantly focus on multi-modal fusion architectures situated either at the top or bottom positions, rather than comprehensively considering multi-modal fusion at various hierarchical positions within the network. In this paper, we propose a novel model called self- and cross-attention network (SCANet), which leverages the attention mechanism for efficient audio-visual feature fusion. SCANet consists of two types of attention blocks: self-attention (SA) and cross-attention (CA) blocks, where the CA blocks are distributed at the top (TCA), middle (MCA) and bottom (BCA) of SCANet. These blocks maintain the ability to learn modality-specific features and enable the extraction of different semantics from audio-visual features. Comprehensive experiments on three standard audio-visual separation benchmarks (LRS2, LRS3, and VoxCeleb2) demonstrate the effectiveness of SCANet, outperforming existing state-of-the-art (SOTA) methods while maintaining comparable inference time.

INFORMS · 數據集 · Extensibility · Performer · 回合 ·

2023 年 8 月 15 日

ADD: An Automatic Desensitization Fisheye Dataset for Autonomous Driving

Zizhang Wu,Chenxin Yuan,Hongyang Wei,Fan Song,Tianhao Xu

Autonomous driving systems require many images for analyzing the surrounding environment. However, there is fewer data protection for private information among these captured images, such as pedestrian faces or vehicle license plates, which has become a significant issue. In this paper, in response to the call for data security laws and regulations and based on the advantages of large Field of View(FoV) of the fisheye camera, we build the first Autopilot Desensitization Dataset, called ADD, and formulate the first deep-learning-based image desensitization framework, to promote the study of image desensitization in autonomous driving scenarios. The compiled dataset consists of 650K images, including different face and vehicle license plate information captured by the surround-view fisheye camera. It covers various autonomous driving scenarios, including diverse facial characteristics and license plate colors. Then, we propose an efficient multitask desensitization network called DesCenterNet as a benchmark on the ADD dataset, which can perform face and vehicle license plate detection and desensitization tasks. Based on ADD, we further provide an evaluation criterion for desensitization performance, and extensive comparison experiments have verified the effectiveness and superiority of our method on image desensitization.

MoDELS · Vision · 去噪 · 前向 · 可辨認的 ·

2022 年 9 月 10 日

Diffusion Models in Vision: A Survey

Florinel-Alin Croitoru,Vlad Hondru,Radu Tudor Ionescu,Mubarak Shah

from arxiv, 20 pages, 3 figures

Denoising diffusion models represent a recent emerging topic in computer vision, demonstrating remarkable results in the area of generative modeling. A diffusion model is a deep generative model that is based on two stages, a forward diffusion stage and a reverse diffusion stage. In the forward diffusion stage, the input data is gradually perturbed over several steps by adding Gaussian noise. In the reverse stage, a model is tasked at recovering the original input data by learning to gradually reverse the diffusion process, step by step. Diffusion models are widely appreciated for the quality and diversity of the generated samples, despite their known computational burdens, i.e. low speeds due to the high number of steps involved during sampling. In this survey, we provide a comprehensive review of articles on denoising diffusion models applied in vision, comprising both theoretical and practical contributions in the field. First, we identify and present three generic diffusion modeling frameworks, which are based on denoising diffusion probabilistic models, noise conditioned score networks, and stochastic differential equations. We further discuss the relations between diffusion models and other deep generative models, including variational auto-encoders, generative adversarial networks, energy-based models, autoregressive models and normalizing flows. Then, we introduce a multi-perspective categorization of diffusion models applied in computer vision. Finally, we illustrate the current limitations of diffusion models and envision some interesting directions for future research.

學成 · 聯邦學習 · MoDELS · Continuity · 假陽性 ·

2022 年 5 月 23 日

CELEST: Federated Learning for Globally Coordinated Threat Detection

Talha Ongun,Simona Boboila,Alina Oprea,Tina Eliassi-Rad,Jason Hiser,Jack Davidson

The cyber-threat landscape has evolved tremendously in recent years, with new threat variants emerging daily, and large-scale coordinated campaigns becoming more prevalent. In this study, we propose CELEST (CollaborativE LEarning for Scalable Threat detection), a federated machine learning framework for global threat detection over HTTP, which is one of the most commonly used protocols for malware dissemination and communication. CELEST leverages federated learning in order to collaboratively train a global model across multiple clients who keep their data locally, thus providing increased privacy and confidentiality assurances. Through a novel active learning component integrated with the federated learning technique, our system continuously discovers and learns the behavior of new, evolving, and globally-coordinated cyber threats. We show that CELEST is able to expose attacks that are largely invisible to individual organizations. For instance, in one challenging attack scenario with data exfiltration malware, the global model achieves a three-fold increase in Precision-Recall AUC compared to the local model. We deploy CELEST on two university networks and show that it is able to detect the malicious HTTP communication with high precision and low false positive rates. Furthermore, during its deployment, CELEST detected a set of previously unknown 42 malicious URLs and 20 malicious domains in one day, which were confirmed to be malicious by VirusTotal.

Performer · 變換 · MoDELS · Taxonomy · INTERACT ·

2022 年 2 月 15 日

Transformers in Time Series: A Survey

Qingsong Wen,Tian Zhou,Chaoli Zhang,Weiqi Chen,Ziqing Ma,Junchi Yan,Liang Sun

from arxiv, 8 pages, 1 figure, 4 tables, 65 referred papers

Transformers have achieved superior performances in many tasks in natural language processing and computer vision, which also intrigues great interests in the time series community. Among multiple advantages of transformers, the ability to capture long-range dependencies and interactions is especially attractive for time series modeling, leading to exciting progress in various time series applications. In this paper, we systematically review transformer schemes for time series modeling by highlighting their strengths as well as limitations through a new taxonomy to summarize existing time series transformers in two perspectives. From the perspective of network modifications, we summarize the adaptations of module level and architecture level of the time series transformers. From the perspective of applications, we categorize time series transformers based on common tasks including forecasting, anomaly detection, and classification. Empirically, we perform robust analysis, model size analysis, and seasonal-trend decomposition analysis to study how Transformers perform in time series. Finally, we discuss and suggest future directions to provide useful research guidance. To the best of our knowledge, this paper is the first work to comprehensively and systematically summarize the recent advances of Transformers for modeling time series data. We hope this survey will ignite further research interests in time series Transformers.

Performer · MoDELS · 模型性能 · Neural Networks · Processing（編程語言） ·

2020 年 10 月 26 日

Backdoor Learning: A Survey

Yiming Li,Baoyuan Wu,Yong Jiang,Zhifeng Li,Shu-Tao Xia

from arxiv, 12 pages. A curated list of backdoor learning resources in this paper is presented in the Github Repo (//github.com/THUYimingLi/backdoor-learning-resources). We will try our best to continuously maintain the repo

Backdoor attack intends to embed hidden backdoor into deep neural networks (DNNs), such that the attacked model performs well on benign samples, whereas its prediction will be maliciously changed if the hidden backdoor is activated by the attacker-defined trigger. Backdoor attack could happen when the training process is not fully controlled by the user, such as training on third-party datasets or adopting third-party models, which poses a new and realistic threat. Although backdoor learning is an emerging and rapidly growing research area, its systematic review, however, remains blank. In this paper, we present the first comprehensive survey of this realm. We summarize and categorize existing backdoor attacks and defenses based on their characteristics, and provide a unified framework for analyzing poisoning-based backdoor attacks. Besides, we also analyze the relation between backdoor attacks and the relevant fields ($i.e.,$ adversarial attack and data poisoning), and summarize the benchmark datasets. Finally, we briefly outline certain future research directions relying upon reviewed works.

Performer · 判別器 · 正例 · 假陽性 · 監督 ·

2018 年 5 月 24 日

DSGAN: Generative Adversarial Training for Distant Supervision Relation Extraction

Pengda Qin,Weiran Xu,William Yang Wang

Distant supervision can effectively label data for relation extraction, but suffers from the noise labeling problem. Recent works mainly perform soft bag-level noise reduction strategies to find the relatively better samples in a sentence bag, which is suboptimal compared with making a hard decision of false positive samples in sentence level. In this paper, we introduce an adversarial learning framework, which we named DSGAN, to learn a sentence-level true-positive generator. Inspired by Generative Adversarial Networks, we regard the positive samples generated by the generator as the negative samples to train the discriminator. The optimal generator is obtained until the discrimination ability of the discriminator has the greatest decline. We adopt the generator to filter distant supervision training dataset and redistribute the false positive instances into the negative set, in which way to provide a cleaned dataset for relation classification. The experimental results show that the proposed strategy significantly improves the performance of distant supervision relation extraction comparing to state-of-the-art systems.