干逼视频无码免费网站-亚日韩中文无码视频

Although semantic communications have exhibited satisfactory performance for a large number of tasks, the impact of semantic noise and the robustness of the systems have not been well investigated. Semantic noise refers to the misleading between the intended semantic symbols and received ones, thus cause the failure of tasks. In this paper, we first propose a framework for the robust end-to-end semantic communication systems to combat the semantic noise. In particular, we analyze sample-dependent and sample-independent semantic noise. To combat the semantic noise, the adversarial training with weight perturbation is developed to incorporate the samples with semantic noise in the training dataset. Then, we propose to mask a portion of the input, where the semantic noise appears frequently, and design the masked vector quantized-variational autoencoder (VQ-VAE) with the noise-related masking strategy. We use a discrete codebook shared by the transmitter and the receiver for encoded feature representation. To further improve the system robustness, we develop a feature importance module (FIM) to suppress the noise-related and task-unrelated features. Thus, the transmitter simply needs to transmit the indices of these important task-related features in the codebook. Simulation results show that the proposed method can be applied in many downstream tasks and significantly improve the robustness against semantic noise with remarkable reduction on the transmission overhead.

相關內容

穩健性

關注 3

代碼 · Learning · NMT · 變換 · Automator ·

2022 年 7 月 25 日

Dynamically Relative Position Encoding-Based Transformer for Automatic Code Edit

Shiyi Qi,Yaoxian Li,Cuiyun Gao,Xiaohong Su,Shuzheng Gao,Zibin Zheng,Chuanyi Liu

Adapting Deep Learning (DL) techniques to automate non-trivial coding activities, such as code documentation and defect detection, has been intensively studied recently. Learning to predict code changes is one of the popular and essential investigations. Prior studies have shown that DL techniques such as Neural Machine Translation (NMT) can benefit meaningful code changes, including bug fixing and code refactoring. However, NMT models may encounter bottleneck when modeling long sequences, thus are limited in accurately predicting code changes. In this work, we design a Transformer-based approach, considering that Transformer has proven effective in capturing long-term dependencies. Specifically, we propose a novel model named DTrans. For better incorporating the local structure of code, i.e., statement-level information in this paper, DTrans is designed with dynamically relative position encoding in the multi-head attention of Transformer. Experiments on benchmark datasets demonstrate that DTrans can more accurately generate patches than the state-of-the-art methods, increasing the performance by at least 5.45\%-46.57\% in terms of the exact match metric on different datasets. Moreover, DTrans can locate the lines to change with 1.75\%-24.21\% higher accuracy than the existing methods.

解碼 · 逼真度 · motivation · INTERACT · SimPLe ·

2022 年 7 月 25 日

VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder

Yuchao Gu,Xintao Wang,Liangbin Xie,Chao Dong,Gen Li,Ying Shan,Ming-Ming Cheng

from arxiv, Project webpage is available at //ycgu.site/projects/VQFR_Project

Although generative facial prior and geometric prior have recently demonstrated high-quality results for blind face restoration, producing fine-grained facial details faithful to inputs remains a challenging problem. Motivated by the classical dictionary-based methods and the recent vector quantization (VQ) technique, we propose a VQ-based face restoration method - VQFR. VQFR takes advantage of high-quality low-level feature banks extracted from high-quality faces and can thus help recover realistic facial details. However, the simple application of the VQ codebook cannot achieve good results with faithful details and identity preservation. Therefore, we further introduce two special network designs. 1). We first investigate the compression patch size in the VQ codebook and find that the VQ codebook designed with a proper compression patch size is crucial to balance the quality and fidelity. 2). To further fuse low-level features from inputs while not "contaminating" the realistic details generated from the VQ codebook, we proposed a parallel decoder consisting of a texture decoder and a main decoder. Those two decoders then interact with a texture warping module with deformable convolution. Equipped with the VQ codebook as a facial detail dictionary and the parallel decoder design, the proposed VQFR can largely enhance the restored quality of facial details while keeping the fidelity to previous methods.

Performer · 估計/估計量 · 圖注意力網絡 · Attention · 通道 ·

2022 年 7 月 23 日

Graph Attention Networks for Channel Estimation in RIS-assisted Satellite IoT Communications

Kür?at Tekb?y?k,Güne? Karabulut Kurt,Ali R?za Ekti,Halim Yanikomeroglu

from arxiv, 11 pages, 13 figures

Direct-to-satellite (DtS) communication has gained importance recently to support globally connected Internet of things (IoT) networks. However, relatively long distances of densely deployed satellite networks around the Earth cause a high path loss. In addition, since high complexity operations such as beamforming, tracking and equalization have to be performed in IoT devices partially, both the hardware complexity and the need for high-capacity batteries of IoT devices increase. The reconfigurable intelligent surfaces (RISs) have the potential to increase the energy-efficiency and to perform complex signal processing over the transmission environment instead of IoT devices. But, RISs need the information of the cascaded channel in order to change the phase of the incident signal. This study evaluates the pilot signal as a graph and incorporates this information into the graph attention networks (GATs) to track the phase relation through pilot signaling. The proposed GAT-based channel estimation method examines the performance of the DtS IoT networks for different RIS configurations to solve the challenging channel estimation problem. It is shown that the proposed GAT both demonstrates a higher performance with increased robustness under changing conditions and has lower computational complexity compared to conventional deep learning methods. Moreover, bit error rate performance is investigated for RIS designs with discrete and non-uniform phase shifts under channel estimation based on the proposed method. One of the findings in this study is that the channel models of the operating environment and the performance of the channel estimation method must be considered during RIS design to exploit performance improvement as far as possible.

Analysis · CNN · Neural Networks · 模型評估 · Networking ·

2022 年 7 月 22 日

Taguchi based Design of Sequential Convolution Neural Network for Classification of Defective Fasteners

Manjeet Kaur,Krishan Kumar Chauhan,Tanya Aggarwal,Pushkar Bharadwaj,Renu Vig,Isibor Kennedy Ihianle,Garima Joshi,Kayode Owa

from arxiv, 13 pages, 6 figures

Fasteners play a critical role in securing various parts of machinery. Deformations such as dents, cracks, and scratches on the surface of fasteners are caused by material properties and incorrect handling of equipment during production processes. As a result, quality control is required to ensure safe and reliable operations. The existing defect inspection method relies on manual examination, which consumes a significant amount of time, money, and other resources; also, accuracy cannot be guaranteed due to human error. Automatic defect detection systems have proven impactful over the manual inspection technique for defect analysis. However, computational techniques such as convolutional neural networks (CNN) and deep learning-based approaches are evolutionary methods. By carefully selecting the design parameter values, the full potential of CNN can be realised. Using Taguchi-based design of experiments and analysis, an attempt has been made to develop a robust automatic system in this study. The dataset used to train the system has been created manually for M14 size nuts having two labeled classes: Defective and Non-defective. There are a total of 264 images in the dataset. The proposed sequential CNN comes up with a 96.3% validation accuracy, 0.277 validation loss at 0.001 learning rate.

INFORMS · 噪聲 · 回合 · Unstructured · Color ·

2022 年 7 月 22 日

Visible and Near Infrared Image Fusion Based on Texture Information

Guanyu Zhang,Beichen Sun,Yuehan Qi,Yang Liu

from arxiv, 10 pages,11 figures

Multi-sensor fusion is widely used in the environment perception system of the autonomous vehicle. It solves the interference caused by environmental changes and makes the whole driving system safer and more reliable. In this paper, a novel visible and near-infrared fusion method based on texture information is proposed to enhance unstructured environmental images. It aims at the problems of artifact, information loss and noise in traditional visible and near infrared image fusion methods. Firstly, the structure information of the visible image (RGB) and the near infrared image (NIR) after texture removal is obtained by relative total variation (RTV) calculation as the base layer of the fused image; secondly, a Bayesian classification model is established to calculate the noise weight and the noise information and the noise information in the visible image is adaptively filtered by joint bilateral filter; finally, the fused image is acquired by color space conversion. The experimental results demonstrate that the proposed algorithm can preserve the spectral characteristics and the unique information of visible and near-infrared images without artifacts and color distortion, and has good robustness as well as preserving the unique texture.

圖像字幕 · MoDELS · ForCES · INFORMS · AIM ·

2022 年 7 月 22 日

Efficient Modeling of Future Context for Image Captioning

Zhengcong Fei,Junshi Huang,Xiaoming Wei,Xiaolin Wei

from arxiv, ACM Multimedia 2022

Existing approaches to image captioning usually generate the sentence word-by-word from left to right, with the constraint of conditioned on local context including the given image and history generated words. There have been many studies target to make use of global information during decoding, e.g., iterative refinement. However, it is still under-explored how to effectively and efficiently incorporate the future context. To respond to this issue, inspired by that Non-Autoregressive Image Captioning (NAIC) can leverage two-side relation with modified mask operation, we aim to graft this advance to the conventional Autoregressive Image Captioning (AIC) model while maintaining the inference efficiency without extra time cost. Specifically, AIC and NAIC models are first trained combined with shared visual encoders, forcing the visual encoder to contain sufficient and valid future context; then the AIC model is encouraged to capture the causal dynamics of cross-layer interchanging from NAIC model on its unconfident words, which follows a teacher-student paradigm and optimized with the distribution calibration training objective. Empirical evidences demonstrate that our proposed approach clearly surpass the state-of-the-art baselines in both automatic metrics and human evaluations on the MS COCO benchmark. The source code is available at: //github.com/feizc/Future-Caption.

SOFT · 無監督 · Networking · Performer · Learning ·

2022 年 7 月 21 日

DC-ShadowNet: Single-Image Hard and Soft Shadow Removal Using Unsupervised Domain-Classifier Guided Network

Yeying Jin,Aashish Sharma,Robby T. Tan

from arxiv, Accepted to ICCV2021, //github.com/jinyeying/DC-ShadowNet-Hard-and-Soft-Shadow-Removal

Shadow removal from a single image is generally still an open problem. Most existing learning-based methods use supervised learning and require a large number of paired images (shadow and corresponding non-shadow images) for training. A recent unsupervised method, Mask-ShadowGAN, addresses this limitation. However, it requires a binary mask to represent shadow regions, making it inapplicable to soft shadows. To address the problem, in this paper, we propose an unsupervised domain-classifier guided shadow removal network, DC-ShadowNet. Specifically, we propose to integrate a shadow/shadow-free domain classifier into a generator and its discriminator, enabling them to focus on shadow regions. To train our network, we introduce novel losses based on physics-based shadow-free chromaticity, shadow-robust perceptual features, and boundary smoothness. Moreover, we show that our unsupervised network can be used for test-time training that further improves the results. Our experiments show that all these novel components allow our method to handle soft shadows, and also to perform better on hard shadows both quantitatively and qualitatively than the existing state-of-the-art shadow removal methods.

Learning · 相似度 · ForCES · 判別器 · Performance ·

2022 年 7 月 21 日

ERA: Expert Retrieval and Assembly for Early Action Prediction

Lin Geng Foo,Tianjiao Li,Hossein Rahmani,Qiuhong Ke,Jun Liu

from arxiv, Accepted to ECCV 2022

Early action prediction aims to successfully predict the class label of an action before it is completely performed. This is a challenging task because the beginning stages of different actions can be very similar, with only minor subtle differences for discrimination. In this paper, we propose a novel Expert Retrieval and Assembly (ERA) module that retrieves and assembles a set of experts most specialized at using discriminative subtle differences, to distinguish an input sample from other highly similar samples. To encourage our model to effectively use subtle differences for early action prediction, we push experts to discriminate exclusively between samples that are highly similar, forcing these experts to learn to use subtle differences that exist between those samples. Additionally, we design an effective Expert Learning Rate Optimization method that balances the experts' optimization and leads to better performance. We evaluate our ERA module on four public action datasets and achieve state-of-the-art performance.

語言模型化 · 有偏 · MoDELS · 可約的 · CASE ·

2022 年 7 月 21 日

The Birth of Bias: A case study on the evolution of gender bias in an English language model

Oskar van der Wal,Jaap Jumelet,Katrin Schulz,Willem Zuidema

from arxiv, Accepted at the 4th Workshop on Gender Bias in Natural Language Processing (NAACL, 2022)

Detecting and mitigating harmful biases in modern language models are widely recognized as crucial, open problems. In this paper, we take a step back and investigate how language models come to be biased in the first place. We use a relatively small language model, using the LSTM architecture trained on an English Wikipedia corpus. With full access to the data and to the model parameters as they change during every step while training, we can map in detail how the representation of gender develops, what patterns in the dataset drive this, and how the model's internal state relates to the bias in a downstream task (semantic textual similarity). We find that the representation of gender is dynamic and identify different phases during training. Furthermore, we show that gender information is represented increasingly locally in the input embeddings of the model and that, as a consequence, debiasing these can be effective in reducing the downstream bias. Monitoring the training dynamics, allows us to detect an asymmetry in how the female and male gender are represented in the input embeddings. This is important, as it may cause naive mitigation strategies to introduce new undesirable biases. We discuss the relevance of the findings for mitigation strategies more generally and the prospects of generalizing our methods to larger language models, the Transformer architecture, other languages and other undesirable biases.

圖注意力網絡 · 文本分類 · 圖 · 注意力機制 · Networking ·

2020 年 3 月 22 日

Multi-Label Text Classification using Attention-based Graph Neural Network

Ankit Pal,Muru Selvakumar,Malaikannan Sankarasubbu

In Multi-Label Text Classification (MLTC), one sample can belong to more than one class. It is observed that most MLTC tasks, there are dependencies or correlations among labels. Existing methods tend to ignore the relationship among labels. In this paper, a graph attention network-based model is proposed to capture the attentive dependency structure among the labels. The graph attention network uses a feature matrix and a correlation matrix to capture and explore the crucial dependencies between the labels and generate classifiers for the task. The generated classifiers are applied to sentence feature vectors obtained from the text feature extraction network (BiLSTM) to enable end-to-end training. Attention allows the system to assign different weights to neighbor nodes per label, thus allowing it to learn the dependencies among labels implicitly. The results of the proposed model are validated on five real-world MLTC datasets. The proposed model achieves similar or better performance compared to the previous state-of-the-art models.