一区二区三区四区五区无码-午夜剧场成年免费视

In recent years there has been a dramatic increase in the number of malware attacks that use encrypted HTTP traffic for self-propagation or communication. Antivirus software and firewalls typically will not have access to encryption keys, and therefore direct detection of malicious encrypted data is unlikely to succeed. However, previous work has shown that traffic analysis can provide indications of malicious intent, even in cases where the underlying data remains encrypted. In this paper, we apply three machine learning techniques to the problem of distinguishing malicious encrypted HTTP traffic from benign encrypted traffic and obtain results comparable to previous work. We then consider the problem of feature analysis in some detail. Previous work has often relied on human expertise to determine the most useful and informative features in this problem domain. We demonstrate that such feature-related information can be obtained directly from machine learning models themselves. We argue that such a machine learning based approach to feature analysis is preferable, as it is more reliable, and we can, for example, uncover relatively unintuitive interactions between features.

相關內容

Analysis

關注 2

損失 · MoDELS · state-of-the-art · Performer · Better ·

2024 年 1 月 30 日

Proactive Detection of Voice Cloning with Localized Watermarking

Robin San Roman,Pierre Fernandez,Alexandre Défossez,Teddy Furon,Tuan Tran,Hady Elsahar

from arxiv, Code at //github.com/facebookresearch/audioseal

In the rapidly evolving field of speech generative models, there is a pressing need to ensure audio authenticity against the risks of voice cloning. We present AudioSeal, the first audio watermarking technique designed specifically for localized detection of AI-generated speech. AudioSeal employs a generator/detector architecture trained jointly with a localization loss to enable localized watermark detection up to the sample level, and a novel perceptual loss inspired by auditory masking, that enables AudioSeal to achieve better imperceptibility. AudioSeal achieves state-of-the-art performance in terms of robustness to real life audio manipulations and imperceptibility based on automatic and human evaluation metrics. Additionally, AudioSeal is designed with a fast, single-pass detector, that significantly surpasses existing models in speed - achieving detection up to two orders of magnitude faster, making it ideal for large-scale and real-time applications.

Taxonomy · 數學 · 相似度 · Seven · CASES ·

2024 年 1 月 30 日

Taxonomy of Mathematical Plagiarism

Ankit Satpute,Andre Greiner-Petter,Noah Gie?ing,Isabel Beckenbach,Moritz Schubotz,Olaf Teschke,Akiko Aizawa,Bela Gipp

from arxiv, 46th European Conference on Information Retrieval (ECIR)

Plagiarism is a pressing concern, even more so with the availability of large language models. Existing plagiarism detection systems reliably find copied and moderately reworded text but fail for idea plagiarism, especially in mathematical science, which heavily uses formal mathematical notation. We make two contributions. First, we establish a taxonomy of mathematical content reuse by annotating potentially plagiarised 122 scientific document pairs. Second, we analyze the best-performing approaches to detect plagiarism and mathematical content similarity on the newly established taxonomy. We found that the best-performing methods for plagiarism and math content similarity achieve an overall detection score (PlagDet) of 0.06 and 0.16, respectively. The best-performing methods failed to detect most cases from all seven newly established math similarity types. Outlined contributions will benefit research in plagiarism detection systems, recommender systems, question-answering systems, and search engines. We make our experiment's code and annotated dataset available to the community: //github.com/gipplab/Taxonomy-of-Mathematical-Plagiarism

INTERACT · Performer · Extensibility · INFORMS · state-of-the-art ·

2024 年 1 月 30 日

Towards Unified Interactive Visual Grounding in The Wild

Jie Xu,Hanbo Zhang,Qingyi Si,Yifeng Li,Xuguang Lan,Tao Kong

from arxiv, Accepted to ICRA 2024

Interactive visual grounding in Human-Robot Interaction (HRI) is challenging yet practical due to the inevitable ambiguity in natural languages. It requires robots to disambiguate the user input by active information gathering. Previous approaches often rely on predefined templates to ask disambiguation questions, resulting in performance reduction in realistic interactive scenarios. In this paper, we propose TiO, an end-to-end system for interactive visual grounding in human-robot interaction. Benefiting from a unified formulation of visual dialogue and grounding, our method can be trained on a joint of extensive public data, and show superior generality to diversified and challenging open-world scenarios. In the experiments, we validate TiO on GuessWhat?! and InViG benchmarks, setting new state-of-the-art performance by a clear margin. Moreover, we conduct HRI experiments on the carefully selected 150 challenging scenes as well as real-robot platforms. Results show that our method demonstrates superior generality to diversified visual and language inputs with a high success rate. Codes and demos are available at //github.com/jxu124/TiO.

MoDELS · Learning · 講稿 · 深度學習 · MINE ·

2024 年 1 月 29 日

Deep Learning Models for Detecting Malware Attacks

Pascal Maniriho,Abdun Naser Mahmood,Mohammad Jabed Morshed Chowdhury

from arxiv, Revised figures 2 and 3, revised title, remove typos page 17

Malware is one of the most common and severe cyber-attack today. Malware infects millions of devices and can perform several malicious activities including mining sensitive data, encrypting data, crippling system performance, and many more. Hence, malware detection is crucial to protect our computers and mobile devices from malware attacks. Deep learning (DL) is one of the emerging and promising technologies for detecting malware. The recent high production of malware variants against desktop and mobile platforms makes DL algorithms powerful approaches for building scalable and advanced malware detection models as they can handle big datasets. This work explores current deep learning technologies for detecting malware attacks on the Windows, Linux, and Android platforms. Specifically, we present different categories of DL algorithms, network optimizers, and regularization methods. Different loss functions, activation functions, and frameworks for implementing DL models are presented. We also present feature extraction approaches and a review of recent DL-based models for detecting malware attacks on the above platforms. Furthermore, this work presents major research issues on malware detection including future directions to further advance knowledge and research in this field.

Processing（編程語言） · 圖 · motivation · Networks · 相似度 ·

2024 年 1 月 29 日

A Generalized Neural Diffusion Framework on Graphs

Yibo Li,Xiao Wang,Hongrui Liu,Chuan Shi

from arxiv, Accepted by AAAI 2024

Recent studies reveal the connection between GNNs and the diffusion process, which motivates many diffusion-based GNNs to be proposed. However, since these two mechanisms are closely related, one fundamental question naturally arises: Is there a general diffusion framework that can formally unify these GNNs? The answer to this question can not only deepen our understanding of the learning process of GNNs, but also may open a new door to design a broad new class of GNNs. In this paper, we propose a general diffusion equation framework with the fidelity term, which formally establishes the relationship between the diffusion process with more GNNs. Meanwhile, with this framework, we identify one characteristic of graph diffusion networks, i.e., the current neural diffusion process only corresponds to the first-order diffusion equation. However, by an experimental investigation, we show that the labels of high-order neighbors actually exhibit monophily property, which induces the similarity based on labels among high-order neighbors without requiring the similarity among first-order neighbors. This discovery motives to design a new high-order neighbor-aware diffusion equation, and derive a new type of graph diffusion network (HiD-Net) based on the framework. With the high-order diffusion equation, HiD-Net is more robust against attacks and works on both homophily and heterophily graphs. We not only theoretically analyze the relation between HiD-Net with high-order random walk, but also provide a theoretical convergence guarantee. Extensive experimental results well demonstrate the effectiveness of HiD-Net over state-of-the-art graph diffusion networks.

知識 (knowledge) · 語言模型化 · 大語言模型 · 可約的 · MoDELS ·

2024 年 1 月 28 日

Mitigating Hallucinations of Large Language Models via Knowledge Consistent Alignment

Fanqi Wan,Xinting Huang,Leyang Cui,Xiaojun Quan,Wei Bi,Shuming Shi

from arxiv, Work in progress

While Large Language Models (LLMs) have proven to be exceptional on a variety of tasks after alignment, they may still produce responses that contradict the context or world knowledge confidently, a phenomenon known as ``hallucination''. In this paper, we demonstrate that reducing the inconsistency between the external knowledge encapsulated in the training data and the intrinsic knowledge inherited in the pretraining corpus could mitigate hallucination in alignment. Specifically, we introduce a novel knowledge consistent alignment (KCA) approach, which involves automatically formulating examinations based on external knowledge for accessing the comprehension of LLMs. For data encompassing knowledge inconsistency, KCA implements several simple yet efficient strategies for processing. We illustrate the superior performance of the proposed KCA approach in mitigating hallucinations across six benchmarks using LLMs of different backbones and scales. Furthermore, we confirm the correlation between knowledge inconsistency and hallucination, signifying the effectiveness of reducing knowledge inconsistency in alleviating hallucinations. Our code, model weights, and data are public at \url{//github.com/fanqiwan/KCA}.

entity · 圖卷積網絡 · Attention · 知識 (knowledge) · 圖 ·

2024 年 1 月 27 日

Knowledge Graph Reasoning Based on Attention GCN

Meera Gupta,Ravi Khanna,Divya Choudhary,Nandini Rao

We propose a novel technique to enhance Knowledge Graph Reasoning by combining Graph Convolution Neural Network (GCN) with the Attention Mechanism. This approach utilizes the Attention Mechanism to examine the relationships between entities and their neighboring nodes, which helps to develop detailed feature vectors for each entity. The GCN uses shared parameters to effectively represent the characteristics of adjacent entities. We first learn the similarity of entities for node representation learning. By integrating the attributes of the entities and their interactions, this method generates extensive implicit feature vectors for each entity, improving performance in tasks including entity classification and link prediction, outperforming traditional neural network models. To conclude, this work provides crucial methodological support for a range of applications, such as search engines, question-answering systems, recommendation systems, and data integration tasks.

穩健性 · binary · 代碼 · 變換 · 講稿 ·

2024 年 1 月 27 日

Improved Construction of Robust Gray Code

Dorsa Fathollahi,Mary Wootters

A robust Gray code, formally introduced by (Lolck and Pagh, SODA 2024), is a Gray code that additionally has the property that, given a noisy version of the encoding of an integer $j$, it is possible to reconstruct $\hat{j}$ so that $|j - \hat{j}|$ is small with high probability. That work presented a transformation that transforms a binary code $C$ of rate $R$ to a robust Gray code with rate $\Omega(R)$, where the constant in the $\Omega(\cdot)$ can be at most $1/4$. We improve upon their construction by presenting a transformation from a (linear) binary code $C$ to a robust Gray code with similar robustness guarantees, but with rate that can approach $R/2$.

向量化 · 相似度 · Processing（編程語言） · Storage · 優化器 ·

2023 年 10 月 21 日

Survey of Vector Database Management Systems

James Jie Pan,Jianguo Wang,Guoliang Li

from arxiv, 25 pages

There are now over 20 commercial vector database management systems (VDBMSs), all produced within the past five years. But embedding-based retrieval has been studied for over ten years, and similarity search a staggering half century and more. Driving this shift from algorithms to systems are new data intensive applications, notably large language models, that demand vast stores of unstructured data coupled with reliable, secure, fast, and scalable query processing capability. A variety of new data management techniques now exist for addressing these needs, however there is no comprehensive survey to thoroughly review these techniques and systems. We start by identifying five main obstacles to vector data management, namely vagueness of semantic similarity, large size of vectors, high cost of similarity comparison, lack of natural partitioning that can be used for indexing, and difficulty of efficiently answering hybrid queries that require both attributes and vectors. Overcoming these obstacles has led to new approaches to query processing, storage and indexing, and query optimization and execution. For query processing, a variety of similarity scores and query types are now well understood; for storage and indexing, techniques include vector compression, namely quantization, and partitioning based on randomization, learning partitioning, and navigable partitioning; for query optimization and execution, we describe new operators for hybrid queries, as well as techniques for plan enumeration, plan selection, and hardware accelerated execution. These techniques lead to a variety of VDBMSs across a spectrum of design and runtime characteristics, including native systems specialized for vectors and extended systems that incorporate vector capabilities into existing systems. We then discuss benchmarks, and finally we outline research challenges and point the direction for future work.

MoDELS · 優化器 · Analysis · 推斷 · 估計/估計量 ·

2022 年 9 月 19 日

A Survey of Deep Causal Model

Zongyu Li,Zhenfeng Zhu

The concept of causality plays an important role in human cognition . In the past few decades, causal inference has been well developed in many fields, such as computer science, medicine, economics, and education. With the advancement of deep learning techniques, it has been increasingly used in causal inference against counterfactual data. Typically, deep causal models map the characteristics of covariates to a representation space and then design various objective optimization functions to estimate counterfactual data unbiasedly based on the different optimization methods. This paper focuses on the survey of the deep causal models, and its core contributions are as follows: 1) we provide relevant metrics under multiple treatments and continuous-dose treatment; 2) we incorporate a comprehensive overview of deep causal models from both temporal development and method classification perspectives; 3) we assist a detailed and comprehensive classification and analysis of relevant datasets and source code.