美国式禁忌电影在线观看免费观看-日韩1区3区4区第一页

The usage of the mobile app is unassailable in this digital era. While tons of data are generated daily, user privacy security concerns become an important issue. Nowadays, tons of techniques, such as machine learning and deep learning traffic classifiers, have been applied to analyze users app traffic. These techniques allow the monitor to get the fingerprints of using apps while the user traffic is still encrypted, which raises a severe privacy issue. In order to fight against this type of data analysis, people have been researching obfuscation algorithms to confuse feature-based machine learning classifiers with data camouflage by modification on packet length distribution. The existing works achieve this goal by remapping traffic packet length distribution from the source app to the fake camouflage app. However, this solution suffers from its lack of scalability and flexibility in practical application since the method needs to pre-sample the target fake apps traffic before the use of traffic camouflage. In this paper, we proposed a practical solution by using a mathematical model to calculate the target distribution while maintaining at least 50 percent accuracy drops on the performance of the AppScanner mobile traffic classifier and roughly 20 percent overhead created during packet modification.

相關內容

Learning

關注 12

攻擊 · 片段 · 差分 · 軟件 · 分析 ·

2023 年 3 月 30 日

Differential Area Analysis for Ransomware: Attacks, Countermeasures, and Limitations

Marco Venturini,Francesco Freda,Emanuele Miotto,Alberto Giaretta,Mauro Conti

from arxiv, 14 pages, 12 figures, journal article

Crypto-ransomware attacks have been a growing threat over the last few years. The goal of every ransomware strain is encrypting user data, such that attackers can later demand users a ransom for unlocking their data. To maximise their earning chances, attackers equip their ransomware with strong encryption which produce files with high entropy values. Davies et al. proposed Differential Area Analysis (DAA), a technique that analyses files headers to differentiate compressed, regularly encrypted, and ransomware-encrypted files. In this paper, first we propose three different attacks to perform malicious header manipulation and bypass DAA detection. Then, we propose three countermeasures, namely 2-Fragments (2F), 3-Fragments (3F), and 4-Fragments (4F), which can be applied equally against each of the three attacks we propose. We conduct a number of experiments to analyse the ability of our countermeasures to detect ransomware-encrypted files, whether implementing our proposed attacks or not. Last, we test the robustness of our own countermeasures by analysing the performance, in terms of files per second analysed and resilience to extensive injection of low-entropy data. Our results show that our detection countermeasures are viable and deployable alternatives to DAA.

差分 · 過程挖掘 · 差分隱私 · 對抗網絡 · 生成式對抗網絡 ·

2023 年 3 月 29 日

TraVaG: Differentially Private Trace Variant Generation Using GANs

Majid Rafiei,Frederik Wangelik,Mahsa Pourbafrani,Wil M. P. van der Aalst

Process mining is rapidly growing in the industry. Consequently, privacy concerns regarding sensitive and private information included in event data, used by process mining algorithms, are becoming increasingly relevant. State-of-the-art research mainly focuses on providing privacy guarantees, e.g., differential privacy, for trace variants that are used by the main process mining techniques, e.g., process discovery. However, privacy preservation techniques for releasing trace variants still do not fulfill all the requirements of industry-scale usage. Moreover, providing privacy guarantees when there exists a high rate of infrequent trace variants is still a challenge. In this paper, we introduce TraVaG as a new approach for releasing differentially private trace variants based on \text{Generative Adversarial Networks} (GANs) that provides industry-scale benefits and enhances the level of privacy guarantees when there exists a high ratio of infrequent variants. Moreover, TraVaG overcomes shortcomings of conventional privacy preservation techniques such as bounding the length of variants and introducing fake variants. Experimental results on real-life event data show that our approach outperforms state-of-the-art techniques in terms of privacy guarantees, plain data utility preservation, and result utility preservation.

深度學習 · 工具 · 彩色圖像 · 網絡模型 · 深度學習框架 ·

2023 年 3 月 29 日

Development of a deep learning-based tool to assist wound classification

Po-Hsuan Huang,Yi-Hsiang Pan,Ying-Sheng Luo,Yi-Fan Chen,Yu-Cheng Lo,Trista Pei-Chun Chen,Cherng-Kang Perng

This paper presents a deep learning-based wound classification tool that can assist medical personnel in non-wound care specialization to classify five key wound conditions, namely deep wound, infected wound, arterial wound, venous wound, and pressure wound, given color images captured using readily available cameras. The accuracy of the classification is vital for appropriate wound management. The proposed wound classification method adopts a multi-task deep learning framework that leverages the relationships among the five key wound conditions for a unified wound classification architecture. With differences in Cohen's kappa coefficients as the metrics to compare our proposed model with humans, the performance of our model was better or non-inferior to those of all human medical personnel. Our convolutional neural network-based model is the first to classify five tasks of deep, infected, arterial, venous, and pressure wounds simultaneously with good accuracy. The proposed model is compact and matches or exceeds the performance of human doctors and nurses. Medical personnel who do not specialize in wound care can potentially benefit from an app equipped with the proposed deep learning model.

文本情感分類 · 預處理 · 網絡文本 · 情感分類 · 支持向量機 ·

2023 年 3 月 28 日

An Experimental Study on Sentiment Classification of Moroccan dialect texts in the web

Mouad Jbel,Imad Hafidi,Abdulmutallib Metrane

from arxiv, 13 pages, 5 tables, 2 figures

With the rapid growth of the use of social media websites, obtaining the users' feedback automatically became a crucial task to evaluate their tendencies and behaviors online. Despite this great availability of information, and the increasing number of Arabic users only few research has managed to treat Arabic dialects. The purpose of this paper is to study the opinion and emotion expressed in real Moroccan texts precisely in the YouTube comments using some well-known and commonly used methods for sentiment analysis. In this paper, we present our work of Moroccan dialect comments classification using Machine Learning (ML) models and based on our collected and manually annotated YouTube Moroccan dialect dataset. By employing many text preprocessing and data representation techniques we aim to compare our classification results utilizing the most commonly used supervised classifiers: k-nearest neighbors (KNN), Support Vector Machine (SVM), Naive Bayes (NB), and deep learning (DL) classifiers such as Convolutional Neural Network (CNN) and Long Short-Term Memory (LTSM). Experiments were performed using both raw and preprocessed data to show the importance of the preprocessing. In fact, the experimental results prove that DL models have a better performance for Moroccan Dialect than classical approaches and we achieved an accuracy of 90%.

計算機網絡 · 分離的 · 網絡活動 · CVPR 2022 · 鏈接預測 ·

2023 年 3 月 28 日

A source separation approach to temporal graph modelling for computer networks

Corentin Larroche

Detecting malicious activity within an enterprise computer network can be framed as a temporal link prediction task: given a sequence of graphs representing communications between hosts over time, the goal is to predict which edges should--or should not--occur in the future. However, standard temporal link prediction algorithms are ill-suited for computer network monitoring as they do not take account of the peculiar short-term dynamics of computer network activity, which exhibits sharp seasonal variations. In order to build a better model, we propose a source separation-inspired description of computer network activity: at each time step, the observed graph is a mixture of subgraphs representing various sources of activity, and short-term dynamics result from changes in the mixing coefficients. Both qualitative and quantitative experiments demonstrate the validity of our approach.

移動游戲 · 實時競價（RTB） · 需求方平臺（DSP） · 匿名化 · 用戶選擇 ·

2023 年 3 月 28 日

Towards a User Privacy-Aware Mobile Gaming App Installation Prediction Model

Ido Zehori,Nevo Itzhak,Yuval Shahar,Mia Dor Schiller

from arxiv, 11 pages, 3 figures

Over the past decade, programmatic advertising has received a great deal of attention in the online advertising industry. A real-time bidding (RTB) system is rapidly becoming the most popular method to buy and sell online advertising impressions. Within the RTB system, demand-side platforms (DSP) aim to spend advertisers' campaign budgets efficiently while maximizing profit, seeking impressions that result in high user responses, such as clicks or installs. In the current study, we investigate the process of predicting a mobile gaming app installation from the point of view of a particular DSP, while paying attention to user privacy, and exploring the trade-off between privacy preservation and model performance. There are multiple levels of potential threats to user privacy, depending on the privacy leaks associated with the data-sharing process, such as data transformation or de-anonymization. To address these concerns, privacy-preserving techniques were proposed, such as cryptographic approaches, for training privacy-aware machine-learning models. However, the ability to train a mobile gaming app installation prediction model without using user-level data, can prevent these threats and protect the users' privacy, even though the model's ability to predict may be impaired. Additionally, current laws might force companies to declare that they are collecting data, and might even give the user the option to opt out of such data collection, which might threaten companies' business models in digital advertising, which are dependent on the collection and use of user-level data. We conclude that privacy-aware models might still preserve significant capabilities, enabling companies to make better decisions, dependent on the privacy-efficacy trade-off utility function of each case.

圖形處理器 · Networking · Neural Networks · 學成 · 圖 ·

2022 年 2 月 17 日

Learning and Evaluating Graph Neural Network Explanations based on Counterfactual and Factual Reasoning

Juntao Tan,Shijie Geng,Zuohui Fu,Yingqiang Ge,Shuyuan Xu,Yunqi Li,Yongfeng Zhang

from arxiv, To be published at the Web Conference 2022 (WWW 2022)

Structural data well exists in Web applications, such as social networks in social media, citation networks in academic websites, and threads data in online forums. Due to the complex topology, it is difficult to process and make use of the rich information within such data. Graph Neural Networks (GNNs) have shown great advantages on learning representations for structural data. However, the non-transparency of the deep learning models makes it non-trivial to explain and interpret the predictions made by GNNs. Meanwhile, it is also a big challenge to evaluate the GNN explanations, since in many cases, the ground-truth explanations are unavailable. In this paper, we take insights of Counterfactual and Factual (CF^2) reasoning from causal inference theory, to solve both the learning and evaluation problems in explainable GNNs. For generating explanations, we propose a model-agnostic framework by formulating an optimization problem based on both of the two casual perspectives. This distinguishes CF^2 from previous explainable GNNs that only consider one of them. Another contribution of the work is the evaluation of GNN explanations. For quantitatively evaluating the generated explanations without the requirement of ground-truth, we design metrics based on Counterfactual and Factual reasoning to evaluate the necessity and sufficiency of the explanations. Experiments show that no matter ground-truth explanations are available or not, CF^2 generates better explanations than previous state-of-the-art methods on real-world datasets. Moreover, the statistic analysis justifies the correlation between the performance on ground-truth evaluation and our proposed metrics.

損失函數（機器學習） · 學習的學習 · 學成 · entity · 泛函 ·

2019 年 9 月 9 日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Jiawei Wu,Wenhan Xiong,William Yang Wang

from arxiv, 11pages, 5 figures, accepted to EMNLP 2019

Many tasks in natural language processing can be viewed as multi-label classification problems. However, most of the existing models are trained with the standard cross-entropy loss function and use a fixed prediction policy (e.g., a threshold of 0.5) for all the labels, which completely ignores the complexity and dependencies among different labels. In this paper, we propose a meta-learning method to capture these complex label dependencies. More specifically, our method utilizes a meta-learner to jointly learn the training policies and prediction policies for different labels. The training policies are then used to train the classifier with the cross-entropy loss function, and the prediction policies are further implemented for prediction. Experimental results on fine-grained entity typing and text classification demonstrate that our proposed method can obtain more accurate multi-label classification results.

樣本 · 類別 · 損失 · Performer · SimPLe ·

2019 年 1 月 16 日

Class-Balanced Loss Based on Effective Number of Samples

Yin Cui,Menglin Jia,Tsung-Yi Lin,Yang Song,Serge Belongie

from arxiv, Code is available at: //github.com/richardaecn/class-balanced-loss

With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.

圖片分類 · 生成式對抗網絡 · Networking · 未標記 · GANs ·

2018 年 2 月 10 日

Generative Adversarial Networks and Probabilistic Graph Models for Hyperspectral Image Classification

Zilong Zhong,Jonathan Li

from arxiv, Accepted by AAAI-18

High spectral dimensionality and the shortage of annotations make hyperspectral image (HSI) classification a challenging problem. Recent studies suggest that convolutional neural networks can learn discriminative spatial features, which play a paramount role in HSI interpretation. However, most of these methods ignore the distinctive spectral-spatial characteristic of hyperspectral data. In addition, a large amount of unlabeled data remains an unexploited gold mine for efficient data use. Therefore, we proposed an integration of generative adversarial networks (GANs) and probabilistic graphical models for HSI classification. Specifically, we used a spectral-spatial generator and a discriminator to identify land cover categories of hyperspectral cubes. Moreover, to take advantage of a large amount of unlabeled data, we adopted a conditional random field to refine the preliminary classification results generated by GANs. Experimental results obtained using two commonly studied datasets demonstrate that the proposed framework achieved encouraging classification accuracy using a small number of data for training.