Compromising legitimate accounts is a way of disseminating malicious content to a large user base in Online Social Networks (OSNs). Since the accounts cause lots of damages to the user and consequently to other users on OSNs, early detection is very important. This paper proposes a novel approach based on authorship verification to identify compromised twitter accounts. As the approach only uses the features extracted from the last user's post, it helps to early detection to control the damage. As a result, the malicious message without a user profile can be detected with satisfying accuracy. Experiments were constructed using a real-world dataset of compromised accounts on Twitter. The result showed that the model is suitable for detection due to achieving an accuracy of 89%.
Processing In Memory (PIM) accelerators are promising architecture that can provide massive parallelization and high efficiency in various applications. Such architectures can instantaneously provide ultra-fast operation over extensive data, allowing real-time performance in data-intensive workloads. For instance, Resistive Memory (ReRAM) based PIM architectures are widely known for their inherent dot-product computation capability. While the performance of such architecture is essential, reliability and accuracy are also important, especially in mission-critical real-time systems. Unfortunately, the PIM architectures have a fundamental limitation in guaranteeing error-free operation. As a result, current methods must pay high implementation costs or performance penalties to achieve reliable execution in the PIM accelerator. In this paper, we make a fundamental observation of this reliability limitation of ReRAM based PIM architecture. Accordingly, we propose a novel solution--Falut Tolerant PIM or FAT-PIM, that can improve reliability for such systems significantly at a low cost. Our evaluation shows that we can improve the error tolerance significantly with only 4.9% performance cost and 3.9% storage overhead.
Subsequence anomaly detection in long sequences is an important problem with applications in a wide range of domains. However, the approaches proposed so far in the literature have severe limitations: they either require prior domain knowledge used to design the anomaly discovery algorithms, or become cumbersome and expensive to use in situations with recurrent anomalies of the same type. In this work, we address these problems, and propose an unsupervised method suitable for domain agnostic subsequence anomaly detection. Our method, Series2Graph, is based on a graph representation of a novel low-dimensionality embedding of subsequences. Series2Graph needs neither labeled instances (like supervised techniques) nor anomaly-free data (like zero-positive learning techniques), and identifies anomalies of varying lengths. The experimental results, on the largest set of synthetic and real datasets used to date, demonstrate that the proposed approach correctly identifies single and recurrent anomalies without any prior knowledge of their characteristics, outperforming by a large margin several competing approaches in accuracy, while being up to orders of magnitude faster. This paper has appeared in VLDB 2020.
Since the inception of Bitcoin in 2009, the market of cryptocurrencies has grown beyond initial expectations as daily trades exceed $10 billion. As industries become automated, the need for an automated fraud detector becomes very apparent. Detecting anomalies in real time prevents potential accidents and economic losses. Anomaly detection in multivariate time series data poses a particular challenge because it requires simultaneous consideration of temporal dependencies and relationships between variables. Identifying an anomaly in real time is not an easy task specifically because of the exact anomalistic behavior they observe. Some points may present pointwise global or local anomalistic behavior, while others may be anomalistic due to their frequency or seasonal behavior or due to a change in the trend. In this paper we suggested working on real time series of trades of Ethereum from specific accounts and surveyed a large variety of different algorithms traditional and new. We categorized them according to the strategy and the anomalistic behavior which they search and showed that when bundling them together to different groups, they can prove to be a good real-time detector with an alarm time of no longer than a few seconds and with very high confidence.
Many internet platforms that collect behavioral big data use it to predict user behavior for internal purposes and for their business customers (e.g., advertisers, insurers, security forces, governments, political consulting firms) who utilize the predictions for personalization, targeting, and other decision-making. Improving predictive accuracy is therefore extremely valuable. Data science researchers design algorithms, models, and approaches to improve prediction. Prediction is also improved with larger and richer data. Beyond improving algorithms and data, platforms can stealthily achieve better prediction accuracy by pushing users' behaviors towards their predicted values, using behavior modification techniques, thereby demonstrating more certain predictions. Such apparent "improved" prediction can result from employing reinforcement learning algorithms that combine prediction and behavior modification. This strategy is absent from the machine learning and statistics literature. Investigating its properties requires integrating causal with predictive notation. To this end, we incorporate Pearl's causal do(.) operator into the predictive vocabulary. We then decompose the expected prediction error given behavior modification, and identify the components impacting predictive power. Our derivation elucidates implications of such behavior modification to data scientists, platforms, their customers, and the humans whose behavior is manipulated. Behavior modification can make users' behavior more predictable and even more homogeneous; yet this apparent predictability might not generalize when business customers use predictions in practice. Outcomes pushed towards their predictions can be at odds with customers' intentions, and harmful to manipulated users.
This paper presents the outcomes of a contest organized to evaluate methods for the online recognition of heterogeneous gestures from sequences of 3D hand poses. The task is the detection of gestures belonging to a dictionary of 16 classes characterized by different pose and motion features. The dataset features continuous sequences of hand tracking data where the gestures are interleaved with non-significant motions. The data have been captured using the Hololens 2 finger tracking system in a realistic use-case of mixed reality interaction. The evaluation is based not only on the detection performances but also on the latency and the false positives, making it possible to understand the feasibility of practical interaction tools based on the algorithms proposed. The outcomes of the contest's evaluation demonstrate the necessity of further research to reduce recognition errors, while the computational cost of the algorithms proposed is sufficiently low.
Video object detection has been an important yet challenging topic in computer vision. Traditional methods mainly focus on designing the image-level or box-level feature propagation strategies to exploit temporal information. This paper argues that with a more effective and efficient feature propagation framework, video object detectors can gain improvement in terms of both accuracy and speed. For this purpose, this paper studies object-level feature propagation, and proposes an object query propagation (QueryProp) framework for high-performance video object detection. The proposed QueryProp contains two propagation strategies: 1) query propagation is performed from sparse key frames to dense non-key frames to reduce the redundant computation on non-key frames; 2) query propagation is performed from previous key frames to the current key frame to improve feature representation by temporal context modeling. To further facilitate query propagation, an adaptive propagation gate is designed to achieve flexible key frame selection. We conduct extensive experiments on the ImageNet VID dataset. QueryProp achieves comparable accuracy with state-of-the-art methods and strikes a decent accuracy/speed trade-off. Code is available at //github.com/hf1995/QueryProp.
In recent years, number of edge computing devices and artificial intelligence applications on them have advanced excessively. In edge computing, decision making processes and computations are moved from servers to edge devices. Hence, cheap and low power devices are required. FPGAs are very low power, inclined to do parallel operations and deeply suitable devices for running Convolutional Neural Networks (CNN) which are the fundamental unit of an artificial intelligence application. Face detection on surveillance systems is the most expected application on the security market. In this work, TinyYolov3 architecture is redesigned and deployed for face detection. It is a CNN based object detection method and developed for embedded systems. PYNQ-Z2 is selected as a target board which has low-end Xilinx Zynq 7020 System-on-Chip (SoC) on it. Redesigned TinyYolov3 model is defined in numerous bit width precisions with Brevitas library which brings fundamental CNN layers and activations in integer quantized form. Then, the model is trained in a quantized structure with WiderFace dataset. In order to decrease latency and power consumption, onchip memory of the FPGA is configured as a storage of whole network parameters and the last activation function is modified as rescaled HardTanh instead of Sigmoid. Also, high degree of parallelism is applied to logical resources of the FPGA. The model is converted to an HLS based application with using FINN framework and FINN-HLS library which includes the layer definitions in C++. Later, the model is synthesized and deployed. CPU of the SoC is employed with multithreading mechanism and responsible for preprocessing, postprocessing and TCP/IP streaming operations. Consequently, 2.4 Watt total board power consumption, 18 Frames-Per-Second (FPS) throughput and 0.757 mAP accuracy rate on Easy category of the WiderFace are achieved with 4 bits precision model.
As the central nerve of the intelligent vehicle control system, the in-vehicle network bus is crucial to the security of vehicle driving. One of the best standards for the in-vehicle network is the Controller Area Network (CAN bus) protocol. However, the CAN bus is designed to be vulnerable to various attacks due to its lack of security mechanisms. To enhance the security of in-vehicle networks and promote the research in this area, based upon a large scale of CAN network traffic data with the extracted valuable features, this study comprehensively compared fully-supervised machine learning with semi-supervised machine learning methods for CAN message anomaly detection. Both traditional machine learning models (including single classifier and ensemble models) and neural network based deep learning models are evaluated. Furthermore, this study proposed a deep autoencoder based semi-supervised learning method applied for CAN message anomaly detection and verified its superiority over other semi-supervised methods. Extensive experiments show that the fully-supervised methods generally outperform semi-supervised ones as they are using more information as inputs. Typically the developed XGBoost based model obtained state-of-the-art performance with the best accuracy (98.65%), precision (0.9853), and ROC AUC (0.9585) beating other methods reported in the literature.
Link prediction on knowledge graphs (KGs) is a key research topic. Previous work mainly focused on binary relations, paying less attention to higher-arity relations although they are ubiquitous in real-world KGs. This paper considers link prediction upon n-ary relational facts and proposes a graph-based approach to this task. The key to our approach is to represent the n-ary structure of a fact as a small heterogeneous graph, and model this graph with edge-biased fully-connected attention. The fully-connected attention captures universal inter-vertex interactions, while with edge-aware attentive biases to particularly encode the graph structure and its heterogeneity. In this fashion, our approach fully models global and local dependencies in each n-ary fact, and hence can more effectively capture associations therein. Extensive evaluation verifies the effectiveness and superiority of our approach. It performs substantially and consistently better than current state-of-the-art across a variety of n-ary relational benchmarks. Our code is publicly available.
Breast cancer remains a global challenge, causing over 1 million deaths globally in 2018. To achieve earlier breast cancer detection, screening x-ray mammography is recommended by health organizations worldwide and has been estimated to decrease breast cancer mortality by 20-40%. Nevertheless, significant false positive and false negative rates, as well as high interpretation costs, leave opportunities for improving quality and access. To address these limitations, there has been much recent interest in applying deep learning to mammography; however, obtaining large amounts of annotated data poses a challenge for training deep learning models for this purpose, as does ensuring generalization beyond the populations represented in the training dataset. Here, we present an annotation-efficient deep learning approach that 1) achieves state-of-the-art performance in mammogram classification, 2) successfully extends to digital breast tomosynthesis (DBT; "3D mammography"), 3) detects cancers in clinically-negative prior mammograms of cancer patients, 4) generalizes well to a population with low screening rates, and 5) outperforms five-out-of-five full-time breast imaging specialists by improving absolute sensitivity by an average of 14%. Our results demonstrate promise towards software that can improve the accuracy of and access to screening mammography worldwide.