While modern society benefits from a range of technological advancements, it also is exposed to an ever-increasing set of cybersecurity threats. These affect all areas of life including business, government, and individuals. To complement technology solutions to this problem, it is crucial to understand more about cybercriminal perpetrators themselves, their use of technology, psychological aspects, and profiles. This is a topic that has received little socio-technical research emphasis in the technology community, has few concrete research findings, and is thus a prime area for development. The aim of this article is to explore cybercriminal activities and behavior from a psychology and human aspects perspective, through a series of notable case studies. We examine motivations, psychological and other interdisciplinary concepts as they may impact/influence cybercriminal activities. We expect this paper to be of value and particularly insightful for those studying technology, psychology, and criminology, with a focus on cybersecurity and cybercrime.
Penetration testing, an essential component of cybersecurity, allows organizations to proactively identify and remediate vulnerabilities in their systems, thus bolstering their defense mechanisms against potential cyberattacks. One recent advancement in the realm of penetration testing is the utilization of Language Models (LLMs). We explore the intersection of LLMs and penetration testing to gain insight into their capabilities and challenges in the context of privilige escalation. We create an automated Linux privilege-escalation benchmark utilizing local virtual machines. We introduce an LLM-guided privilege-escalation tool designed for evaluating different LLMs and prompt strategies against our benchmark. We analyze the impact of different prompt designs, the benefits of in-context learning, and the advantages of offering high-level guidance to LLMs. We discuss challenging areas for LLMs, including maintaining focus during testing, coping with errors, and finally comparing them with both stochastic parrots as well as with human hackers.
We investigate auction mechanisms to support the emerging format of AI-generated content. We in particular study how to aggregate several LLMs in an incentive compatible manner. In this problem, the preferences of each agent over stochastically generated contents are described/encoded as an LLM. A key motivation is to design an auction format for AI-generated ad creatives to combine inputs from different advertisers. We argue that this problem, while generally falling under the umbrella of mechanism design, has several unique features. We propose a general formalism -- the token auction model -- for studying this problem. A key feature of this model is that it acts on a token-by-token basis and lets LLM agents influence generated contents through single dimensional bids. We first explore a robust auction design approach, in which all we assume is that agent preferences entail partial orders over outcome distributions. We formulate two natural incentive properties, and show that these are equivalent to a monotonicity condition on distribution aggregation. We also show that for such aggregation functions, it is possible to design a second-price auction, despite the absence of bidder valuation functions. We then move to designing concrete aggregation functions by focusing on specific valuation forms based on KL-divergence, a commonly used loss function in LLM. The welfare-maximizing aggregation rules turn out to be the weighted (log-space) convex combination of the target distributions from all participants. We conclude with experimental results in support of the token auction formulation.
We study non-monetary mechanisms for the fair and efficient allocation of reusable public resources, i.e., resources used for varying durations. We consider settings where a limited resource is repeatedly shared among a set of agents, each of whom may request to use the resource over multiple consecutive rounds, receiving utility only if they get to use the resource for the full duration of their request. Such settings are of particular significance in scientific research where large-scale instruments such as electron microscopes, particle colliders, or telescopes are shared between multiple research groups; this model also subsumes and extends existing models of repeated non-monetary allocation where resources are required for a single round only. We study a simple pseudo-market mechanism where upfront we endow each agent with a budget of artificial credits, proportional to the fair share of the resource we want the agent to receive. The endowments thus define for each agent her ideal utility as that which she derives from her favorite allocation with no competition, but subject to getting at most her fair share of the resource across rounds. Next, on each round, and for each available resource item, our mechanism runs a first-price auction with a selective reserve, wherein each agent submits a desired duration and a per-round-bid, which must be at least the reserve price if requesting for multiple rounds; the bidder with the highest per-round-bid wins, and gets to use the item for the desired duration. We consider this problem in a Bayesian setting and show that under a carefully chosen reserve price, irrespective of how others bid, each agent has a simple strategy that guarantees she receives a $1/2$ fraction of her ideal utility in expectation. We also show this result is tight, i.e., no mechanism can guarantee that all agents get more than half of their ideal utility.
In recent years, "pre-training and fine-tuning" has emerged as a promising approach in addressing the issues of label dependency and poor generalization performance in traditional GNNs. To reduce labeling requirement, the "pre-train, fine-tune" and "pre-train, prompt" paradigms have become increasingly common. In particular, prompt tuning is a popular alternative to "pre-training and fine-tuning" in natural language processing, which is designed to narrow the gap between pre-training and downstream objectives. However, existing study of prompting on graphs is still limited, lacking a framework that can accommodate commonly used graph pre-training methods and downstream tasks. In this paper, we propose a multi-view graph contrastive learning method as pretext and design a prompting tuning for it. Specifically, we first reformulate graph pre-training and downstream tasks into a common format. Second, we construct multi-view contrasts to capture relevant information of graphs by GNN. Third, we design a prompting tuning method for our multi-view graph contrastive learning method to bridge the gap between pretexts and downsteam tasks. Finally, we conduct extensive experiments on benchmark datasets to evaluate and analyze our proposed method.
As the complexity and scale of modern computer networks continue to increase, there has emerged an urgent need for precise traffic analysis, which plays a pivotal role in cutting-edge wireless connectivity technologies. This study focuses on leveraging Machine Learning methodologies to create an advanced network traffic classification system. We introduce a novel data-driven approach that excels in identifying various network service types in real-time, by analyzing patterns within the network traffic. Our method organizes similar kinds of network traffic into distinct categories, referred to as network services, based on latency requirement. Furthermore, it decomposes the network traffic stream into multiple, smaller traffic flows, with each flow uniquely carrying a specific service. Our ML models are trained on a dataset comprised of labeled examples representing different network service types collected on various Wi-Fi network conditions. Upon evaluation, our system demonstrates a remarkable accuracy in distinguishing the network services. These results emphasize the substantial promise of integrating Artificial Intelligence in wireless technologies. Such an approach encourages more efficient energy consumption, enhances Quality of Service assurance, and optimizes the allocation of network resources, thus laying a solid groundwork for the development of advanced intelligent networks.
We focus on learning adversarially robust classifiers under a cost-sensitive scenario, where the potential harm of different classwise adversarial transformations is encoded in a binary cost matrix. Existing methods are either empirical that cannot certify robustness or suffer from inherent scalability issues. In this work, we study whether randomized smoothing, a more scalable robustness certification framework, can be leveraged to certify cost-sensitive robustness. Built upon a notion of cost-sensitive certified radius, we show how to adapt the standard randomized smoothing certification pipeline to produce tight robustness guarantees for any cost matrix. In addition, with fine-grained certified radius optimization schemes specifically designed for different data subgroups, we propose an algorithm to train smoothed classifiers that are optimized for cost-sensitive robustness. Extensive experiments on image benchmarks and a real-world medical dataset demonstrate the superiority of our method in achieving significantly improved performance of certified cost-sensitive robustness while having a negligible impact on overall accuracy.
The study of network robustness is a critical tool in the characterization and sense making of complex interconnected systems such as infrastructure, communication and social networks. While significant research has been conducted in all of these areas, gaps in the surveying literature still exist. Answers to key questions are currently scattered across multiple scientific fields and numerous papers. In this survey, we distill key findings across numerous domains and provide researchers crucial access to important information by--(1) summarizing and comparing recent and classical graph robustness measures; (2) exploring which robustness measures are most applicable to different categories of networks (e.g., social, infrastructure; (3) reviewing common network attack strategies, and summarizing which attacks are most effective across different network topologies; and (4) extensive discussion on selecting defense techniques to mitigate attacks across a variety of networks. This survey guides researchers and practitioners in navigating the expansive field of network robustness, while summarizing answers to key questions. We conclude by highlighting current research directions and open problems.
Modern neural network training relies heavily on data augmentation for improved generalization. After the initial success of label-preserving augmentations, there has been a recent surge of interest in label-perturbing approaches, which combine features and labels across training samples to smooth the learned decision surface. In this paper, we propose a new augmentation method that leverages the first and second moments extracted and re-injected by feature normalization. We replace the moments of the learned features of one training image by those of another, and also interpolate the target labels. As our approach is fast, operates entirely in feature space, and mixes different signals than prior methods, one can effectively combine it with existing augmentation methods. We demonstrate its efficacy across benchmark data sets in computer vision, speech, and natural language processing, where it consistently improves the generalization performance of highly competitive baseline networks.
Graphical causal inference as pioneered by Judea Pearl arose from research on artificial intelligence (AI), and for a long time had little connection to the field of machine learning. This article discusses where links have been and should be established, introducing key concepts along the way. It argues that the hard open problems of machine learning and AI are intrinsically related to causality, and explains how the field is beginning to understand them.
Benefit from the quick development of deep learning techniques, salient object detection has achieved remarkable progresses recently. However, there still exists following two major challenges that hinder its application in embedded devices, low resolution output and heavy model weight. To this end, this paper presents an accurate yet compact deep network for efficient salient object detection. More specifically, given a coarse saliency prediction in the deepest layer, we first employ residual learning to learn side-output residual features for saliency refinement, which can be achieved with very limited convolutional parameters while keep accuracy. Secondly, we further propose reverse attention to guide such side-output residual learning in a top-down manner. By erasing the current predicted salient regions from side-output features, the network can eventually explore the missing object parts and details which results in high resolution and accuracy. Experiments on six benchmark datasets demonstrate that the proposed approach compares favorably against state-of-the-art methods, and with advantages in terms of simplicity, efficiency (45 FPS) and model size (81 MB).