Resource allocation in adversarial environments is a fundamental challenge across various domains, from corporate competition to military strategy. This article examines the impact of compromising an opponent's strategic intent in the context of General Lotto games, a class of resource allocation problems. We consider a scenario where one player, termed the "Breaker", has access to partial information about their opponent's strategy through a binary sensor. This sensor reveals whether the opponent's allocated resources exceed a certain threshold. Our analysis provides a comprehensive characterization of equilibrium strategies and payoffs for both players under this information structure. Through numerical studies, we demonstrate that the information provided by the sensor can significantly improve the Breaker's performance.
Advancements in tracking algorithms have empowered nascent applications across various domains, from steering autonomous vehicles to guiding robots to enhancing augmented reality experiences for users. However, these algorithms are application-specific and do not work across applications with different types of motion; even a tracking algorithm designed for a given application does not work in scenarios deviating from highly standard conditions. For example, a tracking algorithm designed for robot navigation inside a building will not work for tracking the same robot in an outdoor environment. To demonstrate this problem, we evaluate the performance of the state-of-the-art tracking methods across various applications and scenarios. To inform our analysis, we first categorize algorithmic, environmental, and locomotion-related challenges faced by tracking algorithms. We quantitatively evaluate the performance using multiple tracking algorithms and representative datasets for a wide range of Internet of Things (IoT) and Extended Reality (XR) applications, including autonomous vehicles, drones, and humans. Our analysis shows that no tracking algorithm works across different applications and scenarios within applications. Ultimately, using the insights generated from our analysis, we discuss multiple approaches to improving the tracking performance using input data characterization, leveraging intermediate information, and output evaluation.
Understanding how information can efficiently spread in distributed systems under noisy communications is a fundamental question in both biological research and artificial system design. When agents are able to control whom they interact with, noise can often be mitigated through redundancy or other coding techniques, but it may have fundamentally different consequences on well-mixed systems. Specifically, Boczkowski et al. (2018) considered the noisy $\mathcal{PULL}(h)$ model, where each message can be viewed as any other message with probability $\delta$. The authors proved that in this model, the basic task of propagating a bit value from a single source to the whole population requires $\Omega(\frac{n\delta}{h(1-\delta|\Sigma|)^2})$ (parallel) rounds. The current work shows that the aforementioned lower bound is almost tight. In particular, when each agent observes all other agents in each round, which relates to scenarios where each agent senses the system's average tendency, information spreading can reliably be achieved in $\mathcal{O}(\log n)$ time, assuming constant noise. We present two simple and highly efficient protocols, thus suggesting their applicability to real-life scenarios. Notably, they also work in the presence of multiple conflicting sources and efficiently converge to their plurality opinion. The first protocol we present uses 1-bit messages but relies on a simultaneous wake-up assumption. By increasing the message size to 2 bits and removing the speedup in the information spreading time that may result from having multiple sources, we also present a simple and highly efficient self-stabilizing protocol that avoids the simultaneous wake-up requirement. Overall, our results demonstrate how, under stochastic communication, increasing the sample size can compensate for the lack of communication structure by linearly accelerating information spreading time.
Transformers have demonstrated remarkable in-context learning capabilities across various domains, including statistical learning tasks. While previous work has shown that transformers can implement common learning algorithms, the adversarial robustness of these learned algorithms remains unexplored. This work investigates the vulnerability of in-context learning in transformers to \textit{hijacking attacks} focusing on the setting of linear regression tasks. Hijacking attacks are prompt-manipulation attacks in which the adversary's goal is to manipulate the prompt to force the transformer to generate a specific output. We first prove that single-layer linear transformers, known to implement gradient descent in-context, are non-robust and can be manipulated to output arbitrary predictions by perturbing a single example in the in-context training set. While our experiments show these attacks succeed on linear transformers, we find they do not transfer to more complex transformers with GPT-2 architectures. Nonetheless, we show that these transformers can be hijacked using gradient-based adversarial attacks. We then demonstrate that adversarial training enhances transformers' robustness against hijacking attacks, even when just applied during finetuning. Additionally, we find that in some settings, adversarial training against a weaker attack model can lead to robustness to a stronger attack model. Lastly, we investigate the transferability of hijacking attacks across transformers of varying scales and initialization seeds, as well as between transformers and ordinary least squares (OLS). We find that while attacks transfer effectively between small-scale transformers, they show poor transferability in other scenarios (small-to-large scale, large-to-large scale, and between transformers and OLS).
The increasing growth of social media provides us with an instant opportunity to be informed of the opinions of a large number of politically active individuals in real-time. We can get an overall idea of the ideologies of these individuals on governmental issues by analyzing the social media texts. Nowadays, different kinds of news websites and popular social media such as Facebook, YouTube, Instagram, etc. are the most popular means of communication for the mass population. So the political perception of the users toward different parties in the country is reflected in the data collected from these social sites. In this work, we have extracted three types of features, such as the stylometric feature, the word-embedding feature, and the TF-IDF feature. Traditional machine learning classifiers and deep learning models are employed to identify political ideology from the text. We have compared our methodology with the research work in different languages. Among them, the word embedding feature with LSTM outperforms all other models with 88.28% accuracy.
We present a computational formulation for the approximate version of several variational inequality problems, investigating their computational complexity and establishing PPAD-completeness. Examining applications in computational game theory, we specifically focus on two key concepts: resilient Nash equilibrium, and multi-leader-follower games -- domains traditionally known for the absence of general solutions. In the presence of standard assumptions and relaxation techniques, we formulate problem versions for such games that are expressible in terms of variational inequalities, ultimately leading to proofs of PPAD-completeness.
Storytelling is a fundamental aspect of human communication, relying heavily on creativity to produce narratives that are novel, appropriate, and surprising. While large language models (LLMs) have recently demonstrated the ability to generate high-quality stories, their creative capabilities remain underexplored. Previous research has either focused on creativity tests requiring short responses or primarily compared model performance in story generation to that of professional writers. However, the question of whether LLMs exhibit creativity in writing short stories on par with the average human remains unanswered. In this work, we conduct a systematic analysis of creativity in short story generation across LLMs and everyday people. Using a five-sentence creative story task, commonly employed in psychology to assess human creativity, we automatically evaluate model- and human-generated stories across several dimensions of creativity, including novelty, surprise, and diversity. Our findings reveal that while LLMs can generate stylistically complex stories, they tend to fall short in terms of creativity when compared to average human writers.
The high-performance computing (HPC) community has recently seen a substantial diversification of hardware platforms and their associated programming models. From traditional multicore processors to highly specialized accelerators, vendors and tool developers back up the relentless progress of those architectures. In the context of scientific programming, it is fundamental to consider performance portability frameworks, i.e., software tools that allow programmers to write code once and run it on different computer architectures without sacrificing performance. We report here on the benefits and challenges of performance portability using a field-line tracing simulation and a particle-in-cell code, two relevant applications in computational plasma physics with applications to magnetically-confined nuclear-fusion energy research. For these applications we report performance results obtained on four HPC platforms with server-class CPUs from Intel (Xeon) and AMD (EPYC), and high-end GPUs from Nvidia and AMD, including the latest Nvidia H100 GPU and the novel AMD Instinct MI300A APU. Our results show that both Kokkos and OpenMP are powerful tools to achieve performance portability and decent "out-of-the-box" performance, even for the very latest hardware platforms. For our applications, Kokkos provided performance portability to the broadest range of hardware architectures from different vendors.
Graph neural networks (GNNs) have been demonstrated to be a powerful algorithmic model in broad application fields for their effectiveness in learning over graphs. To scale GNN training up for large-scale and ever-growing graphs, the most promising solution is distributed training which distributes the workload of training across multiple computing nodes. However, the workflows, computational patterns, communication patterns, and optimization techniques of distributed GNN training remain preliminarily understood. In this paper, we provide a comprehensive survey of distributed GNN training by investigating various optimization techniques used in distributed GNN training. First, distributed GNN training is classified into several categories according to their workflows. In addition, their computational patterns and communication patterns, as well as the optimization techniques proposed by recent work are introduced. Second, the software frameworks and hardware platforms of distributed GNN training are also introduced for a deeper understanding. Third, distributed GNN training is compared with distributed training of deep neural networks, emphasizing the uniqueness of distributed GNN training. Finally, interesting issues and opportunities in this field are discussed.
In pace with developments in the research field of artificial intelligence, knowledge graphs (KGs) have attracted a surge of interest from both academia and industry. As a representation of semantic relations between entities, KGs have proven to be particularly relevant for natural language processing (NLP), experiencing a rapid spread and wide adoption within recent years. Given the increasing amount of research work in this area, several KG-related approaches have been surveyed in the NLP research community. However, a comprehensive study that categorizes established topics and reviews the maturity of individual research streams remains absent to this day. Contributing to closing this gap, we systematically analyzed 507 papers from the literature on KGs in NLP. Our survey encompasses a multifaceted review of tasks, research types, and contributions. As a result, we present a structured overview of the research landscape, provide a taxonomy of tasks, summarize our findings, and highlight directions for future work.
User engagement is a critical metric for evaluating the quality of open-domain dialogue systems. Prior work has focused on conversation-level engagement by using heuristically constructed features such as the number of turns and the total time of the conversation. In this paper, we investigate the possibility and efficacy of estimating utterance-level engagement and define a novel metric, {\em predictive engagement}, for automatic evaluation of open-domain dialogue systems. Our experiments demonstrate that (1) human annotators have high agreement on assessing utterance-level engagement scores; (2) conversation-level engagement scores can be predicted from properly aggregated utterance-level engagement scores. Furthermore, we show that the utterance-level engagement scores can be learned from data. These scores can improve automatic evaluation metrics for open-domain dialogue systems, as shown by correlation with human judgements. This suggests that predictive engagement can be used as a real-time feedback for training better dialogue models.