Multi-Access Edge Computing (MEC) emerged as a viable computing allocation method that facilitates offloading tasks to edge servers for efficient processing. The integration of MEC with 5G, referred to as 5G-MEC, provides real-time processing and data-driven decision-making in close proximity to the user. The 5G-MEC has gained significant recognition in task offloading as an essential tool for applications that require low delay. Nevertheless, few studies consider the dropped task ratio metric. Disregarding this metric might possibly undermine system efficiency. In this paper, the dropped task ratio and delay has been minimized in a realistic 5G-MEC task offloading scenario implemented in NS3. We utilize Mixed Integer Linear Programming (MILP) and Genetic Algorithm (GA) to optimize delay and dropped task ratio. We examined the effect of the number of tasks and users on the dropped task ratio and delay. Compared to two traditional offloading schemes, First Come First Serve (FCFS) and Shortest Task First (STF), our proposed method effectively works in 5G-MEC task offloading scenario. For MILP, the dropped task ratio and delay has been minimized by 20% and 2ms compared to GA.
Large Language Models (LLMs) are deployed in interactive contexts with direct user engagement, such as chatbots and writing assistants. These deployments are vulnerable to prompt injection and jailbreaking (collectively, prompt hacking), in which models are manipulated to ignore their original instructions and follow potentially malicious ones. Although widely acknowledged as a significant security threat, there is a dearth of large-scale resources and quantitative studies on prompt hacking. To address this lacuna, we launch a global prompt hacking competition, which allows for free-form human input attacks. We elicit 600K+ adversarial prompts against three state-of-the-art LLMs. We describe the dataset, which empirically verifies that current LLMs can indeed be manipulated via prompt hacking. We also present a comprehensive taxonomical ontology of the types of adversarial prompts.
Continual Test-Time Adaptation (CTA) is a challenging task that aims to adapt a source pre-trained model to continually changing target domains. In the CTA setting, a model does not know when the target domain changes, thus facing a drastic change in the distribution of streaming inputs during the test-time. The key challenge is to keep adapting the model to the continually changing target domains in an online manner. We find that a model shows highly biased predictions as it constantly adapts to the chaining distribution of the target data. It predicts certain classes more often than other classes, making inaccurate over-confident predictions. This paper mitigates this issue to improve performance in the CTA scenario. To alleviate the bias issue, we make class-wise exponential moving average target prototypes with reliable target samples and exploit them to cluster the target features class-wisely. Moreover, we aim to align the target distributions to the source distribution by anchoring the target feature to its corresponding source prototype. With extensive experiments, our proposed method achieves noteworthy performance gain when applied on top of existing CTA methods without substantial adaptation time overhead.
Decentralized Congestion Control (DCC) mechanisms have been a core part of protocol stacks for vehicular networks since their inception and standardization. The ETSI ITS-G5 protocol stack for vehicular communications considers the usage of DCC not only in the network or access layers, but also as a part of the cross-layer architecture that influences how often messages are generated and transmitted. ETSI DCC mechanisms have evolved from a reactive approach based on a finite state machine, to an adaptive approach that relies on a linear control algorithm. This linear control algorithm, called LIMERIC, is the basis of the mechanism used in the ETSI DCC Adaptive Approach. The behavior of this algorithm depends on a set of parameters. Different values for these parameters have been proposed in the literature, including those defined in the ETSI specification. A recent proposal is Dual-$\alpha$, which chooses parameters to improve convergence and fairness when the algorithm has to react to fast changes in the use of the shared medium (transitory situations). This article evaluates, by means of simulations, the performance of the ETSI DCC Adaptive Approach and related algorithms, considering both steady state and transitory situations. Results show that a bad selection of parameters can make a DCC algorithm ineffective, that the ETSI DCC Adaptive algorithm performs well in steady state conditions, and that Dual-$\alpha$ performs as well in steady state conditions and outperforms the ETSI DCC Adaptive Approach in transitory scenarios.
Contact-rich manipulation tasks often exhibit a large sim-to-real gap. For instance, industrial assembly tasks frequently involve tight insertions where the clearance is less than 0.1 mm and can even be negative when dealing with a deformable receptacle. This narrow clearance leads to complex contact dynamics that are difficult to model accurately in simulation, making it challenging to transfer simulation-learned policies to real-world robots. In this paper, we propose a novel framework for robustly learning manipulation skills for real-world tasks using simulated data only. Our framework consists of two main components: the "Force Planner" and the "Gain Tuner". The Force Planner plans both the robot motion and desired contact force, while the Gain Tuner dynamically adjusts the compliance control gains to track the desired contact force during task execution. The key insight is that by dynamically adjusting the robot's compliance control gains during task execution, we can modulate contact force in the new environment, thereby generating trajectories similar to those trained in simulation and narrowing the sim-to-real gap. Experimental results show that our method, trained in simulation on a generic square peg-and-hole task, can generalize to a variety of real-world insertion tasks involving narrow and negative clearances, all without requiring any fine-tuning. Videos are available at //dynamic-compliance.github.io.
The Skolem Problem asks, given an integer linear recurrence sequence (LRS), to determine whether the sequence contains a zero term or not. Its decidability is a longstanding open problem in theoretical computer science and automata theory. Currently, decidability is only known for LRS of order at most 4. On the other hand, the sole known complexity result is NP-hardness, due to Blondel and Portier. A fundamental result in this area is the celebrated Skolem-Mahler-Lech theorem, which asserts that the zero set of any LRS is the union of a finite set and finitely many arithmetic progressions. This paper focuses on a computational perspective of the Skolem-Mahler-Lech theorem: we show that the problem of counting the zeros of a given LRS is #P-hard, and in fact #P-complete for the instances generated in our reduction.
The field of Computer Vision (CV) is increasingly shifting towards ``high-level'' visual sensemaking tasks, yet the exact nature of these tasks remains unclear and tacit. This survey paper addresses this ambiguity by systematically reviewing research on high-level visual understanding, focusing particularly on Abstract Concepts (ACs) in automatic image classification. Our survey contributes in three main ways: Firstly, it clarifies the tacit understanding of high-level semantics in CV through a multidisciplinary analysis, and categorization into distinct clusters, including commonsense, emotional, aesthetic, and inductive interpretative semantics. Secondly, it identifies and categorizes computer vision tasks associated with high-level visual sensemaking, offering insights into the diverse research areas within this domain. Lastly, it examines how abstract concepts such as values and ideologies are handled in CV, revealing challenges and opportunities in AC-based image classification. Notably, our survey of AC image classification tasks highlights persistent challenges, such as the limited efficacy of massive datasets and the importance of integrating supplementary information and mid-level features. We emphasize the growing relevance of hybrid AI systems in addressing the multifaceted nature of AC image classification tasks. Overall, this survey enhances our understanding of high-level visual reasoning in CV and lays the groundwork for future research endeavors.
The dominating NLP paradigm of training a strong neural predictor to perform one task on a specific dataset has led to state-of-the-art performance in a variety of applications (eg. sentiment classification, span-prediction based question answering or machine translation). However, it builds upon the assumption that the data distribution is stationary, ie. that the data is sampled from a fixed distribution both at training and test time. This way of training is inconsistent with how we as humans are able to learn from and operate within a constantly changing stream of information. Moreover, it is ill-adapted to real-world use cases where the data distribution is expected to shift over the course of a model's lifetime. The first goal of this thesis is to characterize the different forms this shift can take in the context of natural language processing, and propose benchmarks and evaluation metrics to measure its effect on current deep learning architectures. We then proceed to take steps to mitigate the effect of distributional shift on NLP models. To this end, we develop methods based on parametric reformulations of the distributionally robust optimization framework. Empirically, we demonstrate that these approaches yield more robust models as demonstrated on a selection of realistic problems. In the third and final part of this thesis, we explore ways of efficiently adapting existing models to new domains or tasks. Our contribution to this topic takes inspiration from information geometry to derive a new gradient update rule which alleviate catastrophic forgetting issues during adaptation.
AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical principles(e.g., model architectures, training procedures, data, systems, security, evaluation, theory) to their applications (e.g., law, healthcare, education) and societal impact (e.g., inequity, misuse, economic and environmental impact, legal and ethical considerations). Though foundation models are based on standard deep learning and transfer learning, their scale results in new emergent capabilities,and their effectiveness across so many tasks incentivizes homogenization. Homogenization provides powerful leverage but demands caution, as the defects of the foundation model are inherited by all the adapted models downstream. Despite the impending widespread deployment of foundation models, we currently lack a clear understanding of how they work, when they fail, and what they are even capable of due to their emergent properties. To tackle these questions, we believe much of the critical research on foundation models will require deep interdisciplinary collaboration commensurate with their fundamentally sociotechnical nature.
Seeking the equivalent entities among multi-source Knowledge Graphs (KGs) is the pivotal step to KGs integration, also known as \emph{entity alignment} (EA). However, most existing EA methods are inefficient and poor in scalability. A recent summary points out that some of them even require several days to deal with a dataset containing 200,000 nodes (DWY100K). We believe over-complex graph encoder and inefficient negative sampling strategy are the two main reasons. In this paper, we propose a novel KG encoder -- Dual Attention Matching Network (Dual-AMN), which not only models both intra-graph and cross-graph information smartly, but also greatly reduces computational complexity. Furthermore, we propose the Normalized Hard Sample Mining Loss to smoothly select hard negative samples with reduced loss shift. The experimental results on widely used public datasets indicate that our method achieves both high accuracy and high efficiency. On DWY100K, the whole running process of our method could be finished in 1,100 seconds, at least 10* faster than previous work. The performances of our method also outperform previous works across all datasets, where Hits@1 and MRR have been improved from 6% to 13%.
Deep neural networks (DNNs) are successful in many computer vision tasks. However, the most accurate DNNs require millions of parameters and operations, making them energy, computation and memory intensive. This impedes the deployment of large DNNs in low-power devices with limited compute resources. Recent research improves DNN models by reducing the memory requirement, energy consumption, and number of operations without significantly decreasing the accuracy. This paper surveys the progress of low-power deep learning and computer vision, specifically in regards to inference, and discusses the methods for compacting and accelerating DNN models. The techniques can be divided into four major categories: (1) parameter quantization and pruning, (2) compressed convolutional filters and matrix factorization, (3) network architecture search, and (4) knowledge distillation. We analyze the accuracy, advantages, disadvantages, and potential solutions to the problems with the techniques in each category. We also discuss new evaluation metrics as a guideline for future research.