Memory safety bugs remain in the top ranks of security vulnerabilities, even after decades of research on their detection and prevention. Various mitigations have been proposed for C/C++, ranging from language dialects to instrumentation. Among these, compiler-based instrumentation is particularly promising, not requiring manual code modifications and being able to achieve precise memory safety. Unfortunately, existing compiler-based solutions compromise in many areas, including performance but also usability and memory safety guarantees. New developments in hardware can help improve performance and security of compiler-based memory safety. ARM Pointer Authentication, added in the ARMv8.3 architecture, is intended to enable hardware-assisted Control Flow Integrity (CFI). But since its operations are generic, it also enables other, more comprehensive hardware-supported runtime integrity approaches. As such, we propose CryptSan, a memory safety approach based on ARM Pointer Authentication. CryptSan uses pointer signatures to retrofit memory safety to C/C++ programs, protecting heap, stack, and globals against temporal and spatial vulnerabilities. We present a full LLVM-based prototype implementation, running on an M1 MacBook Pro, i.e., on actual ARMv8.3 hardware. Our prototype evaluation shows that the system outperforms similar approaches under real-world conditions. This, together with its interoperability with uninstrumented libraries and cryptographic protection against attacks on metadata, makes CryptSan a viable solution for retrofitting memory safety to C/C++ programs.
When performing cloth-related tasks, such as garment hanging, it is often important to identify and grasp certain structural regions -- a shirt's collar as opposed to its sleeve, for instance. However, due to cloth deformability, these manipulation activities, which are essential in domestic, health care, and industrial contexts, remain challenging for robots. In this paper, we focus on how to segment and grasp structural regions of clothes to enable manipulation tasks, using hanging tasks as case study. To this end, a neural network-based perception system is proposed to segment a shirt's collar from areas that represent the rest of the scene in a depth image. With a 10-minute video of a human manipulating shirts to train it, our perception system is capable of generalizing to other shirts regardless of texture as well as to other types of collared garments. A novel grasping strategy is then proposed based on the segmentation to determine grasping pose. Experiments demonstrate that our proposed grasping strategy achieves 92\%, 80\%, and 50\% grasping success rates with one folded garment, one crumpled garment and three crumpled garments, respectively. Our grasping strategy performs considerably better than tested baselines that do not take into account the structural nature of the garments. With the proposed region segmentation and grasping strategy, challenging garment hanging tasks are successfully implemented using an open-loop control policy. Supplementary material is available at //sites.google.com/view/garment-hanging
The widespread adoption of cloud infrastructures has revolutionised data storage and access. However, it has also raised concerns regarding the privacy of sensitive data stored in the cloud. To address these concerns, encryption techniques have been widely used. However, traditional encryption schemes limit the efficient search and retrieval of encrypted data. To tackle this challenge, innovative approaches have emerged, such as the utilisation of Homomorphic Encryption (HE) in Searchable Encryption (SE) schemes. This paper provides a comprehensive analysis of the advancements in HE-based privacy-preserving techniques, focusing on their application in SE. The main contributions of this work include the identification and classification of existing SE schemes that utilize HE, a comprehensive analysis of the types of HE used in SE, an examination of how HE shapes the search process structure and enables additional functionalities, and the identification of promising directions for future research in HE-based SE. The findings reveal the increasing usage of HE in SE schemes, particularly Partially Homomorphic Encryption. The analysis also highlights the prevalence of index-based SE schemes using HE, the support for ranked search and multi-keyword queries, and the need for further exploration in functionalities such as verifiability and the ability to authorise and revoke users. Future research directions include exploring the usage of other encryption schemes alongside HE, addressing omissions in functionalities like fuzzy keyword search, and leveraging recent advancements in Fully Homomorphic Encryption schemes.
Modern DDoS defense systems rely on probabilistic monitoring algorithms to identify flows that exceed a volume threshold and should thus be penalized. Commonly, classic sketch algorithms are considered sufficiently accurate for usage in DDoS defense. However, as we show in this paper, these algorithms achieve poor detection accuracy under burst-flood attacks, i.e., volumetric DDoS attacks composed of a swarm of medium-rate sub-second traffic bursts. Under this challenging attack pattern, traditional sketch algorithms can only detect a high share of the attack bursts by incurring a large number of false positives. In this paper, we present ALBUS, a probabilistic monitoring algorithm that overcomes the inherent limitations of previous schemes: ALBUS is highly effective at detecting large bursts while reporting no legitimate flows, and therefore improves on prior work regarding both recall and precision. Besides improving accuracy, ALBUS scales to high traffic rates, which we demonstrate with an FPGA implementation, and is suitable for programmable switches, which we showcase with a P4 implementation.
The rapid developments of mobile robotics and autonomous navigation over the years are largely empowered by public datasets for testing and upgrading, such as SLAM and localization tasks. Impressive demos and benchmark results have arisen, indicating the establishment of a mature technical framework. However, from the view point of real-world deployments, there are still critical defects of robustness in challenging environments, especially in large-scale, GNSS-denied, textural-monotonous, and unstructured scenarios. To meet the pressing validation demands in such scope, we build a novel challenging robot navigation dataset in a large botanic garden of more than 48000m2. Comprehensive sensors are employed, including high-res/rate stereo Gray&RGB cameras, rotational and forward 3D LiDARs, and low-cost and industrial-grade IMUs, all of which are well calibrated and accurately hardware-synchronized. An all-terrain wheeled robot is configured to mount the sensor suite and provide odometry data. A total of 32 long and short sequences of 2.3 million images are collected, covering scenes of thick woods, riversides, narrow paths, bridges, and grasslands that rarely appeared in previous resources. Excitedly, both highly-accurate ego-motions and 3D map ground truth are provided, along with fine-annotated vision semantics. Our goal is to contribute a high-quality dataset to advance robot navigation and sensor fusion research to a higher level.
The volume, variety, and velocity of change in vulnerabilities and exploits have made incident threat analysis challenging with human expertise and experience along. The MITRE AT&CK framework employs Tactics, Techniques, and Procedures (TTPs) to describe how and why attackers exploit vulnerabilities. However, a TTP description written by one security professional can be interpreted very differently by another, leading to confusion in cybersecurity operations or even business, policy, and legal decisions. Meanwhile, advancements in AI have led to the increasing use of Natural Language Processing (NLP) algorithms to assist the various tasks in cyber operations. With the rise of Large Language Models (LLMs), NLP tasks have significantly improved because of the LLM's semantic understanding and scalability. This leads us to question how well LLMs can interpret TTP or general cyberattack descriptions. We propose and analyze the direct use of LLMs as well as training BaseLLMs with ATT&CK descriptions to study their capability in predicting ATT&CK tactics. Our results reveal that the BaseLLMs with supervised training provide a more focused and clearer differentiation between the ATT&CK tactics (if such differentiation exists). On the other hand, LLMs offer a broader interpretation of cyberattack techniques. Despite the power of LLMs, inherent ambiguity exists within their predictions. We thus summarize the existing challenges and recommend research directions on LLMs to deal with the inherent ambiguity of TTP descriptions.
Randomness supports many critical functions in the field of machine learning (ML) including optimisation, data selection, privacy, and security. ML systems outsource the task of generating or harvesting randomness to the compiler, the cloud service provider or elsewhere in the toolchain. Yet there is a long history of attackers exploiting poor randomness, or even creating it -- as when the NSA put backdoors in random number generators to break cryptography. In this paper we consider whether attackers can compromise an ML system using only the randomness on which they commonly rely. We focus our effort on Randomised Smoothing, a popular approach to train certifiably robust models, and to certify specific input datapoints of an arbitrary model. We choose Randomised Smoothing since it is used for both security and safety -- to counteract adversarial examples and quantify uncertainty respectively. Under the hood, it relies on sampling Gaussian noise to explore the volume around a data point to certify that a model is not vulnerable to adversarial examples. We demonstrate an entirely novel attack against it, where an attacker backdoors the supplied randomness to falsely certify either an overestimate or an underestimate of robustness. We demonstrate that such attacks are possible, that they require very small changes to randomness to succeed, and that they can be hard to detect. As an example, we hide an attack in the random number generator and show that the randomness tests suggested by NIST fail to detect it. We advocate updating the NIST guidelines on random number testing to make them more appropriate for safety-critical and security-critical machine-learning applications.
Conceptual models as representations of real-world systems are based on diverse techniques in various disciplines but lack a framework that provides multidisciplinary ontological understanding of real-world phenomena. Concurrently, systems complexity has intensified, leading to a rise in developing models using different formalisms and diverse representations even within a single domain. Conceptual models have become larger; languages tend to acquire more features, and it is not unusual to use different modeling languages for different components. This diversity has caused problems with consistency between models and incompatibly with designed systems. Two main solutions have been adopted over the last few years: (1) A currently dominant technology-based solution tries to harmonize or unify models, e.g., unifies EER and UML. This solution would solidify modeling achievements, reaping benefits from huge investments over the last thirty years. (2) A less prevalent solution is to pursuit deeper roots that reveal unifying modeling principles and apparatuses. An example of the second method is a category theory-based approach that utilizes the strengths of the graph and set theory, along with other topological tools. This manuscript is a sequel in a research venture that belongs to the second approach and uses a model called thinging machines (TMs) founded on Stoic ontology and Lupascian logic. TM modeling contests the thesis that there is no universal approach that covers all aspects of an application, and the paper demonstrates that pursuing such universality is anything but a dead-end method. This paper continues in this direction, with emphasis on TM foundation (e.g., existence and subsistence of things) and exemplifies this pursuit by proposing an alternative representation of set theory.
The asynchronous and unidirectional communication model supported by mailboxes is a key reason for the success of actor languages like Erlang and Elixir for implementing reliable and scalable distributed systems. While many actors may send messages to some actor, only the actor may (selectively) receive from its mailbox. Although actors eliminate many of the issues stemming from shared memory concurrency, they remain vulnerable to communication errors such as protocol violations and deadlocks. Mailbox types are a novel behavioural type system for mailboxes first introduced for a process calculus by de'Liguoro and Padovani in 2018, which capture the contents of a mailbox as a commutative regular expression. Due to aliasing and nested evaluation contexts, moving from a process calculus to a programming language is challenging. This paper presents Pat, the first programming language design incorporating mailbox types, and describes an algorithmic type system. We make essential use of quasi-linear typing to tame some of the complexity introduced by aliasing. Our algorithmic type system is necessarily co-contextual, achieved through a novel use of backwards bidirectional typing, and we prove it sound and complete with respect to our declarative type system. We implement a prototype type checker, and use it to demonstrate the expressiveness of Pat on a factory automation case study and a series of examples from the Savina actor benchmark suite.
Knowledge graph reasoning (KGR) -- answering complex logical queries over large knowledge graphs -- represents an important artificial intelligence task, entailing a range of applications (e.g., cyber threat hunting). However, despite its surging popularity, the potential security risks of KGR are largely unexplored, which is concerning, given the increasing use of such capability in security-critical domains. This work represents a solid initial step towards bridging the striking gap. We systematize the security threats to KGR according to the adversary's objectives, knowledge, and attack vectors. Further, we present ROAR, a new class of attacks that instantiate a variety of such threats. Through empirical evaluation in representative use cases (e.g., medical decision support, cyber threat hunting, and commonsense reasoning), we demonstrate that ROAR is highly effective to mislead KGR to suggest pre-defined answers for target queries, yet with negligible impact on non-target ones. Finally, we explore potential countermeasures against ROAR, including filtering of potentially poisoning knowledge and training with adversarially augmented queries, which leads to several promising research directions.
Recent years have witnessed the resurgence of knowledge engineering which is featured by the fast growth of knowledge graphs. However, most of existing knowledge graphs are represented with pure symbols, which hurts the machine's capability to understand the real world. The multi-modalization of knowledge graphs is an inevitable key step towards the realization of human-level machine intelligence. The results of this endeavor are Multi-modal Knowledge Graphs (MMKGs). In this survey on MMKGs constructed by texts and images, we first give definitions of MMKGs, followed with the preliminaries on multi-modal tasks and techniques. We then systematically review the challenges, progresses and opportunities on the construction and application of MMKGs respectively, with detailed analyses of the strength and weakness of different solutions. We finalize this survey with open research problems relevant to MMKGs.