Many real-world problems can be formulated as the alignment between two geometric patterns. Previously, a great amount of research focus on the alignment of 2D or 3D patterns in the field of computer vision. Recently, the alignment problem in high dimensions finds several novel applications in practice. However, the research is still rather limited in the algorithmic aspect. To the best of our knowledge, most existing approaches are just simple extensions of their counterparts for 2D and 3D cases, and often suffer from the issues such as high computational complexities. In this paper, we propose an effective framework to compress the high dimensional geometric patterns. Any existing alignment method can be applied to the compressed geometric patterns and the time complexity can be significantly reduced. Our idea is inspired by the observation that high dimensional data often has a low intrinsic dimension. Our framework is a ``data-dependent'' approach that has the complexity depending on the intrinsic dimension of the input data. Our experimental results reveal that running the alignment algorithm on compressed patterns can achieve similar qualities, comparing with the results on the original patterns, but the runtimes (including the times cost for compression) are substantially lower.
The protection of Industrial Control Systems (ICS) that are employed in public critical infrastructures is of utmost importance due to catastrophic physical damages cyberattacks may cause. The research community requires testbeds for validation and comparing various intrusion detection algorithms to protect ICS. However, there exist high barriers to entry for research and education in the ICS cybersecurity domain due to expensive hardware, software, and inherent dangers of manipulating real-world systems. To close the gap, built upon recently developed 3D high-fidelity simulators, we further showcase our integrated framework to automatically launch cyberattacks, collect data, train machine learning models, and evaluate for practical chemical and manufacturing processes. On our testbed, we validate our proposed intrusion detection model called Minimal Threshold and Window SVM (MinTWin SVM) that utilizes unsupervised machine learning via a one-class SVM in combination with a sliding window and classification threshold. Results show that MinTWin SVM minimizes false positives and is responsive to physical process anomalies. Furthermore, we incorporate our framework with ICS cybersecurity education by using our dataset in an undergraduate machine learning course where students gain hands-on experience in practicing machine learning theory with a practical ICS dataset. All of our implementations have been open-sourced.
Scientific processes rely on software as an important tool for data acquisition, analysis, and discovery. Over the years sustainable software development practices have made progress in being considered as an integral component of research. However, management of computation-based scientific studies is often left to individual researchers who design their computational experiments based on personal preferences and the nature of the study. We believe that the quality, efficiency, and reproducibility of computation-based scientific research can be improved by explicitly creating an execution environment that allows researchers to provide a clear record of traceability. This is particularly relevant to complex computational studies in high-performance computing (HPC) environments. In this article, we review the documentation required to maintain a comprehensive record of HPC computational experiments for reproducibility. We also provide an overview of tools and practices that we have developed to perform such studies around Flash-X, a multi-physics scientific software
Modern systems mitigate Rowhammer using victim refresh, which refreshes the two neighbours of an aggressor row when it encounters a specified number of activations. Unfortunately, complex attack patterns like Half-Double break victim-refresh, rendering current systems vulnerable. Instead, recently proposed secure Rowhammer mitigations rely on performing mitigative action on the aggressor rather than the victims. Such schemes employ mitigative actions such as row-migration or access-control and include AQUA, SRS, and Blockhammer. While these schemes incur only modest slowdowns at Rowhammer thresholds of few thousand, they incur prohibitive slowdowns (15%-600%) for lower thresholds that are likely in the near future. The goal of our paper is to make secure Rowhammer mitigations practical at such low thresholds. Our paper provides the key insights that benign application encounter thousands of hot rows (receiving more activations than the threshold) due to the memory mapping, which places spatially proximate lines in the same row to maximize row-buffer hitrate. Unfortunately, this causes row to receive activations for many frequently used lines. We propose Rubix, which breaks the spatial correlation in the line-to-row mapping by using an encrypted address to access the memory, reducing the likelihood of hot rows by 2 to 3 orders of magnitude. To aid row-buffer hits, Rubix randomizes a group of 1-4 lines. We also propose Rubix-D, which dynamically changes the line-to-row mapping. Rubix-D minimizes hot-rows and makes it much harder for an adversary to learn the spatial neighbourhood of a row. Rubix reduces the slowdown of AQUA (from 15% to 1%), SRS (from 60% to 2%), and Blockhammer (from 600% to 3%) while incurring a storage of less than 1 Kilobyte.
The Rowhammer vulnerability continues to get worse, with the Rowhammer Threshold (TRH) reducing from 139K activations to 4.8K activations over the last decade. Typical Rowhammer mitigations rely on tracking aggressor rows. The number of possible aggressors increases with lowering thresholds, making it difficult to reliably track such rows in a storage-efficient manner. At lower thresholds, academic trackers such as Graphene require prohibitive SRAM overheads (hundreds of KBs to MB). Recent in-DRAM trackers from industry, such as DSAC-TRR, perform approximate tracking, sacrificing guaranteed protection for reduced storage overheads, leaving DRAM vulnerable to Rowhammer attacks. Ideally, we seek a scalable tracker that tracks securely and precisely, and incurs negligible dedicated SRAM and performance overheads, while still being able to track arbitrarily low thresholds. To that end, we propose START - a Scalable Tracker for Any Rowhammer Threshold. Rather than relying on dedicated SRAM structures, START dynamically repurposes a small fraction the Last-Level Cache (LLC) to store tracking metadata. START is based on the observation that while the memory contains millions of rows, typical workloads touch only a small subset of rows within a refresh period of 64ms, so allocating tracking entries on demand significantly reduces storage. If the application does not access many rows in memory, START does not reserve any LLC capacity. Otherwise, START dynamically uses 1-way, 2-way, or 8-way of the cache set based on demand. START consumes, on average, 9.4% of the LLC capacity to store metadata, which is 5X lower compared to dedicating a counter in LLC for each row in memory. We also propose START-M, a memory-mapped START for large-memory systems. Our designs require only 4KB SRAM for newly added structures and perform within 1% of idealized tracking even at TRH of less than 100.
Existing knowledge graph (KG) embedding models have primarily focused on static KGs. However, real-world KGs do not remain static, but rather evolve and grow in tandem with the development of KG applications. Consequently, new facts and previously unseen entities and relations continually emerge, necessitating an embedding model that can quickly learn and transfer new knowledge through growth. Motivated by this, we delve into an expanding field of KG embedding in this paper, i.e., lifelong KG embedding. We consider knowledge transfer and retention of the learning on growing snapshots of a KG without having to learn embeddings from scratch. The proposed model includes a masked KG autoencoder for embedding learning and update, with an embedding transfer strategy to inject the learned knowledge into the new entity and relation embeddings, and an embedding regularization method to avoid catastrophic forgetting. To investigate the impacts of different aspects of KG growth, we construct four datasets to evaluate the performance of lifelong KG embedding. Experimental results show that the proposed model outperforms the state-of-the-art inductive and lifelong embedding baselines.
Data in Knowledge Graphs often represents part of the current state of the real world. Thus, to stay up-to-date the graph data needs to be updated frequently. To utilize information from Knowledge Graphs, many state-of-the-art machine learning approaches use embedding techniques. These techniques typically compute an embedding, i.e., vector representations of the nodes as input for the main machine learning algorithm. If a graph update occurs later on -- specifically when nodes are added or removed -- the training has to be done all over again. This is undesirable, because of the time it takes and also because downstream models which were trained with these embeddings have to be retrained if they change significantly. In this paper, we investigate embedding updates that do not require full retraining and evaluate them in combination with various embedding models on real dynamic Knowledge Graphs covering multiple use cases. We study approaches that place newly appearing nodes optimally according to local information, but notice that this does not work well. However, we find that if we continue the training of the old embedding, interleaved with epochs during which we only optimize for the added and removed parts, we obtain good results in terms of typical metrics used in link prediction. This performance is obtained much faster than with a complete retraining and hence makes it possible to maintain embeddings for dynamic Knowledge Graphs.
Object detectors usually achieve promising results with the supervision of complete instance annotations. However, their performance is far from satisfactory with sparse instance annotations. Most existing methods for sparsely annotated object detection either re-weight the loss of hard negative samples or convert the unlabeled instances into ignored regions to reduce the interference of false negatives. We argue that these strategies are insufficient since they can at most alleviate the negative effect caused by missing annotations. In this paper, we propose a simple but effective mechanism, called Co-mining, for sparsely annotated object detection. In our Co-mining, two branches of a Siamese network predict the pseudo-label sets for each other. To enhance multi-view learning and better mine unlabeled instances, the original image and corresponding augmented image are used as the inputs of two branches of the Siamese network, respectively. Co-mining can serve as a general training mechanism applied to most of modern object detectors. Experiments are performed on MS COCO dataset with three different sparsely annotated settings using two typical frameworks: anchor-based detector RetinaNet and anchor-free detector FCOS. Experimental results show that our Co-mining with RetinaNet achieves 1.4%~2.1% improvements compared with different baselines and surpasses existing methods under the same sparsely annotated setting.
Deep neural networks (DNNs) are successful in many computer vision tasks. However, the most accurate DNNs require millions of parameters and operations, making them energy, computation and memory intensive. This impedes the deployment of large DNNs in low-power devices with limited compute resources. Recent research improves DNN models by reducing the memory requirement, energy consumption, and number of operations without significantly decreasing the accuracy. This paper surveys the progress of low-power deep learning and computer vision, specifically in regards to inference, and discusses the methods for compacting and accelerating DNN models. The techniques can be divided into four major categories: (1) parameter quantization and pruning, (2) compressed convolutional filters and matrix factorization, (3) network architecture search, and (4) knowledge distillation. We analyze the accuracy, advantages, disadvantages, and potential solutions to the problems with the techniques in each category. We also discuss new evaluation metrics as a guideline for future research.
With the advent of deep neural networks, learning-based approaches for 3D reconstruction have gained popularity. However, unlike for images, in 3D there is no canonical representation which is both computationally and memory efficient yet allows for representing high-resolution geometry of arbitrary topology. Many of the state-of-the-art learning-based 3D reconstruction approaches can hence only represent very coarse 3D geometry or are limited to a restricted domain. In this paper, we propose occupancy networks, a new representation for learning-based 3D reconstruction methods. Occupancy networks implicitly represent the 3D surface as the continuous decision boundary of a deep neural network classifier. In contrast to existing approaches, our representation encodes a description of the 3D output at infinite resolution without excessive memory footprint. We validate that our representation can efficiently encode 3D structure and can be inferred from various kinds of input. Our experiments demonstrate competitive results, both qualitatively and quantitatively, for the challenging tasks of 3D reconstruction from single images, noisy point clouds and coarse discrete voxel grids. We believe that occupancy networks will become a useful tool in a wide variety of learning-based 3D tasks.
Multi-relation Question Answering is a challenging task, due to the requirement of elaborated analysis on questions and reasoning over multiple fact triples in knowledge base. In this paper, we present a novel model called Interpretable Reasoning Network that employs an interpretable, hop-by-hop reasoning process for question answering. The model dynamically decides which part of an input question should be analyzed at each hop; predicts a relation that corresponds to the current parsed results; utilizes the predicted relation to update the question representation and the state of the reasoning process; and then drives the next-hop reasoning. Experiments show that our model yields state-of-the-art results on two datasets. More interestingly, the model can offer traceable and observable intermediate predictions for reasoning analysis and failure diagnosis, thereby allowing manual manipulation in predicting the final answer.