Edge devices have typically been used for DNN inferencing. The increase in the compute power of accelerated edges is leading to their use in DNN training also. As privacy becomes a concern on multi-tenant edge devices, Docker containers provide a lightweight virtualization mechanism to sandbox models. But their overheads for edge devices are not yet explored. In this work, we study the impact of containerized DNN inference and training workloads on an NVIDIA AGX Orin edge device and contrast it against bare metal execution on running time, CPU, GPU and memory utilization, and energy consumption. Our analysis provides several interesting insights on these overheads.
Recent developments enable the quantification of causal control given a structural causal model (SCM). This has been accomplished by introducing quantities which encode changes in the entropy of one variable when intervening on another. These measures, named causal entropy and causal information gain, aim to address limitations in existing information theoretical approaches for machine learning tasks where causality plays a crucial role. They have not yet been properly mathematically studied. Our research contributes to the formal understanding of the notions of causal entropy and causal information gain by establishing and analyzing fundamental properties of these concepts, including bounds and chain rules. Furthermore, we elucidate the relationship between causal entropy and stochastic interventions. We also propose definitions for causal conditional entropy and causal conditional information gain. Overall, this exploration paves the way for enhancing causal machine learning tasks through the study of recently-proposed information theoretic quantities grounded in considerations about causality.
The alignment of language models with human preferences is vital for their application in real-world tasks. The problem is formulated as optimizing the model's policy to maximize the expected reward that reflects human preferences with minimal deviation from the initial policy. While considered as a straightforward solution, reinforcement learning (RL) suffers from high variance in policy updates, which impedes efficient policy improvement. Recently, direct preference optimization (DPO) was proposed to directly optimize the policy from preference data. Though simple to implement, DPO is derived based on the optimal policy that is not assured to be achieved in practice, which undermines its convergence to the intended solution. In this paper, we propose efficient exact optimization (EXO) of the alignment objective. We prove that EXO is guaranteed to optimize in the same direction as the RL algorithms asymptotically for arbitary parametrization of the policy, while enables efficient optimization by circumventing the complexities associated with RL algorithms. We compare our method to DPO with both theoretical and empirical analyses, and further demonstrate the advantages of our method over existing approaches on realistic human preference data.
User expectations impact the evaluation of new interactive systems. Elevated expectations may enhance the perceived effectiveness of interfaces in user studies, similar to a placebo effect observed in medical studies. To showcase the placebo effect, we executed a user study with 18 participants who conducted a reaction time test with two different computer screen refresh rates. Participants saw a stated screen refresh rate before every condition, which corresponded to the true refresh rate only in half of the conditions and was lower or higher in the other half. Results revealed successful priming, as participants believed in superior or inferior performance based on the narrative despite using the opposite refresh rate. Post-experiment questionnaires confirmed participants still held onto the initial narrative. Interestingly, the objective performance remained unchanged between both refresh rates. We discuss how study narratives can influence subjective measures and suggest strategies to mitigate placebo effects in user-centered study designs.
Due to recent development in quantum computing, the invention of a large quantum computer is no longer a distant future. Quantum computing severely threatens modern cryptography, as the hard mathematical problems beneath classic public-key cryptosystems can be solved easily by a sufficiently large quantum computer. As such, researchers have proposed PQC based on problems that even quantum computers cannot efficiently solve. Generally, post-quantum encryption and signatures can be hard to compute. This could potentially be a problem for IoT, which usually consist lightweight devices with limited computational power. In this paper, we survey existing literature on the performance for PQC in resource-constrained devices to understand the severeness of this problem. We also review recent proposals to optimize PQC algorithms for resource-constrained devices. Overall, we find that whilst PQC may be feasible for reasonably lightweight IoT, proposals for their optimization seem to lack standardization. As such, we suggest future research to seek coordination, in order to ensure an efficient and safe migration toward IoT for the post-quantum era.
Primary motivation for this work was the need to implement hardware accelerators for a newly proposed ANN structure called Auto Resonance Network (ARN) for robotic motion planning. ARN is an approximating feed-forward hierarchical and explainable network. It can be used in various AI applications but the application base was small. Therefore, the objective of the research was twofold: to develop a new application using ARN and to implement a hardware accelerator for ARN. As per the suggestions given by the Doctoral Committee, an image recognition system using ARN has been implemented. An accuracy of around 94% was achieved with only 2 layers of ARN. The network also required a small training data set of about 500 images. Publicly available MNIST dataset was used for this experiment. All the coding was done in Python. Massive parallelism seen in ANNs presents several challenges to CPU design. For a given functionality, e.g., multiplication, several copies of serial modules can be realized within the same area as a parallel module. Advantage of using serial modules compared to parallel modules under area constraints has been discussed. One of the module often useful in ANNs is a multi-operand addition. One problem in its implementation is that the estimation of carry bits when the number of operands changes. A theorem to calculate exact number of carry bits required for a multi-operand addition has been presented in the thesis which alleviates this problem. The main advantage of the modular approach to multi-operand addition is the possibility of pipelined addition with low reconfiguration overhead. This results in overall increase in throughput for large number of additions, typically seen in several DNN configurations.
As artificial intelligence (AI) models continue to scale up, they are becoming more capable and integrated into various forms of decision-making systems. For models involved in moral decision-making, also known as artificial moral agents (AMA), interpretability provides a way to trust and understand the agent's internal reasoning mechanisms for effective use and error correction. In this paper, we provide an overview of this rapidly-evolving sub-field of AI interpretability, introduce the concept of the Minimum Level of Interpretability (MLI) and recommend an MLI for various types of agents, to aid their safe deployment in real-world settings.
Graph neural networks (GNNs) have been demonstrated to be a powerful algorithmic model in broad application fields for their effectiveness in learning over graphs. To scale GNN training up for large-scale and ever-growing graphs, the most promising solution is distributed training which distributes the workload of training across multiple computing nodes. However, the workflows, computational patterns, communication patterns, and optimization techniques of distributed GNN training remain preliminarily understood. In this paper, we provide a comprehensive survey of distributed GNN training by investigating various optimization techniques used in distributed GNN training. First, distributed GNN training is classified into several categories according to their workflows. In addition, their computational patterns and communication patterns, as well as the optimization techniques proposed by recent work are introduced. Second, the software frameworks and hardware platforms of distributed GNN training are also introduced for a deeper understanding. Third, distributed GNN training is compared with distributed training of deep neural networks, emphasizing the uniqueness of distributed GNN training. Finally, interesting issues and opportunities in this field are discussed.
Deep neural networks have revolutionized many machine learning tasks in power systems, ranging from pattern recognition to signal processing. The data in these tasks is typically represented in Euclidean domains. Nevertheless, there is an increasing number of applications in power systems, where data are collected from non-Euclidean domains and represented as the graph-structured data with high dimensional features and interdependency among nodes. The complexity of graph-structured data has brought significant challenges to the existing deep neural networks defined in Euclidean domains. Recently, many studies on extending deep neural networks for graph-structured data in power systems have emerged. In this paper, a comprehensive overview of graph neural networks (GNNs) in power systems is proposed. Specifically, several classical paradigms of GNNs structures (e.g., graph convolutional networks, graph recurrent neural networks, graph attention networks, graph generative networks, spatial-temporal graph convolutional networks, and hybrid forms of GNNs) are summarized, and key applications in power systems such as fault diagnosis, power prediction, power flow calculation, and data generation are reviewed in detail. Furthermore, main issues and some research trends about the applications of GNNs in power systems are discussed.
Deep neural networks (DNNs) are successful in many computer vision tasks. However, the most accurate DNNs require millions of parameters and operations, making them energy, computation and memory intensive. This impedes the deployment of large DNNs in low-power devices with limited compute resources. Recent research improves DNN models by reducing the memory requirement, energy consumption, and number of operations without significantly decreasing the accuracy. This paper surveys the progress of low-power deep learning and computer vision, specifically in regards to inference, and discusses the methods for compacting and accelerating DNN models. The techniques can be divided into four major categories: (1) parameter quantization and pruning, (2) compressed convolutional filters and matrix factorization, (3) network architecture search, and (4) knowledge distillation. We analyze the accuracy, advantages, disadvantages, and potential solutions to the problems with the techniques in each category. We also discuss new evaluation metrics as a guideline for future research.
Dialogue systems have attracted more and more attention. Recent advances on dialogue systems are overwhelmingly contributed by deep learning techniques, which have been employed to enhance a wide range of big data applications such as computer vision, natural language processing, and recommender systems. For dialogue systems, deep learning can leverage a massive amount of data to learn meaningful feature representations and response generation strategies, while requiring a minimum amount of hand-crafting. In this article, we give an overview to these recent advances on dialogue systems from various perspectives and discuss some possible research directions. In particular, we generally divide existing dialogue systems into task-oriented and non-task-oriented models, then detail how deep learning techniques help them with representative algorithms and finally discuss some appealing research directions that can bring the dialogue system research into a new frontier.