Deep policy networks enable robots to learn behaviors to solve various real-world complex tasks in an end-to-end fashion. However, they lack transparency to provide the reasons of actions. Thus, such a black-box model often results in low reliability and disruptive actions during the deployment of the robot in practice. To enhance its transparency, it is important to explain robot behaviors by considering the extent to which each input feature contributes to determining a given action. In this paper, we present an explicit analysis of deep policy models through input attribution methods to explain how and to what extent each input feature affects the decisions of the robot policy models. To this end, we present two methods for applying input attribution methods to robot policy networks: (1) we measure the importance factor of each joint torque to reflect the influence of the motor torque on the end-effector movement, and (2) we modify a relevance propagation method to handle negative inputs and outputs in deep policy networks properly. To the best of our knowledge, this is the first report to identify the dynamic changes of input attributions of multi-modal sensor inputs in deep policy networks online for robotic manipulation.
The purpose of this article is to investigate the viability of Multi-Carrier Modulation (MCM) systems based on the Fast Walsh Hadamard Transform (FWHT). In addition, a nonlinear Joint Low-Complexity Optimized Zero Forcing Successive Interference Cancellation (JLCOZF-SIC) equalizer is proposed. To that end, general equations for the number of flops of the proposed equalizer and various other equalizers are given. This article discusses the use of Banded Matrix Approximation (BMA) as a technique for reducing complexity. The proposed equalizer uses BMA to accomplish both equalization and co-Carrier Frequency Offset (co-CFO) corrections. In addition, three cases involving the proposed equalizer were investigated. In the first case, diagonal compensation is used. In the second case, BMA compensation is used. In the third case, complete matrix compensation is used. In the presence of frequency offset, noise, and frequency-selective Rayleigh fading environments, analysis and simulation results show that the OFDM-FWHT system with the proposed equalizer outperforms the conventional OFDM system with various linear and nonlinear equalizers.
Multivariate time series data are ubiquitous in the application of machine learning to problems in the physical sciences. Chemiresistive sensor arrays are highly promising in chemical detection tasks relevant to industrial, safety, and military applications. Sensor arrays are an inherently multivariate time series data collection tool which demand rapid and accurate classification of arbitrary chemical analytes. Previous research has benchmarked data-agnostic multivariate time series classifiers across diverse multivariate time series supervised tasks in order to find general-purpose classification algorithms. To our knowledge, there has yet to be an effort to survey machine learning and time series classification approaches to chemiresistive hardware sensor arrays for the detection of chemical analytes. In addition to benchmarking existing approaches to multivariate time series classifiers, we incorporate findings from a model survey to propose the novel \textit{ChemTime} approach to sensor array classification for chemical sensing. We design experiments addressing the unique challenges of hardware sensor arrays classification including the rapid classification ability of classifiers and minimization of inference time while maintaining performance for deployed lightweight hardware sensing devices. We find that \textit{ChemTime} is uniquely positioned for the chemical sensing task by combining rapid and early classification of time series with beneficial inference and high accuracy.
Today, the large number of players and the high computational requirements of video games have motivated research on Green Video Games. We present a survey that provides an overview of this recent research area. A total of 2,637 papers were reviewed, selecting 69 papers as primary studies for further analysis. Through a detailed analysis of the results, we propose a new way to define the Green Video Game issues based on motivation, device, and layer of the primary studies. Then, we analyze the different applied techniques, the limitations and levels of evidence, and specific aspects of video games.
Visual Place Recognition (VPR) is a critical task for performing global re-localization in visual perception systems. It requires the ability to accurately recognize a previously visited location under variations such as illumination, occlusion, appearance and viewpoint. In the case of robotic systems and augmented reality, the target devices for deployment are battery powered edge devices. Therefore whilst the accuracy of VPR methods is important so too is memory consumption and latency. Recently new works have focused on the recall@1 metric as a performance measure with limited focus on resource utilization. This has resulted in methods that use deep learning models too large to deploy on low powered edge devices. We hypothesize that these large models are highly over-parameterized and can be optimized to satisfy the constraints of a low powered embedded system whilst maintaining high recall performance. Our work studies the impact of compact convolutional network architecture design in combination with full-precision and mixed-precision post-training quantization on VPR performance. Importantly we not only measure performance via the recall@1 score but also measure memory consumption and latency. We characterize the design implications on memory, latency and recall scores and provide a number of design recommendations for VPR systems under these resource limitations.
Blockchain applications are witnessing rapid evolution, necessitating the integration of upgradeable smart contracts. Software patterns have been proposed to summarize upgradeable smart contract best practices. However, research is missing on the comparison of these upgradeable smart contract patterns, especially regarding gas costs related to deployment and execution. This study aims to provide an in-depth analysis of gas costs associated with two prevalent upgradeable smart contract patterns: the Proxy and diamond patterns. The Proxy pattern utilizes a Proxy pointing to a logic contract, while the diamond pattern enables a Proxy to point to multiple logic contracts. We conduct a comparative analysis of gas costs for both patterns in contrast to a traditional non-upgradeable smart contract. We derive from this analysis a theoretical contribution in the form of two consolidated blockchain patterns and a corresponding decision model. By so doing we hope to contribute to the broader understanding of upgradeable smart contract patterns.
Recent advances in state-of-the-art DNN architecture design have been moving toward Transformer models. These models achieve superior accuracy across a wide range of applications. This trend has been consistent over the past several years since Transformer models were originally introduced. However, the amount of compute and bandwidth required for inference of recent Transformer models is growing at a significant rate, and this has made their deployment in latency-sensitive applications challenging. As such, there has been an increased focus on making Transformer models more efficient, with methods that range from changing the architecture design, all the way to developing dedicated domain-specific accelerators. In this work, we survey different approaches for efficient Transformer inference, including: (i) analysis and profiling of the bottlenecks in existing Transformer architectures and their similarities and differences with previous convolutional models; (ii) implications of Transformer architecture on hardware, including the impact of non-linear operations such as Layer Normalization, Softmax, and GELU, as well as linear operations, on hardware design; (iii) approaches for optimizing a fixed Transformer architecture; (iv) challenges in finding the right mapping and scheduling of operations for Transformer models; and (v) approaches for optimizing Transformer models by adapting the architecture using neural architecture search. Finally, we perform a case study by applying the surveyed optimizations on Gemmini, the open-source, full-stack DNN accelerator generator, and we show how each of these approaches can yield improvements, compared to previous benchmark results on Gemmini. Among other things, we find that a full-stack co-design approach with the aforementioned methods can result in up to 88.7x speedup with a minimal performance degradation for Transformer inference.
Recent years have seen important advances in the quality of state-of-the-art models, but this has come at the expense of models becoming less interpretable. This survey presents an overview of the current state of Explainable AI (XAI), considered within the domain of Natural Language Processing (NLP). We discuss the main categorization of explanations, as well as the various ways explanations can be arrived at and visualized. We detail the operations and explainability techniques currently available for generating explanations for NLP model predictions, to serve as a resource for model developers in the community. Finally, we point out the current gaps and encourage directions for future work in this important research area.
Deep neural networks (DNNs) are successful in many computer vision tasks. However, the most accurate DNNs require millions of parameters and operations, making them energy, computation and memory intensive. This impedes the deployment of large DNNs in low-power devices with limited compute resources. Recent research improves DNN models by reducing the memory requirement, energy consumption, and number of operations without significantly decreasing the accuracy. This paper surveys the progress of low-power deep learning and computer vision, specifically in regards to inference, and discusses the methods for compacting and accelerating DNN models. The techniques can be divided into four major categories: (1) parameter quantization and pruning, (2) compressed convolutional filters and matrix factorization, (3) network architecture search, and (4) knowledge distillation. We analyze the accuracy, advantages, disadvantages, and potential solutions to the problems with the techniques in each category. We also discuss new evaluation metrics as a guideline for future research.
Convolutional networks (ConvNets) have achieved great successes in various challenging vision tasks. However, the performance of ConvNets would degrade when encountering the domain shift. The domain adaptation is more significant while challenging in the field of biomedical image analysis, where cross-modality data have largely different distributions. Given that annotating the medical data is especially expensive, the supervised transfer learning approaches are not quite optimal. In this paper, we propose an unsupervised domain adaptation framework with adversarial learning for cross-modality biomedical image segmentations. Specifically, our model is based on a dilated fully convolutional network for pixel-wise prediction. Moreover, we build a plug-and-play domain adaptation module (DAM) to map the target input to features which are aligned with source domain feature space. A domain critic module (DCM) is set up for discriminating the feature space of both domains. We optimize the DAM and DCM via an adversarial loss without using any target domain label. Our proposed method is validated by adapting a ConvNet trained with MRI images to unpaired CT data for cardiac structures segmentations, and achieved very promising results.
Visual Question Answering (VQA) models have struggled with counting objects in natural images so far. We identify a fundamental problem due to soft attention in these models as a cause. To circumvent this problem, we propose a neural network component that allows robust counting from object proposals. Experiments on a toy task show the effectiveness of this component and we obtain state-of-the-art accuracy on the number category of the VQA v2 dataset without negatively affecting other categories, even outperforming ensemble models with our single model. On a difficult balanced pair metric, the component gives a substantial improvement in counting over a strong baseline by 6.6%.