Voltage Overscaling (VOS) is one of the well-known techniques to increase the energy efficiency of arithmetic units. Also, it can provide significant lifetime improvements, while still meeting the accuracy requirements of inherently error-resilient applications. This paper proposes a generic accuracy-configurable multiplier that employs the VOS at a coarse-grained level (block-level) to reduce the control logic required for applying VOS and its associated overheads, thus enabling a high degree of trade-off between energy consumption and output quality. The proposed configurable Block-Level VOS-based (BL-VOS) multiplier relies on employing VOS in a multiplier composed of smaller blocks, where applying VOS in different blocks results in structures with various output accuracy levels. To evaluate the proposed concept, we implement 8-bit and 16-bit BL-VOS multipliers with various blocks width in a 15-nm FinFET technology. The results show that the proposed multiplier achieves up to 15% lower energy consumption and up to 21% higher output accuracy compared to the state-of-the-art VOS-based multipliers. Also, the effects of Process Variation (PV) and Bias Temperature Instability (BTI) induced delay on the proposed multiplier are investigated. Finally, the effectiveness of the proposed multiplier is studied for two different image processing applications, in terms of quality and energy efficiency.
Visual perception is an important component for autonomous navigation of unmanned surface vessels (USV), particularly for the tasks related to autonomous inspection and tracking. These tasks involve vision-based navigation techniques to identify the target for navigation. Reduced visibility under extreme weather conditions in marine environments makes it difficult for vision-based approaches to work properly. To overcome these issues, this paper presents an autonomous vision-based navigation framework for tracking target objects in extreme marine conditions. The proposed framework consists of an integrated perception pipeline that uses a generative adversarial network (GAN) to remove noise and highlight the object features before passing them to the object detector (i.e., YOLOv5). The detected visual features are then used by the USV to track the target. The proposed framework has been thoroughly tested in simulation under extremely reduced visibility due to sandstorms and fog. The results are compared with state-of-the-art de-hazing methods across the benchmarked MBZIRC simulation dataset, on which the proposed scheme has outperformed the existing methods across various metrics.
Spiking Neural Networks (SNNs) are promising energy-efficient models for neuromorphic computing. For training the non-differentiable SNN models, the backpropagation through time (BPTT) with surrogate gradients (SG) method has achieved high performance. However, this method suffers from considerable memory cost and training time during training. In this paper, we propose the Spatial Learning Through Time (SLTT) method that can achieve high performance while greatly improving training efficiency compared with BPTT. First, we show that the backpropagation of SNNs through the temporal domain contributes just a little to the final calculated gradients. Thus, we propose to ignore the unimportant routes in the computational graph during backpropagation. The proposed method reduces the number of scalar multiplications and achieves a small memory occupation that is independent of the total time steps. Furthermore, we propose a variant of SLTT, called SLTT-K, that allows backpropagation only at K time steps, then the required number of scalar multiplications is further reduced and is independent of the total time steps. Experiments on both static and neuromorphic datasets demonstrate superior training efficiency and performance of our SLTT. In particular, our method achieves state-of-the-art accuracy on ImageNet, while the memory cost and training time are reduced by more than 70% and 50%, respectively, compared with BPTT.
[18F]-Fluorodeoxyglucose (FDG) positron emission tomography - computed tomography (PET-CT) has become the imaging modality of choice for diagnosing many cancers. Co-learning complementary PET-CT imaging features is a fundamental requirement for automatic tumor segmentation and for developing computer aided cancer diagnosis systems. In this study, we propose a hyper-connected transformer (HCT) network that integrates a transformer network (TN) with a hyper connected fusion for multi-modality PET-CT images. The TN was leveraged for its ability to provide global dependencies in image feature learning, which was achieved by using image patch embeddings with a self-attention mechanism to capture image-wide contextual information. We extended the single-modality definition of TN with multiple TN based branches to separately extract image features. We also introduced a hyper connected fusion to fuse the contextual and complementary image features across multiple transformers in an iterative manner. Our results with two clinical datasets show that HCT achieved better performance in segmentation accuracy when compared to the existing methods.
Scene Text Editing (STE) is a challenging research problem, and it aims to modify existing texts in an image while preserving the background and the font style of the original text of the image. Due to its various real-life applications, researchers have explored several approaches toward STE in recent years. However, most of the existing STE methods show inferior editing performance because of (1) complex image backgrounds, (2) various font styles, and (3) varying word lengths within the text. To address such inferior editing performance issues, in this paper, we propose a novel font-agnostic scene text editing framework, named FAST, for simultaneously generating text in arbitrary styles and locations while preserving a natural and realistic appearance through combined mask generation and style transfer. The proposed approach differs from the existing methods as they directly modify all image pixels. Instead, the proposed method has introduced a filtering mechanism to remove background distractions, allowing the network to focus solely on the text regions where editing is required. Additionally, a text-style transfer module has been designed to mitigate the challenges posed by varying word lengths. Extensive experiments and ablations have been conducted, and the results demonstrate that the proposed method outperforms the existing methods both qualitatively and quantitatively.
Matching has been widely used to mimic a randomized experiment with observational data. Ideally, treated subjects are exactly matched with controls for the covariates, and randomization-based estimation can then be conducted as in a randomized experiment (assuming no unobserved covariates). However, when there exists continuous covariates or many covariates, matching typically should be inexact. Previous studies have routinely ignored inexact matching in the downstream randomization-based estimation as long as some covariate balance criteria are satisfied, which can cause severe estimation bias. Built on the covariate-adaptive randomization inference framework, in this research note, we propose two new classes of bias-corrected randomization-based estimators to reduce estimation bias due to inexact matching: the bias-corrected maximum $p$-value estimator for the constant treatment effect and the bias-corrected difference-in-means estimator for the average treatment effect. Our simulation results show that the proposed bias-corrected estimators can effectively reduce estimation bias due to inexact matching.
Wireless localization is essential for tracking objects in indoor environments. Internet of Things (IoT) enables localization through its diverse wireless communication protocols. In this paper, a hybrid section-based indoor localization method using a developed Radio Frequency Identification (RFID) tracking device and multiple IoT wireless technologies is proposed. In order to reduce the cost of the RFID tags, the tags are installed only on the borders of each section. The RFID tracking device identifies the section, and the proposed wireless hybrid method finds the location of the object inside the section. The proposed hybrid method is analytically driven by linear location estimates obtained from different IoT wireless technologies. The experimental results using developed RFID tracking device and RSSI-based localization for Bluetooth, WiFi and ZigBee technologies verifies the analytical results.
Recently, a considerable literature has grown up around the theme of Graph Convolutional Network (GCN). How to effectively leverage the rich structural information in complex graphs, such as knowledge graphs with heterogeneous types of entities and relations, is a primary open challenge in the field. Most GCN methods are either restricted to graphs with a homogeneous type of edges (e.g., citation links only), or focusing on representation learning for nodes only instead of jointly propagating and updating the embeddings of both nodes and edges for target-driven objectives. This paper addresses these limitations by proposing a novel framework, namely the Knowledge Embedding based Graph Convolutional Network (KE-GCN), which combines the power of GCNs in graph-based belief propagation and the strengths of advanced knowledge embedding (a.k.a. knowledge graph embedding) methods, and goes beyond. Our theoretical analysis shows that KE-GCN offers an elegant unification of several well-known GCN methods as specific cases, with a new perspective of graph convolution. Experimental results on benchmark datasets show the advantageous performance of KE-GCN over strong baseline methods in the tasks of knowledge graph alignment and entity classification.
Graph Neural Networks (GNN) is an emerging field for learning on non-Euclidean data. Recently, there has been increased interest in designing GNN that scales to large graphs. Most existing methods use "graph sampling" or "layer-wise sampling" techniques to reduce training time. However, these methods still suffer from degrading performance and scalability problems when applying to graphs with billions of edges. This paper presents GBP, a scalable GNN that utilizes a localized bidirectional propagation process from both the feature vectors and the training/testing nodes. Theoretical analysis shows that GBP is the first method that achieves sub-linear time complexity for both the precomputation and the training phases. An extensive empirical study demonstrates that GBP achieves state-of-the-art performance with significantly less training/testing time. Most notably, GBP can deliver superior performance on a graph with over 60 million nodes and 1.8 billion edges in less than half an hour on a single machine.
Few-shot Knowledge Graph (KG) completion is a focus of current research, where each task aims at querying unseen facts of a relation given its few-shot reference entity pairs. Recent attempts solve this problem by learning static representations of entities and references, ignoring their dynamic properties, i.e., entities may exhibit diverse roles within task relations, and references may make different contributions to queries. This work proposes an adaptive attentional network for few-shot KG completion by learning adaptive entity and reference representations. Specifically, entities are modeled by an adaptive neighbor encoder to discern their task-oriented roles, while references are modeled by an adaptive query-aware aggregator to differentiate their contributions. Through the attention mechanism, both entities and references can capture their fine-grained semantic meanings, and thus render more expressive representations. This will be more predictive for knowledge acquisition in the few-shot scenario. Evaluation in link prediction on two public datasets shows that our approach achieves new state-of-the-art results with different few-shot sizes.
Image segmentation is considered to be one of the critical tasks in hyperspectral remote sensing image processing. Recently, convolutional neural network (CNN) has established itself as a powerful model in segmentation and classification by demonstrating excellent performances. The use of a graphical model such as a conditional random field (CRF) contributes further in capturing contextual information and thus improving the segmentation performance. In this paper, we propose a method to segment hyperspectral images by considering both spectral and spatial information via a combined framework consisting of CNN and CRF. We use multiple spectral cubes to learn deep features using CNN, and then formulate deep CRF with CNN-based unary and pairwise potential functions to effectively extract the semantic correlations between patches consisting of three-dimensional data cubes. Effective piecewise training is applied in order to avoid the computationally expensive iterative CRF inference. Furthermore, we introduce a deep deconvolution network that improves the segmentation masks. We also introduce a new dataset and experimented our proposed method on it along with several widely adopted benchmark datasets to evaluate the effectiveness of our method. By comparing our results with those from several state-of-the-art models, we show the promising potential of our method.