In this work, we address the energy efficiency (EE) maximization problem in a downlink communication system utilizing reconfigurable intelligent surface (RIS) in a multi-user massive multiple-input multiple-output (mMIMO) setup with zero-forcing (ZF) precoding. The channel between the base station (BS) and RIS operates under a Rician fading with Rician factor K1. Since systematically optimizing the RIS phase shifts in each channel coherence time interval is challenging and burdensome, we employ the statistical channel state information (CSI)-based optimization strategy to alleviate this overhead. By treating the RIS phase shifts matrix as a constant over multiple channel coherence time intervals, we can reduce the computational complexity while maintaining an interesting performance. Based on an ergodic rate (ER) lower bound closed-form, the EE optimization problem is formulated. Such a problem is non-convex and challenging to tackle due to the coupled variables. To circumvent such an obstacle, we explore the sequential optimization approach where the power allocation vector p, the number of antennas M, and the RIS phase shifts v are separated and sequentially solved iteratively until convergence. With the help of the Lagrangian dual method, fractional programming (FP) techniques, and Lemma 1, insightful compact closed-form expressions for each of the three optimization variables are derived. Simulation results validate the effectiveness of the proposed method across different generalized channel scenarios, including non-line-of-sight (NLoS) and partially line-of-sight (LoS) conditions. This underscores its potential to significantly reduce power consumption, decrease the number of active antennas at the base station, and effectively incorporate RIS structure in mMIMO communication setup with just statistical CSI knowledge.
In this work, we present a nonlinear dynamics perspective on generating and connecting gaits for energetically conservative models of legged systems. In particular, we show that the set of conservative gaits constitutes a connected space of locally defined 1D submanifolds in the gait space. These manifolds are coordinate-free parameterized by energy level. We present algorithms for identifying such families of gaits through the use of numerical continuation methods, generating sets and bifurcation points. To this end, we also introduce several details for the numerical implementation. Most importantly, we establish the necessary condition for the Delassus' matrix to preserve energy across impacts. An important application of our work is with simple models of legged locomotion that are often able to capture the complexity of legged locomotion with just a few degrees of freedom and a small number of physical parameters. We demonstrate the efficacy of our framework on a one-legged hopper with four degrees of freedom.
Learning time-series models is useful for many applications, such as simulation and forecasting. In this study, we consider the problem of actively learning time-series models while taking given safety constraints into account. For time-series modeling we employ a Gaussian process with a nonlinear exogenous input structure. The proposed approach generates data appropriate for time series model learning, i.e. input and output trajectories, by dynamically exploring the input space. The approach parametrizes the input trajectory as consecutive trajectory sections, which are determined stepwise given safety requirements and past observations. We analyze the proposed algorithm and evaluate it empirically on a technical application. The results show the effectiveness of our approach in a realistic technical use case.
Systolic arrays are a prominent choice for deep neural network (DNN) accelerators because they offer parallelism and efficient data reuse. Improving the reliability of DNN accelerators is crucial as hardware faults can degrade the accuracy of DNN inferencing. Systolic arrays make use of a large number of processing elements (PEs) for parallel processing, but when one PE is faulty, the error propagates and affects the outcomes of downstream PEs. Due to the large number of PEs, the cost associated with implementing hardware-based runtime monitoring of every single PE is infeasible. We present a solution to optimize the placement of hardware monitors within systolic arrays. We first prove that $2N-1$ monitors are needed to localize a single faulty PE and we also derive the monitor placement. We show that a second placement optimization problem, which minimizes the set of candidate faulty PEs for a given number of monitors, is NP-hard. Therefore, we propose a heuristic approach to balance the reliability and hardware resource utilization in DNN accelerators when number of monitors is limited. Experimental evaluation shows that to localize a single faulty PE, an area overhead of only 0.33% is incurred for a $256\times 256$ systolic array.
A pivotal aspect in the design of neural networks lies in selecting activation functions, crucial for introducing nonlinear structures that capture intricate input-output patterns. While the effectiveness of adaptive or trainable activation functions has been studied in domains with ample data, like image classification problems, significant gaps persist in understanding their influence on classification accuracy and predictive uncertainty in settings characterized by limited data availability. This research aims to address these gaps by investigating the use of two types of adaptive activation functions. These functions incorporate shared and individual trainable parameters per hidden layer and are examined in three testbeds derived from additive manufacturing problems containing fewer than one hundred training instances. Our investigation reveals that adaptive activation functions, such as Exponential Linear Unit (ELU) and Softplus, with individual trainable parameters, result in accurate and confident prediction models that outperform fixed-shape activation functions and the less flexible method of using identical trainable activation functions in a hidden layer. Therefore, this work presents an elegant way of facilitating the design of adaptive neural networks in scientific and engineering problems.
In this work, we consider Terahertz (THz) communications with low-resolution uniform quantization and spatial oversampling at the receiver side. We compare different analog-to-digital converter (ADC) parametrizations in a fair manner by keeping the ADC power consumption constant. Here, 1-, 2-, and 3-bit quantization is investigated with different oversampling factors. We analytically compute the statistics of the detection variable, and we propose the optimal as well as several suboptimal detection schemes for arbitrary quantization resolutions. Then, we evaluate the symbol error rate (SER) of the different detectors for a 16- and a 64-ary quadrature amplitude modulation (QAM) constellation. The results indicate that there is a noticeable performance degradation of the suboptimal detection schemes compared to the optimal scheme when the constellation size is larger than the number of quantization levels. Furthermore, at low signal-to-noise ratios (SNRs), 1-bit quantization outperforms 2- and 3-bit quantization, respectively, even when employing higher-order constellations. We confirm our analytical results by Monte Carlo simulations. Both a pure line-of-sight (LoS) and a more realistically modeled indoor THz channel are considered. Then, we optimize the input signal constellation with respect to SER for 1-bit quantization. The results show that the minimum SER can be lowered significantly for 16-QAM by increasing the distance between the inner and outer points of the input constellation. For larger constellations, however, the achievable reduction of the minimum SER is much smaller compared to 16-QAM.
In this paper, we prove the first Bayesian regret bounds for Thompson Sampling in reinforcement learning in a multitude of settings. We simplify the learning problem using a discrete set of surrogate environments, and present a refined analysis of the information ratio using posterior consistency. This leads to an upper bound of order $\widetilde{O}(H\sqrt{d_{l_1}T})$ in the time inhomogeneous reinforcement learning problem where $H$ is the episode length and $d_{l_1}$ is the Kolmogorov $l_1-$dimension of the space of environments. We then find concrete bounds of $d_{l_1}$ in a variety of settings, such as tabular, linear and finite mixtures, and discuss how how our results are either the first of their kind or improve the state-of-the-art.
Translational distance-based knowledge graph embedding has shown progressive improvements on the link prediction task, from TransE to the latest state-of-the-art RotatE. However, N-1, 1-N and N-N predictions still remain challenging. In this work, we propose a novel translational distance-based approach for knowledge graph link prediction. The proposed method includes two-folds, first we extend the RotatE from 2D complex domain to high dimension space with orthogonal transforms to model relations for better modeling capacity. Second, the graph context is explicitly modeled via two directed context representations. These context representations are used as part of the distance scoring function to measure the plausibility of the triples during training and inference. The proposed approach effectively improves prediction accuracy on the difficult N-1, 1-N and N-N cases for knowledge graph link prediction task. The experimental results show that it achieves better performance on two benchmark data sets compared to the baseline RotatE, especially on data set (FB15k-237) with many high in-degree connection nodes.
Detecting carried objects is one of the requirements for developing systems to reason about activities involving people and objects. We present an approach to detect carried objects from a single video frame with a novel method that incorporates features from multiple scales. Initially, a foreground mask in a video frame is segmented into multi-scale superpixels. Then the human-like regions in the segmented area are identified by matching a set of extracted features from superpixels against learned features in a codebook. A carried object probability map is generated using the complement of the matching probabilities of superpixels to human-like regions and background information. A group of superpixels with high carried object probability and strong edge support is then merged to obtain the shape of the carried object. We applied our method to two challenging datasets, and results show that our method is competitive with or better than the state-of-the-art.
Recommender System (RS) is a hot area where artificial intelligence (AI) techniques can be effectively applied to improve performance. Since the well-known Netflix Challenge, collaborative filtering (CF) has become the most popular and effective recommendation method. Despite their success in CF, various AI techniques still have to face the data sparsity and cold start problems. Previous works tried to solve these two problems by utilizing auxiliary information, such as social connections among users and meta-data of items. However, they process different types of information separately, leading to information loss. In this work, we propose to utilize Heterogeneous Information Network (HIN), which is a natural and general representation of different types of data, to enhance CF-based recommending methods. HIN-based recommender systems face two problems: how to represent high-level semantics for recommendation and how to fuse the heterogeneous information to recommend. To address these problems, we propose to applying meta-graph to HIN-based RS and solve the information fusion problem with a "matrix factorization (MF) + factorization machine (FM)" framework. For the "MF" part, we obtain user-item similarity matrices from each meta-graph and adopt low-rank matrix approximation to get latent features for both users and items. For the "FM" part, we propose to apply FM with Group lasso (FMG) on the obtained features to simultaneously predict missing ratings and select useful meta-graphs. Experimental results on two large real-world datasets, i.e., Amazon and Yelp, show that our proposed approach is better than that of the state-of-the-art FM and other HIN-based recommending methods.
In this paper, we propose the joint learning attention and recurrent neural network (RNN) models for multi-label classification. While approaches based on the use of either model exist (e.g., for the task of image captioning), training such existing network architectures typically require pre-defined label sequences. For multi-label classification, it would be desirable to have a robust inference process, so that the prediction error would not propagate and thus affect the performance. Our proposed model uniquely integrates attention and Long Short Term Memory (LSTM) models, which not only addresses the above problem but also allows one to identify visual objects of interests with varying sizes without the prior knowledge of particular label ordering. More importantly, label co-occurrence information can be jointly exploited by our LSTM model. Finally, by advancing the technique of beam search, prediction of multiple labels can be efficiently achieved by our proposed network model.