This paper investigates the broadband channel estimation (CE) for intelligent reflecting surface (IRS)-aided millimeter-wave (mmWave) massive MIMO systems. The CE for such systems is a challenging task due to the large dimension of both the active massive MIMO at the base station (BS) and passive IRS. To address this problem, this paper proposes a compressive sensing (CS)-based CE solution for IRS-aided mmWave massive MIMO systems, whereby the angular channel sparsity of large-scale array at mmWave is exploited for improved CE with reduced pilot overhead. Specifically, we first propose a downlink pilot transmission framework. By designing the pilot signals based on the prior knowledge that the line-of-sight dominated BS-to-IRS channel is known, the high-dimensional channels for BS-to-user and IRS-to-user can be jointly estimated based on CS theory. Moreover, to efficiently estimate broadband channels, a distributed orthogonal matching pursuit algorithm is exploited, where the common sparsity shared by the channels at different subcarriers is utilized. Additionally, the redundant dictionary to combat the power leakage is also designed for the enhanced CE performance. Simulation results demonstrate the effectiveness of the proposed scheme.
EdDSA is a standardised elliptic curve digital signature scheme introduced to overcome some of the issues prevalent in the more established ECDSA standard. Due to the EdDSA standard specifying that the EdDSA signature be deterministic, if the signing function were to be used as a public key signing oracle for the attacker, the unforgeability notion of security of the scheme can be broken. This paper describes an attack against some of the most popular EdDSA implementations, which results in an adversary recovering the private key used during signing. With this recovered secret key, an adversary can sign arbitrary messages that would be seen as valid by the EdDSA verification function. A list of libraries with vulnerable APIs at the time of publication is provided. Furthermore, this paper provides two suggestions for securing EdDSA signing APIs against this vulnerability while it additionally discusses failed attempts to solve the issue.
Dispatching strategies for gas turbines (GTs) are changing in modern electricity grids. A growing incorporation of intermittent renewable energy requires GTs to operate more but shorter cycles and more frequently on partial loads. Deep reinforcement learning (DRL) has recently emerged as a tool that can cope with this development and dispatch GTs economically. The key advantages of DRL are a model-free optimization and the ability to handle uncertainties, such as those introduced by varying loads or renewable energy production. In this study, three popular DRL algorithms are implemented for an economic GT dispatch problem on a case study in Alberta, Canada. We highlight the benefits of DRL by incorporating an existing thermodynamic software provided by Siemens Energy into the environment model and by simulating uncertainty via varying electricity prices, loads, and ambient conditions. Among the tested algorithms and baseline methods, Deep Q-Networks (DQN) obtained the highest rewards while Proximal Policy Optimization (PPO) was the most sample efficient. We further propose and implement a method to assign GT operation and maintenance cost dynamically based on operating hours and cycles. Compared to existing methods, our approach better approximates the true cost of modern GT dispatch and hence leads to more realistic policies.
Unsupervised Domain Adaptive Object Detection (UDA-OD) uses unlabelled data to improve the reliability of robotic vision systems in open-world environments. Previous approaches to UDA-OD based on self-training have been effective in overcoming changes in the general appearance of images. However, shifts in a robot's deployment environment can also impact the likelihood that different objects will occur, termed class distribution shift. Motivated by this, we propose a framework for explicitly addressing class distribution shift to improve pseudo-label reliability in self-training. Our approach uses the domain invariance and contextual understanding of a pre-trained joint vision and language model to predict the class distribution of unlabelled data. By aligning the class distribution of pseudo-labels with this prediction, we provide weak supervision of pseudo-label accuracy. To further account for low quality pseudo-labels early in self-training, we propose an approach to dynamically adjust the number of pseudo-labels per image based on model confidence. Our method outperforms state-of-the-art approaches on several benchmarks, including a 4.7 mAP improvement when facing challenging class distribution shift.
This paper proposes a system capable of recognizing a speaker's utterance-level emotion through multimodal cues in a video. The system seamlessly integrates multiple AI models to first extract and pre-process multimodal information from the raw video input. Next, an end-to-end MER model sequentially predicts the speaker's emotions at the utterance level. Additionally, users can interactively demonstrate the system through the implemented interface.
Reconfigurable intelligent surface (RIS) is a promising technology for future wireless communication systems. Channel estimation (CE) of RIS device is a critical but also challenging issue for its development. The mainstream of existing CE methods is confined to the so-called cascaded channel (CscdChn) estimation scheme, which treats the multiplicative two-hop RIS channels as an effective one and measures it as a whole. This CscdChn training method suffers from severe double-fading attenuation loss, which significantly degrades the CE accuracy. In this paper, we propose a novel RIS-transmitting (RIS-TX) based CE scheme, which has lower pilot overhead than CscdChn scheme and effectively overcomes the double-fading curse via incorporating only one single transmit radio frequency (RF)-chain into RIS. We develop highly efficient gradient descent (GD) and penalty duality decomposition (PDD)-based solutions to resolve the pilot design task for the RIS-TX CE scheme, which is a difficult quartic optimization problem. Our designed pilot signal outperforms the discrete Fourier transform (DFT) sequence, which is reported to be optimal for CscdChn scheme. Besides, both theoretical analysis and numerical results demonstrate that our proposed RIS-TX scheme exhibits distinct performance characteristics as opposed to its CscdChn counterpart and yields superior accuracy when RIS device is not extremely large.
This work aims to provide an engagement decision support tool for Beyond Visual Range (BVR) air combat in the context of Defensive Counter Air (DCA) missions. In BVR air combat, engagement decision refers to the choice of the moment the pilot engages a target by assuming an offensive stance and executing corresponding maneuvers. To model this decision, we use the Brazilian Air Force's Aerospace Simulation Environment (\textit{Ambiente de Simula\c{c}\~ao Aeroespacial - ASA} in Portuguese), which generated 3,729 constructive simulations lasting 12 minutes each and a total of 10,316 engagements. We analyzed all samples by an operational metric called the DCA index, which represents, based on the experience of subject matter experts, the degree of success in this type of mission. This metric considers the distances of the aircraft of the same team and the opposite team, the point of Combat Air Patrol, and the number of missiles used. By defining the engagement status right before it starts and the average of the DCA index throughout the engagement, we create a supervised learning model to determine the quality of a new engagement. An algorithm based on decision trees, working with the XGBoost library, provides a regression model to predict the DCA index with a coefficient of determination close to 0.8 and a Root Mean Square Error of 0.05 that can furnish parameters to the BVR pilot to decide whether or not to engage. Thus, using data obtained through simulations, this work contributes by building a decision support system based on machine learning for BVR air combat.
This paper presents a new approach for assembling graph neural networks based on framelet transforms. The latter provides a multi-scale representation for graph-structured data. With the framelet system, we can decompose the graph feature into low-pass and high-pass frequencies as extracted features for network training, which then defines a framelet-based graph convolution. The framelet decomposition naturally induces a graph pooling strategy by aggregating the graph feature into low-pass and high-pass spectra, which considers both the feature values and geometry of the graph data and conserves the total information. The graph neural networks with the proposed framelet convolution and pooling achieve state-of-the-art performance in many types of node and graph prediction tasks. Moreover, we propose shrinkage as a new activation for the framelet convolution, which thresholds the high-frequency information at different scales. Compared to ReLU, shrinkage in framelet convolution improves the graph neural network model in terms of denoising and signal compression: noises in both node and structure can be significantly reduced by accurately cutting off the high-pass coefficients from framelet decomposition, and the signal can be compressed to less than half its original size with the prediction performance well preserved.
This paper addresses the difficulty of forecasting multiple financial time series (TS) conjointly using deep neural networks (DNN). We investigate whether DNN-based models could forecast these TS more efficiently by learning their representation directly. To this end, we make use of the dynamic factor graph (DFG) from that we enhance by proposing a novel variable-length attention-based mechanism to render it memory-augmented. Using this mechanism, we propose an unsupervised DNN architecture for multivariate TS forecasting that allows to learn and take advantage of the relationships between these TS. We test our model on two datasets covering 19 years of investment funds activities. Our experimental results show that our proposed approach outperforms significantly typical DNN-based and statistical models at forecasting their 21-day price trajectory.
This paper proposes a generic method to learn interpretable convolutional filters in a deep convolutional neural network (CNN) for object classification, where each interpretable filter encodes features of a specific object part. Our method does not require additional annotations of object parts or textures for supervision. Instead, we use the same training data as traditional CNNs. Our method automatically assigns each interpretable filter in a high conv-layer with an object part of a certain category during the learning process. Such explicit knowledge representations in conv-layers of CNN help people clarify the logic encoded in the CNN, i.e., answering what patterns the CNN extracts from an input image and uses for prediction. We have tested our method using different benchmark CNNs with various structures to demonstrate the broad applicability of our method. Experiments have shown that our interpretable filters are much more semantically meaningful than traditional filters.
High spectral dimensionality and the shortage of annotations make hyperspectral image (HSI) classification a challenging problem. Recent studies suggest that convolutional neural networks can learn discriminative spatial features, which play a paramount role in HSI interpretation. However, most of these methods ignore the distinctive spectral-spatial characteristic of hyperspectral data. In addition, a large amount of unlabeled data remains an unexploited gold mine for efficient data use. Therefore, we proposed an integration of generative adversarial networks (GANs) and probabilistic graphical models for HSI classification. Specifically, we used a spectral-spatial generator and a discriminator to identify land cover categories of hyperspectral cubes. Moreover, to take advantage of a large amount of unlabeled data, we adopted a conditional random field to refine the preliminary classification results generated by GANs. Experimental results obtained using two commonly studied datasets demonstrate that the proposed framework achieved encouraging classification accuracy using a small number of data for training.