Delay alignment modulation (DAM) is a promising technology for inter-symbol interference (ISI)-free communication without relying on sophisticated channel equalization or multi-carrier transmissions. The key ideas of DAM are delay precompensation and path-based beamforming, so that the multi-path signal components will arrive at the receiver simultaneously and constructively, rather than causing the detrimental ISI. However, the practical implementation of DAM requires channel state information (CSI) at the transmitter side. Therefore, in this letter, we study an efficient channel estimation method for DAM based on block orthogonal matching pursuit (BOMP) algorithm, by exploiting the block sparsity of the channel vector. Based on the imperfectly estimated CSI, the delay pre-compensations and tap-based beamforming are designed for DAM, and the resulting performance is studied. Simulation results demonstrate that with the BOMP-based channel estimation method, the CSI can be effectively acquired with low training overhead, and the performance of DAM based on estimated CSI is comparable to the ideal case with perfect CSI.
Low frequency oscillator (LFO) driven audio effects such as phaser, flanger, and chorus, modify an input signal using time-varying filters and delays, resulting in characteristic sweeping or widening effects. It has been shown that these effects can be modeled using neural networks when conditioned with the ground truth LFO signal. However, in most cases, the LFO signal is not accessible and measurement from the audio signal is nontrivial, hindering the modeling process. To address this, we propose a framework capable of extracting arbitrary LFO signals from processed audio across multiple digital audio effects, parameter settings, and instrument configurations. Since our system imposes no restrictions on the LFO signal shape, we demonstrate its ability to extract quasiperiodic, combined, and distorted modulation signals that are relevant to effect modeling. Furthermore, we show how coupling the extraction model with a simple processing network enables training of end-to-end black-box models of unseen analog or digital LFO-driven audio effects using only dry and wet audio pairs, overcoming the need to access the audio effect or internal LFO signal. We make our code available and provide the trained audio effect models in a real-time VST plugin.
The massive deployment of low-end wireless Internet of things (IoT) devices opens the challenge of finding de-centralized and lightweight alternatives for secret key distribution. A possible solution, coming from the physical layer, is the secret key generation (SKG) from channel state information (CSI) during the channel's coherence time. This work acknowledges the fact that the CSI consists of deterministic (predictable) and stochastic (unpredictable) components, loosely captured through the terms large-scale and small-scale fading, respectively. Hence, keys must be generated using only the random and unpredictable part. To detrend CSI measurements from deterministic components, a simple and lightweight approach based on Kalman filters is proposed and is evaluated using an implementation of the complete SKG protocol (including privacy amplification that is typically missing in many published works). In our study we use a massive multiple input multiple output (mMIMO) orthogonal frequency division multiplexing outdoor measured CSI dataset. The threat model assumes a passive eavesdropper in the vicinity (at 1 meter distance or less) from one of the legitimate nodes and the Kalman filter is parameterized to maximize the achievable key rate.
This paper presents a novel end-to-end deep learning framework for real-time inertial attitude estimation using 6DoF IMU measurements. Inertial Measurement Units are widely used in various applications, including engineering and medical sciences. However, traditional filters used for attitude estimation suffer from poor generalization over different motion patterns and environmental disturbances. To address this problem, we propose two deep learning models that incorporate accelerometer and gyroscope readings as inputs. These models are designed to be generalized to different motion patterns, sampling rates, and environmental disturbances. Our models consist of convolutional neural network layers combined with Bi-Directional Long-Short Term Memory followed by a Fully Forward Neural Network to estimate the quaternion. We evaluate the proposed method on seven publicly available datasets, totaling more than 120 hours and 200 kilometers of IMU measurements. Our results show that the proposed method outperforms state-of-the-art methods in terms of accuracy and robustness. Additionally, our framework demonstrates superior generalization over various motion characteristics and sensor sampling rates. Overall, this paper provides a comprehensive and reliable solution for real-time inertial attitude estimation using 6DoF IMUs, which has significant implications for a wide range of applications.
Most existing studies on joint activity detection and channel estimation for grant-free massive random access (RA) systems assume perfect synchronization among all active users, which is hard to achieve in practice. Therefore, this paper considers asynchronous grant-free massive RA systems and develops novel algorithms for joint user activity detection, synchronization delay detection, and channel estimation. In particular, the framework of orthogonal approximate message passing (OAMP) is first utilized to deal with the non-independent and identically distributed (i.i.d.) pilot matrix in asynchronous grant-free massive RA systems, and an OAMP-based algorithm capable of leveraging the common sparsity among the received pilot signals from multiple base station antennas is developed. To reduce the computational complexity, a memory AMP (MAMP)based algorithm is further proposed that eliminates the matrix inversions in the OAMP-based algorithm. Simulation results demonstrate the effectiveness of the two proposed algorithms over the baseline methods. Besides, the MAMP-based algorithm reduces 37% of the computations while maintaining comparable detection/estimation accuracy, compared with the OAMP-based algorithm.
In this paper, we propose a joint generative and contrastive representation learning method (GeCo) for anomalous sound detection (ASD). GeCo exploits a Predictive AutoEncoder (PAE) equipped with self-attention as a generative model to perform frame-level prediction. The output of the PAE together with original normal samples, are used for supervised contrastive representative learning in a multi-task framework. Besides cross-entropy loss between classes, contrastive loss is used to separate PAE output and original samples within each class. GeCo aims to better capture context information among frames, thanks to the self-attention mechanism for PAE model. Furthermore, GeCo combines generative and contrastive learning from which we aim to yield more effective and informative representations, compared to existing methods. Extensive experiments have been conducted on the DCASE2020 Task2 development dataset, showing that GeCo outperforms state-of-the-art generative and discriminative methods.
Due to the power consumption and high circuit cost in antenna arrays, the practical application of massive multiple-input multiple-output (MIMO) in the sixth generation (6G) and future wireless networks is still challenging. Employing low-resolution analog-to-digital converters (ADCs) and hybrid analog and digital (HAD) structure is two low-cost choice with acceptable performance loss.In this paper, the combination of the mixed-ADC architecture and HAD structure employed at receiver is proposed for direction of arrival (DOA) estimation, which will be applied to the beamforming tracking and alignment in 6G. By adopting the additive quantization noise model, the exact closed-form expression of the Cram\'{e}r-Rao lower bound (CRLB) for the HAD architecture with mixed-ADCs is derived. Moreover, the closed-form expression of the performance loss factor is derived as a benchmark. In addition, to take power consumption into account, energy efficiency is also investigated in our paper. The numerical results reveal that the HAD structure with mixed-ADCs can significantly reduce the power consumption and hardware cost. Furthermore, that architecture is able to achieve a better trade-off between the performance loss and the power consumption. Finally, adopting 2-4 bits of resolution may be a good choice in practical massive MIMO systems.
Continuous normalizing flows are widely used in generative tasks, where a flow network transports from a data distribution $P$ to a normal distribution. A flow model that can transport from $P$ to an arbitrary $Q$, where both $P$ and $Q$ are accessible via finite samples, would be of various application interests, particularly in the recently developed telescoping density ratio estimation (DRE) which calls for the construction of intermediate densities to bridge between $P$ and $Q$. In this work, we propose such a ``Q-malizing flow'' by a neural-ODE model which is trained to transport invertibly from $P$ to $Q$ (and vice versa) from empirical samples and is regularized by minimizing the transport cost. The trained flow model allows us to perform infinitesimal DRE along the time-parametrized $\log$-density by training an additional continuous-time flow network using classification loss, which estimates the time-partial derivative of the $\log$-density. Integrating the time-score network along time provides a telescopic DRE between $P$ and $Q$ that is more stable than a one-step DRE. The effectiveness of the proposed model is empirically demonstrated on mutual information estimation from high-dimensional data and energy-based generative models of image data.
Owing to the promising ability of saving hardware cost and spectrum resources, integrated sensing and communication (ISAC) is regarded as a revolutionary technology for future sixth-generation (6G) networks. The mono-static ISAC systems considered in most of existing works can only obtain limited sensing performance due to the single observation angle and easily blocked transmission links, which motivates researchers to investigate cooperative ISAC networks. In order to further improve the degrees of freedom (DoFs) of cooperative ISAC networks, the transmitter-receiver selection, i.e., BS mode selection problem, is meaningful to be studied. However, to our best knowledge, this crucial problem has not been extensively studied in existing works. In this paper, we consider the joint BS mode selection, transmit beamforming, and receive filter design for cooperative cell-free ISAC networks, where multi-base stations (BSs) cooperatively serve communication users and detect targets. We aim to maximize the sum of sensing signal-to-interference-plus-noise ratio (SINR) under the communication SINR requirements, total power budget, and constraints on the numbers of transmitters and receivers. An efficient joint beamforming design algorithm and three different heuristic BS mode selection methods are proposed to solve this non-convex NP-hard problem. Simulation results demonstrates the advantages of cooperative ISAC networks, the importance of BS mode selection, and the effectiveness of our proposed joint design algorithms.
Social relations are often used to improve recommendation quality when user-item interaction data is sparse in recommender systems. Most existing social recommendation models exploit pairwise relations to mine potential user preferences. However, real-life interactions among users are very complicated and user relations can be high-order. Hypergraph provides a natural way to model complex high-order relations, while its potentials for improving social recommendation are under-explored. In this paper, we fill this gap and propose a multi-channel hypergraph convolutional network to enhance social recommendation by leveraging high-order user relations. Technically, each channel in the network encodes a hypergraph that depicts a common high-order user relation pattern via hypergraph convolution. By aggregating the embeddings learned through multiple channels, we obtain comprehensive user representations to generate recommendation results. However, the aggregation operation might also obscure the inherent characteristics of different types of high-order connectivity information. To compensate for the aggregating loss, we innovatively integrate self-supervised learning into the training of the hypergraph convolutional network to regain the connectivity information with hierarchical mutual information maximization. The experimental results on multiple real-world datasets show that the proposed model outperforms the SOTA methods, and the ablation study verifies the effectiveness of the multi-channel setting and the self-supervised task. The implementation of our model is available via //github.com/Coder-Yu/RecQ.
Benefit from the quick development of deep learning techniques, salient object detection has achieved remarkable progresses recently. However, there still exists following two major challenges that hinder its application in embedded devices, low resolution output and heavy model weight. To this end, this paper presents an accurate yet compact deep network for efficient salient object detection. More specifically, given a coarse saliency prediction in the deepest layer, we first employ residual learning to learn side-output residual features for saliency refinement, which can be achieved with very limited convolutional parameters while keep accuracy. Secondly, we further propose reverse attention to guide such side-output residual learning in a top-down manner. By erasing the current predicted salient regions from side-output features, the network can eventually explore the missing object parts and details which results in high resolution and accuracy. Experiments on six benchmark datasets demonstrate that the proposed approach compares favorably against state-of-the-art methods, and with advantages in terms of simplicity, efficiency (45 FPS) and model size (81 MB).