The large bandwidth combined with ultra-massive multiple-input multiple-output (UM-MIMO) arrays enables terahertz (THz) systems to achieve terabits-per-second throughput. The THz systems are expected to operate in the near, intermediate, as well as the far-field. As such, channel estimation strategies suitable for the near, intermediate, or far-field have been introduced in the literature. In this work, we propose a cross-field, i.e., able to operate in near, intermediate, and far-field, compressive channel estimation strategy. For an array-of-subarrays (AoSA) architecture, the proposed method compares the received signals across the arrays to determine whether a near, intermediate, or far-field channel estimation approach will be appropriate. Subsequently, compressed estimation is performed in which the proximity of multiple subarrays (SAs) at the transmitter and receiver is exploited to reduce computational complexity and increase estimation accuracy. Numerical results show that the proposed method can enhance channel estimation accuracy and complexity at all distances of interest.
Age and gender recognition in the wild is a highly challenging task: apart from the variability of conditions, pose complexities, and varying image quality, there are cases where the face is partially or completely occluded. We present MiVOLO (Multi Input VOLO), a straightforward approach for age and gender estimation using the latest vision transformer. Our method integrates both tasks into a unified dual input/output model, leveraging not only facial information but also person image data. This improves the generalization ability of our model and enables it to deliver satisfactory results even when the face is not visible in the image. To evaluate our proposed model, we conduct experiments on four popular benchmarks and achieve state-of-the-art performance, while demonstrating real-time processing capabilities. Additionally, we introduce a novel benchmark based on images from the Open Images Dataset. The ground truth annotations for this benchmark have been meticulously generated by human annotators, resulting in high accuracy answers due to the smart aggregation of votes. Furthermore, we compare our model's age recognition performance with human-level accuracy and demonstrate that it significantly outperforms humans across a majority of age ranges. Finally, we grant public access to our models, along with the code for validation and inference. In addition, we provide extra annotations for used datasets and introduce our new benchmark.
This paper proposes a general optimization framework to improve the spectral and energy efficiency (EE) of ultra-reliable low-latency communication (URLLC) simultaneous-transfer-and-receive (STAR) reconfigurable intelligent surface (RIS)-assisted interference-limited systems with finite block length (FBL). This framework can solve a large variety of optimization problems in which the objective and/or constraints are linear functions of the rates and/or EE of users. Additionally, the framework can be applied to any interference-limited system with treating interference as noise as the decoding strategy at receivers. We consider a multi-cell broadcast channel as an example and show how this framework can be specialized to solve the minimum-weighted rate, weighted sum rate, global EE and weighted EE of the system. We make realistic assumptions regarding the (STAR-)RIS by considering three different feasibility sets for the components of either regular RIS or STAR-RIS. Our results show that RIS can substantially increase the spectral and EE of URLLC systems if the reflecting coefficients are properly optimized. Moreover, we consider three different transmission strategies for STAR-RIS as energy splitting (ES), mode switching (MS), and time switching (TS). We show that STAR-RIS can outperform a regular RIS when the regular RIS cannot cover all the users. Furthermore, it is shown that the ES scheme outperforms the MS and TS schemes.
This paper presents a pressure-robust enriched Galerkin (EG) method for the Brinkman equations with minimal degrees of freedom based on EG velocity and pressure spaces. The velocity space consists of linear Lagrange polynomials enriched by a discontinuous, piecewise linear, and mean-zero vector function per element, while piecewise constant functions approximate the pressure. We derive, analyze, and compare two EG methods in this paper: standard and robust methods. The standard method requires a mesh size to be less than a viscous parameter to produce stable and accurate velocity solutions, which is impractical in the Darcy regime. Therefore, we propose the pressure-robust method by utilizing a velocity reconstruction operator and replacing EG velocity functions with a reconstructed velocity. The robust method yields error estimates independent of a pressure term and shows uniform performance from the Stokes to Darcy regimes, preserving minimal degrees of freedom. We prove well-posedness and error estimates for both the standard and robust EG methods. We finally confirm theoretical results through numerical experiments with two- and three-dimensional examples and compare the methods' performance to support the need for the robust method.
We study how to release summary statistics on a data stream subject to the constraint of differential privacy. In particular, we focus on releasing the family of symmetric norms, which are invariant under sign-flips and coordinate-wise permutations on an input data stream and include $L_p$ norms, $k$-support norms, top-$k$ norms, and the box norm as special cases. Although it may be possible to design and analyze a separate mechanism for each symmetric norm, we propose a general parametrizable framework that differentially privately releases a number of sufficient statistics from which the approximation of all symmetric norms can be simultaneously computed. Our framework partitions the coordinates of the underlying frequency vector into different levels based on their magnitude and releases approximate frequencies for the "heavy" coordinates in important levels and releases approximate level sizes for the "light" coordinates in important levels. Surprisingly, our mechanism allows for the release of an arbitrary number of symmetric norm approximations without any overhead or additional loss in privacy. Moreover, our mechanism permits $(1+\alpha)$-approximation to each of the symmetric norms and can be implemented using sublinear space in the streaming model for many regimes of the accuracy and privacy parameters.
We focus on the signal detection for large quasi-symmetric (LQS) multiple-input multiple-output (MIMO) systems, where the numbers of both service (M) and user (N) antennas are large and N/M tends to 1. It is challenging to achieve maximum-likelihood detection (MLD) performance with square-order complexity due to the ill-conditioned channel matrix. In the emerging MIMO paradigm termed with an extremely large aperture array, the channel matrix can be more ill-conditioned due to spatial non-stationarity. In this paper, projected-Jacobi (PJ) is proposed for signal detection in (non-) stationary LQS-MIMO systems. It is theoretically and empirically demonstrated that PJ can achieve MLD performance, even when N/M = 1. Moreover, PJ has square-order complexity of N and supports parallel computation. The main idea of PJ is to add a projection step and to set a (quasi-) orthogonal initialization for the classical Jacobi iteration. Moreover, the symbol error rate (SER) of PJ is mathematically derived and it is tight to the simulation results.
High data rates are one of the most prevalent requirements in current mobile communications. To cover this and other high standards regarding performance, increasing coverage, capacity, and reliability, numerous works have proposed the development of systems employing the combination of several techniques such as Multiple Input Multiple Output (MIMO) wireless technologies with Orthogonal Frequency Division Multiplexing (OFDM) in the evolving 4G wireless communications. Our proposed system is based on the 2x2 MIMO antenna technique, which is defined to enhance the performance of radio communication systems in terms of capacity and spectral efficiency, and the OFDM technique, which can be implemented using two types of sub-carrier mapping modes: Space-Time Block Coding and Space Frequency Block Code. SFBC has been considered in our developed model. The main advantage of SFBC over STBC is that SFBC encodes two modulated symbols over two subcarriers of the same OFDM symbol, whereas STBC encodes two modulated symbols over two subcarriers of the same OFDM symbol; thus, the coding is performed in the frequency domain. Our solution aims to demonstrate the performance analysis of the Space Frequency Block Codes scheme, increasing the Signal Noise Ratio (SNR) at the receiver and decreasing the Bit Error Rate (BER) through the use of 4 QAM, 16 QAM and 64QAM modulation over a 2x2 MIMO channel for an LTE downlink transmission, in different channel radio environments. In this work, an analytical tool to evaluate the performance of SFBC - Orthogonal Frequency Division Multiplexing, using two transmit antennas and two receive antennas has been implemented, and the analysis using the average SNR has been considered as a sufficient statistic to describe the performance of SFBC in the 3GPP Long Term Evolution system over Multiple Input Multiple Output channels.
This paper investigates the broadband channel estimation (CE) for intelligent reflecting surface (IRS)-aided millimeter-wave (mmWave) massive MIMO systems. The CE for such systems is a challenging task due to the large dimension of both the active massive MIMO at the base station (BS) and passive IRS. To address this problem, this paper proposes a compressive sensing (CS)-based CE solution for IRS-aided mmWave massive MIMO systems, whereby the angular channel sparsity of large-scale array at mmWave is exploited for improved CE with reduced pilot overhead. Specifically, we first propose a downlink pilot transmission framework. By designing the pilot signals based on the prior knowledge that the line-of-sight dominated BS-to-IRS channel is known, the high-dimensional channels for BS-to-user and IRS-to-user can be jointly estimated based on CS theory. Moreover, to efficiently estimate broadband channels, a distributed orthogonal matching pursuit algorithm is exploited, where the common sparsity shared by the channels at different subcarriers is utilized. Additionally, the redundant dictionary to combat the power leakage is also designed for the enhanced CE performance. Simulation results demonstrate the effectiveness of the proposed scheme.
This paper focuses on advancing outdoor wireless systems to better support ubiquitous extended reality (XR) applications, and close the gap with current indoor wireless transmission capabilities. We propose a hybrid knowledge-data driven method for channel semantic acquisition and multi-user beamforming in cell-free massive multiple-input multiple-output (MIMO) systems. Specifically, we firstly propose a data-driven multiple layer perceptron (MLP)-Mixer-based auto-encoder for channel semantic acquisition, where the pilot signals, CSI quantizer for channel semantic embedding, and CSI reconstruction for channel semantic extraction are jointly optimized in an end-to-end manner. Moreover, based on the acquired channel semantic, we further propose a knowledge-driven deep-unfolding multi-user beamformer, which is capable of achieving good spectral efficiency with robustness to imperfect CSI in outdoor XR scenarios. By unfolding conventional successive over-relaxation (SOR)-based linear beamforming scheme with deep learning, the proposed beamforming scheme is capable of adaptively learning the optimal parameters to accelerate convergence and improve the robustness to imperfect CSI. The proposed deep unfolding beamforming scheme can be used for access points (APs) with fully-digital array and APs with hybrid analog-digital array structure. Simulation results demonstrate the effectiveness of our proposed scheme in improving the accuracy of channel acquisition, as well as reducing complexity in both CSI acquisition and beamformer design. The proposed beamforming method achieves approximately 96% of the converged spectrum efficiency performance after only three iterations in downlink transmission, demonstrating its efficacy and potential to improve outdoor XR applications.
We consider massive multiple-input multiple-output (MIMO) systems in the presence of Cauchy noise. First, we focus on the channel estimation problem. In the standard massive MIMO setup, the users transmit orthonormal pilots during the training phase and the received signal at the base station is projected onto each pilot. This processing is optimum when the noise is Gaussian. We show that this processing is not optimal when the noise is Cauchy and as a remedy propose a channel estimation technique that operates on the raw received signal. Second, we derive uplink-downlink achievable rates in the presence of Cauchy noise for perfect and imperfect channel state information. Finally, we derive log-likelihood ratio expressions for soft bit detection for both uplink and downlink, and simulate coded bit-error-rate curves. In addition to this, we derive and compare the symbol detectors in the presence of both Gaussian and Cauchy noises. An important observation is that the detector constructed for Cauchy noise performs well with both Gaussian and Cauchy noises; on the other hand, the detector for Gaussian noise works poorly in the presence of Cauchy noise. That is, the Cauchy detector is robust against heavy-tailed noise, whereas the Gaussian detector is not.
Pre-trained deep neural network language models such as ELMo, GPT, BERT and XLNet have recently achieved state-of-the-art performance on a variety of language understanding tasks. However, their size makes them impractical for a number of scenarios, especially on mobile and edge devices. In particular, the input word embedding matrix accounts for a significant proportion of the model's memory footprint, due to the large input vocabulary and embedding dimensions. Knowledge distillation techniques have had success at compressing large neural network models, but they are ineffective at yielding student models with vocabularies different from the original teacher models. We introduce a novel knowledge distillation technique for training a student model with a significantly smaller vocabulary as well as lower embedding and hidden state dimensions. Specifically, we employ a dual-training mechanism that trains the teacher and student models simultaneously to obtain optimal word embeddings for the student vocabulary. We combine this approach with learning shared projection matrices that transfer layer-wise knowledge from the teacher model to the student model. Our method is able to compress the BERT_BASE model by more than 60x, with only a minor drop in downstream task metrics, resulting in a language model with a footprint of under 7MB. Experimental results also demonstrate higher compression efficiency and accuracy when compared with other state-of-the-art compression techniques.