Orthogonal time frequency space (OTFS) modulation is a recently proposed delay-Doppler (DD) domain communication scheme, which has shown promising performance in general wireless communications, especially over high-mobility channels. In this paper, we investigate DD domain Tomlinson-Harashima precoding (THP) for downlink multiuser multiple-input and multiple-output OTFS (MU-MIMO-OTFS) transmissions. Instead of directly applying THP based on the huge equivalent channel matrix, we propose a simple implementation of THP that does not require any matrix decomposition or inversion. Such a simple implementation is enabled by the DD domain channel property, i.e., different resolvable paths do not share the same delay and Doppler shifts, which makes it possible to pre-cancel all the DD domain interference in a symbol-by-symbol manner. We also study the achievable rate performance for the proposed scheme by leveraging the information-theoretical equivalent models. In particular, we show that the proposed scheme can achieve a near optimal performance in the high signal-to-noise ratio (SNR) regime. More importantly, scaling laws for achievable rates with respect to number of antennas and users are derived, which indicate that the achievable rate increases logarithmically with the number of antennas and linearly with the number of users. Our numerical results align well with our findings and also demonstrate a significant improvement compared to existing MU-MIMO schemes on OTFS and orthogonal frequency-division multiplexing (OFDM).
Gaussian boson sampling, a computational model that is widely believed to admit quantum supremacy, has already been experimentally demonstrated to surpasses the classical simulation capabilities of even with the most powerful supercomputers today. However, whether the current approach limited by photon loss and noise in such experiments prescribes a scalable path to quantum advantage is an open question. For example, random circuit sampling with constant noise per gate was recently shown not to be a scalable approach to achieve quantum supremacy, although simulating intermediate scale systems is still difficult. To understand the effect of photon loss on the scability of Gaussian boson sampling, we use a tensor network algorithm with $U(1)$ symmetry to examine the asymptotic operator entanglement entropy scaling, which relates to the simulation complexity. We develop a custom-built algorithm that significantly reduces the computational time with state-of-the-art hardware accelerators, enabling simulations of much larger systems. With this capability, we observe, for Gaussian boson sampling, the crucial $N_\text{out}\propto\sqrt{N}$ scaling of the number of surviving photons in the number of input photons that marks the boundary between efficient and inefficient classical simulation. We further theoretically show that this should be general for other input states.
Over-the-air federated edge learning (Air-FEEL) is a communication-efficient framework for distributed machine learning using training data distributed at edge devices. This framework enables all edge devices to transmit model updates simultaneously over the entire available bandwidth, allowing for over-the-air aggregation. A one-bit digital over-the-air aggregation (OBDA) scheme has been recently proposed, featuring one-bit gradient quantization at edge devices and majority-voting based decoding at the edge server. However, the low-resolution one-bit gradient quantization slows down the model convergence and leads to performance degradation. On the other hand, the aggregation errors caused by fading channels in Air-FEEL is still remained to be solved. To address these issues, we propose the error-feedback one-bit broadband digital aggregation (EFOBDA) and an optimized power control policy. To this end, we first provide a theoretical analysis to evaluate the impact of error feedback on the convergence of FL with EFOBDA. The analytical results show that, by setting an appropriate feedback strength, EFOBDA is comparable to the Air-FEEL without quantization, thus enhancing the performance of OBDA. Then, we further introduce a power control policy by maximizing the convergence rate under instantaneous power constraints. The convergence analysis and optimized power control policy are verified by the experiments, which show that the proposed scheme achieves significantly faster convergence and higher test accuracy in image classification tasks compared with the one-bit quantization scheme without error feedback or optimized power control policy.
In applications of group testing in networks, e.g. identifying individuals who are infected by a disease spread over a network, exploiting correlation among network nodes provides fundamental opportunities in reducing the number of tests needed. We model and analyze group testing on $n$ correlated nodes whose interactions are specified by a graph $G$. We model correlation through an edge-faulty random graph formed from $G$ in which each edge is dropped with probability $1-r$, and all nodes in the same component have the same state. We consider three classes of graphs: cycles and trees, $d$-regular graphs and stochastic block models or SBM, and obtain lower and upper bounds on the number of tests needed to identify the defective nodes. Our results are expressed in terms of the number of tests needed when the nodes are independent and they are in terms of $n$, $r$, and the target error. In particular, we quantify the fundamental improvements that exploiting correlation offers by the ratio between the total number of nodes $n$ and the equivalent number of independent nodes in a classic group testing algorithm. The lower bounds are derived by illustrating a strong dependence of the number of tests needed on the expected number of components. In this regard, we establish a new approximation for the distribution of component sizes in "$d$-regular trees" which may be of independent interest and leads to a lower bound on the expected number of components in $d$-regular graphs. The upper bounds are found by forming dense subgraphs in which nodes are more likely to be in the same state. When $G$ is a cycle or tree, we show an improvement by a factor of $log(1/r)$. For grid, a graph with almost $2n$ edges, the improvement is by a factor of ${(1-r) \log(1/r)}$, indicating drastic improvement compared to trees. When $G$ has a larger number of edges, as in SBM, the improvement can scale in $n$.
We propose a method for channel training and precoding in FDD massive MIMO based on deep neural networks (DNNs), exploiting Downlink (DL) channel covariance knowledge. The DNN is optimized to maximize the DL multi-user sum-rate, by producing a pre-beamforming matrix based on user channel covariances that maps the original channel vectors to effective channels. Measurements of these effective channels are received at the users via common pilot transmission and sent back to the base station (BS) through analog feedback without further processing. The BS estimates the effective channels from received feedback and constructs a linear precoder by concatenating the optimized pre-beamforming matrix with a zero-forcing precoder over the effective channels. We show that the proposed method yields significantly higher sum-rates than the state-of-the-art DNN-based channel training and precoding scheme, especially in scenarios with small pilot and feedback size relative to the channel coherence block length. Unlike many works in the literature, our proposition does not involve deployment of a DNN at the user side, which typically comes at a high computational cost and parameter-transmission overhead on the system, and is therefore considerably more practical.
Over-the-air computation (AirComp) is a known technique in which wireless devices transmit values by analog amplitude modulation so that a function of these values is computed over the communication channel at a common receiver. The physical reason is the superposition properties of the electromagnetic waves, which naturally return sums of analog values. Consequently, the applications of AirComp are almost entirely restricted to analog communication systems. However, the use of digital communications for over-the-air computations would have several benefits, such as error correction, synchronization, acquisition of channel state information, and easier adoption by current digital communication systems. Nevertheless, a common belief is that digital modulations are generally unfeasible for computation tasks because the overlapping of digitally modulated signals returns signals that seem to be meaningless for these tasks. This paper breaks through such a belief and proposes a fundamentally new computing method, named ChannelComp, for performing over-the-air computations by any digital modulation. In particular, we propose digital modulation formats that allow us to compute a wider class of functions than AirComp can compute, and we propose a feasibility optimization problem that ascertains the optimal digital modulation for computing functions over-the-air. The simulation results verify the superior performance of ChannelComp in comparison to AirComp, particularly for the product functions, with around 10 dB improvement of the computation error.
In the design of wireless receivers, DNNs can be combined with traditional model-based receiver algorithms to realize modular hybrid model-based/data-driven architectures that can account for domain knowledge. Such architectures typically include multiple modules, each carrying out a different functionality. Conventionally trained DNN-based modules are known to produce poorly calibrated, typically overconfident, decisions. This implies that an incorrect decision may propagate through the architecture without any indication of its insufficient accuracy. To address this problem, we present a novel combination of Bayesian learning with hybrid model-based/data-driven architectures for wireless receiver design. The proposed methodology, referred to as modular model-based Bayesian learning, results in better calibrated modules, improving accuracy and calibration of the overall receiver. We demonstrate this approach for the recently proposed DeepSIC MIMO receiver, showing significant improvements with respect to the state-of-the-art learning methods.
Cloud computing has become a critical infrastructure for modern society, like electric power grids and roads. As the backbone of the modern economy, it offers subscription-based computing services anytime, anywhere, on a pay-as-you-go basis. Its use is growing exponentially with the continued development of new classes of applications driven by a huge number of emerging networked devices. However, the success of Cloud computing has created a new global energy challenge, as it comes at the cost of vast energy usage. Currently, data centres hosting Cloud services world-wide consume more energy than most countries. Globally, by 2025, they are projected to consume 20% of global electricity and emit up to 5.5% of the world's carbon emissions. In addition, a significant part of the energy consumed is transformed into heat which leads to operational problems, including a reduction in system reliability and the life expectancy of devices, and escalation in cooling requirements. Therefore, for future generations of Cloud computing to address the environmental and operational consequences of such significant energy usage, they must become energy-efficient and environmentally sustainable while continuing to deliver high-quality services. In this paper, we propose a vision for learning-centric approach for the integrated management of new generation Cloud computing environments to reduce their energy consumption and carbon footprint while delivering service quality guarantees. In this paper, we identify the dimensions and key issues of integrated resource management and our envisioned approaches to address them. We present a conceptual architecture for energy-efficient new generation Clouds and early results on the integrated management of resources and workloads that evidence its potential benefits towards energy efficiency and sustainability.
The dynamics of the power system are described by a system of differential-algebraic equations. Time-domain simulations are used to understand the evolution of the system dynamics. These simulations can be computationally expensive due to the stiffness of the system which requires the use of finely discretized time-steps. By increasing the allowable time-step size, we aim to accelerate such simulations. In this paper, we use the observation that even though the individual components are described using both algebraic and differential equations, their coupling only involves algebraic equations. Following this observation, we use Neural Networks (NNs) to approximate the components' state evolution, leading to fast, accurate, and numerically stable approximators, which enable larger time-steps. To account for effects of the network on the components and vice-versa, the NNs take the temporal evolution of the coupling algebraic variables as an input for their prediction. We initially estimate this temporal evolution and then update it in an iterative fashion using the Newton-Raphson algorithm. The involved Jacobian matrix is calculated with Automatic Differentiation and its size depends only on the network size but not on the component dynamics. We demonstrate this NN-based simulator on the IEEE 9-bus test case with 3 generators.
We examine the problem of regret minimization when the learner is involved in a continuous game with other optimizing agents: in this case, if all players follow a no-regret algorithm, it is possible to achieve significantly lower regret relative to fully adversarial environments. We study this problem in the context of variationally stable games (a class of continuous games which includes all convex-concave and monotone games), and when the players only have access to noisy estimates of their individual payoff gradients. If the noise is additive, the game-theoretic and purely adversarial settings enjoy similar regret guarantees; however, if the noise is multiplicative, we show that the learners can, in fact, achieve constant regret. We achieve this faster rate via an optimistic gradient scheme with learning rate separation -- that is, the method's extrapolation and update steps are tuned to different schedules, depending on the noise profile. Subsequently, to eliminate the need for delicate hyperparameter tuning, we propose a fully adaptive method that attains nearly the same guarantees as its non-adapted counterpart, while operating without knowledge of either the game or of the noise profile.
We consider decentralized optimization problems in which a number of agents collaborate to minimize the average of their local functions by exchanging over an underlying communication graph. Specifically, we place ourselves in an asynchronous model where only a random portion of nodes perform computation at each iteration, while the information exchange can be conducted between all the nodes and in an asymmetric fashion. For this setting, we propose an algorithm that combines gradient tracking with a network-level variance reduction (in contrast to variance reduction within each node). This enables each node to track the average of the gradients of the objective functions. Our theoretical analysis shows that the algorithm converges linearly, when the local objective functions are strongly convex, under mild connectivity conditions on the expected mixing matrices. In particular, our result does not require the mixing matrices to be doubly stochastic. In the experiments, we investigate a broadcast mechanism that transmits information from computing nodes to their neighbors, and confirm the linear convergence of our method on both synthetic and real-world datasets.