The incorporation of reconfigurable intelligent surface (RIS) into massive multiple-input-multiple-output (mMIMO) systems can unleash the potential of next-generation networks by improving the performance of user equipments (UEs) in service dead zones. However, their requirement for accurate channel state information (CSI) is critical, and especially, applications with UE mobility that induce channel aging make challenging the achievement of adequate quality of service. Hence, in this work, we investigate the impact of channel aging on the performance of RIS-assisted mMIMO systems under both spatial correlation and imperfect CSI conditions. Specifically, by accounting for channel aging during both uplink training and downlink data transmission phases, we first perform minimum mean square error (MMSE) channel estimation to obtain the UE effective channels with low overhead similar to conventional systems without RIS. Next, we derive the downlink achievable sum spectral efficiency (SE) with regularized zero-forcing (RZF) precoding in closed-form being dependent only on large-scale statistics by using the deterministic equivalent (DE) analysis. Subsequently, we present the attractive optimization of the achievable sum SE with respect to the phase shifts and the total transmit power that can be performed every several coherence intervals due to the slow variation of the large-scale statistics. Numerical results validate the analytical expressions and demonstrate the performance while allowing the extraction of insightful design conclusions for common scenarios including UE mobility. In particular, channel aging degrades the performance but its impact can be controlled by choosing appropriately the frame duration or by increasing the number of RIS elements.
Machine learning models can leak information about the data used to train them. To mitigate this issue, Differentially Private (DP) variants of optimization algorithms like Stochastic Gradient Descent (DP-SGD) have been designed to trade-off utility for privacy in Empirical Risk Minimization (ERM) problems. In this paper, we propose Differentially Private proximal Coordinate Descent (DP-CD), a new method to solve composite DP-ERM problems. We derive utility guarantees through a novel theoretical analysis of inexact coordinate descent. Our results show that, thanks to larger step sizes, DP-CD can exploit imbalance in gradient coordinates to outperform DP-SGD. We also prove new lower bounds for composite DP-ERM under coordinate-wise regularity assumptions, that are nearly matched by DP-CD. For practical implementations, we propose to clip gradients using coordinate-wise thresholds that emerge from our theory, avoiding costly hyperparameter tuning. Experiments on real and synthetic data support our results, and show that DP-CD compares favorably with DP-SGD.
This paper considers the massive random access problem in MIMO quasi-static Rayleigh fading channels. Specifically, we derive achievability and converse bounds on the minimum energy-per-bit required for each active user to transmit $J$ bits with blocklength $n$ and power $P$ under a per-user probability of error (PUPE) constraint, in the cases with and without \emph{a priori} channel state information at the receiver (CSIR and no-CSI). In the case of no-CSI, we consider both the settings with and without knowing the number $K_a$ of active users. The achievability bounds rely on the design of the ``good region''. Numerical evaluation shows the gap between achievability and converse bounds is less than $2.5$ dB in the CSIR case and less than $4$ dB in the no-CSI case in most considered regimes. When the distribution of $K_a$ is known, the performance gap between the cases with and without knowing the value of $K_a$ is small. For example, in the setup with blocklength $n=1000$, payload $J=100$, error requirement $\epsilon=0.001$, and $L=128$ receive antennas, compared to the case with known $K_a$, the extra required energy-per-bit when $K_a$ is unknown and distributed as $K_a\sim\text{Binom}(K,0.4)$ is less than $0.3$ dB on the converse side and $1.1$ dB on the achievability side. The spectral efficiency grows approximately linearly with $L$ in the CSIR case, whereas the growth rate decreases with no-CSI. Moreover, we study the performance of a pilot-assisted scheme, which is suboptimal especially when $K_a$ is large. Building on non-asymptotic results, when all users are active and $J=\Theta(1)$, we obtain scaling laws as follows: when $L=\Theta \left(n^2\right)$ and $P=\Theta\left(\frac{1}{n^2}\right)$, one can reliably serve $K=\mathcal{O}(n^2)$ users with no-CSI; under mild conditions with CSIR, the PUPE requirement is satisfied if and only if $\frac{nL\ln KP}{K}=\Omega\left(1\right)$.
Chinese Spelling Correction (CSC) is gaining increasing attention due to its promise of automatically detecting and correcting spelling errors in Chinese texts. Despite its extensive use in many applications, like search engines and optical character recognition systems, little has been explored in medical scenarios in which complex and uncommon medical entities are easily misspelled. Correcting the misspellings of medical entities is arguably more difficult than those in the open domain due to its requirements of specificdomain knowledge. In this work, we define the task of Medical-domain Chinese Spelling Correction and propose MCSCSet, a large scale specialist-annotated dataset that contains about 200k samples. In contrast to the existing open-domain CSC datasets, MCSCSet involves: i) extensive real-world medical queries collected from Tencent Yidian, ii) corresponding misspelled sentences manually annotated by medical specialists. To ensure automated dataset curation, MCSCSet further offers a medical confusion set consisting of the commonly misspelled characters of given Chinese medical terms. This enables one to create the medical misspelling dataset automatically. Extensive empirical studies have shown significant performance gaps between the open-domain and medical-domain spelling correction, highlighting the need to develop high-quality datasets that allow for Chinese spelling correction in specific domains. Moreover, our work benchmarks several representative Chinese spelling correction models, establishing baselines for future work.
Decentralized optimization is increasingly popular in machine learning for its scalability and efficiency. Intuitively, it should also provide better privacy guarantees, as nodes only observe the messages sent by their neighbors in the network graph. But formalizing and quantifying this gain is challenging: existing results are typically limited to Local Differential Privacy (LDP) guarantees that overlook the advantages of decentralization. In this work, we introduce pairwise network differential privacy, a relaxation of LDP that captures the fact that the privacy leakage from a node $u$ to a node $v$ may depend on their relative position in the graph. We then analyze the combination of local noise injection with (simple or randomized) gossip averaging protocols on fixed and random communication graphs. We also derive a differentially private decentralized optimization algorithm that alternates between local gradient descent steps and gossip averaging. Our results show that our algorithms amplify privacy guarantees as a function of the distance between nodes in the graph, matching the privacy-utility trade-off of the trusted curator, up to factors that explicitly depend on the graph topology. Finally, we illustrate our privacy gains with experiments on synthetic and real-world datasets.
As electric vehicle (EV) technologies become mature, EV has been rapidly adopted in modern transportation systems, and is expected to provide future autonomous mobility-on-demand (AMoD) service with economic and societal benefits. However, EVs require frequent recharges due to their limited and unpredictable cruising ranges, and they have to be managed efficiently given the dynamic charging process. It is urgent and challenging to investigate a computationally efficient algorithm that provide EV AMoD system performance guarantees under model uncertainties, instead of using heuristic demand or charging models. To accomplish this goal, this work designs a data-driven distributionally robust optimization approach for vehicle supply-demand ratio and charging station utilization balancing, while minimizing the worst-case expected cost considering both passenger mobility demand uncertainties and EV supply uncertainties. We then derive an equivalent computationally tractable form for solving the distributionally robust problem in a computationally efficient way under ellipsoid uncertainty sets constructed from data. Based on E-taxi system data of Shenzhen city, we show that the average total balancing cost is reduced by 14.49%, the average unfairness of supply-demand ratio and utilization is reduced by 15.78% and 34.51% respectively with the distributionally robust vehicle balancing method, compared with solutions which do not consider model uncertainties.
We analyze two Natural Language Inference data sets with respect to their linguistic features. The goal is to identify those syntactic and semantic properties that are particularly hard to comprehend for a machine learning model. To this end, we also investigate the differences between a crowd-sourced, machine-translated data set (SNLI) and a collection of text pairs from internet sources. Our main findings are, that the model has difficulty recognizing the semantic importance of prepositions and verbs, emphasizing the importance of linguistically aware pre-training tasks. Furthermore, it often does not comprehend antonyms and homonyms, especially if those are depending on the context. Incomplete sentences are another problem, as well as longer paragraphs and rare words or phrases. The study shows that automated language understanding requires a more informed approach, utilizing as much external knowledge as possible throughout the training process.
We investigate the effect of channel aging on multi-cell massive multiple-input multiple-output (MIMO) vehicular networks in a generic non-isotropic scattering environment. Based on the single cluster scattering assumption and the von Mises distribution assumptions of the scatterers' angles, an aging channel model is established to capture the joint effect of spatial and temporal correlations resulting from different angular spread conditions in various application scenarios. Expressions of the user uplink transmission spectral efficiency (SE) are derived for maximum ratio (MR) and minimum mean square error (MMSE) combining. Through numerical studies, the area spectral efficiency (ASE) performance of the network is evaluated in freeway and urban Manhattan road grid scenarios, and easy-to-use empirical models for the optimal transmission block size for ASE maximization are obtained for the evaluated scenarios.
Autonomic computing investigates how systems can achieve (user) specified control outcomes on their own, without the intervention of a human operator. Autonomic computing fundamentals have been substantially influenced by those of control theory for closed and open-loop systems. In practice, complex systems may exhibit a number of concurrent and inter-dependent control loops. Despite research into autonomic models for managing computer resources, ranging from individual resources (e.g., web servers) to a resource ensemble (e.g., multiple resources within a data center), research into integrating Artificial Intelligence (AI) and Machine Learning (ML) to improve resource autonomy and performance at scale continues to be a fundamental challenge. The integration of AI/ML to achieve such autonomic and self-management of systems can be achieved at different levels of granularity, from full to human-in-the-loop automation. In this article, leading academics, researchers, practitioners, engineers, and scientists in the fields of cloud computing, AI/ML, and quantum computing join to discuss current research and potential future directions for these fields. Further, we discuss challenges and opportunities for leveraging AI and ML in next generation computing for emerging computing paradigms, including cloud, fog, edge, serverless and quantum computing environments.
Current deep learning research is dominated by benchmark evaluation. A method is regarded as favorable if it empirically performs well on the dedicated test set. This mentality is seamlessly reflected in the resurfacing area of continual learning, where consecutively arriving sets of benchmark data are investigated. The core challenge is framed as protecting previously acquired representations from being catastrophically forgotten due to the iterative parameter updates. However, comparison of individual methods is nevertheless treated in isolation from real world application and typically judged by monitoring accumulated test set performance. The closed world assumption remains predominant. It is assumed that during deployment a model is guaranteed to encounter data that stems from the same distribution as used for training. This poses a massive challenge as neural networks are well known to provide overconfident false predictions on unknown instances and break down in the face of corrupted data. In this work we argue that notable lessons from open set recognition, the identification of statistically deviating data outside of the observed dataset, and the adjacent field of active learning, where data is incrementally queried such that the expected performance gain is maximized, are frequently overlooked in the deep learning era. Based on these forgotten lessons, we propose a consolidated view to bridge continual learning, active learning and open set recognition in deep neural networks. Our results show that this not only benefits each individual paradigm, but highlights the natural synergies in a common framework. We empirically demonstrate improvements when alleviating catastrophic forgetting, querying data in active learning, selecting task orders, while exhibiting robust open world application where previously proposed methods fail.
Causal inference is a critical research topic across many domains, such as statistics, computer science, education, public policy and economics, for decades. Nowadays, estimating causal effect from observational data has become an appealing research direction owing to the large amount of available data and low budget requirement, compared with randomized controlled trials. Embraced with the rapidly developed machine learning area, various causal effect estimation methods for observational data have sprung up. In this survey, we provide a comprehensive review of causal inference methods under the potential outcome framework, one of the well known causal inference framework. The methods are divided into two categories depending on whether they require all three assumptions of the potential outcome framework or not. For each category, both the traditional statistical methods and the recent machine learning enhanced methods are discussed and compared. The plausible applications of these methods are also presented, including the applications in advertising, recommendation, medicine and so on. Moreover, the commonly used benchmark datasets as well as the open-source codes are also summarized, which facilitate researchers and practitioners to explore, evaluate and apply the causal inference methods.