We show that the communication cost of quantum broadcast channel simulation under free entanglement assistance between the sender and the receivers is asymptotically characterized by an efficiently computable single-letter formula in terms of the channel's multipartite mutual information. Our core contribution is a new one-shot achievability result for multipartite quantum state splitting via multipartite convex splitting. As part of this, we face a general instance of the quantum joint typicality problem with arbitrarily overlapping marginals. The crucial technical ingredient to sidestep this difficulty is a conceptually novel multipartite mean-zero decomposition lemma, together with employing recently introduced complex interpolation techniques for sandwiched R\'enyi divergences. Moreover, we establish an exponential convergence of the simulation error when the communication costs are within the interior of the capacity region. As the costs approach the boundary of the capacity region moderately quickly, we show that the error still vanishes asymptotically.
Deterministic identification over K-input multiple-access channels with average input cost constraints is considered. The capacity region for deterministic identification is determined for an average-error criterion, where arbitrarily large codes are achievable. For a maximal-error criterion, upper and lower bounds on the capacity region are derived. The bounds coincide if all average partial point-to-point channels are injective under the input constraint, i.e. all inputs at one terminal are mapped to distinct output distributions, if averaged over the inputs at all other terminals. The achievability is proved by treating the MAC as an arbitrarily varying channel with average state constraints. For injective average channels, the capacity region is a hyperrectangle. The modulo-2 and modulo-3 binary adder MAC are presented as examples of channels which are injective under suitable input constraints. The binary multiplier MAC is presented as an example of a non-injective channel, where the achievable identification rate region still includes the Shannon capacity region.
Spatial processes observed in various fields, such as climate and environmental science, often occur on a large scale and demonstrate spatial nonstationarity. Fitting a Gaussian process with a nonstationary Mat\'ern covariance is challenging. Previous studies in the literature have tackled this challenge by employing spatial partitioning techniques to estimate the parameters that vary spatially in the covariance function. The selection of partitions is an important consideration, but it is often subjective and lacks a data-driven approach. To address this issue, in this study, we utilize the power of Convolutional Neural Networks (ConvNets) to derive subregions from the nonstationary data. We employ a selection mechanism to identify subregions that exhibit similar behavior to stationary fields. In order to distinguish between stationary and nonstationary random fields, we conducted training on ConvNet using various simulated data. These simulations are generated from Gaussian processes with Mat\'ern covariance models under a wide range of parameter settings, ensuring adequate representation of both stationary and nonstationary spatial data. We assess the performance of the proposed method with synthetic and real datasets at a large scale. The results revealed enhanced accuracy in parameter estimations when relying on ConvNet-based partition compared to traditional user-defined approaches.
We study a wireless jamming problem consisting of the competition between a legitimate receiver and a jammer, as a zero-sum game with the value to maximize/minimize being the channel capacity at the receiver's side. Most of the approaches found in the literature consider the two players to be stationary nodes. Instead, we investigate what happens when they can change location, specifically moving along a linear geometry. We frame this at first as a static game, which can be solved in closed form, and subsequently we extend it to a dynamic game, under three different versions for what concerns completeness/perfection of mutual information about the adversary's position, corresponding to different assumptions of concealment/sequentiality of the moves, respectively. We first provide some theoretical conditions that hold for the static game and also help identify good strategies valid under any setup, including dynamic games. Since dynamic games, although more realistic, are characterized by an exploding strategy space, we exploit reinforcement learning to obtain efficient strategies leading to equilibrium outcomes. We show how theoretical findings can be used to train smart agents to play the game, and validate our approach in practical setups.
This letter introduces a novel resource allocation algorithm for achieving max-min fairness (MMF) in a rate-splitting multiple access (RSMA) empowered multi-antenna broadcast channel. Specifically, we derive the closed-form solution for the optimal allocation of the common rate among users and the power between the common and private streams for a given practical low-complexity beamforming direction design. Numerical results show that the proposed algorithm achieves at least 90% of the MMF rate obtained by the conventional iterative optimization algorithm while only takes an average of 0.1 millisecond computational time, which is three orders of magnitude lower than the conventional algorithm. It is therefore a practical resource allocation algorithm for RSMA.
Machine learning methods are commonly evaluated and compared by their performance on data sets from public repositories. This allows for multiple methods, oftentimes several thousands, to be evaluated under identical conditions and across time. The highest ranked performance on a problem is referred to as state-of-the-art (SOTA) performance, and is used, among other things, as a reference point for publication of new methods. Using the highest-ranked performance as an estimate for SOTA is a biased estimator, giving overly optimistic results. The mechanisms at play are those of multiplicity, a topic that is well-studied in the context of multiple comparisons and multiple testing, but has, as far as the authors are aware of, been nearly absent from the discussion regarding SOTA estimates. The optimistic state-of-the-art estimate is used as a standard for evaluating new methods, and methods with substantial inferior results are easily overlooked. In this article, we provide a probability distribution for the case of multiple classifiers so that known analyses methods can be engaged and a better SOTA estimate can be provided. We demonstrate the impact of multiplicity through a simulated example with independent classifiers. We show how classifier dependency impacts the variance, but also that the impact is limited when the accuracy is high. Finally, we discuss a real-world example; a Kaggle competition from 2020.
An algorithm is said to be adaptive to a certain parameter (of the problem) if it does not need a priori knowledge of such a parameter but performs competitively to those that know it. This dissertation presents our work on adaptive algorithms in following scenarios: 1. In the stochastic optimization setting, we only receive stochastic gradients and the level of noise in evaluating them greatly affects the convergence rate. Tuning is typically required when without prior knowledge of the noise scale in order to achieve the optimal rate. Considering this, we designed and analyzed noise-adaptive algorithms that can automatically ensure (near)-optimal rates under different noise scales without knowing it. 2. In training deep neural networks, the scales of gradient magnitudes in each coordinate can scatter across a very wide range unless normalization techniques, like BatchNorm, are employed. In such situations, algorithms not addressing this problem of gradient scales can behave very poorly. To mitigate this, we formally established the advantage of scale-free algorithms that adapt to the gradient scales and presented its real benefits in empirical experiments. 3. Traditional analyses in non-convex optimization typically rely on the smoothness assumption. Yet, this condition does not capture the properties of some deep learning objective functions, including the ones involving Long Short-Term Memory networks and Transformers. Instead, they satisfy a much more relaxed condition, with potentially unbounded smoothness. Under this condition, we show that a generalized SignSGD algorithm can theoretically match the best-known convergence rates obtained by SGD with gradient clipping but does not need explicit clipping at all, and it can empirically match the performance of Adam and beat others. Moreover, it can also be made to automatically adapt to the unknown relaxed smoothness.
Language model training in distributed settings is limited by the communication cost of gradient exchanges. In this short note, we extend recent work from Malladi et al. (2023), using shared randomness to perform distributed fine-tuning with low bandwidth. The method is a natural decentralized extension of memory-efficient Simultaneous Perturbation Stochastic Approximation (SPSA). Each iteration, each machine seeds a Random Number Generator (RNG) to perform local reproducible perturbations on model weights and calculate and exchange scalar projected gradients, which are then used to update each model. By using a (machine, sample) identifier as the random seed, each model can regenerate one another's perturbations. As machines only exchange single-byte projected gradients, this is highly communication efficient. There are also potential privacy benefits, as projected gradients may be calculated on different training data, and models never access the other's data. Our approach not only drastically reduces communication bandwidth requirements but also accommodates dynamic addition or removal of machines during the training process and retains the memory-efficient and inference-only advantages of recent work. We perform proof-of-concept experiments to demonstrate the potential usefulness of this method, building off of rich literature on distributed optimization and memory-efficient training.
Evolvability refers to the ability of an individual genotype (solution) to produce offspring with mutually diverse phenotypes. Recent research has demonstrated that divergent search methods, particularly novelty search, promote evolvability by implicitly creating selective pressure for it. The main objective of this paper is to provide a novel perspective on the relationship between neuroevolutionary divergent search and evolvability. In order to achieve this, several types of walks from the literature on fitness landscape analysis are first adapted to this context. Subsequently, the interplay between neuroevolutionary divergent search and evolvability under varying amounts of evolutionary pressure and under different diversity metrics is investigated. To this end, experiments are performed on Fetch Pick and Place, a robotic arm task. Moreover, the performed study in particular sheds light on the structure of the genotype-phenotype mapping (the behavior landscape). Finally, a novel definition of evolvability that takes into account the evolvability of offspring and is appropriate for use with discretized behavior spaces is proposed, together with a Markov-chain-based estimation method for it.
Variational Quantum Algorithms (VQAs), such as the Quantum Approximate Optimization Algorithm (QAOA) of [Farhi, Goldstone, Gutmann, 2014], have seen intense study towards near-term applications on quantum hardware. A crucial parameter for VQAs is the \emph{depth} of the variational ``ansatz'' used -- the smaller the depth, the more amenable the ansatz is to near-term quantum hardware in that it gives the circuit a chance to be fully executed before the system decoheres. In this work, we show that approximating the optimal depth for a given VQA ansatz is intractable. Formally, we show that for any constant $\epsilon>0$, it is QCMA-hard to approximate the optimal depth of a VQA ansatz within multiplicative factor $N^{1-\epsilon}$, for $N$ denoting the encoding size of the VQA instance. (Here, Quantum Classical Merlin-Arthur (QCMA) is a quantum generalization of NP.) We then show that this hardness persists in the even ``simpler'' QAOA-type settings. To our knowledge, this yields the first natural QCMA-hard-to-approximate problems.
Games and simulators can be a valuable platform to execute complex multi-agent, multiplayer, imperfect information scenarios with significant parallels to military applications: multiple participants manage resources and make decisions that command assets to secure specific areas of a map or neutralize opposing forces. These characteristics have attracted the artificial intelligence (AI) community by supporting development of algorithms with complex benchmarks and the capability to rapidly iterate over new ideas. The success of artificial intelligence algorithms in real-time strategy games such as StarCraft II have also attracted the attention of the military research community aiming to explore similar techniques in military counterpart scenarios. Aiming to bridge the connection between games and military applications, this work discusses past and current efforts on how games and simulators, together with the artificial intelligence algorithms, have been adapted to simulate certain aspects of military missions and how they might impact the future battlefield. This paper also investigates how advances in virtual reality and visual augmentation systems open new possibilities in human interfaces with gaming platforms and their military parallels.