This paper studies a scalar Gaussian wiretap channel where instead of an average input power constraint, we consider a peak amplitude constraint on the input. The goal is to obtain insights into the secrecy-capacity and the structure of the secrecy-capacity-achieving distribution. Capitalizing on the recent theoretical progress on the structure of the secrecy-capacity-achieving distribution, this paper develops a numerical procedure, based on the gradient ascent algorithm and a version of the Blahut-Arimoto algorithm, for computing the secrecy-capacity and the secrecy-capacity-achieving input and output distributions.
In this paper, we are interested in the performance of a variable-length stop-feedback (VLSF) code with $m$ optimal decoding times for the binary-input additive white Gaussian noise (BI-AWGN) channel. We first develop tight approximations on the tail probability of length-$n$ cumulative information density. Building on the work of Yavas \emph{et al.}, we formulate the problem of minimizing the upper bound on average blocklength subject to the error probability, minimum gap, and integer constraints. For this integer program, we show that for a given error constraint, a VLSF code that decodes after every symbol attains the maximum achievable rate. We also present a greedy algorithm that yields possibly suboptimal integer decoding times. By allowing a positive real-valued decoding time, we develop the gap-constrained sequential differential optimization (SDO) procedure. Numerical evaluation shows that the gap-constrained SDO can provide a good estimate on achievable rate of VLSF codes with $m$ optimal decoding times and that a finite $m$ suffices to attain Polyanskiy's bound for VLSF codes with $m = \infty$.
Performance assessment and optimization for networks jointly performing caching, computing, and communication (3C) has recently drawn significant attention because many emerging applications require 3C functionality. However, studies in the literature mostly focus on the particular algorithms and setups of such networks, while their theoretical understanding and characterization has been less explored. To fill this gap, this paper conducts the asymptotic (scaling-law) analysis for the delay-outage tradeoff of noise-limited wireless edge networks with joint 3C. In particular, assuming the user requests for different tasks following a Zipf distribution, we derive the analytical expression for the optimal caching policy. Based on this, we next derive the closed-form expression for the optimum outage probability as a function of delay and other network parameters for the case that the Zipf parameter is smaller than 1. Then, for the case that the Zipf parameter is larger than 1, we derive the closed-form expressions for upper and lower bounds of the optimum outage probability. We provide insights and interpretations based on the derived expressions. Computer simulations validate our analytical results and insights.
The angular measure on the unit sphere characterizes the first-order dependence structure of the components of a random vector in extreme regions and is defined in terms of standardized margins. Its statistical recovery is an important step in learning problems involving observations far away from the center. In the common situation that the components of the vector have different distributions, the rank transformation offers a convenient and robust way of standardizing data in order to build an empirical version of the angular measure based on the most extreme observations. However, the study of the sampling distribution of the resulting empirical angular measure is challenging. It is the purpose of the paper to establish finite-sample bounds for the maximal deviations between the empirical and true angular measures, uniformly over classes of Borel sets of controlled combinatorial complexity. The bounds are valid with high probability and, up to logarithmic factors, scale as the square root of the effective sample size. The bounds are applied to provide performance guarantees for two statistical learning procedures tailored to extreme regions of the input space and built upon the empirical angular measure: binary classification in extreme regions through empirical risk minimization and unsupervised anomaly detection through minimum-volume sets of the sphere.
We introduce a class of Markov chains, that contains the model of stochastic approximation by averaging and non-averaging. Using martingale approximation method, we establish various deviation inequalities for separately Lipschitz functions of such a chain, with different moment conditions on some dominating random variables of martingale differences.Finally, we apply these inequalities to the stochastic approximation by averaging and empirical risk minimisation.
As an example of the nonlinear Fokker-Planck equation, the mean field Langevin dynamics attracts attention due to its connection to (noisy) gradient descent on infinitely wide neural networks in the mean field regime, and hence the convergence property of the dynamics is of great theoretical interest. In this work, we give a simple and self-contained convergence rate analysis of the mean field Langevin dynamics with respect to the (regularized) objective function in both continuous and discrete time settings. The key ingredient of our proof is a proximal Gibbs distribution $p_q$ associated with the dynamics, which, in combination of techniques in [Vempala and Wibisono (2019)], allows us to develop a convergence theory parallel to classical results in convex optimization. Furthermore, we reveal that $p_q$ connects to the duality gap in the empirical risk minimization setting, which enables efficient empirical evaluation of the algorithm convergence.
We study the problem of constructing the control driving a controlled differential equation from discrete observations of the response. By restricting the control to the space of piecewise linear paths, we identify the assumptions that ensure uniqueness. The main contribution of this paper is the introduction of a novel numerical algorithm for the construction of the piecewise linear control, that converges uniformly in time. Uniform convergence is needed for many applications and it is achieved by approaching the problem through the signature representation of the paths, which allows us to work with the whole path simultaneously.
We present a method for solving linear and nonlinear PDEs based on the variable projection (VarPro) framework and artificial neural networks (ANN). For linear PDEs, enforcing the boundary/initial value problem on the collocation points leads to a separable nonlinear least squares problem about the network coefficients. We reformulate this problem by the VarPro approach to eliminate the linear output-layer coefficients, leading to a reduced problem about the hidden-layer coefficients only. The reduced problem is solved first by the nonlinear least squares method to determine the hidden-layer coefficients, and then the output-layer coefficients are computed by the linear least squares method. For nonlinear PDEs, enforcing the boundary/initial value problem on the collocation points leads to a nonlinear least squares problem that is not separable, which precludes the VarPro strategy for such problems. To enable the VarPro approach for nonlinear PDEs, we first linearize the problem with a Newton iteration, using a particular form of linearization. The linearized system is solved by the VarPro framework together with ANNs. Upon convergence of the Newton iteration, the network coefficients provide the representation of the solution field to the original nonlinear problem. We present ample numerical examples with linear and nonlinear PDEs to demonstrate the performance of the method herein. For smooth field solutions, the errors of the current method decrease exponentially as the number of collocation points or the number of output-layer coefficients increases. We compare the current method with the ELM method from a previous work. Under identical conditions and network configurations, the current method exhibits an accuracy significantly superior to the ELM method.
In this work, we consider the distributed optimization of non-smooth convex functions using a network of computing units. We investigate this problem under two regularity assumptions: (1) the Lipschitz continuity of the global objective function, and (2) the Lipschitz continuity of local individual functions. Under the local regularity assumption, we provide the first optimal first-order decentralized algorithm called multi-step primal-dual (MSPD) and its corresponding optimal convergence rate. A notable aspect of this result is that, for non-smooth functions, while the dominant term of the error is in $O(1/\sqrt{t})$, the structure of the communication network only impacts a second-order term in $O(1/t)$, where $t$ is time. In other words, the error due to limits in communication resources decreases at a fast rate even in the case of non-strongly-convex objective functions. Under the global regularity assumption, we provide a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function, and show that DRS is within a $d^{1/4}$ multiplicative factor of the optimal convergence rate, where $d$ is the underlying dimension.
The field of Multi-Agent System (MAS) is an active area of research within Artificial Intelligence, with an increasingly important impact in industrial and other real-world applications. Within a MAS, autonomous agents interact to pursue personal interests and/or to achieve common objectives. Distributed Constraint Optimization Problems (DCOPs) have emerged as one of the prominent agent architectures to govern the agents' autonomous behavior, where both algorithms and communication models are driven by the structure of the specific problem. During the last decade, several extensions to the DCOP model have enabled them to support MAS in complex, real-time, and uncertain environments. This survey aims at providing an overview of the DCOP model, giving a classification of its multiple extensions and addressing both resolution methods and applications that find a natural mapping within each class of DCOPs. The proposed classification suggests several future perspectives for DCOP extensions, and identifies challenges in the design of efficient resolution algorithms, possibly through the adaptation of strategies from different areas.
In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov's accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.