We propose an optimal low-resolution precoding technique that minimizes the symbol error probability of the users. Unlike existing approaches that rely on QPSK modulation, for the derivation of the minimum symbol error probability objective function the current approach allows for any PSK modulation order. Moreover, the proposed method solves the corresponding discrete optimization problem optimally via a sophisticated branch-and-bound method. Moreover, we propose different approaches based on the greedy search method to compute practical solutions. Numerical simulations confirm the superiority of the proposed minimum symbol error probability criteria in terms of symbol error rate when compared with the established MMDDT and MMSE approaches.
5G technology allows heterogeneous services to share the wireless spectrum within the same radio access network. In this context, spectrum slicing of the shared radio resources is a critical task to guarantee the performance of each service. We analyze a downlink communication serving two types of traffic: enhanced mobile broadband (eMBB) and ultra-reliable low-latency communication (URLLC). Due to the nature of low-latency traffic, the base station knows the channel state information (CSI) of the eMBB users, while having statistical CSI for the URLLC users. We study the power minimization problem employing orthogonal multiple access (OMA) and non-orthogonal multiple access (NOMA) schemes. Based on this analysis, we propose two algorithms: a lookup table-based and a block coordinated descent (BCD). We show that the BCD is optimal for the URLLC power allocation. The numerical results show that NOMA leads to a lower power consumption compared to OMA, except when the average channel gain of the the URLLC user is very high. For the latter case, the optimal approach depends on the channel condition of the eMBB user. Even when OMA attains the best performance, the gap with NOMA is negligible. This shows the capability of NOMA to reduce the power consumption in practically every condition.
Massive grant-free multiple-access is a valuable research topic for next generation multiple-access, since it significantly reduces the control signaling overhead and transmission latency. This paper constructs a novel uniquely-decodable multi-amplitude sequence (UDAS) set for grant-free multiple-access systems, which can provide high spectrum efficiency (SE) without additional redundancy and realize low-complexity active user detection (AUD). We firstly propose an UDAS-based multi-dimensional bit interleaving coded modulation (MD-BICM) transmitter. Then, this paper presents the detailed definition of UDAS, and provides three conditions for constructing a UDAS set. Following, two kinds of UDAS sets are constructed based on cyclic and quasi-cyclic matrix modes; and some important features of the cyclic/quasi-cyclic UDAS sets are deduced. Besides, we present a statistic of UDAS feature based AUD algorithm (SoF-AUD), and a joint multiuser detection and improved message passing iterative decoding algorithm for the proposed system. Finally, the active user error rate (AUER) and Shannon limits of the proposed system are deduced in details. Simulation results show that the AUER of our proposed system can reach an extremely low value $10^{-5}$, when $E_b/N_0$ is 0 dB and the length of transmit block is larger than a given value (e.g., 784). Meanwhile, the SE of our proposed system can compare with the designed non-orthogonal multiple-access (NOMA) codebooks, verifying the valid and flexible.
Motivated by recent increased interest in optimization algorithms for non-convex optimization in application to training deep neural networks and other optimization problems in data analysis, we give an overview of recent theoretical results on global performance guarantees of optimization algorithms for non-convex optimization. We start with classical arguments showing that general non-convex problems could not be solved efficiently in a reasonable time. Then we give a list of problems that can be solved efficiently to find the global minimizer by exploiting the structure of the problem as much as it is possible. Another way to deal with non-convexity is to relax the goal from finding the global minimum to finding a stationary point or a local minimum. For this setting, we first present known results for the convergence rates of deterministic first-order methods, which are then followed by a general theoretical analysis of optimal stochastic and randomized gradient schemes, and an overview of the stochastic first-order methods. After that, we discuss quite general classes of non-convex problems, such as minimization of $\alpha$-weakly-quasi-convex functions and functions that satisfy Polyak--Lojasiewicz condition, which still allow obtaining theoretical convergence guarantees of first-order methods. Then we consider higher-order and zeroth-order/derivative-free methods and their convergence rates for non-convex optimization problems.
The paper studies the multi-user precoding problem as a non-convex optimization problem for wireless multiple input and multiple output (MIMO) systems. In our work, we approximate the target Spectral Efficiency function with a novel computationally simpler function. Then, we reduce the precoding problem to an unconstrained optimization task using a special differential projection method and solve it by the Quasi-Newton L-BFGS iterative procedure to achieve gains in capacity. We are testing the proposed approach in several scenarios generated using Quadriga -- open-source software for generating realistic radio channel impulse response. Our method shows monotonic improvement over heuristic methods with reasonable computation time. The proposed L-BFGS optimization scheme is novel in this area and shows a significant advantage over the standard approaches. Proposed method has a simple implementation and can be a good reference for other heuristic algorithms in this field.
This paper studies distributed binary test of statistical independence under communication (information bits) constraints. While testing independence is very relevant in various applications, distributed independence test is particularly useful for event detection in sensor networks where data correlation often occurs among observations of devices in the presence of a signal of interest. By focusing on the case of two devices because of their tractability, we begin by investigating conditions on Type I error probability restrictions under which the minimum Type II error admits an exponential behavior with the sample size. Then, we study the finite sample-size regime of this problem. We derive new upper and lower bounds for the gap between the minimum Type II error and its exponential approximation under different setups, including restrictions imposed on the vanishing Type I error probability. Our theoretical results shed light on the sample-size regimes at which approximations of the Type II error probability via error exponents became informative enough in the sense of predicting well the actual error probability. We finally discuss an application of our results where the gap is evaluated numerically, and we show that exponential approximations are not only tractable but also a valuable proxy for the Type II probability of error in the finite-length regime.
Interpretation of Deep Neural Networks (DNNs) training as an optimal control problem with nonlinear dynamical systems has received considerable attention recently, yet the algorithmic development remains relatively limited. In this work, we make an attempt along this line by reformulating the training procedure from the trajectory optimization perspective. We first show that most widely-used algorithms for training DNNs can be linked to the Differential Dynamic Programming (DDP), a celebrated second-order trajectory optimization algorithm rooted in the Approximate Dynamic Programming. In this vein, we propose a new variant of DDP that can accept batch optimization for training feedforward networks, while integrating naturally with the recent progress in curvature approximation. The resulting algorithm features layer-wise feedback policies which improve convergence rate and reduce sensitivity to hyper-parameter over existing methods. We show that the algorithm is competitive against state-ofthe-art first and second order methods. Our work opens up new avenues for principled algorithmic design built upon the optimal control theory.
We propose accelerated randomized coordinate descent algorithms for stochastic optimization and online learning. Our algorithms have significantly less per-iteration complexity than the known accelerated gradient algorithms. The proposed algorithms for online learning have better regret performance than the known randomized online coordinate descent algorithms. Furthermore, the proposed algorithms for stochastic optimization exhibit as good convergence rates as the best known randomized coordinate descent algorithms. We also show simulation results to demonstrate performance of the proposed algorithms.
In this work, we consider the distributed optimization of non-smooth convex functions using a network of computing units. We investigate this problem under two regularity assumptions: (1) the Lipschitz continuity of the global objective function, and (2) the Lipschitz continuity of local individual functions. Under the local regularity assumption, we provide the first optimal first-order decentralized algorithm called multi-step primal-dual (MSPD) and its corresponding optimal convergence rate. A notable aspect of this result is that, for non-smooth functions, while the dominant term of the error is in $O(1/\sqrt{t})$, the structure of the communication network only impacts a second-order term in $O(1/t)$, where $t$ is time. In other words, the error due to limits in communication resources decreases at a fast rate even in the case of non-strongly-convex objective functions. Under the global regularity assumption, we provide a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function, and show that DRS is within a $d^{1/4}$ multiplicative factor of the optimal convergence rate, where $d$ is the underlying dimension.
In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov's accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.
The Residual Networks of Residual Networks (RoR) exhibits excellent performance in the image classification task, but sharply increasing the number of feature map channels makes the characteristic information transmission incoherent, which losses a certain of information related to classification prediction, limiting the classification performance. In this paper, a Pyramidal RoR network model is proposed by analysing the performance characteristics of RoR and combining with the PyramidNet. Firstly, based on RoR, the Pyramidal RoR network model with channels gradually increasing is designed. Secondly, we analysed the effect of different residual block structures on performance, and chosen the residual block structure which best favoured the classification performance. Finally, we add an important principle to further optimize Pyramidal RoR networks, drop-path is used to avoid over-fitting and save training time. In this paper, image classification experiments were performed on CIFAR-10/100 and SVHN datasets, and we achieved the current lowest classification error rates were 2.96%, 16.40% and 1.59%, respectively. Experiments show that the Pyramidal RoR network optimization method can improve the network performance for different data sets and effectively suppress the gradient disappearance problem in DCNN training.