A class of optimal three-weight cyclic codes of dimension 3 over any finite field was presented by Vega [Finite Fields Appl., 42 (2016) 23-38]. Shortly thereafter, Heng and Yue [IEEE Trans. Inf. Theory, 62(8) (2016) 4501-4513] generalized this result by presenting several classes of cyclic codes with either optimal three weights or a few weights. On the other hand, a class of optimal five-weight cyclic codes of dimension 4 over a prime field was recently presented by Li, et al. [Adv. Math. Commun., 13(1) (2019) 137-156]. One of the purposes of this work is to present a more general description for these optimal five-weight cyclic codes, which gives place to an enlarged class of optimal five-weight cyclic codes of dimension 4 over any finite field. As an application of this enlarged class, we present the complete weight enumerator of a subclass of the optimal three-weight cyclic codes over any finite field that were studied by Vega [Finite Fields Appl., 42 (2016) 23-38]. In addition, we study the dual codes in this enlarged class of optimal five-weight cyclic codes, and show that they are cyclic codes of length $q^2-1$, dimension $q^2-5$, and minimum Hamming distance 4. In fact, through several examples, we see that those parameters are the best known parameters for linear codes.
In this paper, we study the convergence properties of a randomized block-coordinate descent algorithm for the minimization of a composite convex objective function, where the block-coordinates are updated asynchronously and randomly according to an arbitrary probability distribution. We prove that the iterates generated by the algorithm form a stochastic quasi-Fej\'er sequence and thus converge almost surely to a minimizer of the objective function. Moreover, we prove a general sublinear rate of convergence in expectation for the function values and a linear rate of convergence in expectation under an error bound condition of Tseng type.
This paper is concerned with the lossy compression of general random variables, specifically with rate-distortion theory and quantization of random variables taking values in general measurable spaces such as, e.g., manifolds and fractal sets. Manifold structures are prevalent in data science, e.g., in compressed sensing, machine learning, image processing, and handwritten digit recognition. Fractal sets find application in image compression and in the modeling of Ethernet traffic. Our main contributions are bounds on the rate-distortion function and the quantization error. These bounds are very general and essentially only require the existence of reference measures satisfying certain regularity conditions in terms of small ball probabilities. To illustrate the wide applicability of our results, we particularize them to random variables taking values in i) manifolds, namely, hyperspheres and Grassmannians, and ii) self-similar sets characterized by iterated function systems satisfying the weak separation property.
We consider the construction of a polyhedral Delaunay partition as a limit of the sequence of power diagrams (radical partitions). The dual Voronoi diagram is obtained as a limit of the sequence of weighted Delaunay partitions. The problem is reduced to the construction of two dual convex polyhedra, inscribed and superscribed around a circular paraboloid, as a limit of the sequence of pairs of general dual convex polyhedra. The sequence of primal polyhedra should converge to the superscribed polyhedron and the sequence of the dual polyhedra converges to the inscribed polyhedron. We are interested in the case when the vertices of primal polyhedra can move or merge together, i.e., no new faces are allowed for dual polyhedra. These rules define the transformation of the set of initial spheres into the set of Delaunay spheres using radius variation and sphere movement and elimination. Existence theorems are still unavailable but we suggest a functional measuring the deviation of the convex polyhedron from the one inscribed into the paraboloid. It is the discrete Dirichlet functional for the power function which is a linear interpolant of the vertical distance of the dual vertices from the paraboloid. The functional's absolute minimizer is attained on the constant power field, meaning that the inscribed polyhedron can be obtained by a simple translation. This formulation of the functional for the dual surface is not quadratic since the unknowns are the vertices of the primal polyhedron, hence, the transformation of the set of spheres into Delaunay spheres is not unique. We concentrate on the experimental confirmation of the approach viability and put aside mesh quality problems. The zero value of the gradient of the proposed functional defines a manifold describing the evolution of Delaunay spheres. Hence, Delaunay-Voronoi meshes can be optimized using the manifold as a constraint.
We consider a class of statistical estimation problems in which we are given a random data matrix ${\boldsymbol X}\in {\mathbb R}^{n\times d}$ (and possibly some labels ${\boldsymbol y}\in{\mathbb R}^n$) and would like to estimate a coefficient vector ${\boldsymbol \theta}\in{\mathbb R}^d$ (or possibly a constant number of such vectors). Special cases include low-rank matrix estimation and regularized estimation in generalized linear models (e.g., sparse regression). First order methods proceed by iteratively multiplying current estimates by ${\boldsymbol X}$ or its transpose. Examples include gradient descent or its accelerated variants. Celentano, Montanari, Wu proved that for any constant number of iterations (matrix vector multiplications), the optimal first order algorithm is a specific approximate message passing algorithm (known as `Bayes AMP'). The error of this estimator can be characterized in the high-dimensional asymptotics $n,d\to\infty$, $n/d\to\delta$, and provides a lower bound to the estimation error of any first order algorithm. Here we present a simpler proof of the same result, and generalize it to broader classes of data distributions and of first order algorithms, including algorithms with non-separable nonlinearities. Most importantly, the new proof technique does not require to construct an equivalent tree-structured estimation problem, and is therefore susceptible of a broader range of applications.
In this article we prove that a class of Goppa codes whose Goppa polynomial is of the form $g(x) = x + x^q + \cdots + x^{q^{m-1}}$ where $m \geq 3$ (i.e. $g(x)$ is a trace polynomial from a field extension of degree $m \geq 3$) has a better minimum distance than what the Goppa bound $d \geq 2deg(g(x))+1$ implies. Our improvement is based on finding another Goppa polynomial $h$ such that $C(L,g) = C(M, h)$ but $deg(h) > deg(g)$. This is a significant improvement over Trace Goppa codes over quadratic field extensions (i.e. the case $m = 2$), as the Goppa bound for the quadratic case is sharp.
Resolving a conjecture of Littlestone and Warmuth, we show that any concept class of VC-dimension $d$ has a sample compression scheme of size $d$.
We present and analyze a momentum-based gradient method for training linear classifiers with an exponentially-tailed loss (e.g., the exponential or logistic loss), which maximizes the classification margin on separable data at a rate of $\widetilde{\mathcal{O}}(1/t^2)$. This contrasts with a rate of $\mathcal{O}(1/\log(t))$ for standard gradient descent, and $\mathcal{O}(1/t)$ for normalized gradient descent. This momentum-based method is derived via the convex dual of the maximum-margin problem, and specifically by applying Nesterov acceleration to this dual, which manages to result in a simple and intuitive method in the primal. This dual view can also be used to derive a stochastic variant, which performs adaptive non-uniform sampling via the dual variables.
Lidar based 3D object detection is inevitable for autonomous driving, because it directly links to environmental understanding and therefore builds the base for prediction and motion planning. The capacity of inferencing highly sparse 3D data in real-time is an ill-posed problem for lots of other application areas besides automated vehicles, e.g. augmented reality, personal robotics or industrial automation. We introduce Complex-YOLO, a state of the art real-time 3D object detection network on point clouds only. In this work, we describe a network that expands YOLOv2, a fast 2D standard object detector for RGB images, by a specific complex regression strategy to estimate multi-class 3D boxes in Cartesian space. Thus, we propose a specific Euler-Region-Proposal Network (E-RPN) to estimate the pose of the object by adding an imaginary and a real fraction to the regression network. This ends up in a closed complex space and avoids singularities, which occur by single angle estimations. The E-RPN supports to generalize well during training. Our experiments on the KITTI benchmark suite show that we outperform current leading methods for 3D object detection specifically in terms of efficiency. We achieve state of the art results for cars, pedestrians and cyclists by being more than five times faster than the fastest competitor. Further, our model is capable of estimating all eight KITTI-classes, including Vans, Trucks or sitting pedestrians simultaneously with high accuracy.
Generative adversarial networks (GANs) evolved into one of the most successful unsupervised techniques for generating realistic images. Even though it has recently been shown that GAN training converges, GAN models often end up in local Nash equilibria that are associated with mode collapse or otherwise fail to model the target distribution. We introduce Coulomb GANs, which pose the GAN learning problem as a potential field of charged particles, where generated samples are attracted to training set samples but repel each other. The discriminator learns a potential field while the generator decreases the energy by moving its samples along the vector (force) field determined by the gradient of the potential field. Through decreasing the energy, the GAN model learns to generate samples according to the whole target distribution and does not only cover some of its modes. We prove that Coulomb GANs possess only one Nash equilibrium which is optimal in the sense that the model distribution equals the target distribution. We show the efficacy of Coulomb GANs on a variety of image datasets. On LSUN and celebA, Coulomb GANs set a new state of the art and produce a previously unseen variety of different samples.
In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov's accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.