Location Routing is a fundamental planning problem in logistics, in which strategic location decisions on the placement of facilities (depots, distribution centers, warehouses etc.) are taken based on accurate estimates of operational routing costs. We present an approximation algorithm, i.e., an algorithm with proven worst-case guarantees both in terms of running time and solution quality, for the general capacitated version of this problem, in which both vehicles and facilities are capacitated. Before, such algorithms were only known for the special case where facilities are uncapacitated or where their capacities can be extended arbitrarily at linear cost. Previously established lower bounds that are known to approximate the optimal solution value well in the uncapacitated case can be off by an arbitrary factor in the general case. We show that this issue can be overcome by a bifactor approximation algorithm that may slightly exceed facility capacities by an adjustable, arbitrarily small margin while approximating the optimal cost by a constant factor. In addition to these proven worst-case guarantees, we also assess the practical performance of our algorithm in a comprehensive computational study, showing that the approach allows efficient computation of near-optimal solutions for instance sizes beyond the reach of current state-of-the-art heuristics.
The Super-SAT or SSAT problem was introduced by Dinur et al.(2002,2003) to prove the NP-hardness of approximation of two popular lattice problems - Shortest Vector Problem(SVP) and Closest Vector Problem(CVP). They conjectured that SSAT is NP-hard to approximate to within a factor of $n^c$ ($c>0$ is constant), where $n$ is the size of the SSAT instance. In this paper we prove this conjecture assuming the Projection Games Conjecture(PGC), given by Moshkovitz (2012). This implies hardness of approximation of SVP and CVP within polynomial factors, assuming PGC. We also reduce SSAT to the Nearest Codeword Problem(NCP) and Learning Halfspace Problem(LHP), as considered by Arora et al.(1997). This proves that both these problems are NP-hard to approximate within a factor of $N^{c'/\log\log n}$($c'>0$ is constant) where $N$ is the size of the instances of the respective problems. Assuming PGC these problems are proved to be NP-hard to approximate within polynomial factors.
Chernoff bounds are a powerful application of the Markov inequality to produce strong bounds on the tails of probability distributions. They are often used to bound the tail probabilities of sums of Poisson trials, or in regression to produce conservative confidence intervals for the parameters of such trials. The bounds provide expressions for the tail probabilities that can be inverted for a given probability/confidence to provide tail intervals. The inversions involve the solution of transcendental equations and it is often convenient to substitute approximations that can be exactly solved e.g. by the quadratic equation. In this paper we introduce approximations for the Chernoff bounds whose inversion can be exactly solved with a quadratic equation, but which are closer approximations than those adopted previously.
We investigate a clustering problem with data from a mixture of Gaussians that share a common but unknown, and potentially ill-conditioned, covariance matrix. We start by considering Gaussian mixtures with two equally-sized components and derive a Max-Cut integer program based on maximum likelihood estimation. We prove its solutions achieve the optimal misclassification rate when the number of samples grows linearly in the dimension, up to a logarithmic factor. However, solving the Max-cut problem appears to be computationally intractable. To overcome this, we develop an efficient spectral algorithm that attains the optimal rate but requires a quadratic sample size. Although this sample complexity is worse than that of the Max-cut problem, we conjecture that no polynomial-time method can perform better. Furthermore, we gather numerical and theoretical evidence that supports the existence of a statistical-computational gap. Finally, we generalize the Max-Cut program to a $k$-means program that handles multi-component mixtures with possibly unequal weights. It enjoys similar optimality guarantees for mixtures of distributions that satisfy a transportation-cost inequality, encompassing Gaussian and strongly log-concave distributions.
Block coordinate descent (BCD), also known as nonlinear Gauss-Seidel, is a simple iterative algorithm for nonconvex optimization that sequentially minimizes the objective function in each block coordinate while the other coordinates are held fixed. We propose a version of BCD that, for block multi-convex and smooth objective functions under constraints, is guaranteed to converge to the stationary points with worst-case rate of convergence of $O((\log n)^{2}/n)$ for $n$ iterations, and a bound of $O(\epsilon^{-1}(\log \epsilon^{-1})^{2})$ for the number of iterations to achieve an $\epsilon$-approximate stationary point. Furthermore, we show that these results continue to hold even when the convex sub-problems are inexactly solved if the optimality gaps are uniformly summable against initialization. A key idea is to restrict the parameter search within a diminishing radius to promote stability of iterates. As an application, we provide an alternating least squares algorithm with diminishing radius for nonnegative CP tensor decomposition that converges to the stationary points of the reconstruction error with the same robust worst-case convergence rate and complexity bounds. We also experimentally validate our results with both synthetic and real-world data and demonstrate that using auxiliary search radius restriction can in fact improve the rate of convergence.
This paper introduces a framework to capture previously intractable optimization constraints and transform them to a mixed-integer linear program, through the use of neural networks. We encode the feasible space of optimization problems characterized by both tractable and intractable constraints, e.g. differential equations, to a neural network. Leveraging an exact mixed-integer reformulation of neural networks, we solve mixed-integer linear programs that accurately approximate solutions to the originally intractable non-linear optimization problem. We apply our methods to the AC optimal power flow problem (AC-OPF), where directly including dynamic security constraints renders the AC-OPF intractable. Our proposed approach has the potential to be significantly more scalable than traditional approaches. We demonstrate our approach for power system operation considering N-1 security and small-signal stability, showing how it can efficiently obtain cost-optimal solutions which at the same time satisfy both static and dynamic security constraints.
Evolving diverse sets of high quality solutions has gained increasing interest in the evolutionary computation literature in recent years. With this paper, we contribute to this area of research by examining evolutionary diversity optimisation approaches for the classical Traveling Salesperson Problem (TSP). We study the impact of using different diversity measures for a given set of tours and the ability of evolutionary algorithms to obtain a diverse set of high quality solutions when adopting these measures. Our studies show that a large variety of diverse high quality tours can be achieved by using our approaches. Furthermore, we compare our approaches in terms of theoretical properties and the final set of tours obtained by the evolutionary diversity optimisation algorithm.
In this paper, the disjunctive and conjunctive lattice piecewise affine (PWA) approximations of explicit linear model predictive control (MPC) are proposed. The training data is generated uniformly in the domain of interest, consisting of the state samples and corresponding affine control laws, based on which the lattice PWA approximations are constructed. Resampling of data is also proposed to guarantee that the lattice PWA approximations are identical to the explicit MPC control law in unique order (UO) regions containing the sample points as interior points. Besides, under mild assumptions, the equivalence of the 2 lattice PWA approximations guarantees the approximations are error-free in the domain of interest. The algorithms for deriving statistical error-free approximation to the explicit linear MPC is proposed and the complexity of the whole procedure is analyzed, which is polynomial with respect to the number of samples. The performance of the proposed approximation strategy is tested through 2 simulation examples, and the result shows that with a moderate number of sample points, we can construct lattice PWA approximations that are equivalent to optimal control law of the explicit linear MPC.
We present an end-to-end framework for solving the Vehicle Routing Problem (VRP) using reinforcement learning. In this approach, we train a single model that finds near-optimal solutions for problem instances sampled from a given distribution, only by observing the reward signals and following feasibility rules. Our model represents a parameterized stochastic policy, and by applying a policy gradient algorithm to optimize its parameters, the trained model produces the solution as a sequence of consecutive actions in real time, without the need to re-train for every new problem instance. On capacitated VRP, our approach outperforms classical heuristics and Google's OR-Tools on medium-sized instances in solution quality with comparable computation time (after training). We demonstrate how our approach can handle problems with split delivery and explore the effect of such deliveries on the solution quality. Our proposed framework can be applied to other variants of the VRP such as the stochastic VRP, and has the potential to be applied more generally to combinatorial optimization problems.
This paper describes a suite of algorithms for constructing low-rank approximations of an input matrix from a random linear image of the matrix, called a sketch. These methods can preserve structural properties of the input matrix, such as positive-semidefiniteness, and they can produce approximations with a user-specified rank. The algorithms are simple, accurate, numerically stable, and provably correct. Moreover, each method is accompanied by an informative error bound that allows users to select parameters a priori to achieve a given approximation quality. These claims are supported by numerical experiments with real and synthetic data.
We develop an approach to risk minimization and stochastic optimization that provides a convex surrogate for variance, allowing near-optimal and computationally efficient trading between approximation and estimation error. Our approach builds off of techniques for distributionally robust optimization and Owen's empirical likelihood, and we provide a number of finite-sample and asymptotic results characterizing the theoretical performance of the estimator. In particular, we show that our procedure comes with certificates of optimality, achieving (in some scenarios) faster rates of convergence than empirical risk minimization by virtue of automatically balancing bias and variance. We give corroborating empirical evidence showing that in practice, the estimator indeed trades between variance and absolute performance on a training sample, improving out-of-sample (test) performance over standard empirical risk minimization for a number of classification problems.