We give new polynomial lower bounds for a number of dynamic measure problems in computational geometry. These lower bounds hold in the the Word-RAM model, conditioned on the hardness of either the 3SUM problem or the Online Matrix-Vector Mutliplication problem [Henzinger et al., STOC 2015]. In particular we get lower bounds in the incremental and fully-dynamic settings for counting maximal or extremal points in R^3, different variants of Klee's Measure Problem, problems related to finding the largest empty disk in a set of points, and querying the size of the i'th convex layer in a planar set of points. While many conditional lower bounds for dynamic data structures have been proven since the seminal work of Patrascu [STOC 2010], few of them relate to computational geometry problems. This is the first paper focusing on this topic. The problems we consider can all be solved in O(n log n) time in the static case and their dynamic versions have mostly been approached from the perspective of improving known upper bounds. One exception to this is Klee's measure problem in R^2, for which Chan [CGTA 2010] gave an unconditional {\Omega}(\sqrt{n}) lower bound on the worst-case update time. By a similar approach, we show that this also holds for an important special case of Klee's measure problem in R^3 known as the Hypervolume Indicator problem.
The segment number of a planar graph $G$ is the smallest number of line segments needed for a planar straight-line drawing of $G$. Dujmovi\'c, Eppstein, Suderman, and Wood [CGTA'07] introduced this measure for the visual complexity of graphs. There are optimal algorithms for trees and worst-case optimal algorithms for outerplanar graphs, 2-trees, and planar 3-trees. It is known that every cubic triconnected planar $n$-vertex graph (except $K_4$) has segment number $n/2+3$, which is the only known universal lower bound for a meaningful class of planar graphs. We show that every triconnected planar 4-regular graph can be drawn using at most $n+3$ segments. This bound is tight up to an additive constant, improves a previous upper bound of $7n/4+2$ implied by a more general result of Dujmovi\'c et al., and supplements the result for cubic graphs. We also give a simple optimal algorithm for cactus graphs, generalizing the above-mentioned result for trees. We prove the first linear universal lower bounds for outerpaths, maximal outerplanar graphs, 2-trees, and planar 3-trees. This shows that the existing algorithms for these graph classes are constant-factor approximations. For maximal outerpaths, our bound is best possible and can be generalized to circular arcs.
Stochastic dual dynamic programming is a cutting plane type algorithm for multi-stage stochastic optimization originated about 30 years ago. In spite of its popularity in practice, there does not exist any analysis on the convergence rates of this method. In this paper, we first establish the number of iterations, i.e., iteration complexity, required by a basic dynamic cutting plane method for solving relatively simple multi-stage optimization problems, by introducing novel mathematical tools including the saturation of search points. We then refine these basic tools and establish the iteration complexity for both deterministic and stochastic dual dynamic programming methods for solving more general multi-stage stochastic optimization problems under the standard stage-wise independence assumption. Our results indicate that the complexity of some deterministic variants of these methods mildly increases with the number of stages $T$, in fact linearly dependent on $T$ for discounted problems. Therefore, they are efficient for strategic decision making which involves a large number of stages, but with a relatively small number of decision variables in each stage. Without explicitly discretizing the state and action spaces, these methods might also be pertinent to the related reinforcement learning and stochastic control areas.
The main theme of this paper is using $k$-dimensional generalizations of the combinatorial Boolean Matrix Multiplication (BMM) hypothesis and the closely-related Online Matrix Vector Multiplication (OMv) hypothesis to prove new tight conditional lower bounds for dynamic problems. The combinatorial $k$-Clique hypothesis, which is a standard hypothesis in the literature, naturally generalizes the combinatorial BMM hypothesis. In this paper, we prove tight lower bounds for several dynamic problems under the combinatorial $k$-Clique hypothesis. For instance, we show that: * The Dynamic Range Mode problem has no combinatorial algorithms with $\mathrm{poly}(n)$ pre-processing time, $O(n^{2/3-\epsilon})$ update time and $O(n^{2/3-\epsilon})$ query time for any $\epsilon > 0$, matching the known upper bounds for this problem. Previous lower bounds only ruled out algorithms with $O(n^{1/2-\epsilon})$ update and query time under the OMv hypothesis. Other examples include tight combinatorial lower bounds for Dynamic Subgraph Connectivity, Dynamic 2D Orthogonal Range Color Counting, Dynamic 2-Pattern Document Retrieval, and Dynamic Range Mode in higher dimensions. Furthermore, we propose the OuMv$_k$ hypothesis as a natural generalization of the OMv hypothesis. Under this hypothesis, we prove tight lower bounds for various dynamic problems. For instance, we show that: * The Dynamic Skyline Points Counting problem in $(2k-1)$-dimensional space has no algorithm with $\mathrm{poly}(n)$ pre-processing time and $O(n^{1-1/k-\epsilon})$ update and query time for $\epsilon > 0$, even if the updates are semi-online. Other examples include tight conditional lower bounds for (semi-online) Dynamic Klee's measure for unit cubes, and high-dimensional generalizations of Erickson's problem and Langerman's problem.
Optimal transport (OT) is a popular measure to compare probability distributions. However, OT suffers a few drawbacks such as (i) a high complexity for computation, (ii) indefiniteness which limits its applicability to kernel machines. In this work, we consider probability measures supported on a graph metric space and propose a novel Sobolev transport metric. We show that the Sobolev transport metric yields a closed-form formula for fast computation and it is negative definite. We show that the space of probability measures endowed with this transport distance is isometric to a bounded convex set in a Euclidean space with a weighted $\ell_p$ distance. We further exploit the negative definiteness of the Sobolev transport to design positive-definite kernels, and evaluate their performances against other baselines in document classification with word embeddings and in topological data analysis.
Let $n>m$, and let $A$ be an $(m\times n)$-matrix of full rank. Then obviously the estimate $\|Ax\|\leq\|A\|\|x\|$ holds for the euclidean norm of $x$ and $Ax$ and the spectral norm as the assigned matrix norm. We study the sets of all $x$ for which, for fixed $\delta<1$, conversely $\|Ax\|\geq\delta\,\|A\|\|x\|$ holds. It turns out that these sets fill, in the high-dimensional case, almost the complete space once $\delta$ falls below a bound that depends on the extremal singular values of $A$ and on the ratio of the dimensions. This effect has much to do with the random projection theorem, which plays an important role in the data sciences. As a byproduct, we calculate the probabilities this theorem deals with exactly.
The optimistic gradient method has seen increasing popularity as an efficient first-order method for solving convex-concave saddle point problems. To analyze its iteration complexity, a recent work [arXiv:1901.08511] proposed an interesting perspective that interprets the optimistic gradient method as an approximation to the proximal point method. In this paper, we follow this approach and distill the underlying idea of optimism to propose a generalized optimistic method, which encompasses the optimistic gradient method as a special case. Our general framework can handle constrained saddle point problems with composite objective functions and can work with arbitrary norms with compatible Bregman distances. Moreover, we also develop an adaptive line search scheme to select the stepsizes without knowledge of the smoothness coefficients. We instantiate our method with first-order, second-order and higher-order oracles and give sharp global iteration complexity bounds. When the objective function is convex-concave, we show that the averaged iterates of our $p$-th-order method ($p\geq 1$) converge at a rate of $\mathcal{O}(1/N^\frac{p+1}{2})$. When the objective function is further strongly-convex-strongly-concave, we prove a complexity bound of $\mathcal{O}(\frac{L_1}{\mu}\log\frac{1}{\epsilon})$ for our first-order method and a bound of $\mathcal{O}((L_p D^\frac{p-1}{2}/\mu)^{\frac{2}{p+1}}+\log\log\frac{1}{\epsilon})$ for our $p$-th-order method ($p\geq 2$) respectively, where $L_p$ ($p\geq 1$) is the Lipschitz constant of the $p$-th-order derivative, $\mu$ is the strongly-convex parameter, and $D$ is the initial Bregman distance to the saddle point. Moreover, our line search scheme provably only requires an almost constant number of calls to a subproblem solver per iteration on average, making our first-order and second-order methods particularly amenable to implementation.
Recently, recovering an unknown signal from quadratic measurements has gained popularity because it includes many interesting applications as special cases such as phase retrieval, fusion frame phase retrieval, and positive operator-valued measure. In this paper, by employing the least squares approach to reconstruct the signal, we establish the non-asymptotic statistical property showing that the gap between the estimator and the true signal is vanished in the noiseless case and is bounded in the noisy case by an error rate of $O(\sqrt{p\log(1+2n)/n})$, where $n$ and $p$ are the number of measurements and the dimension of the signal, respectively. We develop a gradient regularized Newton method (GRNM) to solve the least squares problem and prove that it converges to a unique local minimum at a superlinear rate under certain mild conditions. In addition to the deterministic results, GRNM can reconstruct the true signal exactly for the noiseless case and achieve the above error rate with a high probability for the noisy case. Numerical experiments demonstrate the GRNM performs nicely in terms of high order of recovery accuracy, faster computational speed, and strong recovery capability.
This paper presents local minimax regret lower bounds for adaptively controlling linear-quadratic-Gaussian (LQG) systems. We consider smoothly parametrized instances and provide an understanding of when logarithmic regret is impossible which is both instance specific and flexible enough to take problem structure into account. This understanding relies on two key notions: That of local-uninformativeness; when the optimal policy does not provide sufficient excitation for identification of the optimal policy, and yields a degenerate Fisher information matrix; and that of information-regret-boundedness, when the small eigenvalues of a policy-dependent information matrix are boundable in terms of the regret of that policy. Combined with a reduction to Bayesian estimation and application of Van Trees' inequality, these two conditions are sufficient for proving regret bounds on order of magnitude $\sqrt{T}$ in the time horizon, $T$. This method yields lower bounds that exhibit tight dimensional dependencies and scale naturally with control-theoretic problem constants. For instance, we are able to prove that systems operating near marginal stability are fundamentally hard to learn to control. We further show that large classes of systems satisfy these conditions, among them any state-feedback system with both $A$- and $B$-matrices unknown. Most importantly, we also establish that a nontrivial class of partially observable systems, essentially those that are over-actuated, satisfy these conditions, thus providing a $\sqrt{T}$ lower bound also valid for partially observable systems. Finally, we turn to two simple examples which demonstrate that our lower bound captures classical control-theoretic intuition: our lower bounds diverge for systems operating near marginal stability or with large filter gain -- these can be arbitrarily hard to (learn to) control.
If $\mathcal{P} = \left\langle A \, | \,R \right\rangle$ is a monoid presentation, then the relation words in $\mathcal{P}$ are just the set of words on the left or right hand side of any pair in $R$. A word $w\in A ^*$ is said to be a piece of $\mathcal{P}$ if $w$ is a factor of at least two distinct relation words, or $w$ occurs more than once as a factor of a single relation word (possibly overlapping). A finitely presented monoid is a small overlap monoid if no relation word can be written as a product of fewer than $4$ pieces. In this paper, we present a quadratic time algorithm for computing normal forms of words in small overlap monoids where the coefficients are sufficiently small to allow for practical computation. Additionally, we show that the uniform word problem for small overlap monoids can be solved in linear time.
We consider the exploration-exploitation trade-off in reinforcement learning and we show that an agent imbued with a risk-seeking utility function is able to explore efficiently, as measured by regret. The parameter that controls how risk-seeking the agent is can be optimized exactly, or annealed according to a schedule. We call the resulting algorithm K-learning and show that the corresponding K-values are optimistic for the expected Q-values at each state-action pair. The K-values induce a natural Boltzmann exploration policy for which the `temperature' parameter is equal to the risk-seeking parameter. This policy achieves an expected regret bound of $\tilde O(L^{3/2} \sqrt{S A T})$, where $L$ is the time horizon, $S$ is the number of states, $A$ is the number of actions, and $T$ is the total number of elapsed time-steps. This bound is only a factor of $L$ larger than the established lower bound. K-learning can be interpreted as mirror descent in the policy space, and it is similar to other well-known methods in the literature, including Q-learning, soft-Q-learning, and maximum entropy policy gradient, and is closely related to optimism and count based exploration methods. K-learning is simple to implement, as it only requires adding a bonus to the reward at each state-action and then solving a Bellman equation. We conclude with a numerical example demonstrating that K-learning is competitive with other state-of-the-art algorithms in practice.