We propose a deterministic Kaczmarz algorithm for solving linear systems $A\x=\b$. Different from previous Kaczmarz algorithms, we use reflections in each step of the iteration. This generates a series of points distributed with patterns on a sphere centered at a solution. Firstly, we prove that taking the average of $O(\eta/\epsilon)$ points leads to an effective approximation of the solution up to relative error $\epsilon$, where $\eta$ is a parameter depending on $A$ and can be bounded above by the square of the condition number. We also show how to select these points efficiently. From the numerical tests, our Kaczmarz algorithm usually converges more quickly than the (block) randomized Kaczmarz algorithms. Secondly, when the linear system is consistent, the Kaczmarz algorithm returns the solution that has the minimal distance to the initial vector. This gives a method to solve the least-norm problem. Finally, we prove that our Kaczmarz algorithm indeed solves the linear system $A^TW^{-1}A \x = A^TW^{-1} \b$, where $W$ is the low-triangular matrix such that $W+W^T=2AA^T$. The relationship between this linear system and the original one is studied.
In the $(1+\varepsilon,r)$-approximate near-neighbor problem for curves (ANNC) under some distance measure $\delta$, the goal is to construct a data structure for a given set $\mathcal{C}$ of curves that supports approximate near-neighbor queries: Given a query curve $Q$, if there exists a curve $C\in\mathcal{C}$ such that $\delta(Q,C)\le r$, then return a curve $C'\in\mathcal{C}$ with $\delta(Q,C')\le(1+\varepsilon)r$. There exists an efficient reduction from the $(1+\varepsilon)$-approximate nearest-neighbor problem to ANNC, where in the former problem the answer to a query is a curve $C\in\mathcal{C}$ with $\delta(Q,C)\le(1+\varepsilon)\cdot\delta(Q,C^*)$, where $C^*$ is the curve of $\mathcal{C}$ closest to $Q$. Given a set $\mathcal{C}$ of $n$ curves, each consisting of $m$ points in $d$ dimensions, we construct a data structure for ANNC that uses $n\cdot O(\frac{1}{\varepsilon})^{md}$ storage space and has $O(md)$ query time (for a query curve of length $m$), where the similarity between two curves is their discrete Fr\'echet or dynamic time warping distance. Our method is simple to implement, deterministic, and results in an exponential improvement in both query time and storage space compared to all previous bounds. Further, we also consider the asymmetric version of ANNC, where the length of the query curves is $k \ll m$, and obtain essentially the same storage and query bounds as above, except that $m$ is replaced by $k$. Finally, we apply our method to a version of approximate range counting for curves and achieve similar bounds.
The maximum traveling salesman problem (Max~TSP) consists of finding a Hamiltonian cycle with the maximum total weight of the edges in a given complete weighted graph. We prove that, in the case when the edge weights are induced by a metric space of bounded doubling dimension, asymptotically optimal solutions of the problem can be found by the simple greedy patching heuristic. Taking as a start point a maximum-weight cycle cover, this heuristic iteratively patches pairs of its cycles into one minimizing the weight loss at each step.
We consider Broyden's method and some accelerated schemes for nonlinear equations having a strongly regular singularity of first order with a one-dimensional nullspace. Our two main results are as follows. First, we show that the use of a preceding Newton-like step ensures convergence for starting points in a starlike domain with density 1. This extends the domain of convergence of these methods significantly. Second, we establish that the matrix updates of Broyden's method converge q-linearly with the same asymptotic factor as the iterates. This contributes to the long-standing question whether the Broyden matrices converge by showing that this is indeed the case for the setting at hand. Furthermore, we prove that the Broyden directions violate uniform linear independence, which implies that existing results for convergence of the Broyden matrices cannot be applied. Numerical experiments of high precision confirm the enlarged domain of convergence, the q-linear convergence of the matrix updates, and the lack of uniform linear independence. In addition, they suggest that these results can be extended to singularities of higher order and that Broyden's method can converge r-linearly without converging q-linearly. The underlying code is freely available.
Stochastic majorization-minimization (SMM) is an online extension of the classical principle of majorization-minimization, which consists of sampling i.i.d. data points from a fixed data distribution and minimizing a recursively defined majorizing surrogate of an objective function. In this paper, we introduce stochastic block majorization-minimization, where the surrogates can now be only block multi-convex and a single block is optimized at a time within a diminishing radius. Relaxing the standard strong convexity requirements for surrogates in SMM, our framework gives wider applicability including online CANDECOMP/PARAFAC (CP) dictionary learning and yields greater computational efficiency especially when the problem dimension is large. We provide an extensive convergence analysis on the proposed algorithm, which we derive under possibly dependent data streams, relaxing the standard i.i.d. assumption on data samples. We show that the proposed algorithm converges almost surely to the set of stationary points of a nonconvex objective under constraints at a rate $O((\log n)^{1+\eps}/n^{1/2})$ for the empirical loss function and $O((\log n)^{1+\eps}/n^{1/4})$ for the expected loss function, where $n$ denotes the number of data samples processed. Under some additional assumption, the latter convergence rate can be improved to $O((\log n)^{1+\eps}/n^{1/2})$. Our results provide first convergence rate bounds for various online matrix and tensor decomposition algorithms under a general Markovian data setting.
Navier-Stokes equations are well known in modelling of an incompressible Newtonian fluid, such as air or water. This system of equations is very complex due to the non-linearity term that characterizes it. After the linearization and the discretization parts, we get a descriptor system of index-2 described by a set of differential algebraic equations (DAEs). The two main parts we develop through this paper are focused firstly on constructing an efficient algorithm based on a projection technique onto an extended block Krylov subspace, that appropriately allows us to construct a reduced system of the original DAE system. Secondly, we solve a Linear Quadratic Regulator (LQR) problem based on a Riccati feedback approach. This approach uses numerical solutions of large-scale algebraic Riccati equations. To this end, we use the extended Krylov subspace method that allows us to project the initial large matrix problem onto a low order one that is solved by some direct methods. These numerical solutions are used to obtain a feedback matrix that will be used to stabilize the original system. We conclude by providing some numerical results to confirm the performances of our proposed method compared to other known methods.
We study the class of first-order locally-balanced Metropolis--Hastings algorithms introduced in Livingstone & Zanella (2021). To choose a specific algorithm within the class the user must select a balancing function $g:\mathbb{R} \to \mathbb{R}$ satisfying $g(t) = tg(1/t)$, and a noise distribution for the proposal increment. Popular choices within the class are the Metropolis-adjusted Langevin algorithm and the recently introduced Barker proposal. We first establish a universal limiting optimal acceptance rate of 57% and scaling of $n^{-1/3}$ as the dimension $n$ tends to infinity among all members of the class under mild smoothness assumptions on $g$ and when the target distribution for the algorithm is of the product form. In particular we obtain an explicit expression for the asymptotic efficiency of an arbitrary algorithm in the class, as measured by expected squared jumping distance. We then consider how to optimise this expression under various constraints. We derive an optimal choice of noise distribution for the Barker proposal, optimal choice of balancing function under a Gaussian noise distribution, and optimal choice of first-order locally-balanced algorithm among the entire class, which turns out to depend on the specific target distribution. Numerical simulations confirm our theoretical findings and in particular show that a bi-modal choice of noise distribution in the Barker proposal gives rise to a practical algorithm that is consistently more efficient than the original Gaussian version.
Consider using the right-preconditioned GMRES for obtaining the minimum-norm solution of underdetermined inconsistent least squares problems. Morikuni (Ph.D. thesis, 2013) showed that for some inconsistent and ill-conditioned problems, the iterates may diverge. This is mainly because the Hessenberg matrix in the GMRES method becomes very ill-conditioned so that the backward substitution of the resulting triangular system becomes numerically unstable. We propose a stabilized GMRES based on solving the normal equations corresponding to the above triangular system using the standard Cholesky decomposition. This has the effect of shifting upwards the tiny singular values of the Hessenberg matrix which lead to an inaccurate solution. Thus, the process becomes numerically stable and the system becomes consistent, rendering better convergence and a more accurate solution. Numerical experiments show that the proposed method is robust and efficient. The method can be considered as a way of making GMRES stable for highly ill-conditioned inconsistent problems.
In this paper, we study arbitrary infinite binary information systems each of which consists of an infinite set called universe and an infinite set of two-valued functions (attributes) defined on the universe. We consider the notion of a problem over information system which is described by a finite number of attributes and a mapping corresponding a decision to each tuple of attribute values. As algorithms for problem solving, we use deterministic and nondeterministic decision trees. As time and space complexity, we study the depth and the number of nodes in the decision trees. In the worst case, with the growth of the number of attributes in the problem description, (i) the minimum depth of deterministic decision trees grows either almost as logarithm or linearly, (ii) the minimum depth of nondeterministic decision trees either is bounded from above by a constant or grows linearly, (iii) the minimum number of nodes in deterministic decision trees has either polynomial or exponential growth, and (iv) the minimum number of nodes in nondeterministic decision trees has either polynomial or exponential growth. Based on these results, we divide the set of all infinite binary information systems into five complexity classes, and study for each class issues related to time-space trade-off for decision trees.
In this paper we analyze the Schwarz alternating method for unconstrained elliptic optimal control problems. We discuss the convergence properties of the method in the continuous case first and then apply the arguments to the finite difference discretization case. In both cases, we prove that the Schwarz alternating method is convergent if its counterpart for an elliptic equation is convergent. Meanwhile, the convergence rate of the method for the elliptic equation under the maximum norm also gives a uniform upper bound (with respect to the regularization parameter $\alpha$) of the convergence rate of the method for the optimal control problem under the maximum norm of proper error merit functions in the continuous case or vectors in the discrete case. Our numerical results corroborate our theoretical results and show that with $\alpha$ decreasing to zero, the method will converge faster. We also give some exposition of this phenomenon.
Escaping saddle points is a central research topic in nonconvex optimization. In this paper, we propose a simple gradient-based algorithm such that for a smooth function $f\colon\mathbb{R}^n\to\mathbb{R}$, it outputs an $\epsilon$-approximate second-order stationary point in $\tilde{O}(\log n/\epsilon^{1.75})$ iterations. Compared to the previous state-of-the-art algorithms by Jin et al. with $\tilde{O}((\log n)^{4}/\epsilon^{2})$ or $\tilde{O}((\log n)^{6}/\epsilon^{1.75})$ iterations, our algorithm is polynomially better in terms of $\log n$ and matches their complexities in terms of $1/\epsilon$. For the stochastic setting, our algorithm outputs an $\epsilon$-approximate second-order stationary point in $\tilde{O}((\log n)^{2}/\epsilon^{4})$ iterations. Technically, our main contribution is an idea of implementing a robust Hessian power method using only gradients, which can find negative curvature near saddle points and achieve the polynomial speedup in $\log n$ compared to the perturbed gradient descent methods. Finally, we also perform numerical experiments that support our results.