Let $f \colon \mathcal{M} \to \mathbb{R}$ be a Lipschitz and geodesically convex function defined on a $d$-dimensional Riemannian manifold $\mathcal{M}$. Does there exist a first-order deterministic algorithm which (a) uses at most $O(\mathrm{poly}(d) \log(\epsilon^{-1}))$ subgradient queries to find a point with target accuracy $\epsilon$, and (b) requires only $O(\mathrm{poly}(d))$ arithmetic operations per query? In convex optimization, the classical ellipsoid method achieves this. After detailing related work, we provide an ellipsoid-like algorithm with query complexity $O(d^2 \log^2(\epsilon^{-1}))$ and per-query complexity $O(d^2)$ for the limited case where $\mathcal{M}$ has constant curvature (hemisphere or hyperbolic space). We then detail possible approaches and corresponding obstacles for designing an ellipsoid-like method for general Riemannian manifolds.
In the Activation Edge-Multicover problem we are given a multigraph $G=(V,E)$ with activation costs $\{c_{e}^u,c_{e}^v\}$ for every edge $e=uv \in E$, and degree requirements $r=\{r_v:v \in V\}$. The goal is to find an edge subset $J \subseteq E$ of minimum activation cost $\sum_{v \in V}\max\{c_{uv}^v:uv \in J\}$,such that every $v \in V$ has at least $r_v$ neighbors in the graph $(V,J)$. Let $k= \max_{v \in V} r_v$ be the maximum requirement and let $\theta=\max_{e=uv \in E} \frac{\max\{c_e^u,c_e^v\}}{\min\{c_e^u,c_e^v\}}$ be the maximum quotient between the two costs of an edge. For $\theta=1$ the problem admits approximation ratio $O(\log k)$. For $k=1$ it generalizes the Set Cover problem (when $\theta=\infty$), and admits a tight approximation ratio $O(\log n)$. This implies approximation ratio $O(k \log n)$ for general $k$ and $\theta$, and no better approximation ratio was known. We obtain the first logarithmic approximation ratio $O(\log k +\log\min\{\theta,n\})$, that bridges between the two known ratios -- $O(\log k)$ for $\theta=1$ and $O(\log n)$ for $k=1$. This implies approximation ratio $O\left(\log k +\log\min\{\theta,n\}\right) +\beta \cdot (\theta+1)$ for the Activation $k$-Connected Subgraph problem, where $\beta$ is the best known approximation ratio for the ordinary min-cost version of the problem.
We say that a Hamilton cycle $C=(x_1,\ldots,x_n)$ in a graph $G$ is $k$-symmetric, if the mapping $x_i\mapsto x_{i+n/k}$ for all $i=1,\ldots,n$, where indices are considered modulo $n$, is an automorphism of $G$. In other words, if we lay out the vertices $x_1,\ldots,x_n$ equidistantly on a circle and draw the edges of $G$ as straight lines, then the drawing of $G$ has $k$-fold rotational symmetry, i.e., all information about the graph is compressed into a $360^\circ/k$ wedge of the drawing. The maximum $k$ for which there exists a $k$-symmetric Hamilton cycle in $G$ is referred to as the Hamilton compression of $G$. We investigate the Hamilton compression of four different families of vertex-transitive graphs, namely hypercubes, Johnson graphs, permutahedra and Cayley graphs of abelian groups. In several cases we determine their Hamilton compression exactly, and in other cases we provide close lower and upper bounds. The constructed cycles have a much higher compression than several classical Gray codes known from the literature. Our constructions also yield Gray codes for bitstrings, combinations and permutations that have few tracks and/or that are balanced.
We study the classical problem of approximating a non-decreasing function $f: \mathcal{X} \to \mathcal{Y}$ in $L^p(\mu)$ norm by sequentially querying its values, for known compact real intervals $\mathcal{X}$, $\mathcal{Y}$ and a known probability measure $\mu$ on $\cX$. For any function~$f$ we characterize the minimum number of evaluations of $f$ that algorithms need to guarantee an approximation $\hat{f}$ with an $L^p(\mu)$ error below $\epsilon$ after stopping. Unlike worst-case results that hold uniformly over all $f$, our complexity measure is dependent on each specific function $f$. To address this problem, we introduce GreedyBox, a generalization of an algorithm originally proposed by Novak (1992) for numerical integration. We prove that GreedyBox achieves an optimal sample complexity for any function $f$, up to logarithmic factors. Additionally, we uncover results regarding piecewise-smooth functions. Perhaps as expected, the $L^p(\mu)$ error of GreedyBox decreases much faster for piecewise-$C^2$ functions than predicted by the algorithm (without any knowledge on the smoothness of $f$). A simple modification even achieves optimal minimax approximation rates for such functions, which we compute explicitly. In particular, our findings highlight multiple performance gaps between adaptive and non-adaptive algorithms, smooth and piecewise-smooth functions, as well as monotone or non-monotone functions. Finally, we provide numerical experiments to support our theoretical results.
Let $S_{p,n}$ denote the sample covariance matrix based on $n$ independent identically distributed $p$-dimensional random vectors in the null-case. The main result of this paper is an explicit expansion of trace moments and power-trace covariances of $S_{p,n}$ simultaneously for both high- and low-dimensional data. To this end we expand a well-known ansatz of describing trace moments as weighted sums over routes or graphs. The novelty to our approach is an inherent coloring of the examined graphs and a decomposition of graphs into their tree-structure and their \textit{seed graphs}, which allows for some elegant formulas explaining the effect of the tree structures on the number of Euler-tours. The weighted sums over graphs become weighted sums over the possible seed graphs, which in turn are much easier to analyze.
We consider nonconvex-concave minimax problems, $\min_{\mathbf{x}} \max_{\mathbf{y} \in \mathcal{Y}} f(\mathbf{x}, \mathbf{y})$, where $f$ is nonconvex in $\mathbf{x}$ but concave in $\mathbf{y}$ and $\mathcal{Y}$ is a convex and bounded set. One of the most popular algorithms for solving this problem is the celebrated gradient descent ascent (GDA) algorithm, which has been widely used in machine learning, control theory and economics. Despite the extensive convergence results for the convex-concave setting, GDA with equal stepsize can converge to limit cycles or even diverge in a general setting. In this paper, we present the complexity results on two-time-scale GDA for solving nonconvex-concave minimax problems, showing that the algorithm can find a stationary point of the function $\Phi(\cdot) := \max_{\mathbf{y} \in \mathcal{Y}} f(\cdot, \mathbf{y})$ efficiently. To the best our knowledge, this is the first nonasymptotic analysis for two-time-scale GDA in this setting, shedding light on its superior practical performance in training generative adversarial networks (GANs) and other real applications.
Consider the point line-geometry ${\mathcal P}_t(n,k)$ having as points all the $[n,k]$-linear codes having minimum dual distance at least $t+1$ and where two points $X$ and $Y$ are collinear whenever $X\cap Y$ is a $[n,k-1]$-linear code having minimum dual distance at least $t+1$. We are interested in the collinearity graph $\Lambda_t(n,k)$ of ${\mathcal P}_t(n,k).$ The graph $\Lambda_t(n,k)$ is a subgraph of the Grassmann graph and also a subgraph of the graph $\Delta_t(n,k)$ of the linear codes having minimum dual distance at least $t+1$ introduced in~[M. Kwiatkowski, M. Pankov, On the distance between linear codes, Finite Fields Appl. 39 (2016), 251--263, doi:10.1016/j.ffa.2016.02.004, arXiv:1506.00215]. We shall study the structure of $\Lambda_t(n,k)$ in relation to that of $\Delta_t(n,k)$ and we will characterize the set of its isolated vertices. We will then focus on $\Lambda_1(n,k)$ and $\Lambda_2(n,k)$ providing necessary and sufficient conditions for them to be connected.
This work considers the low-rank approximation of a matrix $A(t)$ depending on a parameter $t$ in a compact set $D \subset \mathbb{R}^d$. Application areas that give rise to such problems include computational statistics and dynamical systems. Randomized algorithms are an increasingly popular approach for performing low-rank approximation and they usually proceed by multiplying the matrix with random dimension reduction matrices (DRMs). Applying such algorithms directly to $A(t)$ would involve different, independent DRMs for every $t$, which is not only expensive but also leads to inherently non-smooth approximations. In this work, we propose to use constant DRMs, that is, $A(t)$ is multiplied with the same DRM for every $t$. The resulting parameter-dependent extensions of two popular randomized algorithms, the randomized singular value decomposition and the generalized Nystr\"{o}m method, are computationally attractive, especially when $A(t)$ admits an affine linear decomposition with respect to $t$. We perform a probabilistic analysis for both algorithms, deriving bounds on the expected value as well as failure probabilities for the approximation error when using Gaussian random DRMs. Both, the theoretical results and numerical experiments, show that the use of constant DRMs does not impair their effectiveness; our methods reliably return quasi-best low-rank approximations.
Given a boolean formula $\Phi$(X, Y, Z), the Max\#SAT problem asks for finding a partial model on the set of variables X, maximizing its number of projected models over the set of variables Y. We investigate a strict generalization of Max\#SAT allowing dependencies for variables in X, effectively turning it into a synthesis problem. We show that this new problem, called DQMax\#SAT, subsumes both the DQBF and DSSAT problems. We provide a general resolution method, based on a reduction to Max\#SAT, together with two improvements for dealing with its inherent complexity. We further discuss a concrete application of DQMax\#SAT for symbolic synthesis of adaptive attackers in the field of program security. Finally, we report preliminary results obtained on the resolution of benchmark problems using a prototype DQMax\#SAT solver implementation.
In this note, we give sufficient conditions for the almost sure and the convergence in $\mathbb{L}^p$ of a $U$-statistic of order $m$ built on a strictly stationary but not necessarily ergodic sequence.
Let $f=(f_0,f_1,\dots, f_{\nu-1})$ be a collection of one-to-one functions from some space~$X$ into itself such that the sets $f_j(X)$ are disjoint. If $w=w_1w_2\cdots w_k$ is a word on the alphabet $\{0,1,\dots,\nu-1\}$, let $\Phi_{f,w} = f_{w_1}\circ f_{w_2}\circ\cdots\circ f_{w_k}$. Given a function~$F$ of which we know that it can be written as $\Phi_{f,w}$, it is easy to recover~$w$. We give some examples of this situation where everything can be scrambled up by using some private key to get a new system $g=(g_1,g_2,\dots,g_{\nu-1})$ on another set~$Y$ in such a way that the images of the $g_j$ are no longer disjoint. We define a cryptosystem whose public key is~$g$. The message to be encrypted is a word~$w$ and the associated cryptogram is $\Phi_{g,w}$. The private key allows to recover $\Phi_{f,w}$ from $\Phi_{g,w}$.