We consider the problem of deterministically enumerating all minimum $k$-cut-sets in a given hypergraph for any fixed $k$. The input here is a hypergraph $G = (V, E)$ with non-negative hyperedge costs. A subset $F$ of hyperedges is a $k$-cut-set if the number of connected components in $G - F$ is at least $k$ and it is a minimum $k$-cut-set if it has the least cost among all $k$-cut-sets. For fixed $k$, we call the problem of finding a minimum $k$-cut-set as Hypergraph-$k$-Cut and the problem of enumerating all minimum $k$-cut-sets as Enum-Hypergraph-$k$-Cut. The special cases of Hypergraph-$k$-Cut and Enum-Hypergraph-$k$-Cut restricted to graph inputs are well-known to be solvable in (randomized as well as deterministic) polynomial time. In contrast, it is only recently that polynomial-time algorithms for Hypergraph-$k$-Cut were developed. The randomized polynomial-time algorithm for Hypergraph-$k$-Cut that was designed in 2018 (Chandrasekaran, Xu, and Yu, SODA 2018) showed that the number of minimum $k$-cut-sets in a hypergraph is $O(n^{2k-2})$, where $n$ is the number of vertices in the input hypergraph, and that they can all be enumerated in randomized polynomial time, thus resolving Enum-Hypergraph-$k$-Cut in randomized polynomial time. A deterministic polynomial-time algorithm for Hypergraph-$k$-Cut was subsequently designed in 2020 (Chandrasekaran and Chekuri, FOCS 2020), but it is not guaranteed to enumerate all minimum $k$-cut-sets. In this work, we give the first deterministic polynomial-time algorithm to solve Enum-Hypergraph-$k$-Cut (this is non-trivial even for $k = 2$). Our algorithms are based on new structural results that allow for efficient recovery of all minimum $k$-cut-sets by solving minimum $(S,T)$-terminal cuts. Our techniques give new structural insights even for enumerating all minimum cut-sets (i.e., minimum 2-cut-sets) in a given hypergraph.
Problems on repeated geometric patterns in finite point sets in Euclidean space are extensively studied in the literature of combinatorial and computational geometry. Such problems trace their inspiration to Erd\H{o}s' original work on that topic. In this paper, we investigate the particular case of finding scaled copies of any pattern within a set of $n$ points, that is, the algorithmic task of efficiently enumerating all such copies. We initially focus on one particularly simple pattern of axis-parallel squares, and present an algorithm with an $O(n\sqrt{n})$ running time and $O(n)$ space for this task, involving various bucket-based and sweep-line techniques. Our algorithm is worst-case optimal, as it matches the known lower bound of $\Omega(n\sqrt{n})$ on the maximum number of axis-parallel squares determined by $n$ points in the plane, thereby solving an open question for more than three decades of realizing that bound for this pattern. We extend our result to an algorithm that enumerates all copies, up to scaling, of any full-dimensional fixed set of points in $d$-dimensional Euclidean space, that works in time $O(n^{1+1/d})$ and space $O(n)$, also matching the corresponding lower bound due to Elekes and Erd\H{o}s.
We prove a bound of $O( k (n+m)\log^{d-1})$ on the number of incidences between $n$ points and $m$ axis parallel boxes in $\mathbb{R}^d$, if no $k$ boxes contain $k$ common points. That is, the incidence graph between the points and the boxes does not contain $K_{k,k}$ as a subgraph. This new bound improves over previous work by a factor of $\log^d n$, for $d >2$. We also study the variant of the problem for points and halfspaces, where we use shallow cuttings to get a near linear bound in two and three dimensions.
Given a graph, the shortest-path problem requires finding a sequence of edges with minimum cumulative length that connects a source vertex to a target vertex. We consider a generalization of this classical problem in which the position of each vertex in the graph is a continuous decision variable, constrained to lie in a corresponding convex set. The length of an edge is then defined as a convex function of the positions of the vertices it connects. Problems of this form arise naturally in motion planning of autonomous vehicles, robot navigation, and even optimal control of hybrid dynamical systems. The price for such a wide applicability is the complexity of this problem, which is easily seen to be NP-hard. Our main contribution is a strong mixed-integer convex formulation based on perspective functions. This formulation has a very tight convex relaxation and makes it possible to efficiently find globally-optimal paths in large graphs and in high-dimensional spaces.
Enumerating all connected subgraphs of a given order from graphs is a computationally challenging task. In this paper, we propose two algorithms for enumerating all connected induced subgraphs of a given order from connected undirected graphs. The first algorithm is a variant of a previous well-known algorithm. The algorithm enumerates all connected induced subgraphs of order $k$ in a bottom-up manner. The data structures that lead to unit time element checking and linear space are presented. Different from previous algorithms that either work in a bottom-up manner or in a reverse search manner, an algorithm that enumerates all connected induced subgraphs of order $k$ in a top-down manner by recursively deleting vertices is proposed. The data structures used in the implementation are also presented. The correctness and complexity of the top-down algorithm is analysed and proven. Experimental results show that the variant bottom-up algorithm outperforms the other algorithms for enumerating connected induced subgraphs of small order, and the top-down algorithm is fastest among the state-of-the-art algorithms for enumerating connected induced subgraphs of large order.
Mining maximal subgraphs with cohesive structures from a bipartite graph has been widely studied. One important cohesive structure on bipartite graphs is k-biplex, where each vertex on one side disconnects at most k vertices on the other side. In this paper, we study the maximal k-biplex enumeration problem which enumerates all maximal k-biplexes. Existing methods suffer from efficiency and/or scalability issues and have the time of waiting for the next output exponential w.r.t. the size of the input bipartite graph (i.e., an exponential delay). In this paper, we adopt a reverse search framework called bTraversal, which corresponds to a depth-first search (DFS) procedure on an implicit solution graph on top of all maximal k-biplexes. We then develop a series of techniques for improving and implementing this framework including (1) carefully selecting an initial solution to start DFS, (2) pruning the vast majority of links from the solution graph of bTraversal, and (3) implementing abstract procedures of the framework. The resulting algorithm is called iTraversal, which has its underlying solution graph significantly sparser than (around 0.1% of) that of bTraversal. Besides, iTraversal provides a guarantee of polynomial delay. Our experimental results on real and synthetic graphs, where the largest one contains more than one billion edges, show that our algorithm is up to four orders of magnitude faster than existing algorithms.
This study concerns probability distribution estimation of sample maximum. The traditional approach is the parametric fitting to the limiting distribution - the generalized extreme value distribution; however, the model in finite cases is misspecified to a certain extent. We propose a plug-in type of the kernel distribution estimator which does not need model specification. It is proved that both asymptotic convergence rates depend on the tail index and the second order parameter. As the tail gets light, the degree of misspecification of the parametric fitting becomes large, that means the convergence rate becomes slow. In the Weibull cases, which can be seen as the limit of tail-lightness, only the nonparametric distribution estimator keeps its consistency. Finally, we report results of numerical experiments and two real case studies.
We give a new deterministic construction of integer sensing matrices that can be used for the recovery of integer-valued signals in compressed sensing. This is a family of $n \times d$ integer matrices, $d \geq n$, with bounded sup-norm and the property that no $\ell$ column vectors are linearly dependent, $\ell \leq n$. Further, if $\ell \leq o(\log n)$ then $d/n \to \infty$ as $n \to \infty$. Our construction comes from particular sets of difference vectors of point-sets in $\mathbb R^n$ that cannot be covered by few parallel hyperplanes. We construct examples of such sets on the $0, \pm 1$ grid and use them for the matrix construction. We also show a connection of our constructions to a simple version of the Tarski plank problem.
We present the quantum algorithm for the Longest Trail Problem. The problem is to search the longest edge-simple path for a graph with $n$ vertexes and $m$ edges. Here edge-simple means no edge occurs in the path twice, but vertexes can occur several times. The running time of our algorithm is $O^*(1.728^m)$.
The problem of Approximate Nearest Neighbor (ANN) search is fundamental in computer science and has benefited from significant progress in the past couple of decades. However, most work has been devoted to pointsets whereas complex shapes have not been sufficiently treated. Here, we focus on distance functions between discretized curves in Euclidean space: they appear in a wide range of applications, from road segments to time-series in general dimension. For $\ell_p$-products of Euclidean metrics, for any $p$, we design simple and efficient data structures for ANN, based on randomized projections, which are of independent interest. They serve to solve proximity problems under a notion of distance between discretized curves, which generalizes both discrete Fr\'echet and Dynamic Time Warping distances. These are the most popular and practical approaches to comparing such curves. We offer the first data structures and query algorithms for ANN with arbitrarily good approximation factor, at the expense of increasing space usage and preprocessing time over existing methods. Query time complexity is comparable or significantly improved by our algorithms, our algorithm is especially efficient when the length of the curves is bounded.
In this work, we consider the distributed optimization of non-smooth convex functions using a network of computing units. We investigate this problem under two regularity assumptions: (1) the Lipschitz continuity of the global objective function, and (2) the Lipschitz continuity of local individual functions. Under the local regularity assumption, we provide the first optimal first-order decentralized algorithm called multi-step primal-dual (MSPD) and its corresponding optimal convergence rate. A notable aspect of this result is that, for non-smooth functions, while the dominant term of the error is in $O(1/\sqrt{t})$, the structure of the communication network only impacts a second-order term in $O(1/t)$, where $t$ is time. In other words, the error due to limits in communication resources decreases at a fast rate even in the case of non-strongly-convex objective functions. Under the global regularity assumption, we provide a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function, and show that DRS is within a $d^{1/4}$ multiplicative factor of the optimal convergence rate, where $d$ is the underlying dimension.