The caterpillar associahedron $\mathcal{A}(G)$ is a polytope arising from the rotation graph of search trees on a caterpillar tree $G$, generalizing the rotation graph of binary search trees (BSTs) and thus the conventional associahedron. We show that the diameter of $\mathcal{A}(G)$ is $\Theta(n + m \cdot (H+1))$, where $n$ is the number of vertices, $m$ is the number of leaves, and $H$ is the entropy of the leaf distribution of $G$. Our proofs reveal a strong connection between caterpillar associahedra and searching in BSTs. We prove the lower bound using Wilber's first lower bound for dynamic BSTs, and the upper bound by reducing the problem to searching in static BSTs.
We study subtrajectory clustering under the Fr\'echet distance. Given one or more trajectories, the task is to split the trajectories into several parts, such that the parts have a good clustering structure. We approach this problem via a new set cover formulation, which we think provides a natural formalization of the problem as it is studied in many applications. Given a polygonal curve $P$ with $n$ vertices in fixed dimension, integers $k$, $\ell \geq 1$, and a real value $\Delta > 0$, the goal is to find $k$ center curves of complexity at most $\ell$ such that every point on $P$ is covered by a subtrajectory that has small Fr\'echet distance to one of the $k$ center curves ($\leq \Delta$). In many application scenarios, one is interested in finding clusters of small complexity, which is controlled by the parameter $\ell$. Our main result is a bicriterial approximation algorithm: if there exists a solution for given parameters $k$, $\ell$, and $\Delta$, then our algorithm finds a set of $k'$ center curves of complexity at most $\ell$ with covering radius $\Delta'$ with $k' \in O( k \ell^2 \log (k \ell))$, and $\Delta'\leq 19 \Delta$. Moreover, within these approximation bounds, we can minimize $k$ while keeping the other parameters fixed. If $\ell$ is a constant independent of $n$, then, the approximation factor for the number of clusters $k$ is $O(\log k)$ and the approximation factor for the radius $\Delta$ is constant. In this case, the algorithm has expected running time in $ \tilde{O}\left( k m^2 + mn\right)$ and uses space in $O(n+m)$, where $m=\lceil\frac{L}{\Delta}\rceil$ and $L$ is the total arclength of the curve $P$.
A Bayesian Network is a directed acyclic graph (DAG) on a set of $n$ random variables (identified with the vertices); a Bayesian Network Distribution (BND) is a probability distribution on the rv's that is Markovian on the graph. A finite mixture of such models is the projection on these variables of a BND on the larger graph which has an additional "hidden" (or "latent") random variable $U$, ranging in $\{1,\ldots,k\}$, and a directed edge from $U$ to every other vertex. Models of this type are fundamental to research in Causal Inference, where $U$ models a confounding effect. One extremely special case has been of longstanding interest in the theory literature: the empty graph. Such a distribution is simply a mixture of $k$ product distributions. A longstanding problem has been, given the joint distribution of a mixture of $k$ product distributions, to identify each of the product distributions, and their mixture weights. Our results are: (1) We improve the sample complexity (and runtime) for identifying mixtures of $k$ product distributions from $\exp(O(k^2))$ to $\exp(O(k \log k))$. This is almost best possible in view of a known $\exp(\Omega(k))$ lower bound. (2) We give the first algorithm for the case of non-empty graphs. The complexity for a graph of maximum degree $\Delta$ is $\exp(O(k(\Delta^2 + \log k)))$. (The above complexities are approximate and suppress dependence on secondary parameters.)
We consider the Minimum Convex Partition problem: Given a set P of n points in the plane, draw a plane graph G on P, with positive minimum degree, such that G partitions the convex hull of P into a minimum number of convex faces. We show that Minimum Convex Partition is NP-hard, and we give several approximation algorithms, from an O(log OPT)-approximation running in O(n^8)-time, where OPT denotes the minimum number of convex faces needed, to an O(sqrt(n) log n)-approximation algorithm running in O(n^2)-time. We say that a point set is k-directed if the (straight) lines containing at least three points have up to k directions. We present an O(k)-approximation algorithm running in n^O(k)-time. Those hardness and approximation results also holds for the Minimum Convex Tiling problem, defined similarly but allowing the use of Steiner points. The approximation results are obtained by relating the problem to the Covering Points with Non-Crossing Segments problem. We show that this problem is NP-hard, and present an FPT algorithm. This allows us to obtain a constant-approximation FPT algorithm for the Minimum Convex Partition Problem where the parameter is the number of faces.
Given $(a_1, \dots, a_n, t) \in \mathbb{Z}_{\geq 0}^{n + 1}$, the Subset Sum problem ($\mathsf{SSUM}$) is to decide whether there exists $S \subseteq [n]$ such that $\sum_{i \in S} a_i = t$. There is a close variant of the $\mathsf{SSUM}$, called $\mathsf{Subset~Product}$. Given positive integers $a_1, ..., a_n$ and a target integer $t$, the $\mathsf{Subset~Product}$ problem asks to determine whether there exists a subset $S \subseteq [n]$ such that $\prod_{i \in S} a_i=t$. There is a pseudopolynomial time dynamic programming algorithm, due to Bellman (1957) which solves the $\mathsf{SSUM}$ and $\mathsf{Subset~Product}$ in $O(nt)$ time and $O(t)$ space. In the first part, we present {\em search} algorithms for variants of the Subset Sum problem. Our algorithms are parameterized by $k$, which is a given upper bound on the number of realisable sets (i.e.,~number of solutions, summing exactly $t$). We show that $\mathsf{SSUM}$ with a unique solution is already NP-hard, under randomized reduction. This makes the regime of parametrized algorithms, in terms of $k$, very interesting. Subsequently, we present an $\tilde{O}(k\cdot (n+t))$ time deterministic algorithm, which finds the hamming weight of all the realisable sets for a subset sum instance. We also give a poly$(knt)$-time and $O(\log(knt))$-space deterministic algorithm that finds all the realisable sets for a subset sum instance. In the latter part, we present a simple and elegant randomized $\tilde{O}(n + t)$ time algorithm for $\mathsf{Subset~Product}$. Moreover, we also present a poly$(nt)$ time and $O(\log^2 (nt))$ space deterministic algorithm for the same. We study these problems in the unbounded setting as well. Our algorithms use multivariate FFT, power series and number-theoretic techniques, introduced by Jin and Wu (SOSA'19) and Kane (2010).
We consider the problem of untangling a given (non-planar) straight-line circular drawing $\delta_G$ of an outerplanar graph $G=(V, E)$ into a planar straight-line circular drawing by shifting a minimum number of vertices to a new position on the circle. For an outerplanar graph $G$, it is clear that such a crossing-free circular drawing always exists and we define the circular shifting number shift$(\delta_G)$ as the minimum number of vertices that are required to be shifted in order to resolve all crossings of $\delta_G$. We show that the problem Circular Untangling, asking whether shift$(\delta_G) \le K$ for a given integer $K$, is NP-complete. For $n$-vertex outerplanar graphs, we obtain a tight upper bound of shift$(\delta_G) \le n - \lfloor\sqrt{n-2}\rfloor -2$. Based on these results we study Circular Untangling for almost-planar circular drawings, in which a single edge is involved in all the crossings. In this case, we provide a tight upper bound shift$(\delta_G) \le \lfloor \frac{n}{2} \rfloor-1$ and present a constructive polynomial-time algorithm to compute the circular shifting number of almost-planar drawings.
We study the problem of maximizing Nash welfare (MNW) while allocating indivisible goods to asymmetric agents. The Nash welfare of an allocation is the weighted geometric mean of agents' utilities, and the allocation with maximum Nash welfare is known to satisfy several desirable fairness and efficiency properties. However, computing such an MNW allocation is APX-hard (hard to approximate) in general, even when agents have additive valuation functions. Hence, we aim to identify tractable classes which either admit a polynomial-time approximation scheme (PTAS) or an exact polynomial-time algorithm. To this end, we design a PTAS for finding an MNW allocation for the case of asymmetric agents with identical, additive valuations, thus generalizing a similar result for symmetric agents. Our techniques can also be adapted to give a PTAS for the problem of computing the optimal $p$-mean welfare. We also show that an MNW allocation can be computed exactly in polynomial time for identical agents with $k$-ary valuations when $k$ is a constant, where every agent has at most $k$ different values for the goods. Next, we consider the special case where every agent finds at most two goods valuable, and show that this class admits an efficient algorithm, even for general monotone valuations. In contrast, we show that when agents can value three or more goods, maximizing Nash welfare is APX-hard, even when agents are symmetric and have additive valuations. Finally, we show that for constantly many asymmetric agents with additive valuations, the MNW problem admits a fully polynomial-time approximation scheme (FPTAS).
A constraint satisfaction problem (CSP), $\textsf{Max-CSP}(\mathcal{F})$, is specified by a finite set of constraints $\mathcal{F} \subseteq \{[q]^k \to \{0,1\}\}$ for positive integers $q$ and $k$. An instance of the problem on $n$ variables is given by $m$ applications of constraints from $\mathcal{F}$ to subsequences of the $n$ variables, and the goal is to find an assignment to the variables that satisfies the maximum number of constraints. In the $(\gamma,\beta)$-approximation version of the problem for parameters $0 \leq \beta < \gamma \leq 1$, the goal is to distinguish instances where at least $\gamma$ fraction of the constraints can be satisfied from instances where at most $\beta$ fraction of the constraints can be satisfied. In this work we consider the approximability of this problem in the context of sketching algorithms and give a dichotomy result. Specifically, for every family $\mathcal{F}$ and every $\beta < \gamma$, we show that either a linear sketching algorithm solves the problem in polylogarithmic space, or the problem is not solvable by any sketching algorithm in $o(\sqrt{n})$ space.
A Boolean constraint satisfaction problem (CSP), $\textsf{Max-CSP}(f)$, is a maximization problem specified by a constraint $f:\{-1,1\}^k\to\{0,1\}$. An instance of the problem consists of $m$ constraint applications on $n$ Boolean variables, where each constraint application applies the constraint to $k$ literals chosen from the $n$ variables and their negations. The goal is to compute the maximum number of constraints that can be satisfied by a Boolean assignment to the $n$~variables. In the $(\gamma,\beta)$-approximation version of the problem for parameters $\gamma \geq \beta \in [0,1]$, the goal is to distinguish instances where at least $\gamma$ fraction of the constraints can be satisfied from instances where at most $\beta$ fraction of the constraints can be satisfied. In this work we consider the approximability of $\textsf{Max-CSP}(f)$ in the (dynamic) streaming setting, where constraints are inserted (and may also be deleted in the dynamic setting) one at a time. We completely characterize the approximability of all Boolean CSPs in the dynamic streaming setting. Specifically, given $f$, $\gamma$ and $\beta$ we show that either (1) the $(\gamma,\beta)$-approximation version of $\textsf{Max-CSP}(f)$ has a probabilistic dynamic streaming algorithm using $O(\log n)$ space, or (2) for every $\epsilon > 0$ the $(\gamma-\epsilon,\beta+\epsilon)$-approximation version of $\textsf{Max-CSP}(f)$ requires $\Omega(\sqrt{n})$ space for probabilistic dynamic streaming algorithms. We also extend previously known results in the insertion-only setting to a wide variety of cases, and in particular the case of $k=2$ where we get a dichotomy and the case when the satisfying assignments of $f$ support a distribution on $\{-1,1\}^k$ with uniform marginals.
In 2005, Goddard, Hedetniemi, Hedetniemi and Laskar [Generalized subgraph-restricted matchings in graphs, Discrete Mathematics, 293 (2005) 129 - 138] asked the computational complexity of determining the maximum cardinality of a matching whose vertex set induces a disconnected graph. In this paper we answer this question. In fact, we consider the generalized problem of finding $c$-disconnected matchings; such matchings are ones whose vertex sets induce subgraphs with at least $c$ connected components. We show that, for every fixed $c \geq 2$, this problem is NP-complete even if we restrict the input to bounded diameter bipartite graphs, while can be solved in polynomial time if $c = 1$. For the case when $c$ is part of the input, we show that the problem is NP-complete for chordal graphs, while being solvable in polynomial time for interval graphs. Finally, we explore the parameterized complexity of the problem. We present an FPT algorithm under the treewidth parameterization, and an XP algorithm for graphs with a polynomial number of minimal separators when parameterized by $c$. We complement these results by showing that, unless NP $\subseteq$ coNP/poly, the related Induced Matching problem does not admit a polynomial kernel when parameterized by vertex cover and size of the matching nor when parameterized by vertex deletion distance to clique and size of the matching. As for Connected Matching, we show how to obtain a maximum connected matching in linear time given an arbitrary maximum matching in the input.
This paper presents a method of learning qualitatively interpretable models in object detection using popular two-stage region-based ConvNet detection systems (i.e., R-CNN). R-CNN consists of a region proposal network and a RoI (Region-of-Interest) prediction network.By interpretable models, we focus on weakly-supervised extractive rationale generation, that is learning to unfold latent discriminative part configurations of object instances automatically and simultaneously in detection without using any supervision for part configurations. We utilize a top-down hierarchical and compositional grammar model embedded in a directed acyclic AND-OR Graph (AOG) to explore and unfold the space of latent part configurations of RoIs. We propose an AOGParsing operator to substitute the RoIPooling operator widely used in R-CNN, so the proposed method is applicable to many state-of-the-art ConvNet based detection systems. The AOGParsing operator aims to harness both the explainable rigor of top-down hierarchical and compositional grammar models and the discriminative power of bottom-up deep neural networks through end-to-end training. In detection, a bounding box is interpreted by the best parse tree derived from the AOG on-the-fly, which is treated as the extractive rationale generated for interpreting detection. In learning, we propose a folding-unfolding method to train the AOG and ConvNet end-to-end. In experiments, we build on top of the R-FCN and test the proposed method on the PASCAL VOC 2007 and 2012 datasets with performance comparable to state-of-the-art methods.