In a graph $G = (V,E)$, a k-ruling set $S$ is one in which all vertices $V$ \ $S$ are at most $k$ distance from $S$. Finding a minimum k-ruling set is intrinsically linked to the minimum dominating set problem and maximal independent set problem, which have been extensively studied in graph theory. This paper presents the first known algorithm for solving all k-ruling set problems in conjunction with known minimum dominating set algorithms at only additional polynomial time cost compared to a minimum dominating set. The algorithm further succeeds for $(\alpha, \alpha - 1)$ ruling sets in which $\alpha > 1$, for which constraints exist on the proximity of vertices v $\in S$. This secondary application instead works in conjunction with maximal independent set algorithms.
Matrix sketching, aimed at approximating a matrix $\boldsymbol{A} \in \mathbb{R}^{N\times d}$ consisting of vector streams of length $N$ with a smaller sketching matrix $\boldsymbol{B} \in \mathbb{R}^{\ell\times d}, \ell \ll N$, has garnered increasing attention in fields such as large-scale data analytics and machine learning. A well-known deterministic matrix sketching method is the Frequent Directions algorithm, which achieves the optimal $O\left(\frac{d}{\varepsilon}\right)$ space bound and provides a covariance error guarantee of $\varepsilon = \lVert \boldsymbol{A}^\top \boldsymbol{A} - \boldsymbol{B}^\top \boldsymbol{B} \rVert_2/\lVert \boldsymbol{A} \rVert_F^2$. The matrix sketching problem becomes particularly interesting in the context of sliding windows, where the goal is to approximate the matrix $\boldsymbol{A}_W$, formed by input vectors over the most recent $N$ time units. However, despite recent efforts, whether achieving the optimal $O\left(\frac{d}{\varepsilon}\right)$ space bound on sliding windows is possible has remained an open question. In this paper, we introduce the DS-FD algorithm, which achieves the optimal $O\left(\frac{d}{\varepsilon}\right)$ space bound for matrix sketching over row-normalized, sequence-based sliding windows. We also present matching upper and lower space bounds for time-based and unnormalized sliding windows, demonstrating the generality and optimality of \dsfd across various sliding window models. This conclusively answers the open question regarding the optimal space bound for matrix sketching over sliding windows. Furthermore, we conduct extensive experiments with both synthetic and real-world datasets, validating our theoretical claims and thus confirming the correctness and effectiveness of our algorithm, both theoretically and empirically.
A {\em bipartite tournament} is a directed graph $T:=(A \cup B, E)$ such that every pair of vertices $(a,b), a\in A,b\in B$ are connected by an arc, and no arc connects two vertices of $A$ or two vertices of $B$. A {\em feedback vertex set} is a set $S$ of vertices in $T$ such that $T - S$ is acyclic. In this article we consider the {\sc Feedback Vertex Set} problem in bipartite tournaments. Here the input is a bipartite tournament $T$ on $n$ vertices together with an integer $k$, and the task is to determine whether $T$ has a feedback vertex set of size at most $k$. We give a new algorithm for {\sc Feedback Vertex Set in Bipartite Tournaments}. The running time of our algorithm is upper-bounded by $O(1.6181^k + n^{O(1)})$, improving over the previously best known algorithm with running time $2^kk^{O(1)} + n^{O(1)}$ [Hsiao, ISAAC 2011]. As a by-product, we also obtain the fastest currently known exact exponential-time algorithm for the problem, with running time $O(1.3820^n)$.
In this paper, we study the quantum channel on a von Neuamnn algebras $\mathcal{M}$ preserving a von Neumann subalgebra $\mathcal{N}$, namely $\mathcal{N}$-$\mathcal{N}$-bimodule unital completely positive map. By introducing the relative irreducibility of a bimodule quantum channel, we show that its eigenvalues with modulus 1 form a finite cyclic group, called its phase group. Moreover, the corresponding eigenspaces are invertible $\mathcal{N}$-$\mathcal{N}$-bimodules, which encode a categorification of the phase group. When $\mathcal{N}\subset \mathcal{M}$ is a finite-index irreducible subfactor of type II$_1$, we prove that any bimodule quantum channel is relative irreducible for the intermediate subfactor of its fixed points. In addition, we can reformulate and prove these results intrinsically in subfactor planar algebras without referring to the subfactor using the methods of quantum Fourier analysis.
We study a data pricing problem, where a seller has access to $N$ homogeneous data points (e.g. drawn i.i.d. from some distribution). There are $m$ types of buyers in the market, where buyers of the same type $i$ have the same valuation curve $v_i:[N]\rightarrow [0,1]$, where $v_i(n)$ is the value for having $n$ data points. A priori, the seller is unaware of the distribution of buyers, but can repeat the market for $T$ rounds so as to learn the revenue-optimal pricing curve $p:[N] \rightarrow [0, 1]$. To solve this online learning problem, we first develop novel discretization schemes to approximate any pricing curve. When compared to prior work, the size of our discretization schemes scales gracefully with the approximation parameter, which translates to better regret in online learning. Under assumptions like smoothness and diminishing returns which are satisfied by data, the discretization size can be reduced further. We then turn to the online learning problem, both in the stochastic and adversarial settings. On each round, the seller chooses an anonymous pricing curve $p_t$. A new buyer appears and may choose to purchase some amount of data. She then reveals her type only if she makes a purchase. Our online algorithms build on classical algorithms such as UCB and FTPL, but require novel ideas to account for the asymmetric nature of this feedback and to deal with the vastness of the space of pricing curves. Using the improved discretization schemes previously developed, we are able to achieve $\tilde{O}(m\sqrt{T})$ regret in the stochastic setting and $\tilde{O}(m^{3/2}\sqrt{T})$ regret in the adversarial setting.
We study a fundamental problem in Computational Geometry, the planar two-center problem. In this problem, the input is a set $S$ of $n$ points in the plane and the goal is to find two smallest congruent disks whose union contains all points of $S$. A longstanding open problem has been to obtain an $O(n\log n)$-time algorithm for planar two-center, matching the $\Omega(n\log n)$ lower bound given by Eppstein [SODA'97]. Towards this, researchers have made a lot of efforts over decades. The previous best algorithm, given by Wang [SoCG'20], solves the problem in $O(n\log^2 n)$ time. In this paper, we present an $O(n\log n)$-time (deterministic) algorithm for planar two-center, which completely resolves this open problem.
Estimating the second frequency moment of a stream up to $(1\pm\varepsilon)$ multiplicative error requires at most $O(\log n / \varepsilon^2)$ bits of space, due to a seminal result of Alon, Matias, and Szegedy. It is also known that at least $\Omega(\log n + 1/\varepsilon^{2})$ space is needed. We prove an optimal lower bound of $\Omega\left(\log \left(n \varepsilon^2 \right) / \varepsilon^2\right)$ for all $\varepsilon = \Omega(1/\sqrt{n})$. Note that when $\varepsilon>n^{-1/2 + c}$, where $c>0$, our lower bound matches the classic upper bound of AMS. For smaller values of $\varepsilon$ we also introduce a revised algorithm that improves the classic AMS bound and matches our lower bound. Our lower bound holds also for the more general problem of $p$-th frequency moment estimation for the range of $p\in (1,2]$, giving a tight bound in the only remaining range to settle the optimal space complexity of estimating frequency moments.
A celebrated result of Hastad established that, for any constant $\varepsilon>0$, it is NP-hard to find an assignment satisfying a $(1/|G|+\varepsilon)$-fraction of the constraints of a given 3-LIN instance over an Abelian group $G$ even if one is promised that an assignment satisfying a $(1-\varepsilon)$-fraction of the constraints exists. Engebretsen, Holmerin, and Russell showed the same result for 3-LIN instances over any finite (not necessarily Abelian) group. In other words, for almost-satisfiable instances of 3-LIN the random assignment achieves an optimal approximation guarantee. We prove that the random assignment algorithm is still best possible under a stronger promise that the 3-LIN instance is almost satisfiable over an arbitrarily more restrictive group.
A seminal palette sparsification result of Assadi, Chen, and Khanna states that in every $n$-vertex graph of maximum degree $\Delta$, sampling $\Theta(\log n)$ colors per vertex from $\{1, \ldots, \Delta+1\}$ almost certainly allows for a proper coloring from the sampled colors. Alon and Assadi extended this work proving a similar result for $O\left(\Delta/\log \Delta\right)$-coloring triangle-free graphs. Apart from being interesting results from a combinatorial standpoint, their results have various applications to the design of graph coloring algorithms in different models of computation. In this work, we focus on locally sparse graphs, i.e., graphs with sparse neighborhoods. We say a graph $G = (V, E)$ is $k$-locally-sparse if for each vertex $v \in V$, the subgraph $G[N(v)]$ contains at most $k$ edges. A celebrated result of Alon, Krivelevich, and Sudakov shows that such graphs are $O(\Delta/\log (\Delta/\sqrt{k}))$-colorable. For any $\alpha \in (0, 1)$ and $k \ll \Delta^{2\alpha}$, let $G$ be a $k$-locally-sparse graph. For $q = \Theta\left(\Delta/\log \left(\Delta^\alpha/\sqrt{k}\right)\right)$, we show that sampling $O\left(\Delta^\alpha + \sqrt{\log n}\right)$ colors per vertex is sufficient to obtain a proper $q$-coloring of $G$ from the sampled colors. Setting $k = 1$ recovers the aforementioned result of Alon and Assadi for triangle-free graphs. A key element in our proof is a proposition regarding correspondence coloring in the so-called color-degree setting, which improves upon recent work of Anderson, Kuchukova, and the author and is of independent interest.
We consider the linear causal representation learning setting where we observe a linear mixing of $d$ unknown latent factors, which follow a linear structural causal model. Recent work has shown that it is possible to recover the latent factors as well as the underlying structural causal model over them, up to permutation and scaling, provided that we have at least $d$ environments, each of which corresponds to perfect interventions on a single latent node (factor). After this powerful result, a key open problem faced by the community has been to relax these conditions: allow for coarser than perfect single-node interventions, and allow for fewer than $d$ of them, since the number of latent factors $d$ could be very large. In this work, we consider precisely such a setting, where we allow a smaller than $d$ number of environments, and also allow for very coarse interventions that can very coarsely \textit{change the entire causal graph over the latent factors}. On the flip side, we relax what we wish to extract to simply the \textit{list of nodes that have shifted between one or more environments}. We provide a surprising identifiability result that it is indeed possible, under some very mild standard assumptions, to identify the set of shifted nodes. Our identifiability proof moreover is a constructive one: we explicitly provide necessary and sufficient conditions for a node to be a shifted node, and show that we can check these conditions given observed data. Our algorithm lends itself very naturally to the sample setting where instead of just interventional distributions, we are provided datasets of samples from each of these distributions. We corroborate our results on both synthetic experiments as well as an interesting psychometric dataset. The code can be found at //github.com/TianyuCodings/iLCS.
We assume that we observe $N$ independent copies of a diffusion process on a time-interval $[0,2T]$. For a given time $t$, we estimate the transition density $p_t(x,y)$, namely the conditional density of $X_{t + s}$ given $X_s = x$, under conditions on the diffusion coefficients ensuring that this quantity exists. We use a least squares projection method on a product of finite dimensional spaces, prove risk bounds for the estimator and propose an anisotropic model selection method, relying on several reference norms. A simulation study illustrates the theoretical part for Ornstein-Uhlenbeck or square-root (Cox-Ingersoll-Ross) processes.