Given an undirected graph $\mathcal{G}=(\mathcal{V},\mathcal{E})$, with vertex weights $(w(u))_{u\in\mathcal{V}}$, vertex values $(\alpha(u))_{u\in\mathcal{V}}$, a knapsack size $s$, and a target value $d$, the \vcknapsack problem is to determine if there exists a subset $\mathcal{U}\subseteq\mathcal{V}$ of vertices such that $\mathcal{U}$ forms a vertex cover, $w(\mathcal{U})=\sum_{u\in\mathcal{U}} w(u) \le s$, and $\alpha(\mathcal{U})=\sum_{u\in\mathcal{U}} \alpha(u) \ge d$. In this paper, we closely study the \vcknapsack problem and its variations, such as \vcknapsackbudget, \minimalvcknapsack, and \minimumvcknapsack, for both general graphs and trees. We first prove that the \vcknapsack problem belongs to the complexity class \NPC and then study the complexity of the other variations. We generalize the problem to \setc and \hs versions and design polynomial time $H_g$-factor approximation algorithm for the \setckp problem and d-factor approximation algorithm for \hstp using primal dual method. We further show that \setcks and \hsmb are hard to approximate in polynomial time. Additionally, we develop a fixed parameter tractable algorithm running in time $8^{\mathcal{O}({\rm tw})}\cdot n\cdot {\sf min}\{s,d\}$ where ${\rm tw},s,d,n$ are respectively treewidth of the graph, the size of the knapsack, the target value of the knapsack, and the number of items for the \minimalvcknapsack problem.
The paper concerns the $d$-dimensional stochastic approximation recursion, $$ \theta_{n+1}= \theta_n + \alpha_{n + 1} f(\theta_n, \Phi_{n+1}) $$ where $ \{ \Phi_n \}$ is a stochastic process on a general state space, satisfying a conditional Markov property that allows for parameter-dependent noise. The main results are established under additional conditions on the mean flow and a version of the Donsker-Varadhan Lyapunov drift condition known as (DV3): {(i)} An appropriate Lyapunov function is constructed that implies convergence of the estimates in $L_4$. {(ii)} A functional central limit theorem (CLT) is established, as well as the usual one-dimensional CLT for the normalized error. Moment bounds combined with the CLT imply convergence of the normalized covariance $\textsf{E} [ z_n z_n^T ]$ to the asymptotic covariance in the CLT, where $z_n{=:} (\theta_n-\theta^*)/\sqrt{\alpha_n}$. {(iii)} The CLT holds for the normalized version $z^{\text{PR}}_n{=:} \sqrt{n} [\theta^{\text{PR}}_n -\theta^*]$, of the averaged parameters $\theta^{\text{PR}}_n {=:} n^{-1} \sum_{k=1}^n\theta_k$, subject to standard assumptions on the step-size. Moreover, the covariance in the CLT coincides with the minimal covariance of Polyak and Ruppert. {(iv)} An example is given where $f$ and $\bar{f}$ are linear in $\theta$, and $\Phi$ is a geometrically ergodic Markov chain but does not satisfy (DV3). While the algorithm is convergent, the second moment of $\theta_n$ is unbounded and in fact diverges. {\bf This arXiv version 3 represents a major extension of the results in prior versions.} The main results now allow for parameter-dependent noise, as is often the case in applications to reinforcement learning.
Approximating the action of a matrix function $f(\mathbf{A})$ on a vector $\mathbf{b}$ is an increasingly important primitive in machine learning, data science, and statistics, with applications such as sampling high dimensional Gaussians, Gaussian process regression and Bayesian inference, principle component analysis, and approximating Hessian spectral densities. Over the past decade, a number of algorithms enjoying strong theoretical guarantees have been proposed for this task. Many of the most successful belong to a family of algorithms called Krylov subspace methods. Remarkably, a classic Krylov subspace method, called the Lanczos method for matrix functions (Lanczos-FA), frequently outperforms newer methods in practice. Our main result is a theoretical justification for this finding: we show that, for a natural class of rational functions, Lanczos-FA matches the error of the best possible Krylov subspace method up to a multiplicative approximation factor. The approximation factor depends on the degree of $f(x)$'s denominator and the condition number of $\mathbf{A}$, but not on the number of iterations $k$. Our result provides a strong justification for the excellent performance of Lanczos-FA, especially on functions that are well approximated by rationals, such as the matrix square root.
An important problem in signal processing and deep learning is to achieve \textit{invariance} to nuisance factors not relevant for the task. Since many of these factors are describable as the action of a group $G$ (e.g. rotations, translations, scalings), we want methods to be $G$-invariant. The $G$-Bispectrum extracts every characteristic of a given signal up to group action: for example, the shape of an object in an image, but not its orientation. Consequently, the $G$-Bispectrum has been incorporated into deep neural network architectures as a computational primitive for $G$-invariance\textemdash akin to a pooling mechanism, but with greater selectivity and robustness. However, the computational cost of the $G$-Bispectrum ($\mathcal{O}(|G|^2)$, with $|G|$ the size of the group) has limited its widespread adoption. Here, we show that the $G$-Bispectrum computation contains redundancies that can be reduced into a \textit{selective $G$-Bispectrum} with $\mathcal{O}(|G|)$ complexity. We prove desirable mathematical properties of the selective $G$-Bispectrum and demonstrate how its integration in neural networks enhances accuracy and robustness compared to traditional approaches, while enjoying considerable speeds-up compared to the full $G$-Bispectrum.
How hard is it to estimate a discrete-time signal $(x_{1}, ..., x_{n}) \in \mathbb{C}^n$ satisfying an unknown linear recurrence relation of order $s$ and observed in i.i.d. complex Gaussian noise? The class of all such signals is parametric but extremely rich: it contains all exponential polynomials over $\mathbb{C}$ with total degree $s$, including harmonic oscillations with $s$ arbitrary frequencies. Geometrically, this class corresponds to the projection onto $\mathbb{C}^{n}$ of the union of all shift-invariant subspaces of $\mathbb{C}^\mathbb{Z}$ of dimension $s$. We show that the statistical complexity of this class, as measured by the squared minimax radius of the $(1-\delta)$-confidence $\ell_2$-ball, is nearly the same as for the class of $s$-sparse signals, namely $O\left(s\log(en) + \log(\delta^{-1})\right) \cdot \log^2(es) \cdot \log(en/s).$ Moreover, the corresponding near-minimax estimator is tractable, and it can be used to build a test statistic with a near-minimax detection threshold in the associated detection problem. These statistical results rest upon an approximation-theoretic one: we show that finite-dimensional shift-invariant subspaces admit compactly supported reproducing kernels whose Fourier spectra have nearly the smallest possible $\ell_p$-norms, for all $p \in [1,+\infty]$ at once.
In the turnstile streaming model, a dynamic vector $\mathbf{x}=(\mathbf{x}_1,\ldots,\mathbf{x}_n)\in \mathbb{Z}^n$ is updated by a stream of entry-wise increments/decrements. Let $f\colon\mathbb{Z}\to \mathbb{R}_+$ be a symmetric function with $f(0)=0$. The \emph{$f$-moment} of $\mathbf{x}$ is defined to be $f(\mathbf{x}) := \sum_{v\in[n]}f(\mathbf{x}_v)$. We revisit the problem of constructing a \emph{universal sketch} that can estimate many different $f$-moments. Previous constructions of universal sketches rely on the technique of sampling with respect to the $L_0$-mass (uniform samples) or $L_2$-mass ($L_2$-heavy-hitters), whose universality comes from being able to evaluate the function $f$ over the samples. In this work we take a new approach to constructing a universal sketch that does not use \emph{any} explicit samples but relies on the \emph{harmonic structure} of the target function $f$. The new sketch ($\textsf{SymmetricPoissonTower}$) \emph{embraces} hash collisions instead of avoiding them, which saves multiple $\log n$ factors in space, e.g., when estimating all $L_p$-moments ($f(z) = |z|^p,p\in[0,2]$). For many nearly periodic functions, the new sketch is \emph{exponentially} more efficient than sampling-based methods. We conjecture that the $\textsf{SymmetricPoissonTower}$ sketch is \emph{the} universal sketch that can estimate every tractable function $f$.
Matrix sketching, aimed at approximating a matrix $\boldsymbol{A} \in \mathbb{R}^{N\times d}$ consisting of vector streams of length $N$ with a smaller sketching matrix $\boldsymbol{B} \in \mathbb{R}^{\ell\times d}, \ell \ll N$, has garnered increasing attention in fields such as large-scale data analytics and machine learning. A well-known deterministic matrix sketching method is the Frequent Directions algorithm, which achieves the optimal $O\left(\frac{d}{\varepsilon}\right)$ space bound and provides a covariance error guarantee of $\varepsilon = \lVert \boldsymbol{A}^\top \boldsymbol{A} - \boldsymbol{B}^\top \boldsymbol{B} \rVert_2/\lVert \boldsymbol{A} \rVert_F^2$. The matrix sketching problem becomes particularly interesting in the context of sliding windows, where the goal is to approximate the matrix $\boldsymbol{A}_W$, formed by input vectors over the most recent $N$ time units. However, despite recent efforts, whether achieving the optimal $O\left(\frac{d}{\varepsilon}\right)$ space bound on sliding windows is possible has remained an open question. In this paper, we introduce the DS-FD algorithm, which achieves the optimal $O\left(\frac{d}{\varepsilon}\right)$ space bound for matrix sketching over row-normalized, sequence-based sliding windows. We also present matching upper and lower space bounds for time-based and unnormalized sliding windows, demonstrating the generality and optimality of \dsfd across various sliding window models. This conclusively answers the open question regarding the optimal space bound for matrix sketching over sliding windows. Furthermore, we conduct extensive experiments with both synthetic and real-world datasets, validating our theoretical claims and thus confirming the correctness and effectiveness of our algorithm, both theoretically and empirically.
Let $\mathcal{D}$ be a set family that is the solution domain of some combinatorial problem. The \emph{max-min diversification problem on $\mathcal{D}$} is the problem to select $k$ sets from $\mathcal{D}$ such that the Hamming distance between any two selected sets is at least $d$. FPT algorithms parameterized by $k,l:=\max_{D\in \mathcal{D}}|D|$ and $k,d$ have been actively studied recently for several specific domains. This paper provides unified algorithmic frameworks to solve this problem. Specifically, for each parameterization $k,l$ and $k,d$, we provide an FPT oracle algorithm for the max-min diversification problem using oracles related to $\mathcal{D}$. We then demonstrate that our frameworks generalize most of the existing domain-specific tractability results and provide the first FPT algorithms for several domains. Our main technical breakthrough is introducing the notion of \emph{max-distance sparsifier} of $\mathcal{D}$, a domain on which the max-min diversification problem is equivalent to the same problem on the original domain $\mathcal{D}$. The core of our framework is to design FPT oracle algorithms that construct a constant-size max-distance sparsifier of $\mathcal{D}$. Using max-distance sparsifiers, we provide FPT algorithms for the max-min and max-sum diversification problems on $\mathcal{D}$, as well as $k$-center and $k$-sum-of-radii clustering problems on $\mathcal{D}$, which are also natural problems in the context of diversification and have their own interests.
A {\em bipartite tournament} is a directed graph $T:=(A \cup B, E)$ such that every pair of vertices $(a,b), a\in A,b\in B$ are connected by an arc, and no arc connects two vertices of $A$ or two vertices of $B$. A {\em feedback vertex set} is a set $S$ of vertices in $T$ such that $T - S$ is acyclic. In this article we consider the {\sc Feedback Vertex Set} problem in bipartite tournaments. Here the input is a bipartite tournament $T$ on $n$ vertices together with an integer $k$, and the task is to determine whether $T$ has a feedback vertex set of size at most $k$. We give a new algorithm for {\sc Feedback Vertex Set in Bipartite Tournaments}. The running time of our algorithm is upper-bounded by $O(1.6181^k + n^{O(1)})$, improving over the previously best known algorithm with running time $2^kk^{O(1)} + n^{O(1)}$ [Hsiao, ISAAC 2011]. As a by-product, we also obtain the fastest currently known exact exponential-time algorithm for the problem, with running time $O(1.3820^n)$.
In this paper, we study the quantum channel on a von Neuamnn algebras $\mathcal{M}$ preserving a von Neumann subalgebra $\mathcal{N}$, namely $\mathcal{N}$-$\mathcal{N}$-bimodule unital completely positive map. By introducing the relative irreducibility of a bimodule quantum channel, we show that its eigenvalues with modulus 1 form a finite cyclic group, called its phase group. Moreover, the corresponding eigenspaces are invertible $\mathcal{N}$-$\mathcal{N}$-bimodules, which encode a categorification of the phase group. When $\mathcal{N}\subset \mathcal{M}$ is a finite-index irreducible subfactor of type II$_1$, we prove that any bimodule quantum channel is relative irreducible for the intermediate subfactor of its fixed points. In addition, we can reformulate and prove these results intrinsically in subfactor planar algebras without referring to the subfactor using the methods of quantum Fourier analysis.
We construct a classical oracle relative to which $\mathsf{P} = \mathsf{NP}$ but quantum-computable quantum-secure trapdoor one-way functions exist. This is a substantial strengthening of the result of Kretschmer, Qian, Sinha, and Tal (STOC 2023), which only achieved single-copy pseudorandom quantum states relative to an oracle that collapses $\mathsf{NP}$ to $\mathsf{P}$. For example, our result implies multi-copy pseudorandom states and pseudorandom unitaries, but also classical-communication public-key encryption, signatures, and oblivious transfer schemes relative to an oracle on which $\mathsf{P}=\mathsf{NP}$. Hence, in our new relativized world, classical computers live in "Algorithmica" whereas quantum computers live in "Cryptomania," using the language of Impagliazzo's worlds. Our proof relies on a new distributional block-insensitivity lemma for $\mathsf{AC^0}$ circuits, wherein a single block is resampled from an arbitrary distribution.