Complexity classes such as $\#\mathbf{P}$, $\oplus\mathbf{P}$, $\mathbf{GapP}$, $\mathbf{OptP}$, $\mathbf{NPMV}$, or the class of fuzzy languages realised by polynomial-time fuzzy nondeterministic Turing machines, can all be described in terms of a class $\mathbf{NP}[S]$ for a suitable semiring $S$, defined via weighted Turing machines over $S$ similarly as $\mathbf{NP}$ is defined via the classical nondeterministic Turing machines. Other complexity classes of decision problems can be lifted to the quantitative world using the same recipe as well, and the resulting classes relate to the original ones in the same way as weighted automata or logics relate to their unweighted counterparts. The article surveys these too-little-known connexions between weighted automata theory and computational complexity theory implicit in the existing literature, suggests a systematic approach to the study of weighted complexity classes, and presents several new observations strengthening the relation between both fields. In particular, it is proved that a natural extension of the Boolean satisfiability problem to weighted propositional logic is complete for the class $\mathbf{NP}[S]$ when $S$ is a finitely generated semiring. Moreover, a class of semiring-valued functions $\mathbf{FP}[S]$ is introduced for each semiring $S$ as a counterpart to the class $\mathbf{P}$, and the relations between $\mathbf{FP}[S]$ and $\mathbf{NP}[S]$ are considered.
Let $X$ be a $d$-dimensional simplicial complex. A function $F\colon X(k)\to \{0,1\}^k$ is said to be a direct product function if there exists a function $f\colon X(1)\to \{0,1\}$ such that $F(\sigma) = (f(\sigma_1), \ldots, f(\sigma_k))$ for each $k$-face $\sigma$. In an effort to simplify components of the PCP theorem, Goldreich and Safra introduced the problem of direct product testing, which asks whether one can test if $F\colon X(k)\to \{0,1\}^k$ is correlated with a direct product function by querying $F$ on only $2$ inputs. Dinur and Kaufman conjectured that there exist bounded degree complexes with a direct product test in the small soundness regime. We resolve their conjecture by showing that for all $\delta>0$, there exists a family of high-dimensional expanders with degree $O_{\delta}(1)$ and a $2$-query direct product tester with soundness $\delta$. We use the characterization given by a subset of the authors and independently by Dikstein and Dinur, who showed that some form of non-Abelian coboundary expansion (which they called "Unique-Games coboundary expansion") is a necessary and sufficient condition for a complex to admit such direct product testers. Our main technical contribution is a general technique for showing coboundary expansion of complexes with coefficients in a non-Abelian group. This allows us to prove that the high dimensional expanders constructed by Chapman and Lubotzky satisfies the necessary conditions, thus admitting a 2-query direct product tester with small soundness.
In dimension $d$, Mutually Unbiased Bases (MUBs) are a collection of orthonormal bases over $\mathbb{C}^d$ such that for any two vectors $v_1, v_2$ belonging to different bases, the dot or scalar product $|\braket{v_1|v_2}| = \frac{1}{\sqrt{d}}$. The upper bound on the number of such bases is $d+1$. Construction methods to achieve this bound are known for cases when $d$ is some power of prime. The situation is more restrictive in other cases and also when we consider the results over real rather than complex. Thus, certain relaxations of this model are considered in literature and consequently Approximate MUBs (AMUB) are studied. This enables one to construct potentially large number of such objects for $\mathbb{C}^d$ as well as in $\mathbb{R}^d$. In this regard, we propose the concept of Almost Perfect MUBs (APMUB), where we restrict the absolute value of inner product $|\braket{v_1|v_2}|$ to be two-valued, one being 0 and the other $ \leq \frac{1+\mathcal{O}(d^{-\lambda})}{\sqrt{d}}$, such that $\lambda > 0$ and the numerator $1 + \mathcal{O}(d^{-\lambda}) \leq 2$. Each such vector constructed, has an important feature that large number of its components are zero and the non-zero components are of equal magnitude. Our techniques are based on combinatorial structures related to Resolvable Block Designs (RBDs). We show that for several composite dimensions $d$, one can construct $\mathcal{O}(\sqrt{d})$ many APMUBs, in which cases the number of MUBs are significantly small. To be specific, this result works for $d$ of the form $(q-e)(q+f), \ q, e, f \in \mathbb{N}$, with the conditions $0 \leq f \leq e$ for constant $e, f$ and $q$ some power of prime. We also show that such APMUBs provide sets of Bi-angular vectors which are of the order of $\mathcal{O}(d^{3/2})$ in numbers, having high angular distances among them.
Two graphs $G$ and $H$ are homomorphism indistinguishable over a class of graphs $\mathcal{F}$ if for all graphs $F \in \mathcal{F}$ the number of homomorphisms from $F$ to $G$ is equal to the number of homomorphisms from $F$ to $H$. Many natural equivalence relations comparing graphs such as (quantum) isomorphism, spectral, and logical equivalences can be characterised as homomorphism indistinguishability relations over certain graph classes. Abstracting from the wealth of such instances, we show in this paper that equivalences w.r.t. any self-complementarity logic admitting a characterisation as homomorphism indistinguishability relation can be characterised by homomorphism indistinguishability over a minor-closed graph class. Self-complementarity is a mild property satisfied by most well-studied logics. This result follows from a correspondence between closure properties of a graph class and preservation properties of its homomorphism indistinguishability relation. Furthermore, we classify all graph classes which are in a sense finite (essentially profinite) and satisfy the maximality condition of being homomorphism distinguishing closed, i.e. adding any graph to the class strictly refines its homomorphism indistinguishability relation. Thereby, we answer various questions raised by Roberson (2022) on general properties of the homomorphism distinguishing closure.
We consider a high-dimensional stochastic contextual linear bandit problem when the parameter vector is $s_{0}$-sparse and the decision maker is subject to privacy constraints under both central and local models of differential privacy. We present PrivateLASSO, a differentially private LASSO bandit algorithm. PrivateLASSO is based on two sub-routines: (i) a sparse hard-thresholding-based privacy mechanism and (ii) an episodic thresholding rule for identifying the support of the parameter $\theta$. We prove minimax private lower bounds and establish privacy and utility guarantees for PrivateLASSO for the central model under standard assumptions.
The \emph{Fast Gaussian Transform} (FGT) enables subquadratic-time multiplication of an $n\times n$ Gaussian kernel matrix $\mathsf{K}_{i,j}= \exp ( - \| x_i - x_j \|_2^2 ) $ with an arbitrary vector $h \in \mathbb{R}^n$, where $x_1,\dots, x_n \in \mathbb{R}^d$ are a set of \emph{fixed} source points. This kernel plays a central role in machine learning and random feature maps. Nevertheless, in most modern data analysis applications, datasets are dynamically changing (yet often have low rank), and recomputing the FGT from scratch in (kernel-based) algorithms incurs a major computational overhead ($\gtrsim n$ time for a single source update $\in \mathbb{R}^d$). These applications motivate a \emph{dynamic FGT} algorithm, which maintains a dynamic set of sources under \emph{kernel-density estimation} (KDE) queries in \emph{sublinear time} while retaining Mat-Vec multiplication accuracy and speed. Assuming the dynamic data-points $x_i$ lie in a (possibly changing) $k$-dimensional subspace ($k\leq d$), our main result is an efficient dynamic FGT algorithm, supporting the following operations in $\log^{O(k)}(n/\varepsilon)$ time: (1) Adding or deleting a source point, and (2) Estimating the ``kernel-density'' of a query point with respect to sources with $\varepsilon$ additive accuracy. The core of the algorithm is a dynamic data structure for maintaining the \emph{projected} ``interaction rank'' between source and target boxes, decoupled into finite truncation of Taylor and Hermite expansions.
Attention computation takes both the time complexity of $O(n^2)$ and the space complexity of $O(n^2)$ simultaneously, which makes deploying Large Language Models (LLMs) in streaming applications that involve long contexts requiring substantial computational resources. In recent OpenAI DevDay (Nov 6, 2023), OpenAI released a new model that is able to support a 128K-long document, in our paper, we focus on the memory-efficient issue when context length $n$ is much greater than 128K ($n \gg 2^d$). Considering a single-layer self-attention with Query, Key, and Value matrices $Q, K, V \in \mathbb{R}^{n \times d}$, the polynomial method approximates the attention output $T \in \mathbb{R}^{n \times d}$. It accomplishes this by constructing $U_1, U_2 \in \mathbb{R}^{n \times t}$ to expedite attention ${\sf Attn}(Q, K, V)$ computation within $n^{1+o(1)}$ time executions. Despite this, computing the approximated attention matrix $U_1U_2^\top \in \mathbb{R}^{n \times n}$ still necessitates $O(n^2)$ space, leading to significant memory usage. In response to these challenges, we introduce a new algorithm that only reads one pass of the data in a streaming fashion. This method employs sublinear space $o(n)$ to store three sketch matrices, alleviating the need for exact $K, V$ storage. Notably, our algorithm exhibits exceptional memory-efficient performance with super-long tokens. As the token length $n$ increases, our error guarantee diminishes while the memory usage remains nearly constant. This unique attribute underscores the potential of our technique in efficiently handling LLMs in streaming applications.
A popular and flexible time series model for counts is the generalized integer autoregressive process of order $p$, GINAR($p$). These Markov processes are defined using thinning operators evaluated on past values of the process along with a discretely-valued innovation process. This class includes the commonly used INAR($p$) process, defined with binomial thinning and Poisson innovations. GINAR processes can be used in a variety of settings, including modeling time series with low counts, and allow for more general mean-variance relationships, capturing both over- or under-dispersion. While there are many thinning operators and innovation processes given in the literature, less focus has been spent on comparing statistical inference and forecasting procedures over different choices of GINAR process. We provide an extensive study of exact and approximate inference and forecasting methods that can be applied to a wide class of GINAR($p$) processes with general thinning and innovation parameters. We discuss the challenges of exact estimation when $p$ is larger. We summarize and extend asymptotic results for estimators of process parameters, and present simulations to compare small sample performance, highlighting how different methods compare. We illustrate this methodology by fitting GINAR processes to a disease surveillance series.
A subset $S$ of the Boolean hypercube $\mathbb{F}_2^n$ is a sumset if $S = \{a + b : a, b\in A\}$ for some $A \subseteq \mathbb{F}_2^n$. Sumsets are central objects of study in additive combinatorics, featuring in several influential results. We prove a lower bound of $\Omega(2^{n/2})$ for the number of queries needed to test whether a Boolean function $f:\mathbb{F}_2^n \to \{0,1\}$ is the indicator function of a sumset. Our lower bound for testing sumsets follows from sharp bounds on the related problem of shift testing, which may be of independent interest. We also give a near-optimal $2^{n/2} \cdot \mathrm{poly}(n)$-query algorithm for a smoothed analysis formulation of the sumset refutation problem.
A subset $\mathcal{C}\subseteq\{0,1,2\}^n$ is said to be a $\textit{trifferent}$ code (of block length $n$) if for every three distinct codewords $x,y, z \in \mathcal{C}$, there is a coordinate $i\in \{1,2,\ldots,n\}$ where they all differ, that is, $\{x(i),y(i),z(i)\}$ is same as $\{0,1,2\}$. Let $T(n)$ denote the size of the largest trifferent code of block length $n$. Understanding the asymptotic behavior of $T(n)$ is closely related to determining the zero-error capacity of the $(3/2)$-channel defined by Elias'88, and is a long-standing open problem in the area. Elias had shown that $T(n)\leq 2\times (3/2)^n$ and prior to our work the best upper bound was $T(n)\leq 0.6937 \times (3/2)^n$ due to Kurz'23. We improve this bound to $T(n)\leq c \times n^{-2/5}\times (3/2)^n$ where $c$ is an absolute constant.
Let $\mathcal{H}=(X,\mathcal{E})$ be a hypergraph. A support is a graph $Q$ on $X$ such that for each $E\in\mathcal{E}$, the subgraph of $Q$ induced on the elements in $E$ is connected. In this paper, we consider hypergraphs defined on a host graph. Given a graph $G=(V,E)$, with $c:V\to\{\mathbf{r},\mathbf{b}\}$, and a collection of connected subgraphs $\mathcal{H}$ of $G$, a primal support is a graph $Q$ on $\mathbf{b}(V)$ such that for each $H\in \mathcal{H}$, the induced subgraph $Q[\mathbf{b}(H)]$ on vertices $\mathbf{b}(H)=H\cap c^{-1}(\mathbf{b})$ is connected. A \emph{dual support} is a graph $Q^*$ on $\mathcal{H}$ s.t. for each $v\in X$, the induced subgraph $Q^*[\mathcal{H}_v]$ is connected, where $\mathcal{H}_v=\{H\in\mathcal{H}: v\in H\}$. We present sufficient conditions on the host graph and hyperedges so that the resulting support comes from a restricted family. We primarily study two classes of graphs: $(1)$ If the host graph has genus $g$ and the hypergraphs satisfy a topological condition of being \emph{cross-free}, then there is a primal and a dual support of genus at most $g$. $(2)$ If the host graph has treewidth $t$ and the hyperedges satisfy a combinatorial condition of being \emph{non-piercing}, then there exist primal and dual supports of treewidth $O(2^t)$. We show that this exponential blow-up is sometimes necessary. As an intermediate case, we also study the case when the host graph is outerplanar. Finally, we show applications of our results to packing and covering, and coloring problems on geometric hypergraphs.