We study the average-case version of the Orthogonal Vectors problem, in which one is given as input $n$ vectors from $\{0,1\}^d$ which are chosen randomly so that each coordinate is $1$ independently with probability $p$. Kane and Williams [ITCS 2019] showed how to solve this problem in time $O(n^{2 - \delta_p})$ for a constant $\delta_p > 0$ that depends only on $p$. However, it was previously unclear how to solve the problem faster in the hardest parameter regime where $p$ may depend on $d$. The best prior algorithm was the best worst-case algorithm by Abboud, Williams and Yu [SODA 2014], which in dimension $d = c \cdot \log n$, solves the problem in time $n^{2 - \Omega(1/\log c)}$. In this paper, we give a new algorithm which improves this to $n^{2 - \Omega(\log\log c /\log c)}$ in the average case for any parameter $p$. As in the prior work, our algorithm uses the polynomial method. We make use of a very simple polynomial over the reals, and use a new method to analyze its performance based on computing how its value degrades as the input vectors get farther from orthogonal. To demonstrate the generality of our approach, we also solve the average-case version of the closest pair problem in the same running time.
Kernel methods map data into high-dimensional spaces, enabling linear algorithms to learn nonlinear functions without explicitly storing the feature vectors. Quantum kernel methods promise efficient learning by encoding feature maps into exponentially large Hilbert spaces inherent in quantum systems. In this work we implement quantum kernels on a 10-qubit star-topology register in a nuclear magnetic resonance (NMR) platform. We experimentally encode classical data in the evolution of multiple quantum coherence orders using data-dependent unitary transformations and then demonstrate one-dimensional regression and two-dimensional classification tasks. By extending the register to a double-layered star configuration, we propose an extended quantum kernel to handle non-parametrized operator inputs. By numerically simulating the extended quantum kernel, we show classification of entangling and nonentangling unitaries. These results confirm that quantum kernels exhibit strong capabilities in classical as well as quantum machine learning tasks.
We consider maximizing an unknown monotonic, submodular set function $f: 2^{[n]} \rightarrow [0,1]$ with cardinality constraint under stochastic bandit feedback. At each time $t=1,\dots,T$ the learner chooses a set $S_t \subset [n]$ with $|S_t| \leq k$ and receives reward $f(S_t) + \eta_t$ where $\eta_t$ is mean-zero sub-Gaussian noise. The objective is to minimize the learner's regret with respect to an approximation of the maximum $f(S_*)$ with $|S_*| = k$, obtained through robust greedy maximization of $f$. To date, the best regret bound in the literature scales as $k n^{1/3} T^{2/3}$. And by trivially treating every set as a unique arm one deduces that $\sqrt{ {n \choose k} T }$ is also achievable using standard multi-armed bandit algorithms. In this work, we establish the first minimax lower bound for this setting that scales like $\tilde{\Omega}(\min_{L \le k}(L^{1/3}n^{1/3}T^{2/3} + \sqrt{{n \choose k - L}T}))$. For a slightly restricted algorithm class, we prove a stronger regret lower bound of $\tilde{\Omega}(\min_{L \le k}(Ln^{1/3}T^{2/3} + \sqrt{{n \choose k - L}T}))$. Moreover, we propose an algorithm Sub-UCB that achieves regret $\tilde{\mathcal{O}}(\min_{L \le k}(Ln^{1/3}T^{2/3} + \sqrt{{n \choose k - L}T}))$ capable of matching the lower bound on regret for the restricted class up to logarithmic factors.
We discuss a nondeterministic variant of the recently introduced machine model of deterministic auxiliary depth-$k$ storage automata (or aux-$k$-sda's) by Yamakami. It was proven that all languages recognized by polynomial-time logarithmic-space aux-$k$-sda's are located between $\mathrm{LOGDCFL}$ and $\mathrm{SC}^k$ (the $k$th level of Steve's class SC). We further propose a new and simple computational model of semi-unbounded fan-in Boolean circuits composed partly of cascading blocks, in which the first few AND gates of unbounded fan-out (called AND$_{(\omega)}$ gates) at each layer from the left (where all gates at each layer are indexed from left to right) are linked in a "cascading" manner to their right neighbors though specific AND and OR gates. We use this new circuit model to characterize a nondeterministic variant of the aux-$2k$-sda's (called aux-$2k$-sna's) that run in polynomial time using logarithmic work space. By relaxing the requirement for cascading circuits, we also demonstrate how such cascading circuit families characterize the complexity class $\mathrm{P}$. This yields an upper bound on the computational complexity of $\mathrm{LOG}k\mathrm{SNA}$ by $\mathrm{P}$.
For a field $\mathbb{F}$ and integers $d$ and $k$, a set ${\cal A} \subseteq \mathbb{F}^d$ is called $k$-nearly orthogonal if its members are non-self-orthogonal and every $k+1$ vectors of ${\cal A}$ include an orthogonal pair. We prove that for every prime $p$ there exists some $\delta = \delta(p)>0$, such that for every field $\mathbb{F}$ of characteristic $p$ and for all integers $k \geq 2$ and $d \geq k$, there exists a $k$-nearly orthogonal set of at least $d^{\delta \cdot k/\log k}$ vectors of $\mathbb{F}^d$. The size of the set is optimal up to the $\log k$ term in the exponent. We further prove two extensions of this result. In the first, we provide a large set ${\cal A}$ of non-self-orthogonal vectors of $\mathbb{F}^d$ such that for every two subsets of ${\cal A}$ of size $k+1$ each, some vector of one of the subsets is orthogonal to some vector of the other. In the second extension, every $k+1$ vectors of the produced set ${\cal A}$ include $\ell+1$ pairwise orthogonal vectors for an arbitrary fixed integer $1 \leq \ell \leq k$. The proofs involve probabilistic and spectral arguments and the hypergraph container method.
Martin-L\"{o}f type theory $\mathbf{MLTT}$ was extended by Setzer with the so-called Mahlo universe types. The extension of $\mathbf{MLTT}$ with one Mahlo universe is called $\mathbf{MLM}$ and was introduced to develop a variant of $\mathbf{MLTT}$ equipped with an analogue of a large cardinal. Another instance of constructive systems extended with an analogue of a large set was formulated in the context of Aczel's constructive set theory: $\mathbf{CZF}$. Rathjen, Griffor and Palmgren extended $\mathbf{CZF}$ with inaccessible sets of all transfinite orders. While Rathjen proved that this extended system of $\mathbf{CZF}$ is interpretable in an extension of $\mathbf{MLM}$ with one usual universe type above the Mahlo universe, it is unknown whether it can be interpreted by the Mahlo universe without a universe type above it. We extend $\mathbf{MLM}$ not by a universe type but by the accessibility predicate, and show that $\mathbf{CZF}$ with inaccessible sets can be interpreted in $\mathbf{MLM}$ with the accessibility predicate. Our interpretation of this extension of $\mathbf{CZF}$ is the same as that of Rathjen, Griffor and Palmgren formulated by $\mathbf{MLTT}$ with second-order universe operators, except that we construct the inaccessible sets by using the Mahlo universe and the accessibility predicate. We formalised the main part of our interpretation in the proof assistant Agda.
Given a point set $P$ in a metric space and a real number $t \geq 1$, an \emph{oriented $t$-spanner} is an oriented graph $\overrightarrow{G}=(P,\overrightarrow{E})$, where for every pair of distinct points $p$ and $q$ in $P$, the shortest oriented closed walk in $\overrightarrow{G}$ that contains $p$ and $q$ is at most a factor $t$ longer than the perimeter of the smallest triangle in $P$ containing $p$ and $q$. The \emph{oriented dilation} of a graph $\overrightarrow{G}$ is the minimum $t$ for which $\overrightarrow{G}$ is an oriented $t$-spanner. We present the first algorithm that computes, in Euclidean space, a sparse oriented spanner whose oriented dilation is bounded by a constant. More specifically, for any set of $n$ points in $\mathbb{R}^d$, where $d$ is a constant, we construct an oriented $(2+\varepsilon)$-spanner with $\mathcal{O}(n)$ edges in $\mathcal{O}(n \log n)$ time and $\mathcal{O}(n)$ space. Our construction uses the well-separated pair decomposition and an algorithm that computes a $(1+\varepsilon)$-approximation of the minimum-perimeter triangle in $P$ containing two given query points in $\mathcal{O}(\log n)$ time. While our algorithm is based on first computing a suitable undirected graph and then orienting it, we show that, in general, computing the orientation of an undirected graph that minimises its oriented dilation is NP-hard, even for point sets in the Euclidean plane. We further prove that even if the orientation is already given, computing the oriented dilation is APSP-hard for points in a general metric space. We complement this result with an algorithm that approximates the oriented dilation of a given graph in subcubic time for point sets in $\mathbb{R}^d$, where $d$ is a constant.
The Knaster-Tarski theorem, also known as Tarski's theorem, guarantees that every monotone function defined on a complete lattice has a fixed point. We analyze the query complexity of finding such a fixed point on the $k$-dimensional grid of side length $n$ under the $\leq$ relation. Specifically, there is an unknown monotone function $f: \{0,1,\ldots, n-1\}^k \to \{0,1,\ldots, n-1\}^k$ and an algorithm must query a vertex $v$ to learn $f(v)$. A key special case of interest is the Boolean hypercube $\{0,1\}^k$, which is isomorphic to the power set lattice -- the original setting of the Knaster-Tarski theorem. Our lower bound characterizes the randomized and deterministic query complexity of the Tarski search problem on the Boolean hypercube as $\Theta(k)$. More generally, we prove a randomized lower bound of $\Omega\left( k + \frac{k \cdot \log{n}}{\log{k}} \right)$ for the $k$-dimensional grid of side length $n$, which is asymptotically tight in high dimensions when $k$ is large relative to $n$.
The ideal estimand for comparing a new treatment $Rx$ with a control $C$ is the $\textit{counterfactual}$ efficacy $Rx:C$, the expected differential outcome between $Rx$ and $C$ if each patient were given $\textit{both}$. While counterfactual $\textit{point estimation}$ from $\textit{factual}$ Randomized Controlled Trials (RCTs) has been available, this article shows $\textit{counterfactual}$ uncertainty quantification (CUQ), quantifying uncertainty for factual point estimates but in a counterfactual setting, is surprisingly achievable. We achieve CUQ whose variability is typically smaller than factual UQ, by creating a new statistical modeling principle called ETZ which is applicable to RCTs with $\textit{Before-and-After}$ treatment Repeated Measures, common in many therapeutic areas. We urge caution when estimate of the unobservable true condition of a patient before treatment has measurement error, because that violation of standard regression assumption can cause attenuation in estimating treatment effects. Fortunately, we prove that, for traditional medicine in general, and for targeted therapy with efficacy defined as averaged over the population, counterfactual point estimation is unbiased. However, for targeted therapy, both Real Human and Digital Twins approaches should respect this limitation, lest predicted treatment effect in $\textit{subgroups}$ will have bias.
The growing demand for larger-scale models in the development of \textbf{L}arge \textbf{L}anguage \textbf{M}odels (LLMs) poses challenges for efficient training within limited computational resources. Traditional fine-tuning methods often exhibit instability in multi-task learning and rely heavily on extensive training resources. Here, we propose MoDULA (\textbf{M}ixture \textbf{o}f \textbf{D}omain-Specific and \textbf{U}niversal \textbf{L}oR\textbf{A}), a novel \textbf{P}arameter \textbf{E}fficient \textbf{F}ine-\textbf{T}uning (PEFT) \textbf{M}ixture-\textbf{o}f-\textbf{E}xpert (MoE) paradigm for improved fine-tuning and parameter efficiency in multi-task learning. The paradigm effectively improves the multi-task capability of the model by training universal experts, domain-specific experts, and routers separately. MoDULA-Res is a new method within the MoDULA paradigm, which maintains the model's general capability by connecting universal and task-specific experts through residual connections. The experimental results demonstrate that the overall performance of the MoDULA-Flan and MoDULA-Res methods surpasses that of existing fine-tuning methods on various LLMs. Notably, MoDULA-Res achieves more significant performance improvements in multiple tasks while reducing training costs by over 80\% without losing general capability. Moreover, MoDULA displays flexible pluggability, allowing for the efficient addition of new tasks without retraining existing experts from scratch. This progressive training paradigm circumvents data balancing issues, enhancing training efficiency and model stability. Overall, MoDULA provides a scalable, cost-effective solution for fine-tuning LLMs with enhanced parameter efficiency and generalization capability.
We study the problem of solving matrix games of the form $\max_{\mathbf{w}\in\mathcal{W}}\min_{\mathbf{p}\in\Delta}\mathbf{p}^{\top}A\mathbf{w}$, where $A$ is some matrix and $\Delta$ is the probability simplex. This problem encapsulates canonical tasks such as finding a linear separator and computing Nash equilibria in zero-sum games. However, perhaps surprisingly, its inherent complexity (as formalized in the standard framework of oracle complexity [Nemirovski and Yudin, 1983]) is not well-understood. In this work, we first identify different oracle models which are implicitly used by prior algorithms, amounting to multiplying the matrix $A$ by a vector from either one or both sides. We then prove complexity lower bounds for algorithms under both access models, which in particular imply a separation between them. Specifically, we start by proving that algorithms for linear separability based on one-sided multiplications must require $\Omega(\gamma_A^{-2})$ iterations, where $\gamma_A$ is the margin, as matched by the Perceptron algorithm. We then prove that accelerated algorithms for this task, which utilize multiplications from both sides, must require $\tilde{\Omega}(\gamma_{A}^{-2/3})$ iterations, establishing the first oracle complexity barrier for such algorithms. Finally, by adapting our lower bound to $\ell_1$ geometry, we prove that computing an $\epsilon$-approximate Nash equilibrium requires $\tilde{\Omega}(\epsilon^{-2/5})$ iterations, which is an exponential improvement over the previously best-known lower bound due to Hadiji et al. [2024].