The approximate degree of a Boolean function is the minimum degree of real polynomial that approximates it pointwise. For any Boolean function, its approximate degree serves as a lower bound on its quantum query complexity, and generically lifts to a quantum communication lower bound for a related function. We introduce a framework for proving approximate degree lower bounds for certain oracle identification problems, where the goal is to recover a hidden binary string $x \in \{0, 1\}^n$ given possibly non-standard oracle access to it. We apply this framework to the ordered search and hidden string problems, proving nearly tight approximate degree lower bounds of $\Omega(n/\log^2 n)$ for each.
An $f$-edge fault-tolerant distance sensitive oracle ($f$-DSO) with stretch $\sigma \geq 1$ is a data structure that preprocesses an input graph $G$. When queried with the triple $(s,t,F)$, where $s, t \in V$ and $F \subseteq E$ contains at most $f$ edges of $G$, the oracle returns an estimate $\widehat{d}_{G-F}(s,t)$ of the distance $d_{G-F}(s,t)$ between $s$ and $t$ in the graph $G-F$ such that $d_{G-F}(s,t) \leq \widehat{d}_{G-F}(s,t) \leq \sigma d_{G-F}(s,t)$. For any positive integer $k \ge 2$ and any $0 < \alpha < 1$, we present an $f$-DSO with sensitivity $f = o(\log n/\log\log n)$, stretch $2k-1$, space $O(n^{1+\frac{1}{k}+\alpha+o(1)})$, and an $\widetilde{O}(n^{1+\frac{1}{k} - \frac{\alpha}{k(f+1)}})$ query time. Prior to our work, there were only three known $f$-DSOs with subquadratic space. The first one by Chechik et al. [Algorithmica 2012] has a stretch of $(8k-2)(f+1)$, depending on $f$. Another approach is storing an $f$-edge fault-tolerant $(2k-1)$-spanner of $G$. The bottleneck is the large query time due to the size of any such spanner, which is $\Omega(n^{1+1/k})$ under the Erd\H{o}s girth conjecture. Bil\`o et al. [STOC 2023] gave a solution with stretch $3+\varepsilon$, query time $O(n^{\alpha})$ but space $O(n^{2-\frac{\alpha}{f+1}})$, approaching the quadratic barrier for large sensitivity. In the realm of subquadratic space, our $f$-DSOs are the first ones that guarantee, at the same time, large sensitivity, low stretch, and non-trivial query time. To obtain our results, we use the approximate distance oracles of Thorup and Zwick [JACM 2005], and the derandomization of the $f$-DSO of Weimann and Yuster [TALG 2013], that was recently given by Karthik and Parter [SODA 2021].
The Transposition Distance Problem (TDP) is a classical problem in genome rearrangements which seeks to determine the minimum number of transpositions needed to transform a linear chromosome into another represented by the permutations $\pi$ and $\sigma$. This paper focuses on the equivalent problem of Sorting By Transpositions (SBT), where $\sigma$ is the identity permutation $\iota$. Specifically, we investigate properties of palisades, a family of permutations that are ``hard'' to sort, as they require numerous transpositions above the celebrated lower bound devised by Bafna and Pevzner. By determining the transposition distance of palisades, we were able to provide the exact transposition diameter for $3$-permutations (TD3), a special subset of the Symmetric Group $S_n$, essential for the study of approximate solutions for SBT using the simplification technique. The exact value for TD3 has remained unknown since Elias and Hartman showed an upper bound for it. Another consequence of determining the transposition distance of palisades is that, using as lower bound the one by Bafna and Pevzner, it is impossible to guarantee approximation ratios lower than $1.375$ when approximating SBT. This finding has significant implications for the study of SBT, as this problem has been subject of intense research efforts for the past 25 years.
The QSAT problem, which asks to evaluate a quantified Boolean formula (QBF), is of fundamental interest in approximation, counting, decision, and probabilistic complexity and is also considered the prototypical PSPACEcomplete problem. As such, it has previously been studied under various structural restrictions (parameters), most notably parameterizations of the primal graph representation of instances. Indeed, it is known that QSAT remains PSPACE-complete even when restricted to instances with constant treewidth of the primal graph, but the problem admits a double-exponential fixed-parameter algorithm parameterized by the vertex cover number (primal graph). However, prior works have left a gap in our understanding of the complexity of QSAT when viewed from the perspective of other natural representations of instances, most notably via incidence graphs. In this paper, we develop structure-aware reductions which allow us to obtain essentially tight lower bounds for highly restricted instances of QSAT, including instances whose incidence graphs have bounded treedepth or feedback vertex number. We complement these lower bounds with novel algorithms for QSAT which establish a nearly-complete picture of the problem's complexity under standard graph-theoretic parameterizations. We also show implications for other natural graph representations, and obtain novel upper as well as lower bounds for QSAT under more fine-grained parameterizations of the primal graph.
We study sublinear time algorithms for estimating the size of maximum matching. After a long line of research, the problem was finally settled by Behnezhad [FOCS'22], in the regime where one is willing to pay an approximation factor of $2$. Very recently, Behnezhad et al.[SODA'23] improved the approximation factor to $(2-\frac{1}{2^{O(1/\gamma)}})$ using $n^{1+\gamma}$ time. This improvement over the factor $2$ is, however, minuscule and they asked if even $1.99$-approximation is possible in $n^{2-\Omega(1)}$ time. We give a strong affirmative answer to this open problem by showing $(1.5+\epsilon)$-approximation algorithms that run in $n^{2-\Theta(\epsilon^{2})}$ time. Our approach is conceptually simple and diverges from all previous sublinear-time matching algorithms: we show a sublinear time algorithm for computing a variant of the edge-degree constrained subgraph (EDCS), a concept that has previously been exploited in dynamic [Bernstein Stein ICALP'15, SODA'16], distributed [Assadi et al. SODA'19] and streaming [Bernstein ICALP'20] settings, but never before in the sublinear setting. Independent work: Behnezhad, Roghani and Rubinstein [BRR'23] independently showed sublinear algorithms similar to our Theorem 1.2 in both adjacency list and matrix models. Furthermore, in [BRR'23], they show additional results on strictly better-than-1.5 approximate matching algorithms in both upper and lower bound sides.
This paper presents an accelerated proximal gradient method for multiobjective optimization, in which each objective function is the sum of a continuously differentiable, convex function and a closed, proper, convex function. Extending first-order methods for multiobjective problems without scalarization has been widely studied, but providing accelerated methods with accurate proofs of convergence rates remains an open problem. Our proposed method is a multiobjective generalization of the accelerated proximal gradient method, also known as the Fast Iterative Shrinkage-Thresholding Algorithm (FISTA), for scalar optimization. The key to this successful extension is solving a subproblem with terms exclusive to the multiobjective case. This approach allows us to demonstrate the global convergence rate of the proposed method ($O(1 / k^2)$), using a merit function to measure the complexity. Furthermore, we present an efficient way to solve the subproblem via its dual representation, and we confirm the validity of the proposed method through some numerical experiments.
Probabilistic programming languages rely fundamentally on some notion of sampling, and this is doubly true for probabilistic programming languages which perform Bayesian inference using Monte Carlo techniques. Verifying samplers - proving that they generate samples from the correct distribution - is crucial to the use of probabilistic programming languages for statistical modelling and inference. However, the typical denotational semantics of probabilistic programs is incompatible with deterministic notions of sampling. This is problematic, considering that most statistical inference is performed using pseudorandom number generators. We present a higher-order probabilistic programming language centred on the notion of samplers and sampler operations. We give this language an operational and denotational semantics in terms of continuous maps between topological spaces. Our language also supports discontinuous operations, such as comparisons between reals, by using the type system to track discontinuities. This feature might be of independent interest, for example in the context of differentiable programming. Using this language, we develop tools for the formal verification of sampler correctness. We present an equational calculus to reason about equivalence of samplers, and a sound calculus to prove semantic correctness of samplers, i.e. that a sampler correctly targets a given measure by construction.
High stakes classification refers to classification problems where erroneously predicting the wrong class is very bad, but assigning "unknown" is acceptable. We make the argument that these problems require us to give multiple unknown classes, to get the most information out of our analysis. With imperfect data we refer to covariates with a large number of missing values, large noise variance, and some errors in the data. The combination of high stakes classification and imperfect data is very common in practice, but it is very difficult to work on using current methods. We present a one-class classifier (OCC) to solve this problem, and we call it NBP. The classifier is based on Naive Bayes, simple to implement, and interpretable. We show that NBP gives both good predictive performance, and works for high stakes classification based on imperfect data. The model we present is quite simple; it is just an OCC based on density estimation. However, we have always felt a big gap between the applied classification problems we have worked on and the theory and models we use for classification, and this model closes that gap. Our main contribution is the motivation for why this model is a good approach, and we hope that this paper will inspire further development down this path.
We propose a novel nonparametric regression framework subject to the positive definiteness constraint. It offers a highly modular approach for estimating covariance functions of stationary processes. Our method can impose positive definiteness, as well as isotropy and monotonicity, on the estimators, and its hyperparameters can be decided using cross validation. We define our estimators by taking integral transforms of kernel-based distribution surrogates. We then use the iterated density estimation evolutionary algorithm, a variant of estimation of distribution algorithms, to fit the estimators. We also extend our method to estimate covariance functions for point-referenced data. Compared to alternative approaches, our method provides more reliable estimates for long-range dependence. Several numerical studies are performed to demonstrate the efficacy and performance of our method. Also, we illustrate our method using precipitation data from the Spatial Interpolation Comparison 97 project.
Buhrman, Cleve and Wigderson (STOC'98) showed that for every Boolean function f : {-1,1}^n to {-1,1} and G in {AND_2, XOR_2}, the bounded-error quantum communication complexity of the composed function f o G equals O(Q(f) log n), where Q(f) denotes the bounded-error quantum query complexity of f. This is achieved by Alice running the optimal quantum query algorithm for f, using a round of O(log n) qubits of communication to implement each query. This is in contrast with the classical setting, where it is easy to show that R^{cc}(f o G) is at most 2R(f), where R^{cc} and R denote bounded-error communication and query complexity, respectively. We show that the O(log n) overhead is required for some functions in the quantum setting, and thus the BCW simulation is tight. We note here that prior to our work, the possibility of Q^{cc}(f o G) = O(Q(f)), for all f and all G in {AND_2, XOR_2}, had not been ruled out. More specifically, we show the following. - We show that the log n overhead is *not* required when f is symmetric, generalizing a result of Aaronson and Ambainis for the Set-Disjointness function (Theory of Computing'05). - In order to prove the above, we design an efficient distributed version of noisy amplitude amplification that allows us to prove the result when f is the OR function. - In view of our first result above, one may ask whether the log n overhead in the BCW simulation can be avoided even when f is transitive, which is a weaker notion of symmetry. We give a strong negative answer by showing that the log n overhead is still necessary for some transitive functions even when we allow the quantum communication protocol an error probability that can be arbitrarily close to 1/2. - We also give, among other things, a general recipe to construct functions for which the log n overhead is required in the BCW simulation in the bounded-error communication model.
This paper introduces a new accurate model for periodic fractional optimal control problems (PFOCPs) using Riemann-Liouville (RL) and Caputo fractional derivatives (FDs) with sliding fixed memory lengths. The paper also provides a novel numerical method for solving PFOCPs using Fourier and Gegenbauer pseudospectral methods. By employing Fourier collocation at equally spaced nodes and Fourier and Gegenbauer quadratures, the method transforms the PFOCP into a simple constrained nonlinear programming problem (NLP) that can be treated easily using standard NLP solvers. We propose a new transformation that largely simplifies the problem of calculating the periodic FDs of periodic functions to the problem of evaluating the integral of the first derivatives of their trigonometric Lagrange interpolating polynomials, which can be treated accurately and efficiently using Gegenbauer quadratures. We introduce the notion of the {\alpha}th-order fractional integration matrix with index L based on Fourier and Gegenbauer pseudospectral approximations, which proves to be very effective in computing periodic FDs. We also provide a rigorous priori error analysis to predict the quality of the Fourier-Gegenbauer-based approximations to FDs. The numerical results of the benchmark PFOCP demonstrate the performance of the proposed pseudospectral method.