Tight wavelet frames (TWFs) in $L^2(\mathbb{R}^n)$ are versatile and practical structures that provide the perfect reconstruction property. Nevertheless, existing TWF construction methods exhibit limitations, including a lack of specific methods for generating mother wavelets in extension-based construction, and the necessity to address the sum of squares (SOS) problem even when specific methods for generating mother wavelets are provided in SOS-based construction. It is a common practice for current TWF constructions to begin with a given refinable function. However, this approach places the entire burden on finding suitable mother wavelets. In this paper, we introduce TWF construction methods that spread the burden between both types of functions: refinable functions and mother wavelets. These construction methods offer an alternative approach to circumvent the SOS problem while providing specific techniques for generating mother wavelets. We present examples to illustrate our construction methods.
An \emph{$\alpha$-approximate vertex fault-tolerant distance sensitivity oracle} (\emph{$\alpha$-VSDO}) for a weighted input graph $G=(V, E, w)$ and a source vertex $s \in V$ is the data structure answering an $\alpha$-approximate distance from $s$ to $t$ in $G-x$ for any given query $(x, t) \in V \times V$. It is a data structure version of the so-called single-source replacement path problem (SSRP). In this paper, we present a new \emph{nearly linear-time} algorithm of constructing a $(1 + \epsilon)$-VSDO for any directed input graph with polynomially bounded integer edge weights. More precisely, the presented oracle attains $\tilde{O}(m \log (nW)/ \epsilon + n \log^2 (nW)/\epsilon^2)$ construction time, $\tilde{O}(n \log (nW) / \epsilon)$ size, and $\tilde{O}(1/\epsilon)$ query time, where $n$ is the number of vertices, $m$ is the number of edges, and $W$ is the maximum edge weight. These bounds are all optimal up to polylogarithmic factors. To the best of our knowledge, this is the first non-trivial algorithm for SSRP/VSDO beating $\tilde{O}(mn)$ computation time for directed graphs with general edge weight functions, and also the first nearly linear-time construction breaking approximation factor 3. Such a construction has been unknown even for undirected and unweighted graphs. In addition, our result implies that the known conditional lower bounds for the exact SSRP computation does not apply to the case of approximation.
In the Directed Steiner Tree (DST) problem the input is a directed edge-weighted graph $G=(V,E)$, a root vertex $r$ and a set $S \subseteq V$ of $k$ terminals. The goal is to find a min-cost subgraph that connects $r$ to each of the terminals. DST admits an $O(\log^2 k/\log \log k)$-approximation in quasi-polynomial time, and an $O(k^{\epsilon})$-approximation for any fixed $\epsilon > 0$ in polynomial-time. Resolving the existence of a polynomial-time poly-logarithmic approximation is a major open problem in approximation algorithms. In a recent work, Friggstad and Mousavi [ICALP 2023] obtained a simple and elegant polynomial-time $O(\log k)$-approximation for DST in planar digraphs via Thorup's shortest path separator theorem. We build on their work and obtain several new results on DST and related problems. - We develop a tree embedding technique for rooted problems in planar digraphs via an interpretation of the recursion in Friggstad and Mousavi [ICALP 2023]. Using this we obtain polynomial-time poly-logarithmic approximations for Group Steiner Tree, Covering Steiner Tree, and the Polymatroid Steiner Tree problems in planar digraphs. All these problems are hard to approximate to within a factor of $\Omega(\log^2 n/\log \log n)$ even in trees. - We prove that the natural cut-based LP relaxation for DST has an integrality gap of $O(\log^2 k)$ in planar graphs. This is in contrast to general graphs where the integrality gap of this LP is known to be $\Omega(k)$ and $\Omega(n^{\delta})$ for some fixed $\delta > 0$. - We combine the preceding results with density based arguments to obtain poly-logarithmic approximations for the multi-rooted versions of the problems in planar digraphs. For DST our result improves the $O(R + \log k)$ approximation of Friggstad and Mousavi [ICALP 2023] when $R= \omega(\log^2 k)$.
We prove a lower bound on the communication complexity of computing the $n$-fold xor of an arbitrary function $f$, in terms of the communication complexity and rank of $f$. We prove that $D(f^{\oplus n}) \geq n \cdot \Big(\frac{\Omega(D(f))}{\log \mathsf{rk}(f)} -\log \mathsf{rk}(f)\Big )$, where here $D(f), D(f^{\oplus n})$ represent the deterministic communication complexity, and $\mathsf{rk}(f)$ is the rank of $f$. Our methods involve a new way to use information theory to reason about deterministic communication complexity.
An AVL tree is a binary search tree that guarantees $ O\left( \log n \right ) $ search. The guarantee is obtained at the cost of rebalancing the AVL tree, potentially after every insertion or deletion. This article proposes a deletion algorithm that reduces the rebalancing required after deletion compared to the rebalancing required after deletion by a previously reported algorithm.
Given a graph $G=(V,E)$ and an integer $k\in \mathbb{N}$, we investigate the 2-Eigenvalue Vertex Deletion (2-EVD) problem. The objective is to remove at most $k$ vertices such that the adjacency matrix of the resulting graph has at most two eigenvalues. It is established that the adjacency matrix of a graph has at most two eigenvalues if and only if the graph is a collection of equal-sized cliques. Thus, the 2-Eigenvalue Vertex Deletion amounts to removing a set of at most $k$ vertices to transform the graph into a collection of equal-sized cliques. The 2-Eigenvalue Edge Editing (2-EEE), 2-Eigenvalue Edge Deletion (2-EED) and 2-Eigenvalue Edge Addition (2-EEA) problems are defined analogously. We present a kernel of size $\mathcal{O}(k^{3})$ for $2$-EVD, along with an FPT algorithm with a running time of $\mathcal{O}^{*}(2^{k})$. For the problem $2$-EEE, we provide a kernel of size $\mathcal{O}(k^{2})$. Additionally, we present linear kernels of size $5k$ and $6k$ for $2$-EEA and $2$-EED respectively. For the $2$-EED, we also construct an algorithm with running time $\mathcal{O}^{*}(1.47^{k})$ . These results address open questions posed by Misra et al. (ISAAC 2023) regarding the complexity of these problems when parameterized by the solution size.
We develop a learning algorithm for closed signal flow graphs - a graphical model of signal transducers. The algorithm relies on the correspondence between closed signal flow graphs and weighted finite automata on a singleton alphabet. We demonstrate that this procedure results in a genuine reduction of complexity: our algorithm fares better than existing learning algorithms for weighted automata restricted to the case of a singleton alphabet.
Given a (multi)graph $G$ which contains a bipartite subgraph with $\rho$ edges, what is the largest triangle-free subgraph of $G$ that can be found efficiently? We present an SDP-based algorithm that finds one with at least $0.8823 \rho$ edges, thus improving on the subgraph with $0.878 \rho$ edges obtained by the classic Max-Cut algorithm of Goemans and Williamson. On the other hand, by a reduction from Hastad's 3-bit PCP we show that it is NP-hard to find a triangle-free subgraph with $(25 / 26 + \epsilon) \rho \approx (0.961 + \epsilon) \rho$ edges. As an application, we classify the Maximum Promise Constraint Satisfaction Problem MaxPCSP($G$,$H$) for all bipartite $G$: Given an input (multi)graph $X$ which admits a $G$-colouring satisfying $\rho$ edges, find an $H$-colouring of $X$ that satisfies $\rho$ edges. This problem is solvable in polynomial time, apart from trivial cases, if $H$ contains a triangle, and is NP-hard otherwise.
Sorting has a natural generalization where the input consists of: (1) a ground set $X$ of size $n$, (2) a partial oracle $O_P$ specifying some fixed partial order $P$ on $X$ and (3) a linear oracle $O_L$ specifying a linear order $L$ that extends $P$. The goal is to recover the linear order $L$ on $X$ using the fewest number of linear oracle queries. In this problem, we measure algorithmic complexity through three metrics: oracle queries to $O_L$, oracle queries to $O_P$, and the time spent. Any algorithm requires worst-case $\log_2 e(P)$ linear oracle queries to recover the linear order on $X$. Kahn and Saks presented the first algorithm that uses $\Theta(\log e(P))$ linear oracle queries (using $O(n^2)$ partial oracle queries and exponential time). The state-of-the-art for the general problem is by Cardinal, Fiorini, Joret, Jungers and Munro who at STOC'10 manage to separate the linear and partial oracle queries into a preprocessing and query phase. They can preprocess $P$ using $O(n^2)$ partial oracle queries and $O(n^{2.5})$ time. Then, given $O_L$, they uncover the linear order on $X$ in $\Theta(\log e(P))$ linear oracle queries and $O(n + \log e(P))$ time -- which is worst-case optimal in the number of linear oracle queries but not in the time spent. For $c \geq 1$, our algorithm can preprocess $O_P$ using $O(n^{1 + \frac{1}{c}})$ queries and time. Given $O_L$, we uncover $L$ using $\Theta(c \log e(P))$ queries and time. We show a matching lower bound, as there exist positive constants $(\alpha, \beta)$ where for any constant $c \geq 1$, any algorithm that uses at most $\alpha \cdot n^{1 + \frac{1}{c}}$ preprocessing must use worst-case at least $\beta \cdot c \log e(P)$ linear oracle queries. Thus, we solve the problem of sorting under partial information through an algorithm that is asymptotically tight across all three metrics.
A Gaussian Cox process is a popular model for point process data, in which the intensity function is a transformation of a Gaussian process. Posterior inference of this intensity function involves an intractable integral (i.e., the cumulative intensity function) in the likelihood resulting in doubly intractable posterior distribution. Here, we propose a nonparametric Bayesian approach for estimating the intensity function of an inhomogeneous Poisson process without reliance on large data augmentation or approximations of the likelihood function. We propose to jointly model the intensity and the cumulative intensity function as a transformed Gaussian process, allowing us to directly bypass the need of approximating the cumulative intensity function in the likelihood. We propose an exact MCMC sampler for posterior inference and evaluate its performance on simulated data. We demonstrate the utility of our method in three real-world scenarios including temporal and spatial event data, as well as aggregated time count data collected at multiple resolutions. Finally, we discuss extensions of our proposed method to other point processes.
We consider the problem of sampling from the posterior distribution of a $d$-dimensional coefficient vector $\boldsymbol{\theta}$, given linear observations $\boldsymbol{y} = \boldsymbol{X}\boldsymbol{\theta}+\boldsymbol{\varepsilon}$. In general, such posteriors are multimodal, and therefore challenging to sample from. This observation has prompted the exploration of various heuristics that aim at approximating the posterior distribution. In this paper, we study a different approach based on decomposing the posterior distribution into a log-concave mixture of simple product measures. This decomposition allows us to reduce sampling from a multimodal distribution of interest to sampling from a log-concave one, which is tractable and has been investigated in detail. We prove that, under mild conditions on the prior, for random designs, such measure decomposition is generally feasible when the number of samples per parameter $n/d$ exceeds a constant threshold. We thus obtain a provably efficient (polynomial time) sampling algorithm in a regime where this was previously not known. Numerical simulations confirm that the algorithm is practical, and reveal that it has attractive statistical properties compared to state-of-the-art methods.