亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Given two strings $A[1..n]$ and $B[1..m]$, and a set of operations allowed to edit the strings, the edit distance between $A$ and $B$ is the minimum number of operations required to transform $A$ into $B$. Sequentially, a standard Dynamic Programming (DP) algorithm solves edit distance with $\Theta(nm)$ cost. In many real-world applications, the strings to be compared are similar and have small edit distances. To achieve highly practical implementations, we focus on output-sensitive parallel edit-distance algorithms, i.e., to achieve asymptotically better cost bounds than the standard $\Theta(nm)$ algorithm when the edit distance is small. We study four algorithms in the paper, including three algorithms based on Breadth-First Search (BFS) and one algorithm based on Divide-and-Conquer (DaC). Our BFS-based solution is based on the Landau-Vishkin algorithm. We implement three different data structures for the longest common prefix (LCP) queries needed in the algorithm: the classic solution using parallel suffix array, and two hash-based solutions proposed in this paper. Our DaC-based solution is inspired by the output-insensitive solution proposed by Apostolico et al., and we propose a non-trivial adaption to make it output-sensitive. All our algorithms have good theoretical guarantees, and they achieve different tradeoffs between work (total number of operations), span (longest dependence chain in the computation), and space. We test and compare our algorithms on both synthetic data and real-world data. Our BFS-based algorithms outperform the existing parallel edit-distance implementation in ParlayLib in all test cases. By comparing our algorithms, we also provide a better understanding of the choice of algorithms for different input patterns. We believe that our paper is the first systematic study in the theory and practice of parallel edit distance.

相關內容

We give an algorithm that, given an $n$-vertex graph $G$ and an integer $k$, in time $2^{O(k)} n$ either outputs a tree decomposition of $G$ of width at most $2k + 1$ or determines that the treewidth of $G$ is larger than $k$. This is the first 2-approximation algorithm for treewidth that is faster than the known exact algorithms, and in particular improves upon the previous best approximation ratio of 5 in time $2^{O(k)} n$ given by Bodlaender et al. [SIAM J. Comput., 45 (2016)]. Our algorithm works by applying incremental improvement operations to a tree decomposition, using an approach inspired by a proof of Bellenbaum and Diestel [Comb. Probab. Comput., 11 (2002)].

The Ramsey number is the minimum number of nodes, $n = R(s, t)$, such that all undirected simple graphs of order $n$, contain a clique of order $s$, or an independent set of order $t$. This paper explores the application of a best first search algorithm and reinforcement learning (RL) techniques to find counterexamples to specific Ramsey numbers. We incrementally improve over prior search methods such as random search by introducing a graph vectorization and deep neural network (DNN)-based heuristic, which gauge the likelihood of a graph being a counterexample. The paper also proposes algorithmic optimizations to confine a polynomial search runtime. This paper does not aim to present new counterexamples but rather introduces and evaluates a framework supporting Ramsey counterexample exploration using other heuristics. Code and methods are made available through a PyPI package and GitHub repository.

A modified successive cancellation list (SCL) decoder is proposed for polar-coded probabilistic shaping. The decoder exploits the deterministic encoding rule for shaping bits to rule out candidate code words that the encoder would not generate. This provides error detection and decreases error rates compared to standard SCL decoding while at the same time reducing the length of the outer cyclic redundancy check code.

The integer complexity $f(n)$ of a positive integer $n$ is defined as the minimum number of 1's needed to represent $n$, using additions, multiplications and parentheses. We present two simple and faster algorithms for computing the integer complexity: 1) A near-optimal $O(N\mathop{\mathrm{polylog}} N)$-time algorithm for computing the integer complexity of all $n\leq N$, improving the previous $O(N^{1.223})$ one [Cordwell et al., 2017]. 2) The first sublinear-time algorithm for computing the integer complexity of a single $n$, with running time $O(n^{0.6154})$. The previous algorithms for computing a single $f(n)$ require computing all $f(1),\dots,f(n)$.

A $d$-dimensional simplicial complex $X$ is said to support a direct product tester if any locally consistent function defined on its $k$-faces (where $k\ll d$) necessarily come from a function over its vertices. More precisely, a direct product tester has a distribution $\mu$ over pairs of $k$-faces $(A,A')$, and given query access to $F\colon X(k)\to\{0,1\}^k$ it samples $(A,A')\sim \mu$ and checks that $F[A]|_{A\cap A'} = F[A']|_{A\cap A'}$. The tester should have (1) the "completeness property", meaning that any assignment $F$ which is a direct product assignment passes the test with probability $1$, and (2) the "soundness property", meaning that if $F$ passes the test with probability $s$, then $F$ must be correlated with a direct product function. Dinur and Kaufman showed that a sufficiently good spectral expanding complex $X$ admits a direct product tester in the "high soundness" regime where $s$ is close to $1$. They asked whether there are high dimensional expanders that support direct product tests in the "low soundness", when $s$ is close to $0$. We give a characterization of high-dimensional expanders that support a direct product tester in the low soundness regime. We show that spectral expansion is insufficient, and the complex must additionally satisfy a variant of coboundary expansion, which we refer to as Unique-Games coboundary expanders. This property can be seen as a high-dimensional generalization of the standard notion of coboundary expansion over non-Abelian groups for 2-dimensional complexes. It asserts that any locally consistent Unique-Games instance obtained using the low-level faces of the complex, must admit a good global solution.

A minimal perfect hash function (MPHF) maps a set $S$ of $n$ keys to the first $n$ integers without collisions. There is a lower bound of $n\log_2e-O(\log n)$ bits of space needed to represent an MPHF. A matching upper bound is obtained using the brute-force algorithm that tries random hash functions until stumbling on an MPHF and stores that function's seed. In expectation, $e^n\textrm{poly}(n)$ seeds need to be tested. The most space-efficient previous algorithms for constructing MPHFs all use such a brute-force approach as a basic building block. In this paper, we introduce ShockHash - Small, heavily overloaded cuckoo hash tables. ShockHash uses two hash functions $h_0$ and $h_1$, hoping for the existence of a function $f : S \rightarrow \{0,1\}$ such that $x \mapsto h_{f(x)}(x)$ is an MPHF on $S$. In graph terminology, ShockHash generates $n$-edge random graphs until stumbling on a pseudoforest - a graph where each component contains as many edges as nodes. Using cuckoo hashing, ShockHash then derives an MPHF from the pseudoforest in linear time. It uses a 1-bit retrieval data structure to store $f$ using $n + o(n)$ bits. By carefully analyzing the probability that a random graph is a pseudoforest, we show that ShockHash needs to try only $(e/2)^n\textrm{poly}(n)$ hash function seeds in expectation, reducing the space for storing the seed by roughly $n$ bits. This makes ShockHash almost a factor $2^n$ faster than brute-force, while maintaining the asymptotically optimal space consumption. An implementation within the RecSplit framework yields the currently most space efficient MPHFs, i.e., competing approaches need about two orders of magnitude more work to achieve the same space.

We show that $n$-bit integers can be factorized by independently running a quantum circuit with $\tilde{O}(n^{3/2})$ gates for $\sqrt{n}+4$ times, and then using polynomial-time classical post-processing. The correctness of the algorithm relies on a number-theoretic heuristic assumption reminiscent of those used in subexponential classical factorization algorithms. It is currently not clear if the algorithm can lead to improved physical implementations in practice.

Given $n$ observations from two balanced classes, consider the task of labeling an additional $m$ inputs that are known to all belong to \emph{one} of the two classes. Special cases of this problem are well-known: with complete knowledge of class distributions ($n=\infty$) the problem is solved optimally by the likelihood-ratio test; when $m=1$ it corresponds to binary classification; and when $m\approx n$ it is equivalent to two-sample testing. The intermediate settings occur in the field of likelihood-free inference, where labeled samples are obtained by running forward simulations and the unlabeled sample is collected experimentally. In recent work it was discovered that there is a fundamental trade-off between $m$ and $n$: increasing the data sample $m$ reduces the amount $n$ of training/simulation data needed. In this work we (a) introduce a generalization where unlabeled samples come from a mixture of the two classes -- a case often encountered in practice; (b) study the minimax sample complexity for non-parametric classes of densities under \textit{maximum mean discrepancy} (MMD) separation; and (c) investigate the empirical performance of kernels parameterized by neural networks on two tasks: detection of the Higgs boson and detection of planted DDPM generated images amidst CIFAR-10 images. For both problems we confirm the existence of the theoretically predicted asymmetric $m$ vs $n$ trade-off.

Feature tracking is a common task in visualization applications, where methods based on topological data analysis (TDA) have successfully been applied in the past for feature definition as well as tracking. In this work, we focus on tracking extrema of temporal scalar fields. A family of TDA approaches address this task by establishing one-to-one correspondences between extrema based on discrete gradient vector fields. More specifically, two extrema of subsequent time steps are matched if they fall into their respective ascending and descending manifolds. However, due to this one-to-one assignment, these approaches are prone to fail where, e.g., extrema are located in regions with low gradient magnitude, or are located close to boundaries of the manifolds. Therefore, we propose a probabilistic matching that captures a larger set of possible correspondences via neighborhood sampling, or by computing the overlap of the manifolds. We illustrate the usefulness of the approach with two application cases.

We give an algorithm that takes as input an $n$-vertex graph $G$ and an integer $k$, runs in time $2^{O(k^2)} n^{O(1)}$, and outputs a tree decomposition of $G$ of width at most $k$, if such a decomposition exists. This resolves the long-standing open problem of whether there is a $2^{o(k^3)} n^{O(1)}$ time algorithm for treewidth. In particular, our algorithm is the first improvement on the dependency on $k$ in algorithms for treewidth since the $2^{O(k^3)} n^{O(1)}$ time algorithm given by Bodlaender and Kloks [ICALP 1991] and Lagergren and Arnborg [ICALP 1991]. We also give an algorithm that given an $n$-vertex graph $G$, an integer $k$, and a rational $\varepsilon \in (0,1)$, in time $k^{O(k/\varepsilon)} n^{O(1)}$ either outputs a tree decomposition of $G$ of width at most $(1+\varepsilon)k$ or determines that the treewidth of $G$ is larger than $k$. Prior to our work, no approximation algorithms for treewidth with approximation ratio less than $2$, other than the exact algorithms, were known. Both of our algorithms work in polynomial space.

北京阿比特科技有限公司