亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

The Longest Common Subsequence (LCS) is a fundamental string similarity measure, and computing the LCS of two strings is a classic algorithms question. A textbook dynamic programming algorithm gives an exact algorithm in quadratic time, and this is essentially best possible under plausible fine-grained complexity assumptions, so a natural problem is to find faster approximation algorithms. When the inputs are two binary strings, there is a simple $\frac{1}{2}$-approximation in linear time: compute the longest common all-0s or all-1s subsequence. It has been open whether a better approximation is possible even in truly subquadratic time. Rubinstein and Song showed that the answer is yes under the assumption that the two input strings have equal lengths. We settle the question, generalizing their result to unequal length strings, proving that, for any $\varepsilon>0$, there exists $\delta>0$ and a $(\frac{1}{2}+\delta)$-approximation algorithm for binary LCS that runs in $n^{1+\varepsilon}$ time. As a consequence of our result and a result of Akmal and Vassilevska-Williams, for any $\varepsilon>0$, there exists a $(\frac{1}{q}+\delta)$-approximation for LCS over $q$-ary strings in $n^{1+\varepsilon}$ time. Our techniques build on the recent work of Guruswami, He, and Li who proved new bounds for error-correcting codes tolerating deletion errors. They prove a combinatorial "structure lemma" for strings which classifies them according to their oscillation patterns. We prove and use an algorithmic generalization of this structure lemma, which may be of independent interest.

相關內容

The problem to compute the vertices of a polytope given by affine inequalities is called vertex enumeration. The inverse problem, which is equivalent by polarity, is called the convex hull problem. We introduce `approximate vertex enumeration' as the problem to compute the vertices of a polytope which is close to the original polytope given by affine inequalities. In contrast to exact vertex enumerations, both polytopes are not required to be combinatorially equivalent. Two algorithms for this problem are introduced. The first one is an approximate variant of Motzkin's double description method. Only under certain strong conditions, which are not acceptable for practical reasons, we were able to prove correctness of this method for polytopes of arbitrary dimension. The second method, called shortcut algorithm, is based on constructing a plane graph and is restricted to polytopes of dimension 2 and 3. We prove correctness of the shortcut algorithm. As a consequence, we also obtain correctness of the approximate double description method, only for dimension 2 and 3 but without any restricting conditions as still required for higher dimensions. We show that for dimension 2 and 3 both algorithm remain correct if imprecise arithmetic is used and the computational error caused by imprecision is not too high. Both algorithms were implemented. The numerical examples motivate the approximate vertex enumeration problem by showing that the approximate problem is often easier to solve than the exact vertex enumeration problem. It remains open whether or not the approximate double description method (without any restricting condition) is correct for polytopes of dimension 4 and higher.

The Shapley value is arguably the most popular approach for assigning a meaningful contribution value to players in a cooperative game, which has recently been used intensively in various areas of machine learning, most notably in explainable artificial intelligence. The meaningfulness is due to axiomatic properties that only the Shapley value satisfies, which, however, comes at the expense of an exact computation growing exponentially with the number of agents. Accordingly, a number of works are devoted to the efficient approximation of the Shapley values, all of which revolve around the notion of an agent's marginal contribution. In this paper, we propose with SVARM and Stratified SVARM two parameter-free and domain-independent approximation algorithms based on a representation of the Shapley value detached from the notion of marginal contributions. We prove unmatched theoretical guarantees regarding their approximation quality and provide satisfying empirical results.

Given a matrix $A\in \mathbb{R}^{n\times d}$ and a vector $b\in \mathbb{R}^n$, we consider the regression problem with $\ell_\infty$ guarantees: finding a vector $x'\in \mathbb{R}^d$ such that $ \|x'-x^*\|_\infty \leq \frac{\epsilon}{\sqrt{d}}\cdot \|Ax^*-b\|_2\cdot \|A^\dagger\|$ where $x^*=\arg\min_{x\in \mathbb{R}^d}\|Ax-b\|_2$. One popular approach for solving such $\ell_2$ regression problem is via sketching: picking a structured random matrix $S\in \mathbb{R}^{m\times n}$ with $m\ll n$ and $SA$ can be quickly computed, solve the ``sketched'' regression problem $\arg\min_{x\in \mathbb{R}^d} \|SAx-Sb\|_2$. In this paper, we show that in order to obtain such $\ell_\infty$ guarantee for $\ell_2$ regression, one has to use sketching matrices that are dense. To the best of our knowledge, this is the first user case in which dense sketching matrices are necessary. On the algorithmic side, we prove that there exists a distribution of dense sketching matrices with $m=\epsilon^{-2}d\log^3(n/\delta)$ such that solving the sketched regression problem gives the $\ell_\infty$ guarantee, with probability at least $1-\delta$. Moreover, the matrix $SA$ can be computed in time $O(nd\log n)$. Our row count is nearly-optimal up to logarithmic factors, and significantly improves the result in [Price, Song and Woodruff, ICALP'17], in which a super-linear in $d$ rows, $m=\Omega(\epsilon^{-2}d^{1+\gamma})$ for $\gamma=\Theta(\sqrt{\frac{\log\log n}{\log d}})$ is required. We also develop a novel analytical framework for $\ell_\infty$ guarantee regression that utilizes the Oblivious Coordinate-wise Embedding (OCE) property introduced in [Song and Yu, ICML'21]. Our analysis is arguably much simpler and more general than [Price, Song and Woodruff, ICALP'17], and it extends to dense sketches for tensor product of vectors.

We study mean-field variational Bayesian inference using the TAP approach, for Z2-synchronization as a prototypical example of a high-dimensional Bayesian model. We show that for any signal strength $\lambda > 1$ (the weak-recovery threshold), there exists a unique local minimizer of the TAP free energy functional near the mean of the Bayes posterior law. Furthermore, the TAP free energy in a local neighborhood of this minimizer is strongly convex. Consequently, a natural-gradient/mirror-descent algorithm achieves linear convergence to this minimizer from a local initialization, which may be obtained by a constant number of iterates of Approximate Message Passing (AMP). This provides a rigorous foundation for variational inference in high dimensions via minimization of the TAP free energy. We also analyze the finite-sample convergence of AMP, showing that AMP is asymptotically stable at the TAP minimizer for any $\lambda > 1$, and is linearly convergent to this minimizer from a spectral initialization for sufficiently large $\lambda$. Such a guarantee is stronger than results obtainable by state evolution analyses, which only describe a fixed number of AMP iterations in the infinite-sample limit. Our proofs combine the Kac-Rice formula and Sudakov-Fernique Gaussian comparison inequality to analyze the complexity of critical points that satisfy strong convexity and stability conditions within their local neighborhoods.

In this paper we consider change-points in multiple sequences with the objective of minimizing the estimation error of a sequence by making use of information from other sequences. This is in contrast to recent interest on change-points in multiple sequences where the focus is on detection of common change-points. We start with the canonical case of a single sequence with constant change-point intensities. We consider two measures of a change-point algorithm. The first is the probability of estimating the change-point with no error. The second is the expected distance between the true and estimated change-points. We provide a theoretical upper bound for the no error probability, and a lower bound for the expected distance, that must be satisfied by all algorithms. We propose a scan-CUSUM algorithm that achieves the no error upper bound and come close to the distance lower bound. We next consider the case of non-constant intensities and establish sharp conditions under which estimation error can go to zero. We propose an extension of the scan-CUSUM algorithm for a non-constant intensity function, and show that it achieves asymptotically zero error at the boundary of the zero-error regime. We illustrate an application of the scan-CUSUM algorithm on multiple sequences sharing an unknown, non-constant intensity function. We estimate the intensity function from the change-point profile likelihoods of all sequences and apply scan-CUSUM on the estimated intensity function.

We provide a new approximation algorithm for the Red-Blue Set Cover problem and give a new hardness result. Our approximation algorithm achieves $\tilde O(m^{1/3})$-approximation improving on the $\tilde O(m^{1/2})$-approximation due to Elkin and Peleg (where $m$ is the number of sets). Additionally, we provide a nearly approximation preserving reduction from Min $k$-Union to Red-Blue Set Cover that gives an $\tilde\Omega(m^{1/4 - \varepsilon})$ hardness under the Dense-vs-Random conjecture.

We study the classical scheduling problem on parallel machines %with precedence constraints where the precedence graph has the bounded depth $h$. Our goal is to minimize the maximum completion time. We focus on developing approximation algorithms that use only sublinear space or sublinear time. We develop the first one-pass streaming approximation schemes using sublinear space when all jobs' processing times differ no more than a constant factor $c$ and the number of machines $m$ is at most $\tfrac {2n \epsilon}{3 h c }$. This is so far the best approximation we can have in terms of $m$, since no polynomial time approximation better than $\tfrac{4}{3}$ exists when $m = \tfrac{n}{3}$ unless P=NP. %the problem cannot be approximated within a factor of $\tfrac{4}{3}$ when $m = \tfrac{n}{3}$ even if all jobs have equal processing time. The algorithms are then extended to the more general problem where the largest $\alpha n$ jobs have no more than $c$ factor difference. % for some constant $0 < \alpha \le 1$. We also develop the first sublinear time algorithms for both problems. For the more general problem, when $ m \le \tfrac { \alpha n \epsilon}{20 c^2 \cdot h } $, our algorithm is a randomized $(1+\epsilon)$-approximation scheme that runs in sublinear time. This work not only provides an algorithmic solution to the studied problem under big data % and cloud computing environment, but also gives a methodological framework for designing sublinear approximation algorithms for other scheduling problems.

The computation of the distance of two time series is time-consuming for any elastic distance function that accounts for misalignments. Among those functions, DTW is the most prominent. However, a recent extensive evaluation has shown that the move-split merge (MSM) metric is superior to DTW regarding the analytical accuracy of the 1-NN classifier. Unfortunately, the running time of the standard dynamic programming algorithm for MSM distance computation is $\Omega(n^2)$, where $n$ is the length of the longest time series. In this paper, we provide approaches to reducing the cost of MSM distance computations by using lower and upper bounds for early pruning paths in the underlying dynamic programming table. For the case of one time series being a constant, we present a linear-time algorithm. In addition, we propose new linear-time heuristics and adapt heuristics known from DTW to computing the MSM distance. One heuristic employs the metric property of MSM and the previously introduced linear-time algorithm. Our experimental studies demonstrate substantial speed-ups in our approaches compared to previous MSM algorithms. In particular, the running time for MSM is faster than a state-of-the-art DTW distance computation for a majority of the popular UCR data sets.

Using techniques developed recently in the field of compressed sensing we prove new upper bounds for general (non-linear) sampling numbers of (quasi-)Banach smoothness spaces in $L^2$. In relevant cases such as mixed and isotropic weighted Wiener classes or Sobolev spaces with mixed smoothness, sampling numbers in $L^2$ can be upper bounded by best $n$-term trigonometric widths in $L^\infty$. We describe a recovery procedure based on $\ell^1$-minimization (basis pursuit denoising) using only $m$ function values. With this method, a significant gain in the rate of convergence compared to recently developed linear recovery methods is achieved. In this deterministic worst-case setting we see an additional speed-up of $n^{-1/2}$ compared to linear methods in case of weighted Wiener spaces. For their quasi-Banach counterparts even arbitrary polynomial speed-up is possible. Surprisingly, our approach allows to recover mixed smoothness Sobolev functions belonging to $S^r_pW(\mathbb{T}^d)$ on the $d$-torus with a logarithmically better rate of convergence than any linear method can achieve when $1 < p < 2$ and $d$ is large. This effect is not present for isotropic Sobolev spaces.

Pre-trained deep neural network language models such as ELMo, GPT, BERT and XLNet have recently achieved state-of-the-art performance on a variety of language understanding tasks. However, their size makes them impractical for a number of scenarios, especially on mobile and edge devices. In particular, the input word embedding matrix accounts for a significant proportion of the model's memory footprint, due to the large input vocabulary and embedding dimensions. Knowledge distillation techniques have had success at compressing large neural network models, but they are ineffective at yielding student models with vocabularies different from the original teacher models. We introduce a novel knowledge distillation technique for training a student model with a significantly smaller vocabulary as well as lower embedding and hidden state dimensions. Specifically, we employ a dual-training mechanism that trains the teacher and student models simultaneously to obtain optimal word embeddings for the student vocabulary. We combine this approach with learning shared projection matrices that transfer layer-wise knowledge from the teacher model to the student model. Our method is able to compress the BERT_BASE model by more than 60x, with only a minor drop in downstream task metrics, resulting in a language model with a footprint of under 7MB. Experimental results also demonstrate higher compression efficiency and accuracy when compared with other state-of-the-art compression techniques.

北京阿比特科技有限公司