亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Suppose a matrix $A \in \mathbb{R}^{m \times n}$ of rank $k$ with singular value decomposition $A = U_{A}\Sigma_{A} V_{A}^{T}$, where $U_{A} \in \mathbb{R}^{m \times k}$, $V_{A} \in \mathbb{R}^{n \times k}$ are orthonormal and $\Sigma_{A} \in \mathbb{R}^{k \times k}$ is a diagonal matrix. The statistical leverage scores of a matrix $A$ are the squared row-norms defined by $\ell_{i} = \|(U_{A})_{i,:}\|_2^2$, where $i \in [m]$, and the matrix coherence is the largest statistical leverage score. These quantities play an important role in machine learning algorithms such as matrix completion and Nystr\"{o}m-based low rank matrix approximation as well as large-scale statistical data analysis applications. The best known classical algorithm to approximate these values runs in time $O((mn + n^3){\rm log}\,m)$ in [P. Drineas, M. Magdon-Ismail, M. W. Mahoney and D. P. Woodruff. Fast approximation of matrix coherence and statistical leverage. J. Mach. Learn. Res., (2012)13: 3475-3506]. In this work, inspired by recent development on dequantization techniques, we propose a fast novel classical algorithm for approximating the statistical leverage scores. Our novel algorithm has query and time complexity $O\left({\rm poly} \left(k, \kappa, \frac{1}{\epsilon}, \frac{1}{\delta}, {\rm log}(mn)\right) \right)$, where $\kappa$ is the condition number of $A$, and $\delta$ is the failure probability.

相關內容

In this paper, we consider two fundamental symmetric kernels in linear algebra: the Cholesky factorization and the symmetric rank-$k$ update (SYRK), with the classical three nested loops algorithms for these kernels. In addition, we consider a machine model with a fast memory of size $S$ and an unbounded slow memory. In this model, all computations must be performed on operands in fast memory, and the goal is to minimize the amount of communication between slow and fast memories. As the set of computations is fixed by the choice of the algorithm, only the ordering of the computations (the schedule) directly influences the volume of communications.We prove lower bounds of $\frac{1}{3\sqrt{2}}\frac{N^3}{\sqrt{S}}$ for the communication volume of the Cholesky factorization of an $N\times N$ symmetric positive definite matrix, and of $\frac{1}{\sqrt{2}}\frac{N^2M}{\sqrt{S}}$ for the SYRK computation of $\mat{A}\cdot\transpose{\mat{A}}$, where $\mathbf{A}$ is an $N\times M$ matrix. Both bounds improve the best known lower bounds from the literature by a factor $\sqrt{2}$.In addition, we present two out-of-core, sequential algorithms with matching communication volume: \TBS for SYRK, with a volume of $\frac{1}{\sqrt{2}}\frac{N^2M}{\sqrt{S}} + \bigo{NM\log N}$, and \LBC for Cholesky, with a volume of $\frac{1}{3\sqrt{2}}\frac{N^3}{\sqrt{S}} + \bigo{N^{5/2}}$. Both algorithms improve over the best known algorithms from the literature by a factor $\sqrt{2}$, and prove that the leading terms in our lower bounds cannot be improved further. This work shows that the operational intensity of symmetric kernels like SYRK or Cholesky is intrinsically higher (by a factor $\sqrt{2}$) than that of corresponding non-symmetric kernels (GEMM and LU factorization).

Multifidelity approximation is an important technique in scientific computation and simulation. In this paper, we introduce a bandit-learning approach for leveraging data of varying fidelities to achieve precise estimates of the parameters of interest. Under a linear model assumption, we formulate a multifidelity approximation as a modified stochastic bandit, and analyze the loss for a class of policies that uniformly explore each model before exploiting. Utilizing the estimated conditional mean-squared error, we propose a consistent algorithm, adaptive Explore-Then-Commit (AETC), and establish a corresponding trajectory-wise optimality result. These results are then extended to the case of vector-valued responses, where we demonstrate that the algorithm is efficient without the need to worry about estimating high-dimensional parameters. The main advantage of our approach is that we require neither hierarchical model structure nor \textit{a priori} knowledge of statistical information (e.g., correlations) about or between models. Instead, the AETC algorithm requires only knowledge of which model is a trusted high-fidelity model, along with (relative) computational cost estimates of querying each model. Numerical experiments are provided at the end to support our theoretical findings.

We study the complexity of approximating the multimarginal optimal transport (MOT) distance, a generalization of the classical optimal transport distance, considered here between $m$ discrete probability distributions supported each on $n$ support points. First, we show that the standard linear programming (LP) representation of the MOT problem is not a minimum-cost flow problem when $m \geq 3$. This negative result implies that some combinatorial algorithms, e.g., network simplex method, are not suitable for approximating the MOT problem, while the worst-case complexity bound for the deterministic interior-point algorithm remains a quantity of $\tilde{O}(n^{3m})$. We then propose two simple and \textit{deterministic} algorithms for approximating the MOT problem. The first algorithm, which we refer to as \textit{multimarginal Sinkhorn} algorithm, is a provably efficient multimarginal generalization of the Sinkhorn algorithm. We show that it achieves a complexity bound of $\tilde{O}(m^3n^m\varepsilon^{-2})$ for a tolerance $\varepsilon \in (0, 1)$. This provides a first \textit{near-linear time} complexity bound guarantee for approximating the MOT problem and matches the best known complexity bound for the Sinkhorn algorithm in the classical OT setting when $m = 2$. The second algorithm, which we refer to as \textit{accelerated multimarginal Sinkhorn} algorithm, achieves the acceleration by incorporating an estimate sequence and the complexity bound is $\tilde{O}(m^3n^{m+1/3}\varepsilon^{-4/3})$. This bound is better than that of the first algorithm in terms of $1/\varepsilon$, and accelerated alternating minimization algorithm~\citep{Tupitsa-2020-Multimarginal} in terms of $n$. Finally, we compare our new algorithms with the commercial LP solver \textsc{Gurobi}. Preliminary results on synthetic data and real images demonstrate the effectiveness and efficiency of our algorithms.

Given an unknown $n \times n$ matrix $A$ having non-negative entries, the \emph{inner product} (IP) oracle takes as inputs a specified row (or a column) of $A$ and a vector $v \in \mathbb{R}^{n}$, and returns their inner product. A derivative of IP is the induced degree query in an unknown graph $G=(V(G), E(G))$ that takes a vertex $u \in V(G)$ and a subset $S \subseteq V(G)$ as input and reports the number of neighbors of $u$ that are present in $S$. The goal of this paper is to understand the strength of the inner product oracle. Our results in that direction are as follows: (I) IP oracle can solve bilinear form estimation, i.e., estimate the value of ${\bf x}^{T}A\bf{y}$ given two vectors ${\bf x},\, {\bf y} \in \mathbb{R}^{n}$ with non-negative entries and can sample almost uniformly entries of a matrix with non-negative entries; (ii) We tackle for the first time weighted edge estimation and weighted sampling of edges that follow as an application to the bilinear form estimation and almost uniform sampling problems, respectively; (iii) induced degree query, a derivative of IP can solve edge estimation and an almost uniform edge sampling in induced subgraphs. To the best of our knowledge, these are the first set of Oracle-based query complexity results for induced subgraphs. We show that IP/induced degree queries over the whole graph can simulate local queries in any induced subgraph; (iv) Apart from the above, we also show that IP can solve several problems related to matrix, like testing if the matrix is diagonal, symmetric, doubly stochastic, etc.

Given a point set $P\subset \mathbb{R}^d$, the kernel density estimate of $P$ is defined as \[ \overline{\mathcal{G}}_P(x) = \frac{1}{\left|P\right|}\sum_{p\in P}e^{-\left\lVert x-p \right\rVert^2} \] for any $x\in\mathbb{R}^d$. We study how to construct a small subset $Q$ of $P$ such that the kernel density estimate of $P$ is approximated by the kernel density estimate of $Q$. This subset $Q$ is called a coreset. The main technique in this work is constructing a $\pm 1$ coloring on the point set $P$ by discrepancy theory and we leverage Banaszczyk's Theorem. When $d>1$ is a constant, our construction gives a coreset of size $O\left(\frac{1}{\varepsilon}\right)$ as opposed to the best-known result of $O\left(\frac{1}{\varepsilon}\sqrt{\log\frac{1}{\varepsilon}}\right)$. It is the first result to give a breakthrough on the barrier of $\sqrt{\log}$ factor even when $d=2$.

We present a classical algorithm that, for any $D$-dimensional geometrically-local, quantum circuit $C$ of polylogarithmic-depth, and any bit string $x \in {0,1}^n$, can compute the quantity $|<x|C|0^{\otimes n}>|^2$ to within any inverse-polynomial additive error in quasi-polynomial time, for any fixed dimension $D$. This is an extension of the result [CC21], which originally proved this result for $D = 3$. To see why this is interesting, note that, while the $D = 1$ case of this result follows from standard use of Matrix Product States, known for decades, the $D = 2$ case required novel and interesting techniques introduced in [BGM19]. Extending to the case $D = 3$ was even more laborious and required further new techniques introduced in [CC21]. Our work here shows that, while handling each new dimension has historically required a new insight, and fixed algorithmic primitive, based on known techniques for $D \leq 3$, we can now handle any fixed dimension $D > 3$. Our algorithm uses the Divide-and-Conquer framework of [CC21] to approximate the desired quantity via several instantiations of the same problem type, each involving $D$-dimensional circuits on about half the number of qubits as the original. This division step is then applied recursively, until the width of the recursively decomposed circuits in the $D^{th}$ dimension is so small that they can effectively be regarded as $(D-1)$-dimensional problems by absorbing the small width in the $D^{th}$ dimension into the qudit structure at the cost of a moderate increase in runtime. The main technical challenge lies in ensuring that the more involved portions of the recursive circuit decomposition and error analysis from [CC21] still hold in higher dimensions, which requires small modifications to the analysis in some places.

We study algorithms for approximating pairwise similarity matrices that arise in natural language processing. Generally, computing a similarity matrix for $n$ data points requires $\Omega(n^2)$ similarity computations. This quadratic scaling is a significant bottleneck, especially when similarities are computed via expensive functions, e.g., via transformer models. Approximation methods reduce this quadratic complexity, often by using a small subset of exactly computed similarities to approximate the remainder of the complete pairwise similarity matrix. Significant work focuses on the efficient approximation of positive semidefinite (PSD) similarity matrices, which arise e.g., in kernel methods. However, much less is understood about indefinite (non-PSD) similarity matrices, which often arise in NLP. Motivated by the observation that many of these matrices are still somewhat close to PSD, we introduce a generalization of the popular Nystr\"{o}m method to the indefinite setting. Our algorithm can be applied to any similarity matrix and runs in sublinear time in the size of the matrix, producing a rank-$s$ approximation with just $O(ns)$ similarity computations. We show that our method, along with a simple variant of CUR decomposition, performs very well in approximating a variety of similarity matrices arising in NLP tasks. We demonstrate high accuracy of the approximated similarity matrices in the downstream tasks of document classification, sentence similarity, and cross-document coreference.

Leveraging biased click data for optimizing learning to rank systems has been a popular approach in information retrieval. Because click data is often noisy and biased, a variety of methods have been proposed to construct unbiased learning to rank (ULTR) algorithms for the learning of unbiased ranking models. Among them, automatic unbiased learning to rank (AutoULTR) algorithms that jointly learn user bias models (i.e., propensity models) with unbiased rankers have received a lot of attention due to their superior performance and low deployment cost in practice. Despite their differences in theories and algorithm design, existing studies on ULTR usually use uni-variate ranking functions to score each document or result independently. On the other hand, recent advances in context-aware learning-to-rank models have shown that multivariate scoring functions, which read multiple documents together and predict their ranking scores jointly, are more powerful than uni-variate ranking functions in ranking tasks with human-annotated relevance labels. Whether such superior performance would hold in ULTR with noisy data, however, is mostly unknown. In this paper, we investigate existing multivariate scoring functions and AutoULTR algorithms in theory and prove that permutation invariance is a crucial factor that determines whether a context-aware learning-to-rank model could be applied to existing AutoULTR framework. Our experiments with synthetic clicks on two large-scale benchmark datasets show that AutoULTR models with permutation-invariant multivariate scoring functions significantly outperform those with uni-variate scoring functions and permutation-variant multivariate scoring functions.

In order to avoid the curse of dimensionality, frequently encountered in Big Data analysis, there was a vast development in the field of linear and nonlinear dimension reduction techniques in recent years. These techniques (sometimes referred to as manifold learning) assume that the scattered input data is lying on a lower dimensional manifold, thus the high dimensionality problem can be overcome by learning the lower dimensionality behavior. However, in real life applications, data is often very noisy. In this work, we propose a method to approximate $\mathcal{M}$ a $d$-dimensional $C^{m+1}$ smooth submanifold of $\mathbb{R}^n$ ($d \ll n$) based upon noisy scattered data points (i.e., a data cloud). We assume that the data points are located "near" the lower dimensional manifold and suggest a non-linear moving least-squares projection on an approximating $d$-dimensional manifold. Under some mild assumptions, the resulting approximant is shown to be infinitely smooth and of high approximation order (i.e., $O(h^{m+1})$, where $h$ is the fill distance and $m$ is the degree of the local polynomial approximation). The method presented here assumes no analytic knowledge of the approximated manifold and the approximation algorithm is linear in the large dimension $n$. Furthermore, the approximating manifold can serve as a framework to perform operations directly on the high dimensional data in a computationally efficient manner. This way, the preparatory step of dimension reduction, which induces distortions to the data, can be avoided altogether.

We show that for the problem of testing if a matrix $A \in F^{n \times n}$ has rank at most $d$, or requires changing an $\epsilon$-fraction of entries to have rank at most $d$, there is a non-adaptive query algorithm making $\widetilde{O}(d^2/\epsilon)$ queries. Our algorithm works for any field $F$. This improves upon the previous $O(d^2/\epsilon^2)$ bound (SODA'03), and bypasses an $\Omega(d^2/\epsilon^2)$ lower bound of (KDD'14) which holds if the algorithm is required to read a submatrix. Our algorithm is the first such algorithm which does not read a submatrix, and instead reads a carefully selected non-adaptive pattern of entries in rows and columns of $A$. We complement our algorithm with a matching query complexity lower bound for non-adaptive testers over any field. We also give tight bounds of $\widetilde{\Theta}(d^2)$ queries in the sensing model for which query access comes in the form of $\langle X_i, A\rangle:=tr(X_i^\top A)$; perhaps surprisingly these bounds do not depend on $\epsilon$. We next develop a novel property testing framework for testing numerical properties of a real-valued matrix $A$ more generally, which includes the stable rank, Schatten-$p$ norms, and SVD entropy. Specifically, we propose a bounded entry model, where $A$ is required to have entries bounded by $1$ in absolute value. We give upper and lower bounds for a wide range of problems in this model, and discuss connections to the sensing model above.

北京阿比特科技有限公司