亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

In this work, we show that solvers of elliptic boundary value problems in $d$ dimensions can be approximated to accuracy $\epsilon$ from only $\mathcal{O}\left(\log(N)\log^{d}(N / \epsilon)\right)$ matrix-vector products with carefully chosen vectors (right-hand sides). The solver is only accessed as a black box, and the underlying operator may be unknown and of an arbitrarily high order. Our algorithm (1) has complexity $\mathcal{O}\left(N\log^2(N)\log^{2d}(N / \epsilon)\right)$ and represents the solution operator as a sparse Cholesky factorization with $\mathcal{O}\left(N\log(N)\log^{d}(N / \epsilon)\right)$ nonzero entries, (2) allows for embarrassingly parallel evaluation of the solution operator and the computation of its log-determinant, (3) allows for $\mathcal{O}\left(\log(N)\log^{d}(N / \epsilon)\right)$ complexity computation of individual entries of the matrix representation of the solver that, in turn, enables its recompression to an $\mathcal{O}\left(N\log^{d}(N / \epsilon)\right)$ complexity representation. As a byproduct, our compression scheme produces a homogenized solution operator with near-optimal approximation accuracy. By polynomial approximation, we can also approximate the continuous Green's function (in operator and Hilbert-Schmidt norm) to accuracy $\epsilon$ from $\mathcal{O}\left(\log^{1 + d}\left(\epsilon^{-1}\right)\right)$ solutions of the PDE. We include rigorous proofs of these results. To the best of our knowledge, our algorithm achieves the best known trade-off between accuracy $\epsilon$ and the number of required matrix-vector products.

相關內容

機器學習系統設計系統評估標準

Importance sampling is a popular technique in Bayesian inference: by reweighting samples drawn from a proposal distribution we are able to obtain samples and moment estimates from a Bayesian posterior over some $n$ latent variables. Recent work, however, indicates that importance sampling scales poorly -- in order to accurately approximate the true posterior, the required number of importance samples grows is exponential in the number of latent variables [Chatterjee and Diaconis, 2018]. Massively parallel importance sampling works around this issue by drawing $K$ samples for each of the $n$ latent variables and reasoning about all $K^n$ combinations of latent samples. In principle, we can reason efficiently over $K^n$ combinations of samples by exploiting conditional independencies in the generative model. However, in practice this requires complex algorithms that traverse backwards through the graphical model, and we need separate backward traversals for each computation (posterior expectations, marginals and samples). Our contribution is to exploit the source term trick from physics to entirely avoid the need to hand-write backward traversals. Instead, we demonstrate how to simply and easily compute all the required quantities -- posterior expectations, marginals and samples -- by differentiating through a slightly modified marginal likelihood estimator.

In recent years, the extraction of opinions and information from user-generated text has attracted a lot of interest, largely due to the unprecedented volume of content in Social Media. However, social researchers face some issues in adopting cutting-edge tools for these tasks, as they are usually behind commercial APIs, unavailable for other languages than English, or very complex to use for non-experts. To address these issues, we present pysentimiento, a comprehensive multilingual Python toolkit designed for opinion mining and other Social NLP tasks. This open-source library brings state-of-the-art models for Spanish, English, Italian, and Portuguese in an easy-to-use Python library, allowing researchers to leverage these techniques. We present a comprehensive assessment of performance for several pre-trained language models across a variety of tasks, languages, and datasets, including an evaluation of fairness in the results.

Infinite-dimensional, holomorphic functions have been studied in detail over the last several decades, due to their relevance to parametric differential equations and computational uncertainty quantification. The approximation of such functions from finitely many samples is of particular interest, due to the practical importance of constructing surrogate models to complex mathematical models of physical processes. In a previous work, [5] we studied the approximation of so-called Banach-valued, $(\boldsymbol{b},\varepsilon)$-holomorphic functions on the infinite-dimensional hypercube $[-1,1]^{\mathbb{N}}$ from $m$ (potentially adaptive) samples. In particular, we derived lower bounds for the adaptive $m$-widths for classes of such functions, which showed that certain algebraic rates of the form $m^{1/2-1/p}$ are the best possible regardless of the sampling-recovery pair. In this work, we continue this investigation by focusing on the practical case where the samples are pointwise evaluations drawn identically and independently from a probability measure. Specifically, for Hilbert-valued $(\boldsymbol{b},\varepsilon)$-holomorphic functions, we show that the same rates can be achieved (up to a small polylogarithmic or algebraic factor) for essentially arbitrary tensor-product Jacobi (ultraspherical) measures. Our reconstruction maps are based on least squares and compressed sensing procedures using the corresponding orthonormal Jacobi polynomials. In doing so, we strengthen and generalize past work that has derived weaker nonuniform guarantees for the uniform and Chebyshev measures (and corresponding polynomials) only. We also extend various best $s$-term polynomial approximation error bounds to arbitrary Jacobi polynomial expansions. Overall, we demonstrate that i.i.d.\ pointwise samples are near-optimal for the recovery of infinite-dimensional, holomorphic functions.

It was conjectured by Gupta et al. [Combinatorica04] that every planar graph can be embedded into $\ell_1$ with constant distortion. However, given an $n$-vertex weighted planar graph, the best upper bound on the distortion is only $O(\sqrt{\log n})$, by Rao [SoCG99]. In this paper we study the case where there is a set $K$ of terminals, and the goal is to embed only the terminals into $\ell_1$ with low distortion. In a seminal paper, Okamura and Seymour [J.Comb.Theory81] showed that if all the terminals lie on a single face, they can be embedded isometrically into $\ell_1$. The more general case, where the set of terminals can be covered by $\gamma$ faces, was studied by Lee and Sidiropoulos [STOC09] and Chekuri et al. [J.Comb.Theory13]. The state of the art is an upper bound of $O(\log \gamma)$ by Krauthgamer, Lee and Rika [SODA19]. Our contribution is a further improvement on the upper bound to $O(\sqrt{\log\gamma})$. Since every planar graph has at most $O(n)$ faces, any further improvement on this result, will be a major breakthrough, directly improving upon Rao's long standing upper bound. Moreover, it is well known that the flow-cut gap equals to the distortion of the best embedding into $\ell_1$. Therefore, our result provides a polynomial time $O(\sqrt{\log \gamma})$-approximation to the sparsest cut problem on planar graphs, for the case where all the demand pairs can be covered by $\gamma$ faces.

Feature attribution is a fundamental task in both machine learning and data analysis, which involves determining the contribution of individual features or variables to a model's output. This process helps identify the most important features for predicting an outcome. The history of feature attribution methods can be traced back to General Additive Models (GAMs), which extend linear regression models by incorporating non-linear relationships between dependent and independent variables. In recent years, gradient-based methods and surrogate models have been applied to unravel complex Artificial Intelligence (AI) systems, but these methods have limitations. GAMs tend to achieve lower accuracy, gradient-based methods can be difficult to interpret, and surrogate models often suffer from stability and fidelity issues. Furthermore, most existing methods do not consider users' contexts, which can significantly influence their preferences. To address these limitations and advance the current state-of-the-art, we define a novel feature attribution framework called Context-Aware Feature Attribution Through Argumentation (CA-FATA). Our framework harnesses the power of argumentation by treating each feature as an argument that can either support, attack or neutralize a prediction. Additionally, CA-FATA formulates feature attribution as an argumentation procedure, and each computation has explicit semantics, which makes it inherently interpretable. CA-FATA also easily integrates side information, such as users' contexts, resulting in more accurate predictions.

Knowledge graphs contain rich semantic relationships related to items and incorporating such semantic relationships into recommender systems helps to explore the latent connections of items, thus improving the accuracy of prediction and enhancing the explainability of recommendations. However, such explainability is not adapted to users' contexts, which can significantly influence their preferences. In this work, we propose CA-KGCN (Context-Aware Knowledge Graph Convolutional Network), an end-to-end framework that can model users' preferences adapted to their contexts and can incorporate rich semantic relationships in the knowledge graph related to items. This framework captures users' attention to different factors: contexts and features of items. More specifically, the framework can model users' preferences adapted to their contexts and provide explanations adapted to the given context. Experiments on three real-world datasets show the effectiveness of our framework: modeling users' preferences adapted to their contexts and explaining the recommendations generated.

We introduce a nonlinear stochastic model reduction technique for high-dimensional stochastic dynamical systems that have a low-dimensional invariant effective manifold with slow dynamics, and high-dimensional, large fast modes. Given only access to a black box simulator from which short bursts of simulation can be obtained, we design an algorithm that outputs an estimate of the invariant manifold, a process of the effective stochastic dynamics on it, which has averaged out the fast modes, and a simulator thereof. This simulator is efficient in that it exploits of the low dimension of the invariant manifold, and takes time steps of size dependent on the regularity of the effective process, and therefore typically much larger than that of the original simulator, which had to resolve the fast modes. The algorithm and the estimation can be performed on-the-fly, leading to efficient exploration of the effective state space, without losing consistency with the underlying dynamics. This construction enables fast and efficient simulation of paths of the effective dynamics, together with estimation of crucial features and observables of such dynamics, including the stationary distribution, identification of metastable states, and residence times and transition rates between them.

Matrix-product constructions giving rise to locally recoverable codes are considered, both the classical $r$ and $(r,\delta)$ localities. We study the recovery advantages offered by the constituent codes and also by the defining matrices of the matrix product codes. Finally, we extend these methods to a particular variation of matrix-product codes and quasi-cyclic codes. Singleton-optimal locally recoverable codes and almost Singleton-optimal codes, with length larger than the finite field size, are obtained, some of them with superlinear length.

This work focuses on accelerating the multiplication of a dense random matrix with a (fixed) sparse matrix, which is frequently used in sketching algorithms. We develop a novel scheme that takes advantage of blocking and recomputation (on-the-fly random number generation) to accelerate this operation. The techniques we propose decrease memory movement, thereby increasing the algorithm's parallel scalability in shared memory architectures. On the Intel Frontera architecture, our algorithm can achieve 2x speedups over libraries such as Eigen and Intel MKL on some examples. In addition, with 32 threads, we can obtain a parallel efficiency of up to approximately 45%. We also present a theoretical analysis for the memory movement lower bound of our algorithm, showing that under mild assumptions, it's possible to beat the data movement lower bound of general matrix-matrix multiply (GEMM) by a factor of $\sqrt M$, where $M$ is the cache size. Finally, we incorporate our sketching algorithm into a randomized least squares solver. For extremely over-determined sparse input matrices, we show that our results are competitive with SuiteSparse; in some cases, we obtain a speedup of 10x over SuiteSparse.

Suppose a finite, unweighted, combinatorial graph $G = (V,E)$ is the union of several (degree-)regular graphs which are then additionally connected with a few additional edges. $G$ will then have only a small number of vertices $v \in V$ with the property that one of their neighbors $(v,w) \in E$ has a higher degree $\mbox{deg}(w) > \mbox{deg}(v)$. We prove the converse statement: if a graph has few vertices having a neighbor with higher degree and satisfies a mild regularity condition, then, via adding and removing a few edges, the graph can be turned into a disjoint union of (distance-)regular graphs. The number of edge operations depends on the maximum degree and number of vertices with a higher degree neighbor but is independent of the size of $|V|$.

北京阿比特科技有限公司