亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We extend the error bounds from [SIMAX, Vol. 43, Iss. 2, pp. 787-811 (2022)] for the Lanczos method for matrix function approximation to the block algorithm. Numerical experiments suggest that our bounds are fairly robust to changing block size and have the potential for use as a practical stopping criteria. Further experiments work towards a better understanding of how certain hyperparameters should be chosen in order to maximize the quality of the error bounds, even in the previously studied block-size one case.

相關內容

通常,函數逼近問題要求我們從定義明確的類中選擇一個函數,該類以特定于任務的方式與目標函數緊密匹配(“近似”)。 在應用數學的許多分支中,特別是在計算機科學中,都出現了函數逼近的需求。 一個人可以區分兩類主要的函數逼近問題:首先,對于已知的目標函數,逼近理論是數值分析的分支,它研究如何通過特定的函數類(例如,某些函數)來近似某些已知函數(例如,特殊函數)。 ,多項式或有理函數),這些屬性通常具有理想的屬性(廉價的計算,連續性,積分和極限值等)。 其次,目標函數g可能是未知的; 而不是顯式公式,僅提供(x,g(x))形式的一組點。 取決于g的域和共域的結構,可以采用幾種近似g的技術。 例如,如果g是對實數的運算,則可以使用插值,外推,回歸分析和曲線擬合的技術。 如果g的共域(范圍集或目標集)是一個有限集,那么人們正在處理一個分類問題。 在某種程度上,不同的問題(回歸,分類,適應度近似)在統計學習理論中得到了統一的處理,在這些理論中,它們被視為監督學習問題。

Group testing is a technique which avoids individually testing $n$ samples for a rare disease and instead tests $n < p$ pools, where a pool consists of a mixture of small, equal portions of a subset of the $p$ samples. Group testing saves testing time and resources in many applications, including RT-PCR, with guarantees for the recovery of the status of the $p$ samples from results on $n$ pools. The noise in quantitative RT- PCR is inherently known to follow a multiplicative data-dependent model. In recent literature, the corresponding linear systems for inferring the health status of $p$ samples from results on $n$ pools have been solved using the Lasso estimator and its variants, which have been typically used in additive Gaussian noise settings. There is no existing literature which establishes performance bounds for Lasso for the multiplicative noise model associated with RT-PCR. After noting that a recent general technique, Hunt et al., works for Poisson inverse problems, we adapt it to handle sparse signal reconstruction from compressive measurements with multiplicative noise: we present high probability performance bounds and data-dependent weights for the Lasso and its weighted version. We also show numerical results on simulated pooled RT-PCR data to empirically validate our bounds.

This paper provides a comprehensive error analysis of learning with vector-valued random features (RF). The theory is developed for RF ridge regression in a fully general infinite-dimensional input-output setting, but nonetheless applies to and improves existing finite-dimensional analyses. In contrast to comparable work in the literature, the approach proposed here relies on a direct analysis of the underlying risk functional and completely avoids the explicit RF ridge regression solution formula in terms of random matrices. This removes the need for concentration results in random matrix theory or their generalizations to random operators. The main results established in this paper include strong consistency of vector-valued RF estimators under model misspecification and minimax optimal convergence rates in the well-specified setting. The parameter complexity (number of random features) and sample complexity (number of labeled data) required to achieve such rates are comparable with Monte Carlo intuition and free from logarithmic factors.

We give a structure theorem for Boolean functions on the biased hypercube which are $\epsilon$-close to degree $d$ in $L_2$, showing that they are close to sparse juntas. Our structure theorem implies that such functions are $O(\epsilon^{C_d} + p)$-close to constant functions. We pinpoint the exact value of the constant $C_d$.

Score-based generative models are a popular class of generative modelling techniques relying on stochastic differential equations (SDE). From their inception, it was realized that it was also possible to perform generation using ordinary differential equations (ODE) rather than SDE. This led to the introduction of the probability flow ODE approach and denoising diffusion implicit models. Flow matching methods have recently further extended these ODE-based approaches and approximate a flow between two arbitrary probability distributions. Previous work derived bounds on the approximation error of diffusion models under the stochastic sampling regime, given assumptions on the $L^2$ loss. We present error bounds for the flow matching procedure using fully deterministic sampling, assuming an $L^2$ bound on the approximation error and a certain regularity condition on the data distributions.

Bilevel optimization reveals the inner structure of otherwise oblique optimization problems, such as hyperparameter tuning and meta-learning. A common goal in bilevel optimization is to find stationary points of the hyper-objective function. Although this hyper-objective approach is widely used, its theoretical properties have not been thoroughly investigated in cases where the lower-level functions lack strong convexity. In this work, we take a step forward and study the hyper-objective approach without the typical lower-level strong convexity assumption. Our hardness results show that the hyper-objective of general convex lower-level functions can be intractable either to evaluate or to optimize. To tackle this challenge, we introduce the gradient dominant condition, which strictly relaxes the strong convexity assumption by allowing the lower-level solution set to be non-singleton. Under the gradient dominant condition, we propose the Inexact Gradient-Free Method (IGFM), which uses the Switching Gradient Method (SGM) as the zeroth order oracle, to find an approximate stationary point of the hyper-objective. We also extend our results to nonsmooth lower-level functions under the weak sharp minimum condition.

Efficient numerical solvers for partial differential equations empower science and engineering. One of the commonly employed numerical solvers is the preconditioned conjugate gradient (PCG) algorithm which can solve large systems to a given precision level. One challenge in PCG solvers is the selection of preconditioners, as different problem-dependent systems can benefit from different preconditioners. We present a new method to introduce \emph{inductive bias} in preconditioning conjugate gradient algorithm. Given a system matrix and a set of solution vectors arise from an underlying distribution, we train a graph neural network to obtain an approximate decomposition to the system matrix to be used as a preconditioner in the context of PCG solvers. We conduct extensive experiments to demonstrate the efficacy and generalizability of our proposed approach in solving various 2D and 3D linear second-order PDEs.

In this paper, we develop a novel data-driven approach to accelerate solving large-scale linear equation systems encountered in scientific computing and optimization. Our method utilizes self-supervised training of a graph neural network to generate an effective preconditioner tailored to the specific problem domain. By replacing conventional hand-crafted preconditioners used with the conjugate gradient method, our approach, named neural incomplete factorization (NeuralIF), significantly speeds-up convergence and computational efficiency. At the core of our method is a novel message-passing block, inspired by sparse matrix theory, that aligns with the objective to find a sparse factorization of the matrix. We evaluate our proposed method on both a synthetic and a real-world problem arising from scientific computing. Our results demonstrate that NeuralIF consistently outperforms the most common general-purpose preconditioners, including the incomplete Cholesky method, achieving competitive performance across various metrics even outside the training data distribution.

In high-dimensional Bayesian statistics, several methods have been developed, including many prior distributions that lead to the sparsity of estimated parameters. However, such priors have limitations in handling the spectral eigenvector structure of data, and as a result, they are ill-suited for analyzing over-parameterized models (high-dimensional linear models that do not assume sparsity) that have been developed in recent years. This paper introduces a Bayesian approach that uses a prior dependent on the eigenvectors of data covariance matrices, but does not induce the sparsity of parameters. We also provide contraction rates of derived posterior distributions and develop a truncated Gaussian approximation of the posterior distribution. The former demonstrates the efficiency of posterior estimation, while the latter enables quantification of parameter uncertainty using a Bernstein-von Mises-type approach. These results indicate that any Bayesian method that can handle the spectrum of data and estimate non-sparse high dimensions would be possible.

PCA-Net is a recently proposed neural operator architecture which combines principal component analysis (PCA) with neural networks to approximate operators between infinite-dimensional function spaces. The present work develops approximation theory for this approach, improving and significantly extending previous work in this direction: First, a novel universal approximation result is derived, under minimal assumptions on the underlying operator and the data-generating distribution. Then, two potential obstacles to efficient operator learning with PCA-Net are identified, and made precise through lower complexity bounds; the first relates to the complexity of the output distribution, measured by a slow decay of the PCA eigenvalues. The other obstacle relates to the inherent complexity of the space of operators between infinite-dimensional input and output spaces, resulting in a rigorous and quantifiable statement of the curse of dimensionality. In addition to these lower bounds, upper complexity bounds are derived. A suitable smoothness criterion is shown to ensure an algebraic decay of the PCA eigenvalues. Furthermore, it is shown that PCA-Net can overcome the general curse of dimensionality for specific operators of interest, arising from the Darcy flow and the Navier-Stokes equations.

Learning from a few examples remains a key challenge in machine learning. Despite recent advances in important domains such as vision and language, the standard supervised deep learning paradigm does not offer a satisfactory solution for learning new concepts rapidly from little data. In this work, we employ ideas from metric learning based on deep neural features and from recent advances that augment neural networks with external memories. Our framework learns a network that maps a small labelled support set and an unlabelled example to its label, obviating the need for fine-tuning to adapt to new class types. We then define one-shot learning problems on vision (using Omniglot, ImageNet) and language tasks. Our algorithm improves one-shot accuracy on ImageNet from 87.6% to 93.2% and from 88.0% to 93.8% on Omniglot compared to competing approaches. We also demonstrate the usefulness of the same model on language modeling by introducing a one-shot task on the Penn Treebank.

北京阿比特科技有限公司