姑娘日本电影免费观看全集中文,日韩一区二区三区免费在线观看,久久999精品网站,久青青国产综合在线视频,在线免费观看国产一区

Matrix perturbation bounds (such as Weyl and Davis-Kahan) are frequently used in many branches of mathematics. Most of the classical results in this area are optimal, in the worst case analysis. However, in modern applications, both the ground and the nose matrices frequently have extra structural properties. For instance, it is often assumed that the ground matrix is essentially low rank, and the nose matrix is random or pseudo-random. We aim to rebuild a part of perturbation theory, adapting to these modern assumptions. We will do this using a contour expansion argument, which enables us to exploit the skewness among the leading eigenvectors of the ground and the noise matrix (which is significant when the two are uncorrelated) to our advantage. In the current paper, we focus on the perturbation of eigenspaces. This helps us to introduce the arguments in the cleanest way, avoiding the more technical consideration of the general case. In applications, this case is also one of the most useful. More general results appear in a subsequent paper. Our method has led to several improvements, which have direct applications in central problems. Among others, we derive a sharp result for perturbation of a low rank matrix with random perturbation, answering an open question in this area. Next, we derive new results concerning the spike model, an important model in statistics, bridging two different directions of current research. Finally, we use our results on the perturbation of eigenspaces to derive new results concerning eigenvalues of deterministic and random matrices. In particular, we obtain new results concerning the outliers in the deformed Wigner model and the least singular value of random matrices with non-zero mean.

相關內容

CASE

關注 1

Subspace · 跡 · 確切的 · 近似 · 數值分析 ·

2024 年 12 月 3 日

A subspace method for large-scale trace ratio problems

G. Ferrandi,M. E. Hochstenbach,M. R. Oliveira

A subspace method is introduced to solve large-scale trace ratio problems. This approach is matrix-free, requiring only the action of the two matrices involved in the trace ratio. At each iteration, a smaller trace ratio problem is addressed in the search subspace. Additionally, the algorithm is endowed with a restarting strategy, that ensures the monotonicity of the trace ratio value throughout the iterations. The behavior of the approximate solution is investigated from a theoretical viewpoint, extending existing results on Ritz values and vectors, as the angle between the search subspace and the exact solution approaches zero. Numerical experiments in multigroup classification show that this new subspace method tends to be more efficient than iterative approaches relying on (partial) eigenvalue decompositions at each step.

分解的 · MoDELS · 圖 · 泛函 · 相互獨立的 ·

2024 年 12 月 3 日

Factored space models: Towards causality between levels of abstraction

Scott Garrabrant,Matthias Georg Mayer,Magdalena Wache,Leon Lang,Sam Eisenstat,Holger Dell

from arxiv, 29 pages

Causality plays an important role in understanding intelligent behavior, and there is a wealth of literature on mathematical models for causality, most of which is focused on causal graphs. Causal graphs are a powerful tool for a wide range of applications, in particular when the relevant variables are known and at the same level of abstraction. However, the given variables can also be unstructured data, like pixels of an image. Meanwhile, the causal variables, such as the positions of objects in the image, can be arbitrary deterministic functions of the given variables. Moreover, the causal variables may form a hierarchy of abstractions, in which the macro-level variables are deterministic functions of the micro-level variables. Causal graphs are limited when it comes to modeling this kind of situation. In the presence of deterministic relationships there is generally no causal graph that satisfies both the Markov condition and the faithfulness condition. We introduce factored space models as an alternative to causal graphs which naturally represent both probabilistic and deterministic relationships at all levels of abstraction. Moreover, we introduce structural independence and establish that it is equivalent to statistical independence in every distribution that factorizes over the factored space. This theorem generalizes the classical soundness and completeness theorem for d-separation.

命名實體識別 · MoDELS · Analysis · 類別 · 判別器 ·

2024 年 12 月 3 日

GerPS-Compare: Comparing NER methods for legal norm analysis

Sarah T. Bachinger,Christoph Unger,Robin Erd,Leila Feddoul,Clara Lachenmaier,Sina Zarrie?,Birgitta K?nig-Ries

We apply NER to a particular sub-genre of legal texts in German: the genre of legal norms regulating administrative processes in public service administration. The analysis of such texts involves identifying stretches of text that instantiate one of ten classes identified by public service administration professionals. We investigate and compare three methods for performing Named Entity Recognition (NER) to detect these classes: a Rule-based system, deep discriminative models, and a deep generative model. Our results show that Deep Discriminative models outperform both the Rule-based system as well as the Deep Generative model, the latter two roughly performing equally well, outperforming each other in different classes. The main cause for this somewhat surprising result is arguably the fact that the classes used in the analysis are semantically and syntactically heterogeneous, in contrast to the classes used in more standard NER tasks. Deep Discriminative models appear to be better equipped for dealing with this heterogenerity than both generic LLMs and human linguists designing rule-based NER systems.

Extensibility · 優化器 · 控制器 · state-of-the-art · 線性的 ·

2024 年 12 月 3 日

Efficient parallel inversion of ParaOpt preconditioners

Corentin Bonte,Arne Bouillon,Giovanni Samaey,Karl Meerbergen

Recently, the ParaOpt algorithm was proposed as an extension of the time-parallel Parareal method to optimal control. ParaOpt uses quasi-Newton steps that each require solving a system of matching conditions iteratively. The state-of-the-art parallel preconditioner for linear problems leads to a set of independent smaller systems that are currently hard to solve. We generalize the preconditioner to the nonlinear case and propose a new, fast inversion method for these smaller systems, avoiding disadvantages of the current options with adjusted boundary conditions in the subproblems.

平穩的 · 線性的 · Subspace · 平滑 · 收縮 ·

2024 年 12 月 2 日

A note on indefinite matrix splitting and preconditioning

Andy Wathen

from arxiv, Submitted to Linear Algebra and its Applications

The solution of systems of linear(ized) equations lies at the heart of many problems in Scientific Computing. In particular for systems of large dimension, iterative methods are a primary approach. Stationary iterative methods are generally based on a matrix splitting, whereas for polynomial iterative methods such as Krylov subspace iteration, the splitting matrix is the preconditioner. The smoother in a multigrid method is generally a stationary or polynomial iteration. Here we consider real symmetric indefinite and complex Hermitian indefinite coefficient matrices and prove that no splitting matrix can lead to a contractive stationary iteration unless the inertia is exactly preserved. This has consequences for preconditioning for indefinite systems and smoothing for multigrid as we further describe.

MoDELS · 語言模型化 · Analysis · 大語言模型 · 頻率主義學派 ·

2024 年 12 月 1 日

Quantifying perturbation impacts for large language models

Paulius Rauba,Qiyao Wei,Mihaela van der Schaar

from arxiv, Statistical Foundations of LLMs and Foundation Models Workshop at NeurIPS 2024

We consider the problem of quantifying how an input perturbation impacts the outputs of large language models (LLMs), a fundamental task for model reliability and post-hoc interpretability. A key obstacle in this domain is disentangling the meaningful changes in model responses from the intrinsic stochasticity of LLM outputs. To overcome this, we introduce Distribution-Based Perturbation Analysis (DBPA), a framework that reformulates LLM perturbation analysis as a frequentist hypothesis testing problem. DBPA constructs empirical null and alternative output distributions within a low-dimensional semantic similarity space via Monte Carlo sampling. Comparisons of Monte Carlo estimates in the reduced dimensionality space enables tractable frequentist inference without relying on restrictive distributional assumptions. The framework is model-agnostic, supports the evaluation of arbitrary input perturbations on any black-box LLM, yields interpretable p-values, supports multiple perturbation testing via controlled error rates, and provides scalar effect sizes for any chosen similarity or distance metric. We demonstrate the effectiveness of DBPA in evaluating perturbation impacts, showing its versatility for perturbation analysis.

MoDELS · 統計量 · 各向同性 · Integration · Performer ·

2024 年 11 月 29 日

Isotropy testing in spatial point patterns: nonparametric versus parametric replication under misspecification

Jakub J. Pypkowski,Adam M. Sykulski,James S. Martin

from arxiv, 20 pages, 9 figures, 3 tables

Several hypothesis testing methods have been proposed to validate the assumption of isotropy in spatial point patterns. A majority of these methods are characterised by an unknown distribution of the test statistic under the null hypothesis of isotropy. Parametric approaches to approximating the distribution involve simulation of patterns from a user-specified isotropic model. Alternatively, nonparametric replicates of the test statistic under isotropy can be used to waive the need for specifying a model. In this paper, we first develop a general framework which allows for the integration of a selected nonparametric replication method into isotropy testing. We then conduct a large simulation study comprising application-like scenarios to assess the performance of tests with different parametric and nonparametric replication methods. In particular, we explore distortions in test size and power caused by model misspecification, and demonstrate the advantages of nonparametric replication in such scenarios.

去噪自編碼 · 預測器/決策函數 · CASE · PDE · 泛函 ·

2024 年 11 月 29 日

High order ADER-DG method with local DG predictor for solutions of differential-algebraic systems of equations

I. S. Popov

from arxiv, 98 pages, 44 figures, 21 tables. arXiv admin note: text overlap with arXiv:2409.09933

A numerical method ADER-DG with a local DG predictor for solving a DAE system has been developed, which was based on the formulation of ADER-DG methods using a local DG predictor for solving ODE and PDE systems. The basis functions were chosen in the form of Lagrange interpolation polynomials with nodal points at the roots of the Radau polynomials, which differs from the classical formulations of the ADER-DG method, where it is customary to use the roots of Legendre polynomials. It was shown that the use of this basis leads to A-stability and L1-stability in the case of using the DAE solver as ODE solver. The numerical method ADER-DG allows one to obtain a highly accurate numerical solution even on very coarse grids, with a step greater than the main characteristic scale of solution variation. The local discrete time solution can be used as a numerical solution of the DAE system between grid nodes, thereby providing subgrid resolution even in the case of very coarse grids. The classical test examples were solved by developed numerical method ADER-DG. With increasing index of the DAE system, a decrease in the empirical convergence orders p is observed. An unexpected result was obtained in the numerical solution of the stiff DAE system -- the empirical convergence orders of the numerical solution obtained using the developed method turned out to be significantly higher than the values expected for this method in the case of stiff problems. It turns out that the use of Lagrange interpolation polynomials with nodal points at the roots of the Radau polynomials is much better suited for solving stiff problems. Estimates showed that the computational costs of the ADER-DG method are approximately comparable to the computational costs of implicit Runge-Kutta methods used to solve DAE systems. Methods were proposed to reduce the computational costs of the ADER-DG method.

優化器 · 塊 · 非凸 · Projection · SimPLe ·

2024 年 11 月 29 日

Block majorization-minimization with diminishing radius for constrained nonsmooth nonconvex optimization

Hanbaek Lyu,Yuchen Li

from arxiv, 27 pages, 4 figures. Generalize and improve the analysis for nonsmooth nonconvex problems

Block majorization-minimization (BMM) is a simple iterative algorithm for constrained nonconvex optimization that sequentially minimizes majorizing surrogates of the objective function in each block while the others are held fixed. BMM entails a large class of optimization algorithms such as block coordinate descent and its proximal-point variant, expectation-minimization, and block projected gradient descent. We first establish that for general constrained nonsmooth nonconvex optimization, BMM with $\rho$-strongly convex and $L_g$-smooth surrogates can produce an $\epsilon$-approximate first-order optimal point within $\widetilde{O}((1+L_g+\rho^{-1})\epsilon^{-2})$ iterations and asymptotically converges to the set of first-order optimal points. Next, we show that BMM combined with trust-region methods with diminishing radius has an improved complexity of $\widetilde{O}((1+L_g) \epsilon^{-2})$, independent of the inverse strong convexity parameter $\rho^{-1}$, allowing improved theoretical and practical performance with `flat' surrogates. Our results hold robustly even when the convex sub-problems are solved as long as the optimality gaps are summable. Central to our analysis is a novel continuous first-order optimality measure, by which we bound the worst-case sub-optimality in each iteration by the first-order improvement the algorithm makes. We apply our general framework to obtain new results on various algorithms such as the celebrated multiplicative update algorithm for nonnegative matrix factorization by Lee and Seung, regularized nonnegative tensor decomposition, and the classical block projected gradient descent algorithm. Lastly, we numerically demonstrate that the additional use of diminishing radius can improve the convergence rate of BMM in many instances.

正則化項 · Subspace · MoDELS · 非凸 · 優化器 ·

2024 年 11 月 28 日

Regularized methods via cubic model subspace minimization for nonconvex optimization

Stefania Bellavia,Davide Palitta,Margherita Porcelli,Valeria Simoncini

Adaptive cubic regularization methods for solving nonconvex problems need the efficient computation of the trial step, involving the minimization of a cubic model. We propose a new approach in which this model is minimized in a low dimensional subspace that, in contrast to classic approaches, is reused for a number of iterations. Whenever the trial step produced by the low-dimensional minimization process is unsatisfactory, we employ a regularized Newton step whose regularization parameter is a by-product of the model minimization over the low-dimensional subspace. We show that the worst-case complexity of classic cubic regularized methods is preserved, despite the possible regularized Newton steps. We focus on the large class of problems for which (sparse) direct linear system solvers are available and provide several experimental results showing the very large gains of our new approach when compared to standard implementations of adaptive cubic regularization methods based on direct linear solvers. Our first choice as projection space for the low-dimensional model minimization is the polynomial Krylov subspace; nonetheless, we also explore the use of rational Krylov subspaces in case where the polynomial ones lead to less competitive numerical results.