日韩在线精品小视频_亚洲天堂AV一区二区在线观看_国产精品JIZZ视频国产_爽爽爽精品视频一区二区_国产黄色视频在线观看视频_色哟哟哟WWW网站入口_黄网站色在线视频免费观看网站

CholeskyQR-type algorithms are very popular in both academia and industry in recent years. It could make a balance between the computational cost, accuracy and speed. CholeskyQR2 provides numerical stability of orthogonality and Shifted CholeskyQR3 deals with problems regarding ill-conditioned matrices. 3C algorithm is applicable for sparse matrices. However, the overestimation of the error matrices in the previous works influences the sufficient conditions for these algorithms. Particularly, it leads to a conservative shifted item in Shifted CholeskyQR3 and 3C, which may greatly influence the properties of the algorithms. In this work, we consider the randomized methods and utilize the model of probabilistic error analysis in \cite{New} to do rounding error analysis for CholeskyQR-type algorithms. We combine the theoretical analysis with the $g$-norm defined in \cite{Columns}. Our analysis could provide a smaller shifted item for Shifted CholeskyQR3 and could improve the orthogonality of our 3C algorithm for dense matrices. Numerical experiments in the final section shows that our improvements with randomized methods do have some advantages compared with the original algorithms.

相關內容

Analysis

關注 2

教程 · 線性的 · 泛函 · 鏈式法則 · Call to Action ·

2024 年 12 月 10 日

A tutorial on automatic differentiation with complex numbers

Nicholas Kr?mer

Automatic differentiation is everywhere, but there exists only minimal documentation of how it works in complex arithmetic beyond stating "derivatives in $\mathbb{C}^d$" $\cong$ "derivatives in $\mathbb{R}^{2d}$" and, at best, shallow references to Wirtinger calculus. Unfortunately, the equivalence $\mathbb{C}^d \cong \mathbb{R}^{2d}$ becomes insufficient as soon as we need to derive custom gradient rules, e.g., to avoid differentiating "through" expensive linear algebra functions or differential equation simulators. To combat such a lack of documentation, this article surveys forward- and reverse-mode automatic differentiation with complex numbers, covering topics such as Wirtinger derivatives, a modified chain rule, and different gradient conventions while explicitly avoiding holomorphicity and the Cauchy--Riemann equations (which would be far too restrictive). To be precise, we will derive, explain, and implement a complex version of Jacobian-vector and vector-Jacobian products almost entirely with linear algebra without relying on complex analysis or differential geometry. This tutorial is a call to action, for users and developers alike, to take complex values seriously when implementing custom gradient propagation rules -- the manuscript explains how.

近似 · ReLU · PDE · Neural Networks · Networking ·

2024 年 12 月 10 日

Solving the Poisson Equation with Dirichlet data by shallow ReLU$^α$-networks: A regularity and approximation perspective

Malhar Vaishampayan,Stephan Wojtowytsch

For several classes of neural PDE solvers (Deep Ritz, PINNs, DeepONets), the ability to approximate the solution or solution operator to a partial differential equation (PDE) hinges on the abilitiy of a neural network to approximate the solution in the spatial variables. We analyze the capacity of neural networks to approximate solutions to an elliptic PDE assuming that the boundary condition can be approximated efficiently. Our focus is on the Laplace operator with Dirichlet boundary condition on a half space and on neural networks with a single hidden layer and an activation function that is a power of the popular ReLU activation function.

Weight · 估計/估計量 · 方差 · CASE · 分離的 ·

2024 年 12 月 10 日

A comparison of Kaplan--Meier-based inverse probability of censoring weighted regression methods

Morten Overgaard

from arxiv, 52 pages, 3 figures

Weighting with the inverse probability of censoring is an approach to deal with censoring in regression analyses where the outcome may be missing due to right-censoring. In this paper, three separate approaches involving this idea in a setting where the Kaplan--Meier estimator is used for estimating the censoring probability are compared. In more detail, the three approaches involve weighted regression, regression with a weighted outcome, and regression of a jack-knife pseudo-observation based on a weighted estimator. Expressions of the asymptotic variances are given in each case and the expressions are compared to each other and to the uncensored case. In terms of low asymptotic variance, a clear winner cannot be found. Which approach will have the lowest asymptotic variance depends on the censoring distribution. Expressions of the limit of the standard sandwich variance estimator in the three cases are also provided, revealing an overestimation under the implied assumptions.

邊緣化 · 近似 · 正則化項 · ReLU · Neural Networks ·

2024 年 12 月 10 日

High-dimensional classification problems with Barron regular boundaries under margin conditions

Jonathan García,Philipp Petersen

We prove that a classifier with a Barron-regular decision boundary can be approximated with a rate of high polynomial degree by ReLU neural networks with three hidden layers when a margin condition is assumed. In particular, for strong margin conditions, high-dimensional discontinuous classifiers can be approximated with a rate that is typically only achievable when approximating a low-dimensional smooth function. We demonstrate how these expression rate bounds imply fast-rate learning bounds that are close to $n^{-1}$ where $n$ is the number of samples. In addition, we carry out comprehensive numerical experimentation on binary classification problems with various margins. We study three different dimensions, with the highest dimensional problem corresponding to images from the MNIST data set.

列 · 可約的 · 分解的 · 正交 · 評論員 ·

2024 年 12 月 10 日

An improved Shifted CholeskyQR based on columns

Yuwei Fan,Haoran Guan,Zhonghua Qiao

Among all the deterministic CholeskyQR-type algorithms, Shifted CholeskyQR3 is specifically designed to address the QR factorization of ill-conditioned matrices. This algorithm introduces a shift parameter $s$ to prevent failure during the initial Cholesky factorization step, making the choice of this parameter critical for the algorithm's effectiveness. Our goal is to identify a smaller $s$ compared to the traditional selection based on $\norm{X}_{2}$. In this research, we propose a new matrix norm called the $g$-norm, which is based on the column properties of $X$. This norm allows us to obtain a reduced shift parameter $s$ for the Shifted CholeskyQR3 algorithm, thereby improving the sufficient condition of $\kappa_{2}(X)$ for this method. We provide rigorous proofs of orthogonality and residuals for the improved algorithm using our proposed $s$. Numerical experiments confirm the enhanced numerical stability of orthogonality and residuals with the reduced $s$. We find that Shifted CholeskyQR3 can effectively handle ill-conditioned $X$ with a larger $\kappa_{2}(X)$ when using our reduced $s$ compared to the original $s$. Furthermore, we compare CPU times with other algorithms to assess performance improvements.

Continuity · binary · 優化器 · 統計量 · AIM ·

2024 年 12 月 9 日

Continuous Testing: Unifying Tests and E-values

Nick W. Koning

Testing has developed into the fundamental statistical framework for falsifying hypotheses. Unfortunately, tests are binary in nature: a test either rejects a hypothesis or not. Such binary decisions do not reflect the reality of many scientific studies, which often aim to present the evidence against a hypothesis and do not necessarily intend to establish a definitive conclusion. We propose a continuous generalization of a test, which we use to continuously measure the evidence against a hypothesis. Such a continuous test can be viewed as a continuous and non-randomized interpretation of the classical `randomized test'. This offers the benefits of a randomized test, without the downsides of external randomization. Another interpretation is as a literal measure, which measures the amount of binary tests that reject the hypothesis. Our work unifies classical testing and the recently proposed $e$-values: $e$-values bounded to $[0, 1/\alpha]$ are continuously interpreted size $\alpha$ randomized tests. Choosing $\alpha = 0$ yields the regular $e$-value, which we use to define a level 0 continuous test. Moreover, we generalize the traditional notion of power by using generalized means. This produces a framework that contains both classical Neyman-Pearson optimal testing and log-optimal $e$-values, as well as a continuum of other options. The traditional $p$-value appears as the reciprocal of a generally invalid continuous test. In an illustration in a Gaussian location model, we find that optimal continuous tests are of a beautifully simple form.

估計/估計量 · MoDELS · 線性的 · 穩健性 · ENJOY ·

2024 年 12 月 9 日

Sandwich regression for accurate and robust estimation in generalized linear multilevel and longitudinal models

Elliot H. Young,Rajen D. Shah

Generalized linear models are a popular tool in applied statistics, with their maximum likelihood estimators enjoying asymptotic Gaussianity and efficiency. As all models are wrong, it is desirable to understand these estimators' behaviours under model misspecification. We study semiparametric multilevel generalized linear models, where only the conditional mean of the response is taken to follow a specific parametric form. Pre-existing estimators from mixed effects models and generalized estimating equations require specificaiton of a conditional covariance, which when misspecified can result in inefficient estimates of fixed effects parameters. It is nevertheless often computationally attractive to consider a restricted, finite dimensional class of estimators, as these models naturally imply. We introduce sandwich regression, that selects the estimator of minimal variance within a parametric class of estimators over all distributions in the full semiparametric model. We demonstrate numerically on simulated and real data the attractive improvements our sandwich regression approach enjoys over classical mixed effects models and generalized estimating equations.

論文 · Less · Notability · 塊 · 基 ·

2024 年 12 月 8 日

Combinatorics on words and generating Dirichlet series of automatic sequences

Jean-Paul Allouche,Jeffrey Shallit,Manon Stipulanti

from arxiv, 26 pages, 1 figure

Generating series are crucial in enumerative combinatorics, analytic combinatorics, and combinatorics on words. Though it might seem at first view that generating Dirichlet series are less used in these fields than ordinary and exponential generating series, there are many notable papers where they play a fundamental role, as can be seen in particular in the work of Flajolet and several of his co-authors. In this paper, we study Dirichlet series of integers with missing digits or blocks of digits in some integer base $b$; i.e., where the summation ranges over the integers whose expansions form some language strictly included in the set of all words over the alphabet $\{0, 1, \dots, b-1\}$ that do not begin with a $0$. We show how to unify and extend results proved by Nathanson in 2021 and by K\"ohler and Spilker in 2009. En route, we encounter several sequences from Sloane's On-Line Encyclopedia of Integer Sequences, as well as some famous $b$-automatic sequences or $b$-regular sequences. We also consider a specific sequence that is not $b$-regular.

INTERACT · 語言模型化 · MoDELS · Agent · 縮放 ·

2024 年 12 月 7 日

Investigating social alignment via mirroring in a system of interacting language models

Harvey McGuinness,Tianyu Wang,Carey E. Priebe,Hayden Helm

Alignment is a social phenomenon wherein individuals share a common goal or perspective. Mirroring, or mimicking the behaviors and opinions of another individual, is one mechanism by which individuals can become aligned. Large scale investigations of the effect of mirroring on alignment have been limited due to the scalability of traditional experimental designs in sociology. In this paper, we introduce a simple computational framework that enables studying the effect of mirroring behavior on alignment in multi-agent systems. We simulate systems of interacting large language models in this framework and characterize overall system behavior and alignment with quantitative measures of agent dynamics. We find that system behavior is strongly influenced by the range of communication of each agent and that these effects are exacerbated by increased rates of mirroring. We discuss the observed simulated system behavior in the context of known human social dynamics.

Weight · 線性的 · Analysis · 近似 · 泛函 ·

2024 年 12 月 6 日

On one dimensional weighted Poincare inequalities for Global Sensitivity Analysis

David Heredia,Aldéric Joulin,Olivier Roustant

One-dimensional Poincare inequalities are used in Global Sensitivity Analysis (GSA) to provide derivative-based upper bounds and approximations of Sobol indices. We add new perspectives by investigating weighted Poincare inequalities. Our contributions are twofold. In a first part, we provide new theoretical results for weighted Poincare inequalities, guided by GSA needs. We revisit the construction of weights from monotonic functions, providing a new proof from a spectral point of view. In this approach, given a monotonic function g, the weight is built such that g is the first non-trivial eigenfunction of a convenient diffusion operator. This allows us to reconsider the linear standard, i.e. the weight associated to a linear g. In particular, we construct weights that guarantee the existence of an orthonormal basis of eigenfunctions, leading to approximation of Sobol indices with Parseval formulas. In a second part, we develop specific methods for GSA. We study the equality case of the upper bound of a total Sobol index, and link the sharpness of the inequality to the proximity of the main effect to the eigenfunction. This leads us to theoretically investigate the construction of data-driven weights from estimators of the main effects when they are monotonic, another extension of the linear standard. Finally, we illustrate the benefits of using weights on a GSA study of two toy models and a real flooding application, involving the Poincare constant and/or the whole eigenbasis.