一级欧美一级日韩大片-碰碰女人公开免费视频

Classification of $N$ points becomes a simultaneous control problem when viewed through the lens of neural ordinary differential equations (neural ODEs), which represent the time-continuous limit of residual networks. For the narrow model, with one neuron per hidden layer, it has been shown that the task can be achieved using $O(N)$ neurons. In this study, we focus on estimating the number of neurons required for efficient cluster-based classification, particularly in the worst-case scenario where points are independently and uniformly distributed in $[0,1]^d$. Our analysis provides a novel method for quantifying the probability of requiring fewer than $O(N)$ neurons, emphasizing the asymptotic behavior as both $d$ and $N$ increase. Additionally, under the sole assumption that the data are in general position, we propose a new constructive algorithm that simultaneously classifies clusters of $d$ points from any initial configuration, effectively reducing the maximal complexity to $O(N/d)$ neurons.

相關內容

分離的

關注 1

廣義函數 · 泛函 · Weight · 蒙特卡羅 · 均值 ·

2024 年 2 月 9 日

Weighted cumulative residual Entropy Generating Function and its properties

Smitha S.,Sudheesh K. Kattumannil,Sreedevi E. P

from arxiv, arXiv admin note: text overlap with arXiv:2211.05484

The study on the generating function approach to entropy become popular as it generates several well-known entropy measures discussed in the literature. In this work, we define the weighted cumulative residual entropy generating function (WCREGF) and study its properties. We then introduce the dynamic weighted cumulative residual entropy generating function (DWCREGF). It is shown that the DWCREGF determines the distribution uniquely. We study some characterization results using the relationship between the DWCREGF and the hazard rate and/or the mean residual life function. Using a characterization based on DWCREGF, we develop a new goodness fit test for Rayleigh distribution. A Monte Carlo simulation study is conducted to evaluate the proposed test. Finally, the test is illustrated using two real data sets.

泛函 · Neural Networks · 損失函數（機器學習） · 估計/估計量 · Networking ·

2024 年 2 月 8 日

Neural functional a posteriori error estimates

Vladimir Fanaskov,Alexander Rudikov,Ivan Oseledets

from arxiv, Under review for ICML2024, was reviewed at //openreview.net/forum?id=z62Xc88jgF for ICLR2024

We propose a new loss function for supervised and physics-informed training of neural networks and operators that incorporates a posteriori error estimate. More specifically, during the training stage, the neural network learns additional physical fields that lead to rigorous error majorants after a computationally cheap postprocessing stage. Theoretical results are based upon the theory of functional a posteriori error estimates, which allows for the systematic construction of such loss functions for a diverse class of practically relevant partial differential equations. From the numerical side, we demonstrate on a series of elliptic problems that for a variety of architectures and approaches (physics-informed neural networks, physics-informed neural operators, neural operators, and classical architectures in the regression and physics-informed settings), we can reach better or comparable accuracy and in addition to that cheaply recover high-quality upper bounds on the error after training.

Projection · Subspace · 點估計 · Performer · 統計量 ·

2024 年 2 月 8 日

Uncertainty calibration for probabilistic projection methods

Vladimir Fanaskov

Classical Krylov subspace projection methods for the solution of linear problem $Ax = b$ output an approximate solution $\widetilde{x}\simeq x$. Recently, it has been recognized that projection methods can be understood from a statistical perspective. These probabilistic projection methods return a distribution $p(\widetilde{x})$ in place of a point estimate $\widetilde{x}$. The resulting uncertainty, codified as a distribution, can, in theory, be meaningfully combined with other uncertainties, can be propagated through computational pipelines, and can be used in the framework of probabilistic decision theory. The problem we address is that the current probabilistic projection methods lead to the poorly calibrated posterior distribution. We improve the covariance matrix from previous works in a way that it does not contain such undesirable objects as $A^{-1}$ or $A^{-1}A^{-T}$, results in nontrivial uncertainty, and reproduces an arbitrary projection method as a mean of the posterior distribution. We also propose a variant that is numerically inexpensive in the case the uncertainty is calibrated a priori. Since it usually is not, we put forward a practical way to calibrate uncertainty that performs reasonably well, albeit at the expense of roughly doubling the numerical cost of the underlying projection method.

流形 · Integration · 曲率 · 估計/估計量 · 泛化理論 ·

2024 年 2 月 8 日

B-stability of numerical integrators on Riemannian manifolds

Martin Arnold,Elena Celledoni,Ergys ?okaj,Brynjulf Owren,Denise Tumiotto

We propose a generalization of nonlinear stability of numerical one-step integrators to Riemannian manifolds in the spirit of Butcher's notion of B-stability. Taking inspiration from Simpson-Porco and Bullo, we introduce non-expansive systems on such manifolds and define B-stability of integrators. In this first exposition, we provide concrete results for a geodesic version of the Implicit Euler (GIE) scheme. We prove that the GIE method is B-stable on Riemannian manifolds with non-positive sectional curvature. We show through numerical examples that the GIE method is expansive when applied to a certain non-expansive vector field on the 2-sphere, and that the GIE method does not necessarily possess a unique solution for large enough step sizes. Finally, we derive a new improved global error estimate for general Lie group integrators.

MoDELS · 預測準確率 · INFORMS · 模型評估 · 判別器 ·

2024 年 2 月 8 日

When accurate prediction models yield harmful self-fulfilling prophecies

Wouter A. C. van Amsterdam,Nan van Geloven,Jesse H. Krijthe,Rajesh Ranganath,Giovanni Ciná

Objective: Prediction models are popular in medical research and practice. By predicting an outcome of interest for specific patients, these models may help inform difficult treatment decisions, and are often hailed as the poster children for personalized, data-driven healthcare. Many prediction models are deployed for decision support based on their prediction accuracy in validation studies. We investigate whether this is a safe and valid approach. Materials and Methods: We show that using prediction models for decision making can lead to harmful decisions, even when the predictions exhibit good discrimination after deployment. These models are harmful self-fulfilling prophecies: their deployment harms a group of patients but the worse outcome of these patients does not invalidate the predictive power of the model. Results: Our main result is a formal characterization of a set of such prediction models. Next we show that models that are well calibrated before and after deployment are useless for decision making as they made no change in the data distribution. Discussion: Our results point to the need to revise standard practices for validation, deployment and evaluation of prediction models that are used in medical decisions. Conclusion: Outcome prediction models can yield harmful self-fulfilling prophecies when used for decision making, a new perspective on prediction model development, deployment and monitoring is needed.

貝葉斯網/貝葉斯網絡 · Networking · 離散化 · 可約的 · CASES ·

2024 年 2 月 8 日

Reductions of discrete Bayesian networks via lumping

Linard Hoessly

Bayesian networks are widely utilised in various fields, offering elegant representations of factorisations and causal relationships. We use surjective functions to reduce the dimensionality of the Bayesian networks by combining states and study the preservation of their factorisation structure. We introduce and define corresponding notions, analyse their properties, and provide examples of highly symmetric special cases, enhancing the understanding of the fundamental properties of such reductions for Bayesian networks. We also discuss the connection between this and reductions of homogeneous and non-homogeneous Markov chains.

Weight · Networking · 相關系數 · 可辨認的 · 統計量 ·

2024 年 2 月 7 日

Gradient descent induces alignment between weights and the empirical NTK for deep non-linear networks

Daniel Beaglehole,Ioannis Mitliagkas,Atish Agarwala

Understanding the mechanisms through which neural networks extract statistics from input-label pairs is one of the most important unsolved problems in supervised learning. Prior works have identified that the gram matrices of the weights in trained neural networks of general architectures are proportional to the average gradient outer product of the model, in a statement known as the Neural Feature Ansatz (NFA). However, the reason these quantities become correlated during training is poorly understood. In this work, we explain the emergence of this correlation. We identify that the NFA is equivalent to alignment between the left singular structure of the weight matrices and a significant component of the empirical neural tangent kernels associated with those weights. We establish that the NFA introduced in prior works is driven by a centered NFA that isolates this alignment. We show that the speed of NFA development can be predicted analytically at early training times in terms of simple statistics of the inputs and labels. Finally, we introduce a simple intervention to increase NFA correlation at any given layer, which dramatically improves the quality of features learned.

泛函 · UniFormer · Processing（編程語言） · 類別 · 泛化理論 ·

2024 年 2 月 7 日

Instance-dependent uniform tail bounds for empirical processes

Sohail Bahmani

We formulate a uniform tail bound for empirical processes indexed by a class of functions, in terms of the individual deviations of the functions rather than the worst-case deviation in the considered class. The tail bound is established by introducing an initial "deflation" step to the standard generic chaining argument. The resulting tail bound is the sum of the complexity of the "deflated function class" in terms of a generalization of Talagrand's $\gamma$ functional, and the deviation of the function instance, both of which are formulated based on the natural seminorm induced by the corresponding Cram\'{e}r functions. We also provide certain approximations for the mentioned seminorm when the function class lies in a given (exponential type) Orlicz space, that can be used to make the complexity term and the deviation term more explicit.

估計/估計量 · Networking · MoDELS · 極大似然 · 統計量 ·

2024 年 2 月 7 日

Spatial autoregressive model with measurement error in covariates

Subhadeep Paul,Shanjukta Nath

The Spatial AutoRegressive model (SAR) is commonly used in studies involving spatial and network data to estimate the spatial or network peer influence and the effects of covariates on the response, taking into account the spatial or network dependence. While the model can be efficiently estimated with a Quasi maximum likelihood approach (QMLE), the detrimental effect of covariate measurement error on the QMLE and how to remedy it is currently unknown. If covariates are measured with error, then the QMLE may not have the $\sqrt{n}$ convergence and may even be inconsistent even when a node is influenced by only a limited number of other nodes or spatial units. We develop a measurement error-corrected ML estimator (ME-QMLE) for the parameters of the SAR model when covariates are measured with error. The ME-QMLE possesses statistical consistency and asymptotic normality properties. We consider two types of applications. The first is when the true covariate cannot be measured directly, and a proxy is observed instead. The second one involves including latent homophily factors estimated with error from the network for estimating peer influence. Our numerical results verify the bias correction property of the estimator and the accuracy of the standard error estimates in finite samples. We illustrate the method on a real dataset related to county-level death rates from the COVID-19 pandemic.

估計/估計量 · Minimax · 貝葉斯估計 · MoDELS · 頻率主義學派 ·

2024 年 2 月 6 日

Nonparametric Bayesian estimation in a multidimensional diffusion model with high frequency data

Marc Hoffmann,Kolyan Ray

from arxiv, 61 pages, 1 figure

We consider nonparametric Bayesian inference in a multidimensional diffusion model with reflecting boundary conditions based on discrete high-frequency observations. We prove a general posterior contraction rate theorem in $L^2$-loss, which is applied to Gaussian priors. The resulting posteriors, as well as their posterior means, are shown to converge to the ground truth at the minimax optimal rate over H\"older smoothness classes in any dimension. Of independent interest and as part of our proofs, we show that certain frequentist penalized least squares estimators are also minimax optimal.