亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Classification of $N$ points becomes a simultaneous control problem when viewed through the lens of neural ordinary differential equations (neural ODEs), which represent the time-continuous limit of residual networks. For the narrow model, with one neuron per hidden layer, it has been shown that the task can be achieved using $O(N)$ neurons. In this study, we focus on estimating the number of neurons required for efficient cluster-based classification, particularly in the worst-case scenario where points are independently and uniformly distributed in $[0,1]^d$. Our analysis provides a novel method for quantifying the probability of requiring fewer than $O(N)$ neurons, emphasizing the asymptotic behavior as both $d$ and $N$ increase. Additionally, under the sole assumption that the data are in general position, we propose a new constructive algorithm that simultaneously classifies clusters of $d$ points from any initial configuration, effectively reducing the maximal complexity to $O(N/d)$ neurons.

相關內容

The study on the generating function approach to entropy become popular as it generates several well-known entropy measures discussed in the literature. In this work, we define the weighted cumulative residual entropy generating function (WCREGF) and study its properties. We then introduce the dynamic weighted cumulative residual entropy generating function (DWCREGF). It is shown that the DWCREGF determines the distribution uniquely. We study some characterization results using the relationship between the DWCREGF and the hazard rate and/or the mean residual life function. Using a characterization based on DWCREGF, we develop a new goodness fit test for Rayleigh distribution. A Monte Carlo simulation study is conducted to evaluate the proposed test. Finally, the test is illustrated using two real data sets.

We propose a new loss function for supervised and physics-informed training of neural networks and operators that incorporates a posteriori error estimate. More specifically, during the training stage, the neural network learns additional physical fields that lead to rigorous error majorants after a computationally cheap postprocessing stage. Theoretical results are based upon the theory of functional a posteriori error estimates, which allows for the systematic construction of such loss functions for a diverse class of practically relevant partial differential equations. From the numerical side, we demonstrate on a series of elliptic problems that for a variety of architectures and approaches (physics-informed neural networks, physics-informed neural operators, neural operators, and classical architectures in the regression and physics-informed settings), we can reach better or comparable accuracy and in addition to that cheaply recover high-quality upper bounds on the error after training.

Classical Krylov subspace projection methods for the solution of linear problem $Ax = b$ output an approximate solution $\widetilde{x}\simeq x$. Recently, it has been recognized that projection methods can be understood from a statistical perspective. These probabilistic projection methods return a distribution $p(\widetilde{x})$ in place of a point estimate $\widetilde{x}$. The resulting uncertainty, codified as a distribution, can, in theory, be meaningfully combined with other uncertainties, can be propagated through computational pipelines, and can be used in the framework of probabilistic decision theory. The problem we address is that the current probabilistic projection methods lead to the poorly calibrated posterior distribution. We improve the covariance matrix from previous works in a way that it does not contain such undesirable objects as $A^{-1}$ or $A^{-1}A^{-T}$, results in nontrivial uncertainty, and reproduces an arbitrary projection method as a mean of the posterior distribution. We also propose a variant that is numerically inexpensive in the case the uncertainty is calibrated a priori. Since it usually is not, we put forward a practical way to calibrate uncertainty that performs reasonably well, albeit at the expense of roughly doubling the numerical cost of the underlying projection method.

We propose a generalization of nonlinear stability of numerical one-step integrators to Riemannian manifolds in the spirit of Butcher's notion of B-stability. Taking inspiration from Simpson-Porco and Bullo, we introduce non-expansive systems on such manifolds and define B-stability of integrators. In this first exposition, we provide concrete results for a geodesic version of the Implicit Euler (GIE) scheme. We prove that the GIE method is B-stable on Riemannian manifolds with non-positive sectional curvature. We show through numerical examples that the GIE method is expansive when applied to a certain non-expansive vector field on the 2-sphere, and that the GIE method does not necessarily possess a unique solution for large enough step sizes. Finally, we derive a new improved global error estimate for general Lie group integrators.

Objective: Prediction models are popular in medical research and practice. By predicting an outcome of interest for specific patients, these models may help inform difficult treatment decisions, and are often hailed as the poster children for personalized, data-driven healthcare. Many prediction models are deployed for decision support based on their prediction accuracy in validation studies. We investigate whether this is a safe and valid approach. Materials and Methods: We show that using prediction models for decision making can lead to harmful decisions, even when the predictions exhibit good discrimination after deployment. These models are harmful self-fulfilling prophecies: their deployment harms a group of patients but the worse outcome of these patients does not invalidate the predictive power of the model. Results: Our main result is a formal characterization of a set of such prediction models. Next we show that models that are well calibrated before and after deployment are useless for decision making as they made no change in the data distribution. Discussion: Our results point to the need to revise standard practices for validation, deployment and evaluation of prediction models that are used in medical decisions. Conclusion: Outcome prediction models can yield harmful self-fulfilling prophecies when used for decision making, a new perspective on prediction model development, deployment and monitoring is needed.

Bayesian networks are widely utilised in various fields, offering elegant representations of factorisations and causal relationships. We use surjective functions to reduce the dimensionality of the Bayesian networks by combining states and study the preservation of their factorisation structure. We introduce and define corresponding notions, analyse their properties, and provide examples of highly symmetric special cases, enhancing the understanding of the fundamental properties of such reductions for Bayesian networks. We also discuss the connection between this and reductions of homogeneous and non-homogeneous Markov chains.

Understanding the mechanisms through which neural networks extract statistics from input-label pairs is one of the most important unsolved problems in supervised learning. Prior works have identified that the gram matrices of the weights in trained neural networks of general architectures are proportional to the average gradient outer product of the model, in a statement known as the Neural Feature Ansatz (NFA). However, the reason these quantities become correlated during training is poorly understood. In this work, we explain the emergence of this correlation. We identify that the NFA is equivalent to alignment between the left singular structure of the weight matrices and a significant component of the empirical neural tangent kernels associated with those weights. We establish that the NFA introduced in prior works is driven by a centered NFA that isolates this alignment. We show that the speed of NFA development can be predicted analytically at early training times in terms of simple statistics of the inputs and labels. Finally, we introduce a simple intervention to increase NFA correlation at any given layer, which dramatically improves the quality of features learned.

We formulate a uniform tail bound for empirical processes indexed by a class of functions, in terms of the individual deviations of the functions rather than the worst-case deviation in the considered class. The tail bound is established by introducing an initial "deflation" step to the standard generic chaining argument. The resulting tail bound is the sum of the complexity of the "deflated function class" in terms of a generalization of Talagrand's $\gamma$ functional, and the deviation of the function instance, both of which are formulated based on the natural seminorm induced by the corresponding Cram\'{e}r functions. We also provide certain approximations for the mentioned seminorm when the function class lies in a given (exponential type) Orlicz space, that can be used to make the complexity term and the deviation term more explicit.

The Spatial AutoRegressive model (SAR) is commonly used in studies involving spatial and network data to estimate the spatial or network peer influence and the effects of covariates on the response, taking into account the spatial or network dependence. While the model can be efficiently estimated with a Quasi maximum likelihood approach (QMLE), the detrimental effect of covariate measurement error on the QMLE and how to remedy it is currently unknown. If covariates are measured with error, then the QMLE may not have the $\sqrt{n}$ convergence and may even be inconsistent even when a node is influenced by only a limited number of other nodes or spatial units. We develop a measurement error-corrected ML estimator (ME-QMLE) for the parameters of the SAR model when covariates are measured with error. The ME-QMLE possesses statistical consistency and asymptotic normality properties. We consider two types of applications. The first is when the true covariate cannot be measured directly, and a proxy is observed instead. The second one involves including latent homophily factors estimated with error from the network for estimating peer influence. Our numerical results verify the bias correction property of the estimator and the accuracy of the standard error estimates in finite samples. We illustrate the method on a real dataset related to county-level death rates from the COVID-19 pandemic.

We consider nonparametric Bayesian inference in a multidimensional diffusion model with reflecting boundary conditions based on discrete high-frequency observations. We prove a general posterior contraction rate theorem in $L^2$-loss, which is applied to Gaussian priors. The resulting posteriors, as well as their posterior means, are shown to converge to the ground truth at the minimax optimal rate over H\"older smoothness classes in any dimension. Of independent interest and as part of our proofs, we show that certain frequentist penalized least squares estimators are also minimax optimal.

北京阿比特科技有限公司