亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

This paper considers the reliability of automatic differentiation (AD) for neural networks involving the nonsmooth MaxPool operation. We investigate the behavior of AD across different precision levels (16, 32, 64 bits) and convolutional architectures (LeNet, VGG, and ResNet) on various datasets (MNIST, CIFAR10, SVHN, and ImageNet). Although AD can be incorrect, recent research has shown that it coincides with the derivative almost everywhere, even in the presence of nonsmooth operations (such as MaxPool and ReLU). On the other hand, in practice, AD operates with floating-point numbers (not real numbers), and there is, therefore, a need to explore subsets on which AD can be numerically incorrect. These subsets include a bifurcation zone (where AD is incorrect over reals) and a compensation zone (where AD is incorrect over floating-point numbers but correct over reals). Using SGD for the training process, we study the impact of different choices of the nonsmooth Jacobian for the MaxPool function on the precision of 16 and 32 bits. These findings suggest that nonsmooth MaxPool Jacobians with lower norms help maintain stable and efficient test accuracy, whereas those with higher norms can result in instability and decreased performance. We also observe that the influence of MaxPool's nonsmooth Jacobians on learning can be reduced by using batch normalization, Adam-like optimizers, or increasing the precision level.

相關內容

The typical phases of Bayesian network (BN) structured development include specification of purpose and scope, structure development, parameterisation and validation. Structure development is typically focused on qualitative issues and parameterisation quantitative issues, however there are qualitative and quantitative issues that arise in both phases. A common step that occurs after the initial structure has been developed is to perform a rough parameterisation that only captures and illustrates the intended qualitative behaviour of the model. This is done prior to a more rigorous parameterisation, ensuring that the structure is fit for purpose, as well as supporting later development and validation. In our collective experience and in discussions with other modellers, this step is an important part of the development process, but is under-reported in the literature. Since the practice focuses on qualitative issues, despite being quantitative in nature, we call this step qualitative parameterisation and provide an outline of its role in the BN development process.

We propose new goodness-of-fit tests for the Poisson distribution. The testing procedure entails fitting a weighted Poisson distribution, which has the Poisson as a special case, to observed data. Based on sample data, we calculate an empirical weight function which is compared to its theoretical counterpart under the Poisson assumption. Weighted Lp distances between these empirical and theoretical functions are proposed as test statistics and closed form expressions are derived for L1, L2 and L1 distances. A Monte Carlo study is included in which the newly proposed tests are shown to be powerful when compared to existing tests, especially in the case of overdispersed alternatives. We demonstrate the use of the tests with two practical examples.

We present an algorithm for the exact computer-aided construction of the Voronoi cells of lattices with known symmetry group. Our algorithm scales better than linearly with the total number of faces and is applicable to dimensions beyond 12, which previous methods could not achieve. The new algorithm is applied to the Coxeter-Todd lattice $K_{12}$ as well as to a family of lattices obtained from laminating $K_{12}$. By optimizing this family, we obtain a new best 13-dimensional lattice quantizer (among the lattices with published exact quantizer constants).

Complex interval arithmetic is a powerful tool for the analysis of computational errors. The naturally arising rectangular, polar, and circular (together called primitive) interval types are not closed under simple arithmetic operations, and their use yields overly relaxed bounds. The later introduced polygonal type, on the other hand, allows for arbitrarily precise representation of the above operations for a higher computational cost. We propose the polyarcular interval type as an effective extension of the previous types. The polyarcular interval can represent all primitive intervals and most of their arithmetic combinations precisely and has an approximation capability competing with that of the polygonal interval. In particular, in antenna tolerance analysis it can achieve perfect accuracy for lower computational cost then the polygonal type, which we show in a relevant case study. In this paper, we present a rigorous analysis of the arithmetic properties of all five interval types, involving a new algebro-geometric method of boundary analysis.

We investigate a filtered Lie-Trotter splitting scheme for the ``good" Boussinesq equation and derive an error estimate for initial data with very low regularity. Through the use of discrete Bourgain spaces, our analysis extends to initial data in $H^{s}$ for $0<s\leq 2$, overcoming the constraint of $s>1/2$ imposed by the bilinear estimate in smooth Sobolev spaces. We establish convergence rates of order $\tau^{s/2}$ in $L^2$ for such levels of regularity. Our analytical findings are supported by numerical experiments.

We provide full theoretical guarantees for the convergence behaviour of diffusion-based generative models under the assumption of strongly log-concave data distributions while our approximating class of functions used for score estimation is made of Lipschitz continuous functions. We demonstrate via a motivating example, sampling from a Gaussian distribution with unknown mean, the powerfulness of our approach. In this case, explicit estimates are provided for the associated optimization problem, i.e. score approximation, while these are combined with the corresponding sampling estimates. As a result, we obtain the best known upper bound estimates in terms of key quantities of interest, such as the dimension and rates of convergence, for the Wasserstein-2 distance between the data distribution (Gaussian with unknown mean) and our sampling algorithm. Beyond the motivating example and in order to allow for the use of a diverse range of stochastic optimizers, we present our results using an $L^2$-accurate score estimation assumption, which crucially is formed under an expectation with respect to the stochastic optimizer and our novel auxiliary process that uses only known information. This approach yields the best known convergence rate for our sampling algorithm.

In this article we aim to obtain the Fisher Riemann geodesics for nonparametric families of probability densities as a weak limit of the parametric case with increasing number of parameters.

Effect modification occurs when the impact of the treatment on an outcome varies based on the levels of other covariates known as effect modifiers. Modeling of these effect differences is important for etiological goals and for purposes of optimizing treatment. Structural nested mean models (SNMMs) are useful causal models for estimating the potentially heterogeneous effect of a time-varying exposure on the mean of an outcome in the presence of time-varying confounding. A data-driven approach for selecting the effect modifiers of an exposure may be necessary if these effect modifiers are a priori unknown and need to be identified. Although variable selection techniques are available in the context of estimating conditional average treatment effects using marginal structural models, or in the context of estimating optimal dynamic treatment regimens, all of these methods consider an outcome measured at a single point in time. In the context of an SNMM for repeated outcomes, we propose a doubly robust penalized G-estimator for the causal effect of a time-varying exposure with a simultaneous selection of effect modifiers and use this estimator to analyze the effect modification in a study of hemodiafiltration. We prove the oracle property of our estimator, and conduct a simulation study for evaluation of its performance in finite samples and for verification of its double-robustness property. Our work is motivated by and applied to the study of hemodiafiltration for treating patients with end-stage renal disease at the Centre Hospitalier de l'Universit\'e de Montr\'eal. We apply the proposed method to investigate the effect heterogeneity of dialysis facility on the repeated session-specific hemodiafiltration outcomes.

Continual learning algorithms strive to acquire new knowledge while preserving prior information. Often, these algorithms emphasise stability and restrict network updates upon learning new tasks. In many cases, such restrictions come at a cost to the model's plasticity, i.e. the model's ability to adapt to the requirements of a new task. But is all change detrimental? Here, we approach this question by proposing that activation spaces in neural networks can be decomposed into two subspaces: a readout range in which change affects prior tasks and a null space in which change does not alter prior performance. Based on experiments with this novel technique, we show that, indeed, not all activation change is associated with forgetting. Instead, the only change in the subspace visible to the readout of a task can lead to decreased stability, while restricting change outside of this subspace is associated only with a loss of plasticity. Analysing various commonly used algorithms, we show that regularisation-based techniques do not fully disentangle the two spaces and, as a result, restrict plasticity more than need be. We expand our results by investigating a linear model in which we can manipulate learning in the two subspaces directly and thus causally link activation changes to stability and plasticity. For hierarchical, nonlinear cases, we present an approximation that enables us to estimate functionally relevant subspaces at every layer of a deep nonlinear network, corroborating our previous insights. Together, this work provides novel means to derive insights into the mechanisms behind stability and plasticity in continual learning and may serve as a diagnostic tool to guide developments of future continual learning algorithms that stabilise inference while allowing maximal space for learning.

Analyzing longitudinal data in health studies is challenging due to sparse and error-prone measurements, strong within-individual correlation, missing data and various trajectory shapes. While mixed-effect models (MM) effectively address these challenges, they remain parametric models and may incur computational costs. In contrast, Functional Principal Component Analysis (FPCA) is a non-parametric approach developed for regular and dense functional data that flexibly describes temporal trajectories at a lower computational cost. This paper presents an empirical simulation study evaluating the behaviour of FPCA with sparse and error-prone repeated measures and its robustness under different missing data schemes in comparison with MM. The results show that FPCA is well-suited in the presence of missing at random data caused by dropout, except in scenarios involving most frequent and systematic dropout. Like MM, FPCA fails under missing not at random mechanism. The FPCA was applied to describe the trajectories of four cognitive functions before clinical dementia and contrast them with those of matched controls in a case-control study nested in a population-based aging cohort. The average cognitive declines of future dementia cases showed a sudden divergence from those of their matched controls with a sharp acceleration 5 to 2.5 years prior to diagnosis.

北京阿比特科技有限公司