亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Generalization to unseen data remains poorly understood for deep learning classification and foundation models. How can one assess the ability of networks to adapt to new or extended versions of their input space in the spirit of few-shot learning, out-of-distribution generalization, and domain adaptation? Which layers of a network are likely to generalize best? We provide a new method for evaluating the capacity of networks to represent a sampled domain, regardless of whether the network has been trained on all classes in the domain. Our approach is the following: after fine-tuning state-of-the-art pre-trained models for visual classification on a particular domain, we assess their performance on data from related but distinct variations in that domain. Generalization power is quantified as a function of the latent embeddings of unseen data from intermediate layers for both unsupervised and supervised settings. Working throughout all stages of the network, we find that (i) high classification accuracy does not imply high generalizability; and (ii) deeper layers in a model do not always generalize the best, which has implications for pruning. Since the trends observed across datasets are largely consistent, we conclude that our approach reveals (a function of) the intrinsic capacity of the different layers of a model to generalize.

相關內容

Previous explanations for the persistence of polarization of opinions have typically included modelling assumptions that predispose the possibility of polarization (e.g.\ repulsive interactions). An exception is recent research showing that polarization is stable when agents form their opinions using reinforcement learning. We show that the polarization observed in this model is not stable, but exhibits consensus asymptotically with probability one. By constructing a link between the reinforcement learning model and the voter model, we argue that the observed polarization is metastable. Finally, we show that a slight modification in the learning process of the agents changes the model from being non-ergodic to being ergodic. Our results show that reinforcement learning may be a powerful method for modelling polarization in opinion dynamics, but that the tools appropriate for analysing such models crucially depend on the properties of the resulting systems. Properties which are determined by the details of the learning process.

It is well known since 1960s that by exploring the tensor product structure of the discrete Laplacian on Cartesian meshes, one can develop a simple direct Poisson solver with an $\mathcal O(N^{\frac{d+1}d})$ complexity in d-dimension, where N is the number of the total unknowns. The GPU acceleration of numerically solving PDEs has been explored successfully around fifteen years ago and become more and more popular in the past decade, driven by significant advancement in both hardware and software technologies, especially in the recent few years. We present in this paper a simple but extremely fast MATLAB implementation on a modern GPU, which can be easily reproduced, for solving 3D Poisson type equations using a spectral-element method. In particular, it costs less than one second on a Nvidia A100 for solving a Poisson equation with one billion degree of freedoms. We also present applications of this fast solver to solve a linear (time-independent) Schr\"odinger equation and a nonlinear (time-dependent) Cahn-Hilliard equation.

Recently, addressing spatial confounding has become a major topic in spatial statistics. However, the literature has provided conflicting definitions, and many proposed definitions do not address the issue of confounding as it is understood in causal inference. We define spatial confounding as the existence of an unmeasured causal confounder with a spatial structure. We present a causal inference framework for nonparametric identification of the causal effect of a continuous exposure on an outcome in the presence of spatial confounding. We propose double machine learning (DML), a procedure in which flexible models are used to regress both the exposure and outcome variables on confounders to arrive at a causal estimator with favorable robustness properties and convergence rates, and we prove that this approach is consistent and asymptotically normal under spatial dependence. As far as we are aware, this is the first approach to spatial confounding that does not rely on restrictive parametric assumptions (such as linearity, effect homogeneity, or Gaussianity) for both identification and estimation. We demonstrate the advantages of the DML approach analytically and in simulations. We apply our methods and reasoning to a study of the effect of fine particulate matter exposure during pregnancy on birthweight in California.

We consider Maxwell eigenvalue problems on uncertain shapes with perfectly conducting TESLA cavities being the driving example. Due to the shape uncertainty the resulting eigenvalues and eigenmodes are also uncertain and it is well known that the eigenvalues may exhibit crossings or bifurcations under perturbation. We discuss how the shape uncertainties can be modelled using the domain mapping approach and how the deformation mapping can be expressed as coefficients in Maxwell's equations. Using derivatives of these coefficients and derivatives of the eigenpairs, we follow a perturbation approach to compute approximations of mean and covariance of the eigenpairs. For small perturbations these approximations are faster and more accurate than sampling or surrogate model strategies. For the implementation we use an approach based on isogeometric analysis, which allows for straightforward modelling of the domain deformations and computation of the required derivatives. Numerical experiments for a three-dimensional 9-cell TESLA cavity are presented.

In Parts I and II of this series of papers, three new methods for the computation of eigenvalues of singular pencils were developed: rank-completing perturbations, rank-projections, and augmentation. It was observed that a straightforward structure-preserving adaption for symmetric pencils was not possible and it was left as an open question how to address this challenge. In this Part III, it is shown how the observed issue can be circumvented by using Hermitian perturbations. This leads to structure-preserving analogues of the three techniques from Parts I and II for Hermitian pencils (including real symmetric pencils) as well as for related structures. It is an important feature of these methods that the sign characteristic of the given pencil is preserved. As an application, it is shown that the resulting methods can be used to solve systems of bivariate polynomials.

Mathematical modeling offers the opportunity to test hypothesis concerning Myeloproliferative emergence and development. We tested different mathematical models based on a training cohort (n=264 patients) (Registre de la c\^ote d'Or) to determine the emergence and evolution times before JAK2V617F classical Myeloproliferative disorders (respectively Polycythemia Vera and Essential Thrombocytemia) are diagnosed. We dissected the time before diagnosis as two main periods: the time from embryonic development for the JAK2V617F mutation to occur, not disappear and enter in proliferation, and a second time corresponding to the expansion of the clonal population until diagnosis. We demonstrate using progressively complexified models that the rate of active mutation occurrence is not constant and doesn't just rely on individual variability, but rather increases with age and takes a median time of 63.1+/-13 years. A contrario, the expansion time can be considered as constant: 8.8 years once the mutation has emerged. Results were validated in an external cohort (national FIMBANK Cohort, n=1248 patients). Analyzing JAK2V617F Essential Thrombocytema versus Polycythemia Vera, we noticed that the first period of time (rate of active homozygous mutation occurrence) for PV takes approximatively 1.5 years more than for ET to develop when the expansion time was quasi-similar. In conclusion, our multi-step approach and the ultimate time-dependent model of MPN emergence and development demonstrates that the emergence of a JAK2V617F mutation should be linked to an aging mechanism, and indicates a 8-9 years period of time to develop a full MPN.

Nonparametric estimators for the mean and the covariance functions of functional data are proposed. The setup covers a wide range of practical situations. The random trajectories are, not necessarily differentiable, have unknown regularity, and are measured with error at discrete design points. The measurement error could be heteroscedastic. The design points could be either randomly drawn or common for all curves. The estimators depend on the local regularity of the stochastic process generating the functional data. We consider a simple estimator of this local regularity which exploits the replication and regularization features of functional data. Next, we use the ``smoothing first, then estimate'' approach for the mean and the covariance functions. They can be applied with both sparsely or densely sampled curves, are easy to calculate and to update, and perform well in simulations. Simulations built upon an example of real data set, illustrate the effectiveness of the new approach.

We present a dimension-incremental method for function approximation in bounded orthonormal product bases to learn the solutions of various differential equations. Therefore, we deconstruct the source function of the differential equation into parameters like Fourier or Spline coefficients and treat the solution of the differential equation as a high-dimensional function w.r.t. the spatial variables, these parameters and also further possible parameters from the differential equation itself. Finally, we learn this function in the sense of sparse approximation in a suitable function space by detecting coefficients of the basis expansion with largest absolute value. Investigating the corresponding indices of the basis coefficients yields further insights on the structure of the solution as well as its dependency on the parameters and their interactions and allows for a reasonable generalization to even higher dimensions and therefore better resolutions of the deconstructed source function.

Bayesian inference for survival regression modeling offers numerous advantages, especially for decision-making and external data borrowing, but demands the specification of the baseline hazard function, which may be a challenging task. We propose an alternative approach that does not need the specification of this function. Our approach combines pseudo-observations to convert censored data into longitudinal data with the Generalized Methods of Moments (GMM) to estimate the parameters of interest from the survival function directly. GMM may be viewed as an extension of the Generalized Estimating Equation (GEE) currently used for frequentist pseudo-observations analysis and can be extended to the Bayesian framework using a pseudo-likelihood function. We assessed the behavior of the frequentist and Bayesian GMM in the new context of analyzing pseudo-observations. We compared their performances to the Cox, GEE, and Bayesian piecewise exponential models through a simulation study of two-arm randomized clinical trials. Frequentist and Bayesian GMM gave valid inferences with similar performances compared to the three benchmark methods, except for small sample sizes and high censoring rates. For illustration, three post-hoc efficacy analyses were performed on randomized clinical trials involving patients with Ewing Sarcoma, producing results similar to those of the benchmark methods. Through a simple application of estimating hazard ratios, these findings confirm the effectiveness of this new Bayesian approach based on pseudo-observations and the generalized method of moments. This offers new insights on using pseudo-observations for Bayesian survival analysis.

This dissertation studies a fundamental open challenge in deep learning theory: why do deep networks generalize well even while being overparameterized, unregularized and fitting the training data to zero error? In the first part of the thesis, we will empirically study how training deep networks via stochastic gradient descent implicitly controls the networks' capacity. Subsequently, to show how this leads to better generalization, we will derive {\em data-dependent} {\em uniform-convergence-based} generalization bounds with improved dependencies on the parameter count. Uniform convergence has in fact been the most widely used tool in deep learning literature, thanks to its simplicity and generality. Given its popularity, in this thesis, we will also take a step back to identify the fundamental limits of uniform convergence as a tool to explain generalization. In particular, we will show that in some example overparameterized settings, {\em any} uniform convergence bound will provide only a vacuous generalization bound. With this realization in mind, in the last part of the thesis, we will change course and introduce an {\em empirical} technique to estimate generalization using unlabeled data. Our technique does not rely on any notion of uniform-convergece-based complexity and is remarkably precise. We will theoretically show why our technique enjoys such precision. We will conclude by discussing how future work could explore novel ways to incorporate distributional assumptions in generalization bounds (such as in the form of unlabeled data) and explore other tools to derive bounds, perhaps by modifying uniform convergence or by developing completely new tools altogether.

北京阿比特科技有限公司