亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Classical statistical methods have theoretical justification when the sample size is predetermined. In applications, however, it's often the case that sample sizes aren't predetermined; instead, they're often data-dependent. Since those methods designed for static sample sizes aren't reliable when sample sizes are dynamic, there's been recent interest in e-processes and corresponding tests and confidence sets that are anytime valid in the sense that their justification holds up for arbitrary dynamic data-collection plans. But if the investigator has relevant-yet-incomplete prior information about the quantity of interest, then there's an opportunity for efficiency gain, but existing approaches can't accommodate this. The present paper offer a new, regularized e-process framework that features a knowledge-based, imprecise-probabilistic regularization with improved efficiency. A generalized version of Ville's inequality is established, ensuring that inference based on the regularized e-process remains anytime valid in a novel, knowledge-dependent sense. In addition, the proposed regularized e-processes facilitate possibility-theoretic uncertainty quantification with strong frequentist-like calibration properties and other desirable Bayesian-like features: satisfies the likelihood principle, avoids sure-loss, and offers formal decision-making with reliability guarantees.

相關內容

Subclasses of TFNP (total functional NP) are usually defined by specifying a complete problem, which is necessarily in TFNP, and including all problems many-one reducible to it. We study two notions of how a TFNP problem can be reducible to an object, such as a complexity class, outside TFNP. This gives rise to subclasses of TFNP which capture some properties of that outside object. We show that well-known subclasses can arise in this way, for example PPA from reducibility to parity P and PLS from reducibility to P^NP. We study subclasses arising from PSPACE and the polynomial hierarchy, and show that they are characterized by the propositional proof systems Frege and constant-depth Frege, extending the known pairings between natural TFNP subclasses and proof systems. We study approximate counting from this point of view, and look for a subclass of TFNP that gives a natural home to combinatorial principles such as Ramsey which can be proved using approximate counting. We relate this to the recently-studied Long choice and Short choice problems.

Studying unified model averaging estimation for situations with complicated data structures, we propose a novel model averaging method based on cross-validation (MACV). MACV unifies a large class of new and existing model averaging estimators and covers a very general class of loss functions. Furthermore, to reduce the computational burden caused by the conventional leave-subject/one-out cross validation, we propose a SEcond-order-Approximated Leave-one/subject-out (SEAL) cross validation, which largely improves the computation efficiency. In the context of non-independent and non-identically distributed random variables, we establish the unified theory for analyzing the asymptotic behaviors of the proposed MACV and SEAL methods, where the number of candidate models is allowed to diverge with sample size. To demonstrate the breadth of the proposed methodology, we exemplify four optimal model averaging estimators under four important situations, i.e., longitudinal data with discrete responses, within-cluster correlation structure modeling, conditional prediction in spatial data, and quantile regression with a potential correlation structure. We conduct extensive simulation studies and analyze real-data examples to illustrate the advantages of the proposed methods.

Gradient descent is one of the most widely used iterative algorithms in modern statistical learning. However, its precise algorithmic dynamics in high-dimensional settings remain only partially understood, which has therefore limited its broader potential for statistical inference applications. This paper provides a precise, non-asymptotic distributional characterization of gradient descent iterates in a broad class of empirical risk minimization problems, in the so-called mean-field regime where the sample size is proportional to the signal dimension. Our non-asymptotic state evolution theory holds for both general non-convex loss functions and non-Gaussian data, and reveals the central role of two Onsager correction matrices that precisely characterize the non-trivial dependence among all gradient descent iterates in the mean-field regime. Although the Onsager correction matrices are typically analytically intractable, our state evolution theory facilitates a generic gradient descent inference algorithm that consistently estimates these matrices across a broad class of models. Leveraging this algorithm, we show that the state evolution can be inverted to construct (i) data-driven estimators for the generalization error of gradient descent iterates and (ii) debiased gradient descent iterates for inference of the unknown signal. Detailed applications to two canonical models--linear regression and (generalized) logistic regression--are worked out to illustrate model-specific features of our general theory and inference methods.

Dynamic event prediction, using joint modeling of survival time and longitudinal variables, is extremely useful in personalized medicine. However, the estimation of joint models including many longitudinal markers is still a computational challenge because of the high number of random effects and parameters to be estimated. In this paper, we propose a model averaging strategy to combine predictions from several joint models for the event, including one longitudinal marker only or pairwise longitudinal markers. The prediction is computed as the weighted mean of the predictions from the one-marker or two-marker models, with the time-dependent weights estimated by minimizing the time-dependent Brier score. This method enables us to combine a large number of predictions issued from joint models to achieve a reliable and accurate individual prediction. Advantages and limits of the proposed methods are highlighted in a simulation study by comparison with the predictions from well-specified and misspecified all-marker joint models as well as the one-marker and two-marker joint models. Using the PBC2 data set, the method is used to predict the risk of death in patients with primary biliary cirrhosis. The method is also used to analyze a French cohort study called the 3C data. In our study, seventeen longitudinal markers are considered to predict the risk of death.

Two sequential estimators are proposed for the odds p/(1-p) and log odds log(p/(1-p)) respectively, using independent Bernoulli random variables with parameter p as inputs. The estimators are unbiased, and guarantee that the variance of the estimation error divided by the true value of the odds, or the variance of the estimation error of the log odds, are less than a target value for any p in (0,1). The estimators are close to optimal in the sense of Wolfowitz's bound.

In this paper, we provide a mathematical framework for improving generalization in a class of learning problems which is related to point estimations for modeling of high-dimensional nonlinear functions. In particular, we consider a variational problem for a weakly-controlled gradient system, whose control input enters into the system dynamics as a coefficient to a nonlinear term which is scaled by a small parameter. Here, the optimization problem consists of a cost functional, which is associated with how to gauge the quality of the estimated model parameters at a certain fixed final time w.r.t. the model validating dataset, while the weakly-controlled gradient system, whose the time-evolution is guided by the model training dataset and its perturbed version with small random noise. Using the perturbation theory, we provide results that will allow us to solve a sequence of optimization problems, i.e., a set of decomposed optimization problems, so as to aggregate the corresponding approximate optimal solutions that are reasonably sufficient for improving generalization in such a class of learning problems. Moreover, we also provide an estimate for the rate of convergence for such approximate optimal solutions. Finally, we present some numerical results for a typical case of nonlinear regression problem.

The computation of integrals is a fundamental task in the analysis of functional data, which are typically considered as random elements in a space of squared integrable functions. Borrowing ideas from recent advances in the Monte Carlo integration literature, we propose effective unbiased estimation and inference procedures for integrals of uni- and multivariate random functions. Several applications to key problems in functional data analysis (FDA) involving random design points are studied and illustrated. In the absence of noise, the proposed estimates converge faster than the sample mean and the usual algorithms for numerical integration. Moreover, the proposed estimator facilitates effective inference by generally providing better coverage with shorter confidence and prediction intervals, in both noisy and noiseless setups.

Ideally, all analyses of normally distributed data should include the full covariance information between all data points. In practice, the full covariance matrix between all data points is not always available. Either because a result was published without a covariance matrix, or because one tries to combine multiple results from separate publications. For simple hypothesis tests, it is possible to define robust test statistics that will behave conservatively in the presence on unknown correlations. For model parameter fits, one can inflate the variance by a factor to ensure that things remain conservative at least up to a chosen confidence level. This paper describes a class of robust test statistics for simple hypothesis tests, as well as an algorithm to determine the necessary inflation factor for model parameter fits and Goodness of Fit tests and composite hypothesis tests. It then presents some example applications of the methods to real neutrino interaction data and model comparisons.

D&R is a statistical approach designed to handle large and complex datasets. It partitions the dataset into several manageable subsets and subsequently applies the analytic method to each subset independently to obtain results. Finally, the results from each subset are combined to yield the results for the entire dataset. D&R strategies can be implemented to fit GLMs to datasets too large for conventional methods. Several D&R strategies are available for different GLMs, some of which are theoretically justified but lack practical validation. A significant limitation is the theoretical and practical justification for estimating combined standard errors and confidence intervals. This paper reviews D&R strategies for GLMs and proposes a method to determine the combined standard error for D&R-based estimators. In addition to the traditional dataset division procedures, we propose a different division method named sequential partitioning for D&R-based estimators on GLMs. We show that the obtained D&R estimator with the proposed standard error attains equivalent efficiency as the full data estimate. We illustrate this on a large synthetic dataset and verify that the results from D&R are accurate and identical to those from other available R packages.

We present a novel computational framework to assess the structural integrity of welds. In the first stage of the simulation framework, local fractions of microstructural constituents within weld regions are predicted based on steel composition and welding parameters. The resulting phase fraction maps are used to define heterogeneous properties that are subsequently employed in structural integrity assessments using an elastoplastic phase field fracture model. The framework is particularised to predicting failure in hydrogen pipelines, demonstrating its potential to assess the feasibility of repurposing existing pipeline infrastructure to transport hydrogen. First, the process model is validated against experimental microhardness maps for vintage and modern pipeline welds. Additionally, the influence of welding conditions on hardness and residual stresses is investigated, demonstrating that variations in heat input, filler material composition, and weld bead order can significantly affect the properties within the weld region. Coupled hydrogen diffusion-fracture simulations are then conducted to determine the critical pressure at which hydrogen transport pipelines will fail. To this end, the model is enriched with a microstructure-sensitive description of hydrogen transport and hydrogen-dependent fracture resistance. The analysis of an X52 pipeline reveals that even 2 mm defects in a hard heat-affected zone can drastically reduce the critical failure pressure.

北京阿比特科技有限公司