亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

The concepts of Bayesian prediction, model comparison, and model selection have developed significantly over the last decade. As a result, the Bayesian community has witnessed a rapid growth in theoretical and applied contributions to building and selecting predictive models. Projection predictive inference in particular has shown promise to this end, finding application across a broad range of fields. It is less prone to over-fitting than na\"ive selection based purely on cross-validation or information criteria performance metrics, and has been known to out-perform other methods in terms of predictive performance. We survey the core concept and contemporary contributions to projection predictive inference, and present a safe, efficient, and modular workflow for prediction-oriented model selection therein. We also provide an interpretation of the projected posteriors achieved by projection predictive inference in terms of their limitations in causal settings.

相關內容

Modeling complex systems that consist of different types of objects leads to multilayer networks, in which vertices are connected by both inter-layer and intra-layer edges. In this paper, we investigate multiplex networks, in which vertices in different layers are identified with each other, and the only inter-layer edges are those that connect a vertex with its copy in other layers. Let the third-order adjacency tensor $\mathcal{A}\in\R^{N\times N\times L}$ and the parameter $\gamma\geq 0$, which is associated with the ease of communication between layers, represent a multiplex network with $N$ vertices and $L$ layers. To measure the ease of communication in a multiplex network, we focus on the average inverse geodesic length, which we refer to as the multiplex global efficiency $e_\mathcal{A}(\gamma)$ by means of the multiplex path length matrix $P\in\R^{N\times N}$. This paper generalizes the approach proposed in \cite{NR23} for single-layer networks. We describe an algorithm based on min-plus matrix multiplication to construct $P$, as well as variants $P^K$ that only take into account multiplex paths made up of at most $K$ intra-layer edges. These matrices are applied to detect redundant edges and to determine non-decreasing lower bounds $e_\mathcal{A}^K(\gamma)$ for $e_\mathcal{A}(\gamma)$, for $K=1,2,\dots,N-2$. Finally, the sensitivity of $e_\mathcal{A}^K(\gamma)$ to changes of the entries of the adjacency tensor $\mathcal{A}$ is investigated to determine edges that should be strengthened to enhance the multiplex global efficiency the most.

Multivariate B-splines and Non-uniform rational B-splines (NURBS) lack adaptivity due to their tensor product structure. Truncated hierarchical B-splines (THB-splines) provide a solution for this. THB-splines organize the parameter space into a hierarchical structure, which enables efficient approximation and representation of functions with different levels of detail. The truncation mechanism ensures the partition of unity property of B-splines and defines a more scattered set of basis functions without overlapping on the multi-level spline space. Transferring these multi-level splines into B\'ezier elements representation facilitates straightforward incorporation into existing finite element (FE) codes. By separating the multi-level extraction of the THB-splines from the standard B\'ezier extraction, a more general independent framework applicable to any sequence of nested spaces is created. The operators for the multi-level structure of THB-splines and the operators of B\'ezier extraction are constructed in a local approach. Adjusting the operators for the multi-level structure from an element point of view and multiplying with the B\'ezier extraction operators of those elements, a direct map between B\'ezier elements and a hierarchical structure is obtained. The presented implementation involves the use of an open-source Octave/MATLAB isogeometric analysis (IGA) code called GeoPDEs. A basic Poisson problem is presented to investigate the performance of multi-level B\'ezier extraction compared to a standard THB-spline approach.

Deep neural operators (DNOs) have been utilized to approximate nonlinear mappings between function spaces. However, DNOs face the challenge of increased dimensionality and computational cost associated with unaligned observation data. In this study, we propose a hybrid Decoder-DeepONet operator regression framework to handle unaligned data effectively. Additionally, we introduce a Multi-Decoder-DeepONet, which utilizes an average field of training data as input augmentation. The consistencies of the frameworks with the operator approximation theory are provided, on the basis of the universal approximation theorem. Two numerical experiments, Darcy problem and flow-field around an airfoil, are conducted to validate the efficiency and accuracy of the proposed methods. Results illustrate the advantages of Decoder-DeepONet and Multi-Decoder-DeepONet in handling unaligned observation data and showcase their potentials in improving prediction accuracy.

Sparse regression has emerged as a popular technique for learning dynamical systems from temporal data, beginning with the SINDy (Sparse Identification of Nonlinear Dynamics) framework proposed by arXiv:1509.03580. Quantifying the uncertainty inherent in differential equations learned from data remains an open problem, thus we propose leveraging recent advances in statistical inference for sparse regression to address this issue. Focusing on systems of ordinary differential equations (ODEs), SINDy assumes that each equation is a parsimonious linear combination of a few candidate functions, such as polynomials, and uses methods such as sequentially-thresholded least squares or the Lasso to identify a small subset of these functions that govern the system's dynamics. We instead employ bias-corrected versions of the Lasso and ridge regression estimators, as well as an empirical Bayes variable selection technique known as SEMMS, to estimate each ODE as a linear combination of terms that are statistically significant. We demonstrate through simulations that this approach allows us to recover the functional terms that correctly describe the dynamics more often than existing methods that do not account for uncertainty.

Although rarely stated, in practice, Grammatical Error Correction (GEC) encompasses various models with distinct objectives, ranging from grammatical error detection to improving fluency. Traditional evaluation methods fail to fully capture the full range of system capabilities and objectives. Reference-based evaluations suffer from limitations in capturing the wide variety of possible correction and the biases introduced during reference creation and is prone to favor fixing local errors over overall text improvement. The emergence of large language models (LLMs) has further highlighted the shortcomings of these evaluation strategies, emphasizing the need for a paradigm shift in evaluation methodology. In the current study, we perform a comprehensive evaluation of various GEC systems using a recently published dataset of Swedish learner texts. The evaluation is performed using established evaluation metrics as well as human judges. We find that GPT-3 in a few-shot setting by far outperforms previous grammatical error correction systems for Swedish, a language comprising only 0.11% of its training data. We also found that current evaluation methods contain undesirable biases that a human evaluation is able to reveal. We suggest using human post-editing of GEC system outputs to analyze the amount of change required to reach native-level human performance on the task, and provide a dataset annotated with human post-edits and assessments of grammaticality, fluency and meaning preservation of GEC system outputs.

This paper studies model checking for general parametric regression models having no dimension reduction structures on the predictor vector. Using any U-statistic type test as an initial test, this paper combines the sample-splitting and conditional studentization approaches to construct a COnditionally Studentized Test (COST). Whether the initial test is global or local smoothing-based; the dimension of the predictor vector and the number of parameters are fixed or diverge at a certain rate, the proposed test always has a normal weak limit under the null hypothesis. When the dimension of the predictor vector diverges to infinity at faster rate than the number of parameters, even the sample size, these results are still available under some conditions. This shows the potential of our method to handle higher dimensional problems. Further, the test can detect the local alternatives distinct from the null hypothesis at the fastest possible rate of convergence in hypothesis testing. We also discuss the optimal sample splitting in power performance. The numerical studies offer information on its merits and limitations in finite sample cases including the setting where the dimension of predictor vector equals the sample size. As a generic methodology, it could be applied to other testing problems.

Deep generative models require large amounts of training data. This often poses a problem as the collection of datasets can be expensive and difficult, in particular datasets that are representative of the appropriate underlying distribution (e.g. demographic). This introduces biases in datasets which are further propagated in the models. We present an approach to mitigate biases in an existing generative adversarial network by rebalancing the model distribution. We do so by generating balanced data from an existing unbalanced deep generative model using latent space exploration and using this data to train a balanced generative model. Further, we propose a bias mitigation loss function that shows improvements in the fairness metric even when trained with unbalanced datasets. We show results for the Stylegan2 models while training on the FFHQ dataset for racial fairness and see that the proposed approach improves on the fairness metric by almost 5 times, whilst maintaining image quality. We further validate our approach by applying it to an imbalanced Cifar-10 dataset. Lastly, we argue that the traditionally used image quality metrics such as Frechet inception distance (FID) are unsuitable for bias mitigation problems.

Latitude on the choice of initialisation is a shared feature between one-step extended state-space and multi-step methods. The paper focuses on lattice Boltzmann schemes, which can be interpreted as examples of both previous categories of numerical schemes. We propose a modified equation analysis of the initialisation schemes for lattice Boltzmann methods, determined by the choice of initial data. These modified equations provide guidelines to devise and analyze the initialisation in terms of order of consistency with respect to the target Cauchy problem and time smoothness of the numerical solution. In detail, the larger the number of matched terms between modified equations for initialisation and bulk methods, the smoother the obtained numerical solution. This is particularly manifest for numerical dissipation. Starting from the constraints to achieve time smoothness, which can quickly become prohibitive for they have to take the parasitic modes into consideration, we explain how the distinct lack of observability for certain lattice Boltzmann schemes -- seen as dynamical systems on a commutative ring -- can yield rather simple conditions and be easily studied as far as their initialisation is concerned. This comes from the reduced number of initialisation schemes at the fully discrete level. These theoretical results are successfully assessed on several lattice Boltzmann methods.

Confidence intervals based on the central limit theorem (CLT) are a cornerstone of classical statistics. Despite being only asymptotically valid, they are ubiquitous because they permit statistical inference under very weak assumptions, and can often be applied to problems even when nonasymptotic inference is impossible. This paper introduces time-uniform analogues of such asymptotic confidence intervals. To elaborate, our methods take the form of confidence sequences (CS) -- sequences of confidence intervals that are uniformly valid over time. CSs provide valid inference at arbitrary stopping times, incurring no penalties for "peeking" at the data, unlike classical confidence intervals which require the sample size to be fixed in advance. Existing CSs in the literature are nonasymptotic, and hence do not enjoy the aforementioned broad applicability of asymptotic confidence intervals. Our work bridges the gap by giving a definition for "asymptotic CSs", and deriving a universal asymptotic CS that requires only weak CLT-like assumptions. While the CLT approximates the distribution of a sample average by that of a Gaussian at a fixed sample size, we use strong invariance principles (stemming from the seminal 1960s work of Strassen and improvements by Koml\'os, Major, and Tusn\'ady) to uniformly approximate the entire sample average process by an implicit Gaussian process. As an illustration of our theory, we derive asymptotic CSs for the average treatment effect using efficient estimators in observational studies (for which no nonasymptotic bounds can exist even in the fixed-time regime) as well as randomized experiments, enabling causal inference that can be continuously monitored and adaptively stopped.

The remarkable practical success of deep learning has revealed some major surprises from a theoretical perspective. In particular, simple gradient methods easily find near-optimal solutions to non-convex optimization problems, and despite giving a near-perfect fit to training data without any explicit effort to control model complexity, these methods exhibit excellent predictive accuracy. We conjecture that specific principles underlie these phenomena: that overparametrization allows gradient methods to find interpolating solutions, that these methods implicitly impose regularization, and that overparametrization leads to benign overfitting. We survey recent theoretical progress that provides examples illustrating these principles in simpler settings. We first review classical uniform convergence results and why they fall short of explaining aspects of the behavior of deep learning methods. We give examples of implicit regularization in simple settings, where gradient methods lead to minimal norm functions that perfectly fit the training data. Then we review prediction methods that exhibit benign overfitting, focusing on regression problems with quadratic loss. For these methods, we can decompose the prediction rule into a simple component that is useful for prediction and a spiky component that is useful for overfitting but, in a favorable setting, does not harm prediction accuracy. We focus specifically on the linear regime for neural networks, where the network can be approximated by a linear model. In this regime, we demonstrate the success of gradient flow, and we consider benign overfitting with two-layer networks, giving an exact asymptotic analysis that precisely demonstrates the impact of overparametrization. We conclude by highlighting the key challenges that arise in extending these insights to realistic deep learning settings.

北京阿比特科技有限公司