亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We introduce a generalized additive model for location, scale, and shape (GAMLSS) next of kin aiming at distribution-free and parsimonious regression modelling for arbitrary outcomes. We replace the strict parametric distribution formulating such a model by a transformation function, which in turn is estimated from data. Doing so not only makes the model distribution-free but also allows to limit the number of linear or smooth model terms to a pair of location-scale predictor functions. We derive the likelihood for continuous, discrete, and randomly censored observations, along with corresponding score functions. A plethora of existing algorithms is leveraged for model estimation, including constrained maximum-likelihood, the original GAMLSS algorithm, and transformation trees. Parameter interpretability in the resulting models is closely connected to model selection. We propose the application of a novel best subset selection procedure to achieve especially simple ways of interpretation. All techniques are motivated and illustrated by a collection of applications from different domains, including crossing and partial proportional hazards, complex count regression, non-linear ordinal regression, and growth curves. All analyses are reproducible with the help of the "tram" add-on package to the R system for statistical computing and graphics.

相關內容

The efficient representation of random fields on geometrically complex domains is crucial for Bayesian modelling in engineering and machine learning. Today's prevalent random field representations are restricted to unbounded domains or are too restrictive in terms of possible field properties. As a result, new techniques leveraging the historically established link between stochastic PDEs (SPDEs) and random fields are especially appealing for engineering applications with complex geometries which already have a finite element discretisation for solving the physical conservation equations. Unlike the dense covariance matrix of a random field, its inverse, the precision matrix, is usually sparse and equal to the stiffness matrix of a Helmholtz-like SPDE. In this paper, we use the SPDE representation to develop a scalable framework for large-scale statistical finite element analysis (statFEM) and Gaussian process (GP) regression on geometrically complex domains. We use the SPDE formulation to obtain the relevant prior probability densities with a sparse precision matrix. The properties of the priors are governed by the parameters and possibly fractional order of the Helmholtz-like SPDE so that we can model on bounded domains and manifolds anisotropic, non-homogeneous random fields with arbitrary smoothness. We use for assembling the sparse precision matrix the same finite element mesh used for solving the physical conservation equations. The observation models for statFEM and GP regression are such that the posterior probability densities are Gaussians with a closed-form mean and precision. The expressions for the mean vector and the precision matrix can be evaluated using only sparse matrix operations. We demonstrate the versatility of the proposed framework and its convergence properties with one and two-dimensional Poisson and thin-shell examples.

We introduce a new methodology 'charcoal' for estimating the location of sparse changes in high-dimensional linear regression coefficients, without assuming that those coefficients are individually sparse. The procedure works by constructing different sketches (projections) of the design matrix at each time point, where consecutive projection matrices differ in sign in exactly one column. The sequence of sketched design matrices is then compared against a single sketched response vector to form a sequence of test statistics whose behaviour shows a surprising link to the well-known CUSUM statistics of univariate changepoint analysis. The procedure is computationally attractive, and strong theoretical guarantees are derived for its estimation accuracy. Simulations confirm that our methods perform well in extensive settings, and a real-world application to a large single-cell RNA sequencing dataset showcases the practical relevance.

In recent years, there has been a significant growth in research focusing on minimum $\ell_2$ norm (ridgeless) interpolation least squares estimators. However, the majority of these analyses have been limited to a simple regression error structure, assuming independent and identically distributed errors with zero mean and common variance, independent of the feature vectors. Additionally, the main focus of these theoretical analyses has been on the out-of-sample prediction risk. This paper breaks away from the existing literature by examining the mean squared error of the ridgeless interpolation least squares estimator, allowing for more general assumptions about the regression errors. Specifically, we investigate the potential benefits of overparameterization by characterizing the mean squared error in a finite sample. Our findings reveal that including a large number of unimportant parameters relative to the sample size can effectively reduce the mean squared error of the estimator. Notably, we establish that the estimation difficulties associated with the variance term can be summarized through the trace of the variance-covariance matrix of the regression errors.

This study develops an asymptotic theory for estimating the time-varying characteristics of locally stationary functional time series (LSFTS). We investigate a kernel-based method to estimate the time-varying covariance operator and the time-varying mean function of an LSFTS. In particular, we derive the convergence rate of the kernel estimator of the covariance operator and associated eigenvalue and eigenfunctions and establish a central limit theorem for the kernel-based locally weighted sample mean. As applications of our results, we discuss methods for testing the equality of time-varying mean functions in two functional samples.

We extend our previous work on two-party election competition [Lin, Lu & Chen 2021] to the setting of three or more parties. An election campaign among two or more parties is viewed as a game of two or more players. Each of them has its own candidates as the pure strategies to play. People, as voters, comprise supporters for each party, and a candidate brings utility for the the supporters of each party. Each player nominates exactly one of its candidates to compete against the other party's. A candidate is assumed to win the election with higher odds if it brings more utility for all the people. The payoff of each player is the expected utility its supporters get. The game is egoistic if every candidate benefits her party's supporters more than any candidate from the competing party does. In this work, we first argue that the election game always has a pure Nash equilibrium when the winner is chosen by the hardmax function, while there exist game instances in the three-party election game such that no pure Nash equilibrium exists even the game is egoistic. Next, we propose two sufficient conditions for the egoistic election game to have a pure Nash equilibrium. Based on these conditions, we propose a fixed-parameter tractable algorithm to compute a pure Nash equilibrium of the egoistic election game. Finally, perhaps surprisingly, we show that the price of anarchy of the egoistic election game is upper bounded by the number of parties. Our findings suggest that the election becomes unpredictable when more than two parties are involved and, moreover, the social welfare deteriorates with the number of participating parties in terms of possibly increasing price of anarchy. This work alternatively explains why the two-party system is prevalent in democratic countries.

Energy-Based Models (EBMs) offer a versatile framework for modeling complex data distributions. However, training and sampling from EBMs continue to pose significant challenges. The widely-used Denoising Score Matching (DSM) method for scalable EBM training suffers from inconsistency issues, causing the energy model to learn a `noisy' data distribution. In this work, we propose an efficient sampling framework: (pseudo)-Gibbs sampling with moment matching, which enables effective sampling from the underlying clean model when given a `noisy' model that has been well-trained via DSM. We explore the benefits of our approach compared to related methods and demonstrate how to scale the method to high-dimensional datasets.

This paper studies the open problem of conformalized entry prediction in a row/column-exchangeable matrix. The matrix setting presents novel and unique challenges, but there exists little work on this interesting topic. We meticulously define the problem, differentiate it from closely related problems, and rigorously delineate the boundary between achievable and impossible goals. We then propose two practical algorithms. The first method provides a fast emulation of the full conformal prediction, while the second method leverages the technique of algorithmic stability for acceleration. Both methods are computationally efficient and can effectively safeguard coverage validity in presence of arbitrary missing pattern. Further, we quantify the impact of missingness on prediction accuracy and establish fundamental limit results. Empirical evidence from synthetic and real-world data sets corroborates the superior performance of our proposed methods.

Evolutionary symbolic regression (SR) fits a symbolic equation to data, which gives a concise interpretable model. We explore using SR as a method to propose which data to gather in an active learning setting with physical constraints. SR with active learning proposes which experiments to do next. Active learning is done with query by committee, where the Pareto frontier of equations is the committee. The physical constraints improve proposed equations in very low data settings. These approaches reduce the data required for SR and achieves state of the art results in data required to rediscover known equations.

Directional beamforming will play a paramount role in 5G and beyond networks in order to combat the higher path losses incurred at millimeter wave bands. Appropriate modeling and analysis of the angles and distances between transmitters and receivers in these networks are thus essential to understand performance and limiting factors. Most existing literature considers either infinite and uniform networks, where nodes are drawn according to a Poisson point process, or finite networks with the reference receiver placed at the origin of a disk. Under either of these assumptions, the distance and azimuth angle between transmitter and receiver are independent, and the angle follows a uniform distribution between $0$ and $2\pi$. Here, we consider a more realistic case of finite networks where the reference node is placed at any arbitrary location. We obtain the joint distribution between the distance and azimuth angle and demonstrate that these random variables do exhibit certain correlation, which depends on the shape of the region and the location of the reference node. To conduct the analysis, we present a general mathematical framework which is specialized to exemplify the case of a rectangular region. We then also derive the statistics for the 3D case where, considering antenna heights, the joint distribution of distance, azimuth and zenith angles is obtained. Finally, we describe some immediate applications of the present work, including the analysis of directional beamforming, the design of analog codebooks and wireless routing algorithms.

The dominating NLP paradigm of training a strong neural predictor to perform one task on a specific dataset has led to state-of-the-art performance in a variety of applications (eg. sentiment classification, span-prediction based question answering or machine translation). However, it builds upon the assumption that the data distribution is stationary, ie. that the data is sampled from a fixed distribution both at training and test time. This way of training is inconsistent with how we as humans are able to learn from and operate within a constantly changing stream of information. Moreover, it is ill-adapted to real-world use cases where the data distribution is expected to shift over the course of a model's lifetime. The first goal of this thesis is to characterize the different forms this shift can take in the context of natural language processing, and propose benchmarks and evaluation metrics to measure its effect on current deep learning architectures. We then proceed to take steps to mitigate the effect of distributional shift on NLP models. To this end, we develop methods based on parametric reformulations of the distributionally robust optimization framework. Empirically, we demonstrate that these approaches yield more robust models as demonstrated on a selection of realistic problems. In the third and final part of this thesis, we explore ways of efficiently adapting existing models to new domains or tasks. Our contribution to this topic takes inspiration from information geometry to derive a new gradient update rule which alleviate catastrophic forgetting issues during adaptation.

北京阿比特科技有限公司