一级a视频免费一区二区,日韩A级毛片免费视频,欧美日韩一看视频,黄色视频网站在线播放

Regression models that ignore measurement error in predictors may produce highly biased estimates leading to erroneous inferences. It is well known that it is extremely difficult to take measurement error into account in Gaussian nonparametric regression. This problem becomes tremendously more difficult when considering other families such as logistic regression, Poisson and negative-binomial. For the first time, we present a method aiming to correct for measurement error when estimating regression functions flexibly covering virtually all distributions and link functions regularly considered in generalized linear models. This approach depends on approximating the first and the second moment of the response after integrating out the true unobserved predictors in a semiparametric generalized linear model. Unlike previous methods, this method is not restricted to truncated splines and can utilize various basis functions. Through extensive simulation studies, we study the performance of our method under many scenarios.

相關內容

廣義線性模型

關注 0

語言模型化 · Extensibility · 條件概率分布 · INFORMS · MoDELS ·

2021 年 6 月 3 日

Provably Secure Generative Linguistic Steganography

Siyu Zhang,Zhongliang Yang,Jinshuai Yang,Yongfeng Huang

from arxiv, Accepted by ACL-IJCNLP 2021: findings

Generative linguistic steganography mainly utilized language models and applied steganographic sampling (stegosampling) to generate high-security steganographic text (stegotext). However, previous methods generally lead to statistical differences between the conditional probability distributions of stegotext and natural text, which brings about security risks. In this paper, to further ensure security, we present a novel provably secure generative linguistic steganographic method ADG, which recursively embeds secret information by Adaptive Dynamic Grouping of tokens according to their probability given by an off-the-shelf language model. We not only prove the security of ADG mathematically, but also conduct extensive experiments on three public corpora to further verify its imperceptibility. The experimental results reveal that the proposed method is able to generate stegotext with nearly perfect security.

誤差函數 · 泛函 · 稀疏 · 零空間 · 求逆 ·

2021 年 6 月 3 日

Sparse recovery based on the generalized error function

Zhiyong Zhou

In this paper, we propose a novel sparse recovery method based on the generalized error function. The penalty function introduced involves both the shape and the scale parameters, making it very flexible. The theoretical analysis results in terms of the null space property, the spherical section property and the restricted invertibility factor are established for both constrained and unconstrained models. The practical algorithms via both the iteratively reweighted $\ell_1$ and the difference of convex functions algorithms are presented. Numerical experiments are conducted to illustrate the improvement provided by the proposed approach in various scenarios. Its practical application in magnetic resonance imaging (MRI) reconstruction is studied as well.

損失函數（機器學習） · MoDELS · 損失 · 泛函 · TOOLS ·

2021 年 6 月 2 日

General Bayesian Loss Function Selection and the use of Improper Models

Jack Jewson,David Rossell

from arxiv, Keywords: Loss functions; Improper models; General Bayes; Hyv\"arinen score; Robust regression; Kernel density estimation

Statisticians often face the choice between using probability models or a paradigm defined by minimising a loss function. Both approaches are useful and, if the loss can be re-cast into a proper probability model, there are many tools to decide which model or loss is more appropriate for the observed data, in the sense of explaining the data's nature. However, when the loss leads to an improper model, there are no principled ways to guide this choice. We address this task by combining the Hyv\"arinen score, which naturally targets infinitesimal relative probabilities, and general Bayesian updating, which provides a unifying framework for inference on losses and models. Specifically we propose the H-score, a general Bayesian selection criterion and prove that it consistently selects the (possibly improper) model closest to the data-generating truth in Fisher's divergence. We also prove that an associated H-posterior consistently learns optimal hyper-parameters featuring in loss functions, including a challenging tempering parameter in generalised Bayesian inference. As salient examples, we consider robust regression and non-parametric density estimation where popular loss functions define improper models for the data and hence cannot be dealt with using standard model selection tools. These examples illustrate advantages in robustness-efficiency trade-offs and provide a Bayesian implementation for kernel density estimation, opening a new avenue for Bayesian non-parametrics.

向量空間 · 計算成本 · 推斷 · 蒙特卡羅積分 · 證據下界 ·

2021 年 6 月 2 日

Deterministic Variational Inference for Neural SDEs

Andreas Look,Melih Kandemir,Jan Peters

Neural Stochastic Differential Equations (NSDEs) model the drift and diffusion functions of a stochastic process as neural networks. While NSDEs are known to predict time series accurately, their uncertainty quantification properties remain unexplored. Currently, there are no approximate inference methods, which allow flexible models and provide at the same time high quality uncertainty estimates at a reasonable computational cost. Existing SDE inference methods either make overly restrictive assumptions, e.g. linearity, or rely on Monte Carlo integration that requires many samples at prediction time for reliable uncertainty quantification. However, many real-world safety critical applications necessitate highly expressive models that can quantify prediction uncertainty at affordable computational cost. We introduce a variational inference scheme that approximates the posterior distribution of a NSDE governing a latent state space by a deterministic chain of operations. We approximate the intractable data fit term of the evidence lower bound by a novel bidimensional moment matching algorithm: vertical along the neural net layers and horizontal along the time direction. Our algorithm achieves uncertainty calibration scores that can be matched by its sampling-based counterparts only at significantly higher computation cost, while providing as accurate forecasts on system dynamics.

模型選擇 · MoDELS · 近似 · Integration · 可約的 ·

2021 年 6 月 2 日

Approximate Laplace approximations for scalable model selection

David Rossell,Oriol Abril,Anirban Bhattacharya

We propose the approximate Laplace approximation (ALA) to evaluate integrated likelihoods, a bottleneck in Bayesian model selection. The Laplace approximation (LA) is a popular tool that speeds up such computation and equips strong model selection properties. However, when the sample size is large or one considers many models the cost of the required optimizations becomes impractical. ALA reduces the cost to that of solving a least-squares problem for each model. Further, it enables efficient computation across models such as sharing pre-computed sufficient statistics and certain operations in matrix decompositions. We prove that in generalized (possibly non-linear) models ALA achieves a strong form of model selection consistency for a suitably-defined optimal model, at the same functional rates as exact computation. We consider fixed- and high-dimensional problems, group and hierarchical constraints, and the possibility that all models are misspecified. We also obtain ALA rates for Gaussian regression under non-local priors, an important example where the LA can be costly and does not consistently estimate the integrated likelihood. Our examples include non-linear regression, logistic, Poisson and survival models. We implement the methodology in the R package mombf.

可辨認的 · 對數幾率回歸 · 估計/估計量 · 分離的 · 最大似然估計 ·

2021 年 6 月 2 日

Combining case-control studies for identifiability and efficiency improvement in logistic regression

Wenlu Tang,Yuanyuan Lin,Linlin Dai,Kani Chen

from arxiv, Yuanyuan Lin (E-mail: [email protected]), Associate Professor in Department of Statistics, The Chinese University of Hong Kong, Hong Kong is the corresponding author

Can two separate case-control studies, one about Hepatitis disease and the other about Fibrosis, for example, be combined together? It would be hugely beneficial if two or more separately conducted case-control studies, even for entirely irrelevant purposes, can be merged together with a unified analysis that produces better statistical properties, e.g., more accurate estimation of parameters. In this paper, we show that, when using the popular logistic regression model, the combined/integrative analysis produces a more accurate estimation of the slope parameters than the single case-control study. It is known that, in a single logistic case-control study, the intercept is not identifiable, contrary to prospective studies. In combined case-control studies, however, the intercepts are proved to be identifiable under mild conditions. The resulting maximum likelihood estimates of the intercepts and slopes are proved to be consistent and asymptotically normal, with asymptotic variances achieving the semiparametric efficiency lower bound.

估計/估計量 · 可辨認的 · MoDELS · 樣例 · 蒙特卡羅 ·

2021 年 6 月 2 日

On Selection of Semiparametric Spatial Regression Models

Guannan Wang,Jue Wang

In this paper, we focus on the variable selection techniques for a class of semiparametric spatial regression models which allow one to study the effects of explanatory variables in the presence of the spatial information. The spatial smoothing problem in the nonparametric part is tackled by means of bivariate splines over triangulation, which is able to deal efficiently with data distributed over irregularly shaped regions. In addition, we develop a unified procedure for variable selection to identify significant covariates under a double penalization framework, and we show that the penalized estimators enjoy the "oracle" property. The proposed method can simultaneously identify non-zero spatially distributed covariates and solve the problem of "leakage" across complex domains of the functional spatial component. To estimate the standard deviations of the proposed estimators for the coefficients, a sandwich formula is developed as well. In the end, Monte Carlo simulation examples and a real data example are provided to illustrate the proposed methodology. All technical proofs are given in the supplementary materials.

估計/估計量 · 線性模型 · 線性的 · 三角形化 · MoDELS ·

2021 年 6 月 2 日

Efficient Estimation of Partially Linear Models for Spatial Data over Complex Domain

Li Wang,Guannan Wang,Min-Jun Lai,Lei Gao

In this paper, we study the estimation of partially linear models for spatial data distributed over complex domains. We use bivariate splines over triangulations to represent the nonparametric component on an irregular two-dimensional domain. The proposed method is formulated as a constrained minimization problem which does not require constructing finite elements or locally supported basis functions. Thus, it allows an easier implementation of piecewise polynomial representations of various degrees and various smoothness over an arbitrary triangulation. Moreover, the constrained minimization problem is converted into an unconstrained minimization via a QR decomposition of the smoothness constraints, which allows for the development of a fast and efficient penalized least squares algorithm to fit the model. The estimators of the parameters are proved to be asymptotically normal under some regularity conditions. The estimator of the bivariate function is consistent, and its rate of convergence is also established. The proposed method enables us to construct confidence intervals and permits inference for the parameters. The performance of the estimators is evaluated by two simulation examples and by a real data analysis.

局部曲率 · 泛化理論 · 跡 · 曲率 · 正則化項 ·

2021 年 5 月 31 日

Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts Generalization

Stanislaw Jastrzebski,Devansh Arpit,Oliver Astrand,Giancarlo Kerg,Huan Wang,Caiming Xiong,Richard Socher,Kyunghyun Cho,Krzysztof Geras

from arxiv, The last two authors contributed equally. Accepted to the International Conference on Machine Learning 2021

The early phase of training of deep neural networks has a dramatic effect on the local curvature of the loss function. For instance, using a small learning rate does not guarantee stable optimization because the optimization trajectory has a tendency to steer towards regions of the loss surface with increasing local curvature. We ask whether this tendency is connected to the widely observed phenomenon that the choice of the learning rate strongly influences generalization. We first show that stochastic gradient descent (SGD) implicitly penalizes the trace of the Fisher Information Matrix (FIM), a measure of the local curvature, from the beginning of training. We argue it is an implicit regularizer in SGD by showing that explicitly penalizing the trace of the FIM can significantly improve generalization. We highlight that poor final generalization coincides with the trace of the FIM increasing to a large value early in training, to which we refer as catastrophic Fisher explosion. Finally, to gain insight into the regularization effect of penalizing the trace of the FIM, we show that it limits memorization by reducing the learning speed of examples with noisy labels more than that of the clean examples.

納什均衡 · INFORMS · 方差減小 · 代價函數 · 估計/估計量 ·

2021 年 5 月 31 日

Stochastic generalized Nash equilibrium seeking under partial-decision information

Barbara Franci,Sergio Grammatico

We consider for the first time a stochastic generalized Nash equilibrium problem, i.e., with expected-value cost functions and joint feasibility constraints, under partial-decision information, meaning that the agents communicate only with some trusted neighbours. We propose several distributed algorithms for network games and aggregative games that we show being special instances of a preconditioned forward-backward splitting method. We prove that the algorithms converge to a generalized Nash equilibrium when the forward operator is restricted cocoercive by using the stochastic approximation scheme with variance reduction to estimate the expected value of the pseudogradient.