黄片一级在线视频播放-91日韩国产无码

Recently the shape-restricted inference has gained popularity in statistical and econometric literature in order to relax the linear or quadratic covariate effect in regression analyses. The typical shape-restricted covariate effect includes monotonic increasing, decreasing, convexity or concavity. In this paper, we introduce the shape-restricted inference to the celebrated Cox regression model (SR-Cox), in which the covariate response is modeled as shape-restricted additive functions. The SR-Cox regression approximates the shape-restricted functions using a spline basis expansion with data driven choice of knots. The underlying minimization of negative log-likelihood function is formulated as a convex optimization problem, which is solved with an active-set optimization algorithm. The highlight of this algorithm is that it eliminates the superfluous knots automatically. When covariate effects include combinations of convex or concave terms with unknown forms and linear terms, the most interesting finding is that SR-Cox produces accurate linear covariate effect estimates which are comparable to the maximum partial likelihood estimates if indeed the forms are known. We conclude that concave or convex SR-Cox models could significantly improve nonlinear covariate response recovery and model goodness of fit.

相關內容

統計(ji)量

關注 3

MoDELS · 預測器/決策函數 · 自助法/自舉法 · 簇 · 優化器 ·

2021 年 9 月 1 日

Penalized regression calibration: a method for the prediction of survival outcomes using complex longitudinal and high-dimensional data

Mirko Signorelli,Pietro Spitali,Cristina Al-Khalili Szigyarto,The MARK-MD Consortium,Roula Tsonaka

from arxiv, The article is now published in Statistics in Medicine (with Open Access)

Longitudinal and high-dimensional measurements have become increasingly common in biomedical research. However, methods to predict survival outcomes using covariates that are both longitudinal and high-dimensional are currently missing. In this article, we propose penalized regression calibration (PRC), a method that can be employed to predict survival in such situations. PRC comprises three modeling steps: First, the trajectories described by the longitudinal predictors are flexibly modeled through the specification of multivariate mixed effects models. Second, subject-specific summaries of the longitudinal trajectories are derived from the fitted mixed models. Third, the time to event outcome is predicted using the subject-specific summaries as covariates in a penalized Cox model. To ensure a proper internal validation of the fitted PRC models, we furthermore develop a cluster bootstrap optimism correction procedure that allows to correct for the optimistic bias of apparent measures of predictiveness. PRC and the CBOCP are implemented in the R package pencal, available from CRAN. After studying the behavior of PRC via simulations, we conclude by illustrating an application of PRC to data from an observational study that involved patients affected by Duchenne muscular dystrophy, where the goal is predict time to loss of ambulation using longitudinal blood biomarkers.

泛函 · 估計/估計量 · 特化 · 正則化項 · MoDELS ·

2021 年 8 月 31 日

On Consistency and Sparsity for High-Dimensional Functional Time Series with Application to Autoregressions

Shaojun Guo,Xinghao Qiao

from arxiv, arXiv admin note: text overlap with arXiv:1812.07619

Modelling a large collection of functional time series arises in a broad spectral of real applications. Under such a scenario, not only the number of functional variables can be diverging with, or even larger than the number of temporally dependent functional observations, but each function itself is an infinite-dimensional object, posing a challenging task. In this paper, we propose a three-step procedure to estimate high-dimensional functional time series models. To provide theoretical guarantees for the three-step procedure, we focus on multivariate stationary processes and propose a novel functional stability measure based on their spectral properties. Such stability measure facilitates the development of some useful concentration bounds on sample (auto)covariance functions, which serve as a fundamental tool for further convergence analysis in high-dimensional settings. As functional principal component analysis (FPCA) is one of the key dimension reduction techniques in the first step, we also investigate the non-asymptotic properties of the relevant estimated terms under a FPCA framework. To illustrate with an important application, we consider vector functional autoregressive models and develop a regularization approach to estimate autoregressive coefficient functions under the sparsity constraint. Using our derived non-asymptotic results, we investigate convergence properties of the regularized estimate under high-dimensional scaling. Finally, the finite-sample performance of the proposed method is examined through both simulations and a public financial dataset.

泛函 · MoDELS · 過完備 · Processing（編程語言） · 平滑 ·

2021 年 8 月 31 日

Multivariate Lévy Adaptive B-Spline Regression

Sewon Park,Jaeyong Lee

from arxiv, 29 pages, 7 figures and 7 tables

We develop a fully Bayesian nonparametric regression model based on a L\'evy process prior named MLABS (Multivariate L\'evy Adaptive B-Spline regression) model, a multivariate version of the LARK (L\'evy Adaptive Regression Kernels) models, for estimating unknown functions with either varying degrees of smoothness or high interaction orders. L\'evy process priors have advantages of encouraging sparsity in the expansions and providing automatic selection over the number of basis functions. The unknown regression function is expressed as a weighted sum of tensor product of B-spline basis functions as the elements of an overcomplete system, which can deal with multi-dimensional data. The B-spline basis can express systematically functions with varying degrees of smoothness. By changing a set of degrees of the tensor product basis function, MLABS can adapt the smoothness of target functions due to the nice properties of B-spline bases. The local support of the B-spline basis enables the MLABS to make more delicate predictions than other existing methods in the two-dimensional surface data. Experiments on various simulated and real-world datasets illustrate that the MLABS model has comparable performance on regression and classification problems. We also show that the MLABS model has more stable and accurate predictive abilities than state-of-the-art nonparametric regression models in relatively low-dimensional data.

等分回歸 · 分段 · 估計/估計量 · INFORMS · 離散化 ·

2021 年 8 月 30 日

Generalized nearly isotonic regression

Takeru Matsuda,Yuto Miyatake

The problem of estimating a piecewise monotone sequence of normal means is called the nearly isotonic regression. For this problem, an efficient algorithm has been devised by modifying the pool adjacent violators algorithm (PAVA). In this study, we extend nearly isotonic regression to general one-parameter exponential families such as binomial, Poisson and chi-square. We consider estimation of a piecewise monotone parameter sequence and develop an efficient algorithm based on the modified PAVA, which utilizes the duality between the natural and expectation parameters. We also provide a method for selecting the regularization parameter by using an information criterion. Simulation results demonstrate that the proposed method detects change-points in piecewise monotone parameter sequences in a data-driven manner. Applications to spectrum estimation, causal inference and discretization error quantification of ODE solvers are also presented.

統計量 · SimPLe · Networking · 可約的 · 推斷 ·

2021 年 8 月 29 日

SIMPLE: Statistical Inference on Membership Profiles in Large Networks

Jianqing Fan,Yingying Fan,Xiao Han,Jinchi Lv

from arxiv, 59 pages, 4 figures; Journal of the Royal Statistical Society Series B, to appear

Network data is prevalent in many contemporary big data applications in which a common interest is to unveil important latent links between different pairs of nodes. Yet a simple fundamental question of how to precisely quantify the statistical uncertainty associated with the identification of latent links still remains largely unexplored. In this paper, we propose the method of statistical inference on membership profiles in large networks (SIMPLE) in the setting of degree-corrected mixed membership model, where the null hypothesis assumes that the pair of nodes share the same profile of community memberships. In the simpler case of no degree heterogeneity, the model reduces to the mixed membership model for which an alternative more robust test is also proposed. Both tests are of the Hotelling-type statistics based on the rows of empirical eigenvectors or their ratios, whose asymptotic covariance matrices are very challenging to derive and estimate. Nevertheless, their analytical expressions are unveiled and the unknown covariance matrices are consistently estimated. Under some mild regularity conditions, we establish the exact limiting distributions of the two forms of SIMPLE test statistics under the null hypothesis and contiguous alternative hypothesis. They are the chi-square distributions and the noncentral chi-square distributions, respectively, with degrees of freedom depending on whether the degrees are corrected or not. We also address the important issue of estimating the unknown number of communities and establish the asymptotic properties of the associated test statistics. The advantages and practical utility of our new procedures in terms of both size and power are demonstrated through several simulation examples and real network applications.

統計量 · MoDELS · binary · 期望損失 · Continuity ·

2021 年 8 月 29 日

Policy Implications of Statistical Estimates: A General Bayesian Decision-Theoretic Model for Binary Outcomes

Akisato Suzuki

from arxiv, working paper

How should we evaluate a policy's effect on the likelihood of an undesirable event, such as conflict? The conventional practice has three limitations. First, relying on statistical significance misses the fact that uncertainty is a continuous scale. Second, focusing on a standard point estimate overlooks a variation in plausible effect sizes. Third, the criterion of substantive significance is rarely explained or justified. To overcome these, my Bayesian decision-theoretic model compares the expected loss under a policy intervention with the one under no such intervention. These losses are computed as a function of a particular effect size, the probability of this effect being realized, and the ratio of the cost of an intervention to the cost of an undesirable event. The model is more practically interpretable than common statistical decision-theoretic models using the standard loss functions or the relative costs of false positives and false negatives. I exemplify my model's use through three applications and provide an R package.

離散化 · 近似 · Continuity · SOFT · 標準正交 ·

2021 年 8 月 29 日

Lasso hyperinterpolation over general regions

Congpei An,Hao-Ning Wu

from arxiv, 24 pages, 4 figures; accepted by SIAM Journal on Scientific Computing

This paper develops a fully discrete soft thresholding polynomial approximation over a general region, named Lasso hyperinterpolation. This approximation is an $\ell_1$-regularized discrete least squares approximation under the same conditions of hyperinterpolation. Lasso hyperinterpolation also uses a high-order quadrature rule to approximate the Fourier coefficients of a given continuous function with respect to some orthonormal basis, and then it obtains its coefficients by acting a soft threshold operator on all approximated Fourier coefficients. Lasso hyperinterpolation is not a discrete orthogonal projection, but it is an efficient tool to deal with noisy data. We theoretically analyze Lasso hyperinterpolation for continuous and smooth functions. The principal results are twofold: the norm of the Lasso hyperinterpolation operator is bounded independently of the polynomial degree, which is inherited from hyperinterpolation; and the $L_2$ error bound of Lasso hyperinterpolation is less than that of hyperinterpolation when the level of noise becomes large, which improves the robustness of hyperinterpolation. Explicit constructions and corresponding numerical examples of Lasso hyperinterpolation over intervals, discs, spheres, and cubes are given.

估計/估計量 · MoDELS · 邊緣化 · Weight · 泛函 ·

2021 年 8 月 29 日

Bayesian Non-parametric Quantile Process Regression and Estimation of Marginal Quantile Effects

Steven G. Xu,Brian J. Reich

from arxiv, 53 pages, 22 figures

Flexible estimation of multiple conditional quantiles is of interest in numerous applications, such as studying the effect of pregnancy-related factors on low and high birth weight. We propose a Bayesian non-parametric method to simultaneously estimate non-crossing, non-linear quantile curves. We expand the conditional distribution function of the response in I-spline basis functions where the covariate-dependent coefficients are modeled using neural networks. By leveraging the approximation power of splines and neural networks, our model can approximate any continuous quantile function. Compared to existing models, our model estimates all rather than a finite subset of quantiles, scales well to high dimensions, and accounts for estimation uncertainty. While the model is arbitrarily flexible, interpretable marginal quantile effects are estimated using accumulative local effect plots and variable importance measures. A simulation study shows that our model can better recover quantiles of the response distribution when the data is sparse, and an analysis of birth weight data is presented.

估計/估計量 · 統計量 · MoDELS · Networking · 推斷 ·

2021 年 8 月 27 日

Limit theorems for dependent combinatorial data, with applications in statistical inference

Somabha Mukherjee

from arxiv, University of Pennsylvania Ph.D. Thesis

The Ising model is a celebrated example of a Markov random field, introduced in statistical physics to model ferromagnetism. This is a discrete exponential family with binary outcomes, where the sufficient statistic involves a quadratic term designed to capture correlations arising from pairwise interactions. However, in many situations the dependencies in a network arise not just from pairs, but from peer-group effects. A convenient mathematical framework for capturing higher-order dependencies, is the $p$-tensor Ising model, where the sufficient statistic consists of a multilinear polynomial of degree $p$. This thesis develops a framework for statistical inference of the natural parameters in $p$-tensor Ising models. We begin with the Curie-Weiss Ising model, where we unearth various non-standard phenomena in the asymptotics of the maximum-likelihood (ML) estimates of the parameters, such as the presence of a critical curve in the interior of the parameter space on which these estimates have a limiting mixture distribution, and a surprising superefficiency phenomenon at the boundary point(s) of this curve. ML estimation fails in more general $p$-tensor Ising models due to the presence of a computationally intractable normalizing constant. To overcome this issue, we use the popular maximum pseudo-likelihood (MPL) method, which avoids computing the inexplicit normalizing constant based on conditional distributions. We derive general conditions under which the MPL estimate is $\sqrt{N}$-consistent, where $N$ is the size of the underlying network. Finally, we consider a more general Ising model, which incorporates high-dimensional covariates at the nodes of the network, that can also be viewed as a logistic regression model with dependent observations. In this model, we show that the parameters can be estimated consistently under sparsity assumptions on the true covariate vector.

GIoU · Performer · 優化器 · CASE · 損失 ·

2019 年 2 月 25 日

Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression

Hamid Rezatofighi,Nathan Tsoi,JunYoung Gwak,Amir Sadeghian,Ian Reid,Silvio Savarese

from arxiv, accepted in CVPR 2019

Intersection over Union (IoU) is the most popular evaluation metric used in the object detection benchmarks. However, there is a gap between optimizing the commonly used distance losses for regressing the parameters of a bounding box and maximizing this metric value. The optimal objective for a metric is the metric itself. In the case of axis-aligned 2D bounding boxes, it can be shown that $IoU$ can be directly used as a regression loss. However, $IoU$ has a plateau making it infeasible to optimize in the case of non-overlapping bounding boxes. In this paper, we address the weaknesses of $IoU$ by introducing a generalized version as both a new loss and a new metric. By incorporating this generalized $IoU$ ($GIoU$) as a loss into the state-of-the art object detection frameworks, we show a consistent improvement on their performance using both the standard, $IoU$ based, and new, $GIoU$ based, performance measures on popular object detection benchmarks such as PASCAL VOC and MS COCO.