曰本中文字幕一区二区三区高清_高中小鲜肉自慰GAY免费_亚洲国产欧美日本国产欧美日韩_亚洲国产精品无码日韩专区_AAA级精品久久久国产_日本欧美高清全视频_亚洲一本之道高清乱码

The existence of immune or cured individuals in a population and whether there is sufficient followup in a sample of censored observations on their lifetimes to be confident of their presence are questions of major importance in medical survival analysis. So far only a few candidates have been put forward as possible test statistics for the existence of sufficient followup in a sample. Here we investigate one such statistic and give a detailed analysis, obtaining an exact finite sample as well as asymptotic distributions for it, and use these to calculate the power of the test as a function of the followup in the sample.

相關內容

確切的(de)

關注 0

經驗風險最小化 · 經驗風險 · 廣義線性模型 · Lipschitz · 線性的 ·

2021 年 10 月 28 日

Decentralized Feature-Distributed Optimization for Generalized Linear Models

Brighton Ancelin,Sohail Bahmani,Justin Romberg

We consider the "all-for-one" decentralized learning problem for generalized linear models. The features of each sample are partitioned among several collaborating agents in a connected network, but only one agent observes the response variables. To solve the regularized empirical risk minimization in this distributed setting, we apply the Chambolle--Pock primal--dual algorithm to an equivalent saddle-point formulation of the problem. The primal and dual iterations are either in closed-form or reduce to coordinate-wise minimization of scalar convex functions. We establish convergence rates for the empirical risk minimization under two different assumptions on the loss function (Lipschitz and square root Lipschitz), and show how they depend on the characteristics of the design matrix and the Laplacian of the network.

圖卷積神經網絡/圖卷積網絡 · 泛化理論 · 模型評估 · 圖卷積 · 可理解性 ·

2021 年 10 月 28 日

On Provable Benefits of Depth in Training Graph Convolutional Networks

Weilin Cong,Morteza Ramezani,Mehrdad Mahdavi

Graph Convolutional Networks (GCNs) are known to suffer from performance degradation as the number of layers increases, which is usually attributed to over-smoothing. Despite the apparent consensus, we observe that there exists a discrepancy between the theoretical understanding of over-smoothing and the practical capabilities of GCNs. Specifically, we argue that over-smoothing does not necessarily happen in practice, a deeper model is provably expressive, can converge to global optimum with linear convergence rate, and achieve very high training accuracy as long as properly trained. Despite being capable of achieving high training accuracy, empirical results show that the deeper models generalize poorly on the testing stage and existing theoretical understanding of such behavior remains elusive. To achieve better understanding, we carefully analyze the generalization capability of GCNs, and show that the training strategies to achieve high training accuracy significantly deteriorate the generalization capability of GCNs. Motivated by these findings, we propose a decoupled structure for GCNs that detaches weight matrices from feature propagation to preserve the expressive power and ensure good generalization performance. We conduct empirical evaluations on various synthetic and real-world datasets to validate the correctness of our theory.

INFORMS · 學成 · MoDELS · 監督模型 · 估計/估計量 ·

2021 年 10 月 28 日

Using Time-Series Privileged Information for Provably Efficient Learning of Prediction Models

Rickard Karlsson,Martin Willbo,Zeshan Hussain,Rahul G. Krishnan,David Sontag,Fredrik D. Johansson

We study prediction of future outcomes with supervised models that use privileged information during learning. The privileged information comprises samples of time series observed between the baseline time of prediction and the future outcome; this information is only available at training time which differs from the traditional supervised learning. Our question is when using this privileged data leads to more sample-efficient learning of models that use only baseline data for predictions at test time. We give an algorithm for this setting and prove that when the time series are drawn from a non-stationary Gaussian-linear dynamical system of fixed horizon, learning with privileged information is more efficient than learning without it. On synthetic data, we test the limits of our algorithm and theory, both when our assumptions hold and when they are violated. On three diverse real-world datasets, we show that our approach is generally preferable to classical learning, particularly when data is scarce. Finally, we relate our estimator to a distillation approach both theoretically and empirically.

統計量 · Integration · Extensibility · 泛函 · 樣本 ·

2021 年 10 月 28 日

Affine-Invariant Integrated Rank-Weighted Depth: Definition, Properties and Finite Sample Analysis

Guillaume Staerman,Pavlo Mozharovskyi,Stéphan Clémen?on

Because it determines a center-outward ordering of observations in $\mathbb{R}^d$ with $d\geq 2$, the concept of statistical depth permits to define quantiles and ranks for multivariate data and use them for various statistical tasks (e.g. inference, hypothesis testing). Whereas many depth functions have been proposed \textit{ad-hoc} in the literature since the seminal contribution of \cite{Tukey75}, not all of them possess the properties desirable to emulate the notion of quantile function for univariate probability distributions. In this paper, we propose an extension of the \textit{integrated rank-weighted} statistical depth (IRW depth in abbreviated form) originally introduced in \cite{IRW}, modified in order to satisfy the property of \textit{affine-invariance}, fulfilling thus all the four key axioms listed in the nomenclature elaborated by \cite{ZuoS00a}. The variant we propose, referred to as the Affine-Invariant IRW depth (AI-IRW in short), involves the covariance/precision matrices of the (supposedly square integrable) $d$-dimensional random vector $X$ under study, in order to take into account the directions along which $X$ is most variable to assign a depth value to any point $x\in \mathbb{R}^d$. The accuracy of the sampling version of the AI-IRW depth is investigated from a nonasymptotic perspective. Namely, a concentration result for the statistical counterpart of the AI-IRW depth is proved. Beyond the theoretical analysis carried out, applications to anomaly detection are considered and numerical results are displayed, providing strong empirical evidence of the relevance of the depth function we propose here.

推斷 · Continuity · 情景 · 置信度 · Conformer ·

2021 年 10 月 27 日

Distribution-free inference for regression: discrete, continuous, and in between

Yonghoon Lee,Rina Foygel Barber

In data analysis problems where we are not able to rely on distributional assumptions, what types of inference guarantees can still be obtained? Many popular methods, such as holdout methods, cross-validation methods, and conformal prediction, are able to provide distribution-free guarantees for predictive inference, but the problem of providing inference for the underlying regression function (for example, inference on the conditional mean $\mathbb{E}[Y|X]$) is more challenging. In the setting where the features $X$ are continuously distributed, recent work has established that any confidence interval for $\mathbb{E}[Y|X]$ must have non-vanishing width, even as sample size tends to infinity. At the other extreme, if $X$ takes only a small number of possible values, then inference on $\mathbb{E}[Y|X]$ is trivial to achieve. In this work, we study the problem in settings in between these two extremes. We find that there are several distinct regimes in between the finite setting and the continuous setting, where vanishing-width confidence intervals are achievable if and only if the effective support size of the distribution of $X$ is smaller than the square of the sample size.

可辨認的 · 泛化理論 · CASES · Performance · 樣例 ·

2021 年 10 月 27 日

Finding Regions of Heterogeneity in Decision-Making via Expected Conditional Covariance

Justin Lim,Christina X Ji,Michael Oberst,Saul Blecker,Leora Horwitz,David Sontag

from arxiv, To appear in NeurIPS 2021

Individuals often make different decisions when faced with the same context, due to personal preferences and background. For instance, judges may vary in their leniency towards certain drug-related offenses, and doctors may vary in their preference for how to start treatment for certain types of patients. With these examples in mind, we present an algorithm for identifying types of contexts (e.g., types of cases or patients) with high inter-decision-maker disagreement. We formalize this as a causal inference problem, seeking a region where the assignment of decision-maker has a large causal effect on the decision. Our algorithm finds such a region by maximizing an empirical objective, and we give a generalization bound for its performance. In a semi-synthetic experiment, we show that our algorithm recovers the correct region of heterogeneity accurately compared to baselines. Finally, we apply our algorithm to real-world healthcare datasets, recovering variation that aligns with existing clinical knowledge.

聯系函數 · MoDELS · 近似 · 泛函 · Weight ·

2021 年 10 月 27 日

Approximation Methods for Mixed Models with Probit Link Functions

Benjamin Christoffersen,Mark Clements,Hedvig Kjellstr?m,Keith Humphreys

from arxiv, 33 pages, 2 figures, and 9 tables

We study approximation methods for a large class of mixed models with a probit link function that includes mixed versions of the binomial model, the multinomial model, and generalized survival models. The class of models is special because the marginal likelihood can be expressed as Gaussian weighted integrals or as multivariate Gaussian cumulative density functions. The latter approach is unique to the probit link function models and has been proposed for parameter estimation in complex, mixed effects models. However, it has not been investigated in which scenarios either form is preferable. Our simulations and data example show that neither form is preferable in general and give guidance on when to approximate the cumulative density functions and when to approximate the Gaussian weighted integrals and, in the case of the latter, which general purpose method to use among a large list of methods.

估計/估計量 · 異常點 · Processing（編程語言） · 穩健性 · Performer ·

2021 年 10 月 26 日

Phase I Analysis of High-Dimensional Multivariate Processes in the Presence of Outliers

Mohsen Ebadi,Shojaeddin Chenouri,Stefan H. Steiner

One of the significant challenges in monitoring the quality of products today is the high dimensionality of quality characteristics. In this paper, we address Phase I analysis of high-dimensional processes with individual observations when the available number of samples collected over time is limited. Using a new charting statistic, we propose a robust procedure for parameter estimation in Phase I. This robust procedure is efficient in parameter estimation in the presence of outliers or contamination in the data. A consistent estimator is proposed for parameter estimation and a finite sample correction coefficient is derived and evaluated through simulation. We assess the statistical performance of the proposed method in Phase I in terms of the probability of signal criterion. This assessment is carried out in the absence and presence of outliers. We show that, in both phases, the proposed control chart scheme effectively detects various kinds of shifts in the process mean. Besides, we present two real-world examples to illustrate the applicability of our proposed method.

COVID-19 · CASES · 分段 · 概率密度函數 · CASE ·

2021 年 10 月 26 日

Analyzing the Data of COVID-19 with Quasi-Distribution Fitting Based on Piecewise B-spline Curves

Qingliang Zhao,Zhenhuan Lu,Yiduo Wang

Facing the world wide coronavirus disease 2019 (COVID-19) pandemic, a new fitting method (QDF, quasi-distribution fitting) which could be used to analyze the data of COVID-19 is developed based on piecewise quasi-uniform B-spline curves. For any given country or district, it simulates the distribution histogram data which is made from the daily confirmed cases (or the other data including daily recovery cases and daily fatality cases) of the COVID-19 with piecewise quasi-uniform B-spline curves. Being dealt with area normalization method, the fitting curves could be regarded as a kind of probability density function (PDF), its mathematical expectation and the variance could be used to analyze the situation of the coronavirus pandemic. Numerical experiments based on the data of certain countries have indicated that the QDF method demonstrate the intrinsic characteristics of COVID-19 data of the given country or distric, and because of the interval of data used in this paper is over one year (500 days), it reveals the fact that after multi-wave transmission of the coronavirus, the case fatality rate has declined obviously, the result shows that as an appraisal method, it is effective and feasible.

強化學習 · 學成 · INTERACT · Performer · 屬性值 ·

2021 年 10 月 21 日

Model-based Reinforcement Learning for Service Mesh Fault Resiliency in a Web Application-level

Fanfei Meng,Lalita Jagadeesan,Marina Thottan

from arxiv, 10 pages, submitted for the Web Conference 2022

Microservice-based architectures enable different aspects of web applications to be created and updated independently, even after deployment. Associated technologies such as service mesh provide application-level fault resilience through attribute configurations that govern the behavior of request-response service -- and the interactions among them -- in the presence of failures. While this provides tremendous flexibility, the configured values of these attributes -- and the relationships among them -- can significantly affect the performance and fault resilience of the overall application. Furthermore, it is impossible to determine the best and worst combinations of attribute values with respect to fault resiliency via testing, due to the complexities of the underlying distributed system and the many possible attribute value combinations. In this paper, we present a model-based reinforcement learning workflow towards service mesh fault resiliency. Our approach enables the prediction of the most significant fault resilience behaviors at a web application-level, scratching from single service to aggregated multi-service management with efficient agent collaborations.