良辰好景知几何电视剧免费观看_无码人妻丰满熟妇A片护士M_欧美日本理伦黄色网站_午夜视频久久久久一区二区_欧美成人高清视频_国产综合亚洲综合AV人片导航_美日韩激情亚洲国产亚洲

The simplest, most common paired samples consist of observations from two populations, with each observed response from one population corresponding to an observed response from the other population at the same value of an ordinal covariate. The pair of observed responses (one from each population) at the same value of the covariate is known as a "matched pair" (with the matching based on the value of the covariate). A graph of cumulative differences between the two populations reveals differences in responses as a function of the covariate. Indeed, the slope of the secant line connecting two points on the graph becomes the average difference over the wide interval of values of the covariate between the two points; i.e., slope of the graph is the average difference in responses. ("Average" refers to the weighted average if the samples are weighted.) Moreover, a simple statistic known as the Kuiper metric summarizes into a single scalar the overall differences over all values of the covariate. The Kuiper metric is the absolute value of the total difference in responses between the two populations, totaled over the interval of values of the covariate for which the absolute value of the total is greatest. The total should be normalized such that it becomes the (weighted) average over all values of the covariate when the interval over which the total is taken is the entire range of the covariate (i.e., the sum for the total gets divided by the total number of observations, if the samples are unweighted, or divided by the total weight, if the samples are weighted). This cumulative approach is fully nonparametric and uniquely defined (with only one right way to construct the graphs and scalar summary statistics), unlike traditional methods such as reliability diagrams or parametric or semi-parametric regressions, which typically obscure significant differences due to their parameter settings.

相關內容

Pair

關注 1

兩(liang)人(ren)親(qin)密(mi)社(she)交應用，官網：

均值 · GROUP · 估計/估計量 · Analysis · 推斷 ·

2023 年 7 月 6 日

A two-sample comparison of mean survival times of uncured sub-populations

Dennis Dobler,Eni Musta

Comparing the survival times among two groups is a common problem in time-to-event analysis, for example if one would like to understand whether one medical treatment is superior to another. In the standard survival analysis setting, there has been a lot of discussion on how to quantify such difference and what can be an intuitive, easily interpretable, summary measure. In the presence of subjects that are immune to the event of interest (`cured'), we illustrate that it is not appropriate to just compare the overall survival functions. Instead, it is more informative to compare the cure fractions and the survival of the uncured sub-populations separately from each other. Our research is mainly driven by the question: if the cure fraction is similar for two available treatments, how else can we determine which is preferable? To this end, we estimate the mean survival times in the uncured fractions of both treatment groups ($MST_u$) and develop permutation tests for inference. In the first out of two connected papers, we focus on nonparametric approaches. The methods are illustrated with medical data of leukemia patients. In Part II we adjust the mean survival time of the uncured for potential confounders, which is crucial in observational settings. For each group, we employ the widely used logistic-Cox mixture cure model and estimate the $MST_u$ conditionally on a given covariate value. An asymptotic and a permutation-based approach have been developed for making inference on the difference of conditional $MST_u$'s between two groups. Contrarily to available results in the literature, in the simulation study we do not observe a clear advantage of the permutation method over the asymptotic one to justify its increased computational cost. The methods are illustrated through a practical application to breast cancer data.

估計/估計量 · Minimax · 優化器 · 泛函 · 可辨認的 ·

2023 年 7 月 6 日

On the Optimality of Functional Sliced Inverse Regression

Rui Chen,Songtao Tian,Dongming Huang,Qian Lin,Jun S. Liu

In this paper, we prove that functional sliced inverse regression (FSIR) achieves the optimal (minimax) rate for estimating the central space in functional sufficient dimension reduction problems. First, we provide a concentration inequality for the FSIR estimator of the covariance of the conditional mean, i.e., $\var(\E[\boldsymbol{X}\mid Y])$. Based on this inequality, we establish the root-$n$ consistency of the FSIR estimator of the image of $\var(\E[\boldsymbol{X}\mid Y])$. Second, we apply the most widely used truncated scheme to estimate the inverse of the covariance operator and identify the truncation parameter which ensures that FSIR can achieve the optimal minimax convergence rate for estimating the central space. Finally, we conduct simulations to demonstrate the optimal choice of truncation parameter and the estimation efficiency of FSIR. To the best of our knowledge, this is the first paper to rigorously prove the minimax optimality of FSIR in estimating the central space for multiple-index models and general $Y$ (not necessarily discrete).

估計/估計量 · TD · 情景 · 分解 · Extensibility ·

2023 年 7 月 6 日

Decomposing Triple-Differences Regression under Staggered Adoption

Anton Strezhnev

The triple-differences (TD) design is a popular identification strategy for causal effects in settings where researchers do not believe the parallel trends assumption of conventional difference-in-differences (DiD) is satisfied. TD designs augment the conventional 2x2 DiD with a "placebo" stratum -- observations that are nested in the same units and time periods but are known to be entirely unaffected by the treatment. However, many TD applications go beyond this simple 2x2x2 and use observations on many units in many "placebo" strata across multiple time periods. A popular estimator for this setting is the triple-differences regression (TDR) fixed-effects estimator -- an extension of the common "two-way fixed effects" estimator for DiD. This paper decomposes the TDR estimator into its component two-group/two-period/two-strata triple-differences and illustrates how interpreting this parameter causally in settings with arbitrary staggered adoption requires strong effect homogeneity assumptions as many placebo DiDs incorporate observations under treatment. The decomposition clarifies the implied identifying variation behind the triple-differences regression estimator and suggests researchers should be cautious when implementing these estimators in settings more complex than the 2x2x2 case. Alternative approaches that only incorporate "clean placebos" such as direct imputation of the counterfactual may be more appropriate. The paper concludes by demonstrating the utility of this imputation estimator in an application of the "gravity model" to the estimation of the effect of the WTO/GATT on international trade.

Learning · Markov · Bandits · CASES · 樣本復雜度 ·

2023 年 7 月 5 日

The Curse of Passive Data Collection in Batch Reinforcement Learning

Chenjun Xiao,Ilbin Lee,Bo Dai,Dale Schuurmans,Csaba Szepesvari

from arxiv, 27 pages, 2 figures. AISTATS 2022. In this revision, we fix an error in the previous upper bound results

In high stake applications, active experimentation may be considered too risky and thus data are often collected passively. While in simple cases, such as in bandits, passive and active data collection are similarly effective, the price of passive sampling can be much higher when collecting data from a system with controlled states. The main focus of the current paper is the characterization of this price. For example, when learning in episodic finite state-action Markov decision processes (MDPs) with $\mathrm{S}$ states and $\mathrm{A}$ actions, we show that even with the best (but passively chosen) logging policy, $\Omega(\mathrm{A}^{\min(\mathrm{S}-1, H)}/\varepsilon^2)$ episodes are necessary (and sufficient) to obtain an $\epsilon$-optimal policy, where $H$ is the length of episodes. Note that this shows that the sample complexity blows up exponentially compared to the case of active data collection, a result which is not unexpected, but, as far as we know, have not been published beforehand and perhaps the form of the exact expression is a little surprising. We also extend these results in various directions, such as other criteria or learning in the presence of function approximation, with similar conclusions. A remarkable feature of our result is the sharp characterization of the exponent that appears, which is critical for understanding what makes passive learning hard.

真實值 · CASE · MoDELS · Performer · 統計量 ·

2023 年 7 月 5 日

Evaluating AI systems under uncertain ground truth: a case study in dermatology

David Stutz,Ali Taylan Cemgil,Abhijit Guha Roy,Tatiana Matejovicova,Melih Barsbey,Patricia Strachan,Mike Schaekermann,Jan Freyberg,Rajeev Rikhye,Beverly Freeman,Javier Perez Matos,Umesh Telang,Dale R. Webster,Yuan Liu,Greg S. Corrado,Yossi Matias,Pushmeet Kohli,Yun Liu,Arnaud Doucet,Alan Karthikesalingam

For safety, AI systems in health undergo thorough evaluations before deployment, validating their predictions against a ground truth that is assumed certain. However, this is actually not the case and the ground truth may be uncertain. Unfortunately, this is largely ignored in standard evaluation of AI models but can have severe consequences such as overestimating the future performance. To avoid this, we measure the effects of ground truth uncertainty, which we assume decomposes into two main components: annotation uncertainty which stems from the lack of reliable annotations, and inherent uncertainty due to limited observational information. This ground truth uncertainty is ignored when estimating the ground truth by deterministically aggregating annotations, e.g., by majority voting or averaging. In contrast, we propose a framework where aggregation is done using a statistical model. Specifically, we frame aggregation of annotations as posterior inference of so-called plausibilities, representing distributions over classes in a classification setting, subject to a hyper-parameter encoding annotator reliability. Based on this model, we propose a metric for measuring annotation uncertainty and provide uncertainty-adjusted metrics for performance evaluation. We present a case study applying our framework to skin condition classification from images where annotations are provided in the form of differential diagnoses. The deterministic adjudication process called inverse rank normalization (IRN) from previous work ignores ground truth uncertainty in evaluation. Instead, we present two alternative statistical models: a probabilistic version of IRN and a Plackett-Luce-based model. We find that a large portion of the dataset exhibits significant ground truth uncertainty and standard IRN-based evaluation severely over-estimates performance without providing uncertainty estimates.

優化器 · FAST · Pivotal（公司） · Notability · 方陣 ·

2023 年 7 月 4 日

Fast Optimal Transport through Sliced Wasserstein Generalized Geodesics

Guillaume Mahey,Laetitia Chapel,Gilles Gasso,Clément Bonet,Nicolas Courty

from arxiv, Main: 10 pages,4 Figures Tables Supplementary: 19 pages, 13 Figures ,1 Table. Sumbitted to Neurips 2023

Wasserstein distance (WD) and the associated optimal transport plan have been proven useful in many applications where probability measures are at stake. In this paper, we propose a new proxy of the squared WD, coined min-SWGG, that is based on the transport map induced by an optimal one-dimensional projection of the two input distributions. We draw connections between min-SWGG and Wasserstein generalized geodesics in which the pivot measure is supported on a line. We notably provide a new closed form for the exact Wasserstein distance in the particular case of one of the distributions supported on a line allowing us to derive a fast computational scheme that is amenable to gradient descent optimization. We show that min-SWGG is an upper bound of WD and that it has a complexity similar to as Sliced-Wasserstein, with the additional feature of providing an associated transport plan. We also investigate some theoretical properties such as metricity, weak convergence, computational and topological properties. Empirical evidences support the benefits of min-SWGG in various contexts, from gradient flows, shape matching and image colorization, among others.

估計/估計量 · 矩 · Signal Processing · 統計量 · 樣本 ·

2023 年 7 月 4 日

A Non-Classical Parameterization for Density Estimation Using Sample Moments

Guangyu Wu,Anders Lindquist

from arxiv, 12 pages, 15 figures

Probability density estimation is a core problem of statistics and signal processing. Moment methods are an important means of density estimation, but they are generally strongly dependent on the choice of feasible functions, which severely affects the performance. In this paper, we propose a non-classical parametrization for density estimation using sample moments, which does not require the choice of such functions. The parametrization is induced by the squared Hellinger distance, and the solution of it, which is proved to exist and be unique subject to a simple prior that does not depend on data, and can be obtained by convex optimization. Statistical properties of the density estimator, together with an asymptotic error upper bound are proposed for the estimator by power moments. Applications of the proposed density estimator in signal processing tasks are given. Simulation results validate the performance of the estimator by a comparison to several prevailing methods. To the best of our knowledge, the proposed estimator is the first one in the literature for which the power moments up to an arbitrary even order exactly match the sample moments, while the true density is not assumed to fall within specific function classes.

啟發式算法 · 近似 · SAT · MoDELS · 示例 ·

2023 年 7 月 4 日

Heuristic Algorithms for the Approximation of Mutual Coherence

Gregor Betz,Vera Chekan,Tamara Mchedlidze

from arxiv, Results from 2021

Mutual coherence is a measure of similarity between two opinions. Although the notion comes from philosophy, it is essential for a wide range of technologies, e.g., the Wahl-O-Mat system. In Germany, this system helps voters to find candidates that are the closest to their political preferences. The exact computation of mutual coherence is highly time-consuming due to the iteration over all subsets of an opinion. Moreover, for every subset, an instance of the SAT model counting problem has to be solved which is known to be a hard problem in computer science. This work is the first study to accelerate this computation. We model the distribution of the so-called confirmation values as a mixture of three Gaussians and present efficient heuristics to estimate its model parameters. The mutual coherence is then approximated with the expected value of the distribution. Some of the presented algorithms are fully polynomial-time, others only require solving a small number of instances of the SAT model counting problem. The average squared error of our best algorithm lies below 0.0035 which is insignificant if the efficiency is taken into account. Furthermore, the accuracy is precise enough to be used in Wahl-O-Mat-like systems.

正則化項 · 泛函 · Continuity · 近似 · 樣例 ·

2023 年 7 月 4 日

Hybrid hyperinterpolation over general regions

Congpei An,Jia-Shu Ran,Alvise Sommariva

from arxiv, 25 pages, 7 figures

We present an $\ell^2_2+\ell_1$-regularized discrete least squares approximation over general regions under assumptions of hyperinterpolation, named hybrid hyperinterpolation. Hybrid hyperinterpolation, using a soft thresholding operator and a filter function to shrink the Fourier coefficients approximated by a high-order quadrature rule of a given continuous function with respect to some orthonormal basis, is a combination of Lasso and filtered hyperinterpolations. Hybrid hyperinterpolation inherits features of them to deal with noisy data once the regularization parameter and the filter function are chosen well. We not only provide $L_2$ errors in theoretical analysis for hybrid hyperinterpolation to approximate continuous functions with noise and noise-free, but also decompose $L_2$ errors into three exact computed terms with the aid of a prior regularization parameter choices rule. This rule, making fully use of coefficients of hyperinterpolation to choose a regularization parameter, reveals that $L_2$ errors for hybrid hyperinterpolation sharply decrease and then slowly increase when the sparsity of coefficients ranges from one to large values. Numerical examples show the enhanced performance of hybrid hyperinterpolation when regularization parameters and noise vary. Theoretical $L_2$ errors bounds are verified in numerical examples on the interval, the unit-disk, the unit-sphere and the unit-cube, the union of disks.

樣本 · 類別 · 損失 · Performer · SimPLe ·

2019 年 1 月 16 日

Class-Balanced Loss Based on Effective Number of Samples

Yin Cui,Menglin Jia,Tsung-Yi Lin,Yang Song,Serge Belongie

from arxiv, Code is available at: //github.com/richardaecn/class-balanced-loss

With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.