国产一国产一级毛片A久久久_国产男女无套内谢免费视频_亚洲国产欧美高清在线精品二区_国产偷亚洲专区在线观看_91精品成人免费国产_久久2O20国产精品_亚洲色无码中文字幕

Applying simple linear regression models, an economist analysed a published dataset from an influential annual ranking in 2016 and 2017 of consumer outlets for Dutch New Herring and concluded that the ranking was manipulated. His finding was promoted by his university in national and international media, and this led to public outrage and ensuing discontinuation of the survey. We reconstitute the dataset, correcting errors and exposing features already important in a descriptive analysis of the data. The economist has continued his investigations, and in a follow-up publication repeats the same accusations. We point out errors in his reasoning and show that alleged evidence for deliberate manipulation of the ranking could easily be an artefact of specification errors. Temporal and spatial factors are both important and complex, and their effects cannot be captured using simple models, given the small sample sizes and many factors determining perceived taste of a food product.

相關內容

秩

關注 0

Performer · 泛函 · 方差 · Analysis · 回合 ·

2023 年 5 月 7 日

Revisiting the Performance of Serverless Computing: An Analysis of Variance

Jinfeng Wen,Zhenpeng Chen,Federica Sarro,Xuanzhe Liu

Serverless computing is an emerging cloud computing paradigm, which allows software engineers to develop applications at the granularity of function (called serverless functions). However, multiple identical runs of the same serverless functions can show different performance (i.e., response latencies) due to the highly dynamic underlying environment where these functions are executed. We conduct the first work to study serverless function performance to raise awareness of this variance among researchers. We investigate 59 related research papers published in top-tier conferences, and observe that only 40.68% of them use multiple runs to quantify the variance of serverless function performance. Then we extract 65 serverless functions used in these papers and find that the performance of these serverless functions can differ by up to 338.76% (44.15% on average), indicating a large magnitude of the variance. Furthermore, we find that 61.54% of these functions can have unreliable performance results at the low number of repetitions that are widely adopted in the serverless computing literature.

情景 · Analysis · 有偏 · 穩健性 · CASE ·

2023 年 5 月 6 日

Risk Set Matched Difference-in-Differences for the Analysis of Effect Modification in an Observational Study on the Impact of Gun Violence on Health Outcomes

Eric R. Cohn,Zirui Song,Jose R. Zubizarreta

Gun violence is a major problem in contemporary American society, with tens of thousands injured each year. However, relatively little is known about the effects on family members and how effects vary across subpopulations. To study these questions and, more generally, to address a gap in the causal inference literature, we present a framework for the study of effect modification or heterogeneous treatment effects in difference-in-differences designs. We implement a new matching technique, which combines profile matching and risk set matching, to (i) preserve the time alignment of covariates, exposure, and outcomes, avoiding pitfalls of other common approaches for difference-in-differences, and (ii) explicitly control biases due to imbalances in observed covariates in subgroups discovered from the data. Our case study shows significant and persistent effects of nonfatal firearm injuries on several health outcomes for those injured and on the mental health of their family members. Sensitivity analyses reveal that these results are moderately robust to unmeasured confounding bias. Finally, while the effects for those injured are modified largely by the severity of the injury and its documented intent, for families, effects are strongest for those whose relative's injury is documented as resulting from an assault, self-harm, or law enforcement intervention.

SimPLe · INFORMS · 互信息 · Color · 示例 ·

2023 年 5 月 6 日

How to Compress the Solution

Samuel Epstein

Using derandomization, we provide an upper bound on the compression size of solutions to the graph coloring problem. In general, if solutions to a combinatorial problem exist with high probability and the probability is simple, then there exists a simple solution to the problem. Otherwise the problem instance has high mutual information with the halting problem.

因子分析 · Subspace · 分解的 · Analysis · 可辨認的 ·

2023 年 5 月 6 日

Inferring Covariance Structure from Multiple Data Sources via Subspace Factor Analysis

Noirrit Kiran Chandra,David B. Dunson,Jason Xu

Factor analysis provides a canonical framework for imposing lower-dimensional structure such as sparse covariance in high-dimensional data. High-dimensional data on the same set of variables are often collected under different conditions, for instance in reproducing studies across research groups. In such cases, it is natural to seek to learn the shared versus condition-specific structure. Existing hierarchical extensions of factor analysis have been proposed, but face practical issues including identifiability problems. To address these shortcomings, we propose a class of SUbspace Factor Analysis (SUFA) models, which characterize variation across groups at the level of a lower-dimensional subspace. We prove that the proposed class of SUFA models lead to identifiability of the shared versus group-specific components of the covariance, and study their posterior contraction properties. Taking a Bayesian approach, these contributions are developed alongside efficient posterior computation algorithms. Our sampler fully integrates out latent variables, is easily parallelizable and has complexity that does not depend on sample size. We illustrate the methods through application to integration of multiple gene expression datasets relevant to immunology.

樣本 · 相互獨立的 · AIM · 統計量 · ONCE ·

2023 年 5 月 6 日

Bayesian sample size determination for multi-site replication studies

Konstantinos Bourazas,Guido Consonni,Laura Deldossi

from arxiv, 43 pages, 7 tables, 11 figures

An ongoing "reproducibility crisis" calls into question scientific discoveries across a variety of disciplines ranging from life to social sciences. Replication studies aim to investigate the validity of findings in published research, and try to assess whether the latter are statistically consistent with those in the replications. While the majority of replication projects are based on a single experiment, multiple independent replications of the same experiment conducted simultaneously at different sites are becoming more frequent. In connection with these types of projects, we deal with testing heterogeneity among sites; specifically, we focus on sample size determination suitable to deliver compelling evidence once the experimental data are gathered.

主動學習 · Learning · Performer · 模型評估 · 類別 ·

2023 年 5 月 5 日

Transfer and Active Learning for Dissonance Detection: Addressing the Rare-Class Challenge

Vasudha Varadarajan,Swanie Juhng,Syeda Mahwish,Xiaoran Liu,Jonah Luby,Christian Luhmann,H. Andrew Schwartz

While transformer-based systems have enabled greater accuracies with fewer training examples, data acquisition obstacles still persist for rare-class tasks -- when the class label is very infrequent (e.g. < 5% of samples). Active learning has in general been proposed to alleviate such challenges, but choice of selection strategy, the criteria by which rare-class examples are chosen, has not been systematically evaluated. Further, transformers enable iterative transfer-learning approaches. We propose and investigate transfer- and active learning solutions to the rare class problem of dissonance detection through utilizing models trained on closely related tasks and the evaluation of acquisition strategies, including a proposed probability-of-rare-class (PRC) approach. We perform these experiments for a specific rare class problem: collecting language samples of cognitive dissonance from social media. We find that PRC is a simple and effective strategy to guide annotations and ultimately improve model accuracy while transfer-learning in a specific order can improve the cold-start performance of the learner but does not benefit iterations of active learning.

規范化的 · MoDELS · 極大似然 · 最大似然估計 · 估計/估計量 ·

2023 年 5 月 4 日

Extreme Limit Theory of Competing Risks under Power Normalization

Kaihao Hu,Kai Wang,Corina Constantinescu,Zhengjun Zhang,Chengxiu Ling

Advanced science and technology provide a wealth of big data from different sources for extreme value analysis.Classic extreme value theory was extended to obtain an accelerated max-stable distribution family for modelling competing risk-based extreme data in Cao and Zhang (2021). In this paper, we establish probability models for power normalized maxima and minima from competing risks. The limit distributions consist of an extensional new accelerated max-stable and min-stable distribution family (termed as the accelerated p-max/p-min stable distribution), and its left-truncated version. The limit types of distributions are determined principally by the sample generating process and the interplay among the competing risks, which are illustrated by common examples. Further, the statistical inference concerning the maximum likelihood estimation and model diagnosis of this model was investigated. Numerical studies show first the efficient approximation of all limit scenarios as well as its comparable convergence rate in contrast with those under linear normalization, and then present the maximum likelihood estimation and diagnosis of accelerated p-max/p-min stable models for simulated data sets. Finally, two real datasets concerning annual maximum of ground level ozone and survival times of Stanford heart plant demonstrate the performance of our accelerated p-max and accelerated p-min stable models.

INFORMS · MoDELS · Performer · 泛化誤差 · 樣本 ·

2023 年 5 月 4 日

Credibility of high $R^2$ in regression problems: a permutation approach

Micha? Ciszewski,Jakob S?hl,Ton Leenen,Bart van Trigt,Geurt Jongbloed

from arxiv, Submitted to Journal of Applied Statistics

The question of whether $Y$ can be predicted based on $X$ often arises and while a well adjusted model may perform well on observed data, the risk of overfitting always exists, leading to poor generalization error on unseen data. This paper proposes a rigorous permutation test to assess the credibility of high $R^2$ values in regression models, which can also be applied to any measure of goodness of fit, without the need for sample splitting, by generating new pairings of $(X_i, Y_j)$ and providing an overall interpretation of the model's accuracy. It introduces a new formulation of the null hypothesis and justification for the test, which distinguishes it from previous literature. The theoretical findings are applied to both simulated data and sensor data of tennis serves in an experimental context. The simulation study underscores how the available information affects the test, showing that the less informative the predictors, the lower the probability of rejecting the null hypothesis, and emphasizing that detecting weaker dependence between variables requires a sufficient sample size.

AI · Extensibility · GPT3.5 · Continuity · INTERACT ·

2023 年 5 月 4 日

Governance of the AI, by the AI, and for the AI

Andrew W. Torrance,Bill Tomlinson

from arxiv, 20 pages

Over the past half century, there have been several false dawns during which the "arrival" of world-changing artificial intelligence (AI) has been heralded. Tempting fate, the authors believe the age of AI has, indeed, finally arrived. Powerful image generators, such as DALL-E2 and Midjourney have suddenly allowed anyone with access the ability easily to create rich and complex art. In a similar vein, text generators, such as GPT3.5 (including ChatGPT) and BLOOM, allow users to compose detailed written descriptions of many topics of interest. And, it is even possible now for a person without extensive expertise in writing software to use AI to generate code capable of myriad applications. While AI will continue to evolve and improve, probably at a rapid rate, the current state of AI is already ushering in profound changes to many different sectors of society. Every new technology challenges the ability of humanity to govern it wisely. However, governance is usually viewed as both possible and necessary due to the disruption new technology often poses to social structures, industries, the environment, and other important human concerns. In this article, we offer an analysis of a range of interactions between AI and governance, with the hope that wise decisions may be made that maximize benefits and minimize costs. The article addresses two main aspects of this relationship: the governance of AI by humanity, and the governance of humanity by AI. The approach we have taken is itself informed by AI, as this article was written collaboratively by the authors and ChatGPT.

估計/估計量 · 正交 · 泛函 · MoDELS · 有偏 ·

2018 年 1 月 20 日

IEOPF: An Active Contour Model for Image Segmentation with Inhomogeneities Estimated by Orthogonal Primary Functions

Chaolu Feng

from arxiv, 27 pages, 14 figures

Image segmentation is still an open problem especially when intensities of the interested objects are overlapped due to the presence of intensity inhomogeneity (also known as bias field). To segment images with intensity inhomogeneities, a bias correction embedded level set model is proposed where Inhomogeneities are Estimated by Orthogonal Primary Functions (IEOPF). In the proposed model, the smoothly varying bias is estimated by a linear combination of a given set of orthogonal primary functions. An inhomogeneous intensity clustering energy is then defined and membership functions of the clusters described by the level set function are introduced to rewrite the energy as a data term of the proposed model. Similar to popular level set methods, a regularization term and an arc length term are also included to regularize and smooth the level set function, respectively. The proposed model is then extended to multichannel and multiphase patterns to segment colourful images and images with multiple objects, respectively. It has been extensively tested on both synthetic and real images that are widely used in the literature and public BrainWeb and IBSR datasets. Experimental results and comparison with state-of-the-art methods demonstrate that advantages of the proposed model in terms of bias correction and segmentation accuracy.