草莓视频在线观看免费完整-精品自在线观看影片天天看

A simple way of obtaining robust estimates of the "center" (or the "location") and of the "scatter" of a dataset is to use the maximum likelihood estimate with a class of heavy-tailed distributions, regardless of the "true" distribution generating the data. We observe that the maximum likelihood problem for the Cauchy distributions, which have particularly heavy tails, is geodesically convex and therefore efficiently solvable (Cauchy distributions are parametrized by the upper half plane, i.e. by the hyperbolic plane). Moreover, it has an appealing geometrical meaning: the datapoints, living on the boundary of the hyperbolic plane, are attracting the parameter by unit forces, and we search the point where these forces are in equilibrium. This picture generalizes to several classes of multivariate distributions with heavy tails, including, in particular, the multivariate Cauchy distributions. The hyperbolic plane gets replaced by symmetric spaces of noncompact type. Geodesic convexity gives us an efficient numerical solution of the maximum likelihood problem for these distribution classes. This can then be used for robust estimates of location and spread, thanks to the heavy tails of these distributions.

知識 (knowledge) · MoDELS · INFORMS · 知識表示 · 表示 ·

2023 年 11 月 19 日

Representations of epistemic uncertainty and awareness in data-driven strategies

Mario Angelelli,Massimiliano Gervasi

from arxiv, 32 pages, 4 figures. Improved exposition and corrected misprints. Comments are welcome!

The diffusion of AI and big data is reshaping decision-making processes by increasing the amount of information that supports decisions while reducing direct interaction with data and empirical evidence. This paradigm shift introduces new sources of uncertainty, as limited data observability results in ambiguity and a lack of interpretability. The need for the proper analysis of data-driven strategies motivates the search for new models that can describe this type of bounded access to knowledge. This contribution presents a novel theoretical model for uncertainty in knowledge representation and its transfer mediated by agents. We provide a dynamical description of knowledge states by endowing our model with a structure to compare and combine them. Specifically, an update is represented through combinations, and its explainability is based on its consistency in different dimensional representations. We look at inequivalent knowledge representations in terms of multiplicity of inferences, preference relations, and information measures. Furthermore, we define a formal analogy with two scenarios that illustrate non-classical uncertainty in terms of ambiguity (Ellsberg's model) and reasoning about knowledge mediated by other agents observing data (Wigner's friend). Finally, we discuss some implications of the proposed model for data-driven strategies, with special attention to reasoning under uncertainty about business value dimensions and the design of measurement tools for their assessment.

INFORMS · Performer · 查準率/準確率 · 自動問答 · 確切的 ·

2023 年 11 月 18 日

Evaluating LLMs on Document-Based QA: Exact Answer Selection and Numerical Extraction using Cogtale dataset

Zafaryab Rasool,Scott Barnett,Stefanus Kurniawan,Sherwin Balugo,Rajesh Vasa,Courtney Chesser,Alex Bahar-Fuchs

from arxiv, 14 pages, 1 figure, 8 tables

Document-based Question-Answering (QA) tasks are crucial for precise information retrieval. While some existing work focus on evaluating large language model's performance on retrieving and answering questions from documents, assessing the LLMs' performance on QA types that require exact answer selection from predefined options and numerical extraction is yet to be fully assessed. In this paper, we specifically focus on this underexplored context and conduct empirical analysis of LLMs (GPT-4 and GPT 3.5) on question types, including single-choice, yes-no, multiple-choice, and number extraction questions from documents. We use the Cogtale dataset for evaluation, which provide human expert-tagged responses, offering a robust benchmark for precision and factual grounding. We found that LLMs, particularly GPT-4, can precisely answer many single-choice and yes-no questions given relevant context, demonstrating their efficacy in information retrieval tasks. However, their performance diminishes when confronted with multiple-choice and number extraction formats, lowering the overall performance of the model on this task, indicating that these models may not be reliable for the task. This limits the applications of LLMs on applications demanding precise information extraction from documents, such as meta-analysis tasks. However, these findings hinge on the assumption that the retrievers furnish pertinent context necessary for accurate responses, emphasizing the need for further research on the efficacy of retriever mechanisms in enhancing question-answering performance. Our work offers a framework for ongoing dataset evaluation, ensuring that LLM applications for information retrieval and document analysis continue to meet evolving standards.

有向 · 向量化 · 操作 · 單位向量 · 數學 ·

2023 年 11 月 17 日

Mathematical morphology on directional data

Konstantin Hauch,Claudia Redenbach

from arxiv, 19 pages

We define morphological operators and filters for directional images whose pixel values are unit vectors. This requires an ordering relation for unit vectors which is obtained by using depth functions. They provide a centre-outward ordering with respect to a specified centre vector. We apply our operators on synthetic directional images and compare them with classical morphological operators for grey-scale images. As application examples, we enhance the fault region in a compressed glass foam and segment misaligned fibre regions of glass fibre reinforced polymers.

約束 · 編譯器 · 知識 (knowledge) · 分解 · 表示 ·

2023 年 11 月 16 日

A characterization of efficiently compilable constraint languages

Christoph Berkholz,Stefan Mengel,Hermann Wilhelm

A central task in knowledge compilation is to compile a CNF-SAT instance into a succinct representation format that allows efficient operations such as testing satisfiability, counting, or enumerating all solutions. Useful representation formats studied in this area range from ordered binary decision diagrams (OBDDs) to circuits in decomposable negation normal form (DNNFs). While it is known that there exist CNF formulas that require exponential size representations, the situation is less well studied for other types of constraints than Boolean disjunctive clauses. The constraint satisfaction problem (CSP) is a powerful framework that generalizes CNF-SAT by allowing arbitrary sets of constraints over any finite domain. The main goal of our work is to understand for which type of constraints (also called the constraint language) it is possible to efficiently compute representations of polynomial size. We answer this question completely and prove two tight characterizations of efficiently compilable constraint languages, depending on whether target format is structured. We first identify the combinatorial property of ``strong blockwise decomposability'' and show that if a constraint language has this property, we can compute DNNF representations of linear size. For all other constraint languages we construct families of CSP-instances that provably require DNNFs of exponential size. For a subclass of ``strong uniformly blockwise decomposable'' constraint languages we obtain a similar dichotomy for structured DNNFs. In fact, strong (uniform) blockwise decomposability even allows efficient compilation into multi-valued analogs of OBDDs and FBDDs, respectively. Thus, we get complete characterizations for all knowledge compilation classes between O(B)DDs and DNNFs.

有偏 · search engine · Engineering · Google Scholar · Google ·

2023 年 11 月 16 日

Examining bias perpetuation in academic search engines: an algorithm audit of Google and Semantic Scholar

Celina Kacperski,Mona Bielig,Mykola Makorthyk,Maryna Sydorova,Roberto Ulloa

Researchers rely on academic web search engines to find scientific sources, but search engine mechanisms may selectively present content that aligns with biases embedded in the queries. This study examines whether confirmation-biased queries prompted into Google Scholar and Semantic Scholar will yield skewed results. Six queries (topics across health and technology domains such as "vaccines" or "internet use") were analyzed for disparities in search results. We confirm that biased queries (targeting "benefits" or "risks") affect search results in line with the bias, with technology-related queries displaying more significant disparities. Overall, Semantic Scholar exhibited fewer disparities than Google Scholar. Topics rated as more polarizing did not consistently show more skewed results. Academic search results that perpetuate confirmation bias have strong implications for both researchers and citizens searching for evidence. More research is needed to explore how scientific inquiry and academic search engines interact.

知識 (knowledge) · MoDELS · INFORMS · 知識表示 · 表示 ·

2023 年 11 月 16 日

Representations of epistemic uncertainty and its perception in data-driven strategies

Mario Angelelli,Massimiliano Gervasi

from arxiv, 32 pages, 4 figures. Improved exposition and corrected misprints. Comments are welcome!

The diffusion of AI and big data is reshaping decision-making processes by increasing the amount of information that supports decisions while reducing direct interaction with data and empirical evidence. This paradigm shift introduces new sources of uncertainty, as limited data observability results in ambiguity and a lack of interpretability. The need for the proper analysis of data-driven strategies motivates the search for new models that can describe this type of bounded access to knowledge. This contribution presents a novel theoretical model for uncertainty in knowledge representation and its transfer mediated by agents. We provide a dynamical description of knowledge states by endowing our model with a structure to compare and combine them. Specifically, an update is represented through combinations, and its explainability is based on its consistency in different dimensional representations. We look at inequivalent knowledge representations in terms of multiplicity of inferences, preference relations, and information measures. Furthermore, we define a formal analogy with two scenarios that illustrate non-classical uncertainty in terms of ambiguity (Ellsberg's model) and reasoning about knowledge mediated by other agents observing data (Wigner's friend). Finally, we discuss some implications of the proposed model for data-driven strategies, with special attention to reasoning under uncertainty about business value dimensions and the design of measurement tools for their assessment.

統計量 · 散度 · 近似 · UniFormer · CASES ·

2023 年 11 月 16 日

Poisson and Gaussian approximations of the power divergence family of statistics

Fraser Daly

from arxiv, 15 pages; minor revision of original version

Consider the family of power divergence statistics based on $n$ trials, each leading to one of $r$ possible outcomes. This includes the log-likelihood ratio and Pearson's statistic as important special cases. It is known that in certain regimes (e.g., when $r$ is of order $n^2$ and the allocation is asymptotically uniform as $n\to\infty$) the power divergence statistic converges in distribution to a linear transformation of a Poisson random variable. We establish explicit error bounds in the Kolmogorov (or uniform) metric to complement this convergence result, which may be applied for any values of $n$, $r$ and the index parameter $\lambda$ for which such a finite-sample bound is meaningful. We further use this Poisson approximation result to derive error bounds in Gaussian approximation of the power divergence statistics.

跡 · 估計/估計量 · 頻率主義學派 · INFORMS · 查準率/準確率 ·

2023 年 11 月 16 日

How trace plots help interpret meta-analysis results

Christian R?ver,David Rindskopf,Tim Friede

from arxiv, 14 pages, 18 Figures

The trace plot is seldom used in meta-analysis, yet it is a very informative plot. In this article we define and illustrate what the trace plot is, and discuss why it is important. The Bayesian version of the plot combines the posterior density of tau, the between-study standard deviation, and the shrunken estimates of the study effects as a function of tau. With a small or moderate number of studies, tau is not estimated with much precision, and parameter estimates and shrunken study effect estimates can vary widely depending on the correct value of tau. The trace plot allows visualization of the sensitivity to tau along with a plot that shows which values of tau are plausible and which are implausible. A comparable frequentist or empirical Bayes version provides similar results. The concepts are illustrated using examples in meta-analysis and meta-regression; implementaton in R is facilitated in a Bayesian or frequentist framework using the bayesmeta and metafor packages, respectively.

正交 · 學習器 · 變換 · 評論員 · 估計/估計量 ·

2023 年 11 月 15 日

Orthogonal prediction of counterfactual outcomes

Stijn Vansteelandt,Pawe? Morzywo?ek

Orthogonal meta-learners, such as DR-learner, R-learner and IF-learner, are increasingly used to estimate conditional average treatment effects. They improve convergence rates relative to na\"{\i}ve meta-learners (e.g., T-, S- and X-learner) through de-biasing procedures that involve applying standard learners to specifically transformed outcome data. This leads them to disregard the possibly constrained outcome space, which can be particularly problematic for dichotomous outcomes: these typically get transformed to values that are no longer constrained to the unit interval, making it difficult for standard learners to guarantee predictions within the unit interval. To address this, we construct orthogonal meta-learners for the prediction of counterfactual outcomes which respect the outcome space. As such, the obtained i-learner or imputation-learner is more generally expected to outperform existing learners, even when the outcome is unconstrained, as we confirm empirically in simulation studies and an analysis of critical care data. Our development also sheds broader light onto the construction of orthogonal learners for other estimands.

亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

相關內容