成年人日屄视频免费观看,啊在线不卡视频无码,又粗又爽又硬免费看视频骚视频

Emergent chain-of-thought (CoT) reasoning capabilities promise to improve performance and explainability of large language models (LLMs). However, uncertainties remain about how reasoning strategies formulated for previous model generations generalize to new model generations and different datasets. In this small-scale study, we compare different reasoning strategies induced by zero-shot prompting across six recently released LLMs (davinci-002, davinci-003, GPT-3.5-turbo, GPT-4, Flan-T5-xxl and Cohere command-xlarge) on a mixture of six question-answering datasets, including datasets from scientific and medical domains. Our findings demonstrate that while some variations in effectiveness occur, gains from CoT reasoning strategies remain robust across different models and datasets. GPT-4 has the most benefit from current state-of-the-art reasoning strategies and exhibits the best performance by applying a prompt previously discovered through automated discovery.

相關內容

MoDELS

關注 43

ACM/IEEE第23屆模型驅動工程語言和系統國際會議，是模型驅動軟件和系統工程的首要會議系列，由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來，模型涵蓋了建模的各個方面，從語言和方法到工具和應用程序。模特的參加者來自不同的背景，包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇，參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會，并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。官網鏈接： · MoDELS · INFORMS · 均值 · 變換 ·

2023 年 9 月 25 日

Speaker anonymization using neural audio codec language models

Michele Panariello,Francesco Nespoli,Massimiliano Todisco,Nicholas Evans

from arxiv, Submitted to ICASSP 2024

The vast majority of approaches to speaker anonymization involve the extraction of fundamental frequency estimates, linguistic features and a speaker embedding which is perturbed to obfuscate the speaker identity before an anonymized speech waveform is resynthesized using a vocoder. Recent work has shown that x-vector transformations are difficult to control consistently: other sources of speaker information contained within fundamental frequency and linguistic features are re-entangled upon vocoding, meaning that anonymized speech signals still contain speaker information. We propose an approach based upon neural audio codecs (NACs), which are known to generate high-quality synthetic speech when combined with language models. NACs use quantized codes, which are known to effectively bottleneck speaker-related information: we demonstrate the potential of speaker anonymization systems based on NAC language modeling by applying the evaluation framework of the Voice Privacy Challenge 2022.

Analysis · INFORMS · 可理解性 · 知識 (knowledge) · Extensibility ·

2023 年 9 月 25 日

Understanding metric-related pitfalls in image analysis validation

Annika Reinke,Minu D. Tizabi,Michael Baumgartner,Matthias Eisenmann,Doreen Heckmann-N?tzel,A. Emre Kavur,Tim R?dsch,Carole H. Sudre,Laura Acion,Michela Antonelli,Tal Arbel,Spyridon Bakas,Arriel Benis,Matthew Blaschko,Florian Buettner,M. Jorge Cardoso,Veronika Cheplygina,Jianxu Chen,Evangelia Christodoulou,Beth A. Cimini,Gary S. Collins,Keyvan Farahani,Luciana Ferrer,Adrian Galdran,Bram van Ginneken,Ben Glocker,Patrick Godau,Robert Haase,Daniel A. Hashimoto,Michael M. Hoffman,Merel Huisman,Fabian Isensee,Pierre Jannin,Charles E. Kahn,Dagmar Kainmueller,Bernhard Kainz,Alexandros Karargyris,Alan Karthikesalingam,Hannes Kenngott,Jens Kleesiek,Florian Kofler,Thijs Kooi,Annette Kopp-Schneider,Michal Kozubek,Anna Kreshuk,Tahsin Kurc,Bennett A. Landman,Geert Litjens,Amin Madani,Klaus Maier-Hein,Anne L. Martel,Peter Mattson,Erik Meijering,Bjoern Menze,Karel G. M. Moons,Henning Müller,Brennan Nichyporuk,Felix Nickel,Jens Petersen,Susanne M. Rafelski,Nasir Rajpoot,Mauricio Reyes,Michael A. Riegler,Nicola Rieke,Julio Saez-Rodriguez,Clara I. Sánchez,Shravya Shetty,Maarten van Smeden,Ronald M. Summers,Abdel A. Taha,Aleksei Tiulpin,Sotirios A. Tsaftaris,Ben Van Calster,Ga?l Varoquaux,Manuel Wiesenfarth,Ziv R. Yaniv,Paul F. J?ger,Lena Maier-Hein

from arxiv, Shared first authors: Annika Reinke, Minu D. Tizabi; shared senior authors: Paul F. J\"ager, Lena Maier-Hein

Validation metrics are key for the reliable tracking of scientific progress and for bridging the current chasm between artificial intelligence (AI) research and its translation into practice. However, increasing evidence shows that particularly in image analysis, metrics are often chosen inadequately in relation to the underlying research problem. This could be attributed to a lack of accessibility of metric-related knowledge: While taking into account the individual strengths, weaknesses, and limitations of validation metrics is a critical prerequisite to making educated choices, the relevant knowledge is currently scattered and poorly accessible to individual researchers. Based on a multi-stage Delphi process conducted by a multidisciplinary expert consortium as well as extensive community feedback, the present work provides the first reliable and comprehensive common point of access to information on pitfalls related to validation metrics in image analysis. Focusing on biomedical image analysis but with the potential of transfer to other fields, the addressed pitfalls generalize across application domains and are categorized according to a newly created, domain-agnostic taxonomy. To facilitate comprehension, illustrations and specific examples accompany each pitfall. As a structured body of information accessible to researchers of all levels of expertise, this work enhances global comprehension of a key topic in image analysis validation.

Pivotal（公司） · 單純形 · 策略迭代 · Markov · CASE ·

2023 年 9 月 25 日

A unified worst case for classical simplex and policy iteration pivot rules

Yann Disser,Nils Mosis

We construct a family of Markov decision processes for which the policy iteration algorithm needs an exponential number of improving switches with Dantzig's rule, with Bland's rule, and with the Largest Increase pivot rule. This immediately translates to a family of linear programs for which the simplex algorithm needs an exponential number of pivot steps with the same three pivot rules. Our results yield a unified construction that simultaneously reproduces well-known lower bounds for these classical pivot rules, and we are able to infer that any (deterministic or randomized) combination of them cannot avoid an exponential worst-case behavior. Regarding the policy iteration algorithm, pivot rules typically switch multiple edges simultaneously and our lower bound for Dantzig's rule and the Largest Increase rule, which perform only single switches, seem novel. Regarding the simplex algorithm, the individual lower bounds were previously obtained separately via deformed hypercube constructions. In contrast to previous bounds for the simplex algorithm via Markov decision processes, our rigorous analysis is reasonably concise.

向量化 · 路徑 · 相互獨立的 · 評論員 · 模型評估 ·

2023 年 9 月 24 日

A new unified arc-length method for damage mechanics problems

Roshan Philip Saji,Panos Pantidis,Mostafa E. Mobasher

The numerical solution of continuum damage mechanics (CDM) problems suffers from convergence-related challenges during the material softening stage, and consequently existing iterative solvers are subject to a trade-off between computational expense and solution accuracy. In this work, we present a novel unified arc-length (UAL) method, and we derive the formulation of the analytical tangent matrix and governing system of equations for both local and non-local gradient damage problems. Unlike existing versions of arc-length solvers that monolithically scale the external force vector, the proposed method treats the latter as an independent variable and determines the position of the system on the equilibrium path based on all the nodal variations of the external force vector. This approach renders the proposed solver substantially more efficient and robust than existing solvers used in CDM problems. We demonstrate the considerable advantages of the proposed algorithm through several benchmark 1D problems with sharp snap-backs and 2D examples under various boundary conditions and loading scenarios. The proposed UAL approach exhibits a superior ability of overcoming critical increments along the equilibrium path. Moreover, the proposed UAL method is 1-2 orders of magnitude faster than force-controlled arc-length and monolithic Newton-Raphson solvers.

Analysis · 可辨認的 · Continuity · CASES · Performer ·

2023 年 9 月 22 日

A Fourier-based methodology without numerical diffusion for conducting dye simulations and particle residence time calculations

Faisal Amlani,Heng Wei,Niema M. Pahlevan

Dye experimentation is a widely used method in experimental fluid mechanics for flow analysis or for the study of the transport of particles within a fluid. This technique is particularly useful in biomedical diagnostic applications ranging from hemodynamic analysis of cardiovascular systems to ocular circulation. However, simulating dyes governed by convection-diffusion partial differential equations (PDEs) can also be a useful post-processing analysis approach for computational fluid dynamics (CFD) applications. Such simulations can be used to identify the relative significance of different spatial subregions in particular time intervals of interest in an unsteady flow field. Additionally, dye evolution is closely related to non-discrete particle residence time (PRT) calculations that are governed by similar PDEs. This contribution introduces a pseudo-spectral method based on Fourier continuation (FC) for conducting dye simulations and non-discrete particle residence time calculations without numerical diffusion errors. Convergence and error analyses are performed with both manufactured and analytical solutions. The methodology is applied to three distinct physical/physiological cases: 1) flow over a two-dimensional (2D) cavity; 2) pulsatile flow in a simplified partially-grafted aortic dissection model; and 3) non-Newtonian blood flow in a Fontan graft. Although velocity data is provided in this work by numerical simulation, the proposed approach can also be applied to velocity data collected through experimental techniques such as from particle image velocimetry.

語義相似度 · 相似度 · 相似度度量 · Better · MoDELS ·

2023 年 9 月 22 日

Steffen Herbold

from arxiv, Under review

Semantic similarity between natural language texts is typically measured either by looking at the overlap between subsequences (e.g., BLEU) or by using embeddings (e.g., BERTScore, S-BERT). Within this paper, we argue that when we are only interested in measuring the semantic similarity, it is better to directly predict the similarity using a fine-tuned model for such a task. Using a fine-tuned model for the STS-B from the GLUE benchmark, we define the STSScore approach and show that the resulting similarity is better aligned with our expectations on a robust semantic similarity measure than other approaches.

估計/估計量 · 泛函 · Copulas · Continuity · 相互獨立的 ·

2023 年 9 月 21 日

Quantifying and estimating dependence via sensitivity of conditional distributions

Jonathan Ansari,Patrick B. Langthaler,Sebastian Fuchs,Wolfgang Trutschnig

from arxiv, 24 pages, 5 figures, 1 table

Recently established, directed dependence measures for pairs $(X,Y)$ of random variables build upon the natural idea of comparing the conditional distributions of $Y$ given $X=x$ with the marginal distribution of $Y$. They assign pairs $(X,Y)$ values in $[0,1]$, the value is $0$ if and only if $X,Y$ are independent, and it is $1$ exclusively for $Y$ being a function of $X$. Here we show that comparing randomly drawn conditional distributions with each other instead or, equivalently, analyzing how sensitive the conditional distribution of $Y$ given $X=x$ is on $x$, opens the door to constructing novel families of dependence measures $\Lambda_\varphi$ induced by general convex functions $\varphi: \mathbb{R} \rightarrow \mathbb{R}$, containing, e.g., Chatterjee's coefficient of correlation as special case. After establishing additional useful properties of $\Lambda_\varphi$ we focus on continuous $(X,Y)$, translate $\Lambda_\varphi$ to the copula setting, consider the $L^p$-version and establish an estimator which is strongly consistent in full generality. A real data example and a simulation study illustrate the chosen approach and the performance of the estimator. Complementing the afore-mentioned results, we show how a slight modification of the construction underlying $\Lambda_\varphi$ can be used to define new measures of explainability generalizing the fraction of explained variance.

Analysis · 層 · 估計/估計量 · 樣例 · 平滑 ·

2023 年 9 月 21 日

Numerical analysis of a singularly perturbed 4th order problem with a shift term

Sebastian Franz,Kleio Liotati

from arxiv, 24 pages, 2 figures

We consider a one-dimensional singularly perturbed 4th order problem with the additional feature of a shift term. An expansion into a smooth term, boundary layers and an inner layer yields a formal solution decomposition, and together with a stability result we have estimates for the subsequent numerical analysis. With classical layer adapted meshes we present a numerical method, that achieves supercloseness and optimal convergence orders in the associated energy norm. We also consider coarser meshes in view of the weak layers. Some numerical examples conclude the paper and support the theory.

模型評估 · 均值 · Performer · SimPLe · Ray ·

2023 年 9 月 21 日

Grammatical cues to subjecthood are redundant in a majority of simple clauses across languages

Kyle Mahowald,Evgeniia Diachek,Edward Gibson,Evelina Fedorenko,Richard Futrell

Grammatical cues are sometimes redundant with word meanings in natural language. For instance, English word order rules constrain the word order of a sentence like "The dog chewed the bone" even though the status of "dog" as subject and "bone" as object can be inferred from world knowledge and plausibility. Quantifying how often this redundancy occurs, and how the level of redundancy varies across typologically diverse languages, can shed light on the function and evolution of grammar. To that end, we performed a behavioral experiment in English and Russian and a cross-linguistic computational analysis measuring the redundancy of grammatical cues in transitive clauses extracted from corpus text. English and Russian speakers (n=484) were presented with subjects, verbs, and objects (in random order and with morphological markings removed) extracted from naturally occurring sentences and were asked to identify which noun is the subject of the action. Accuracy was high in both languages (~89% in English, ~87% in Russian). Next, we trained a neural network machine classifier on a similar task: predicting which nominal in a subject-verb-object triad is the subject. Across 30 languages from eight language families, performance was consistently high: a median accuracy of 87%, comparable to the accuracy observed in the human experiments. The conclusion is that grammatical cues such as word order are necessary to convey subjecthood and objecthood in a minority of naturally occurring transitive clauses; nevertheless, they can (a) provide an important source of redundancy and (b) are crucial for conveying intended meaning that cannot be inferred from the words alone, including descriptions of human interactions, where roles are often reversible (e.g., Ray helped Lu/Lu helped Ray), and expressing non-prototypical meanings (e.g., "The bone chewed the dog.").

近似 · 控制器 · 易處理的 · prototype · 穩健性 ·

2023 年 9 月 14 日

Guaranteed approximations of arbitrarily quantified reachability problems

Eric Goubault,Sylvie Putot

We propose an approach to compute inner and outer-approximations of the sets of values satisfying constraints expressed as arbitrarily quantified formulas. Such formulas arise for instance when specifying important problems in control such as robustness, motion planning or controllers comparison. We propose an interval-based method which allows for tractable but tight approximations. We demonstrate its applicability through a series of examples and benchmarks using a prototype implementation.