国产欧美日韩综合在线,欧美亚洲一区电影,成人三级视频在线看网站,激情欧美视频一区二区三区四区,国产精品自产拍在线网站

Evaluations of model editing currently only use the `next few token' completions after a prompt. As a result, the impact of these methods on longer natural language generation is largely unknown. We introduce long-form evaluation of model editing (\textbf{\textit{LEME}}) a novel evaluation protocol that measures the efficacy and impact of model editing in long-form generative settings. Our protocol consists of a machine-rated survey and a classifier which correlates well with human ratings. Importantly, we find that our protocol has very little relationship with previous short-form metrics (despite being designed to extend efficacy, generalization, locality, and portability into a long-form setting), indicating that our method introduces a novel set of dimensions for understanding model editing methods. Using this protocol, we benchmark a number of model editing techniques and present several findings including that, while some methods (ROME and MEMIT) perform well in making consistent edits within a limited scope, they suffer much more from factual drift than other methods. Finally, we present a qualitative analysis that illustrates common failure modes in long-form generative settings including internal consistency, lexical cohesion, and locality issues.

相關內容

MoDELS

關注 0

ACM/IEEE第23屆模型驅動工程語言和系統國際會議，是模型驅動軟件和系統工程的首要會議系列，由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來，模型涵蓋了建模的各個方面，從語言和方法到工具和應用程序。模特的參加者來自不同的背景，包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇，參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會，并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。官網鏈接： · 卷積 · 衰減 · MoDELS · Continuity ·

2024 年 3 月 27 日

Fractional variational integrators based on convolution quadrature

Khaled Hariz,Fernando Jiménez,Sina Ober-Bl?baum

Fractional dissipation is a powerful tool to study non-local physical phenomena such as damping models. The design of geometric, in particular, variational integrators for the numerical simulation of such systems relies on a variational formulation of the model. In [19], a new approach is proposed to deal with dissipative systems including fractionally damped systems in a variational way for both, the continuous and discrete setting. It is based on the doubling of variables and their fractional derivatives. The aim of this work is to derive higher-order fractional variational integrators by means of convolution quadrature (CQ) based on backward difference formulas. We then provide numerical methods that are of order 2 improving a previous result in [19]. The convergence properties of the fractional variational integrators and saturation effects due to the approximation of the fractional derivatives by CQ are studied numerically.

MoDELS · 近似 · CASES · 蒙特卡羅方法 · 蒙特卡羅 ·

2024 年 3 月 27 日

Unconditionally positivity-preserving approximations of the Ait-Sahalia type model: Explicit Milstein-type schemes

Yingsong Jiang,Ruishu Liu,Xiaojie Wang,Jinghua Zhuo

from arxiv, 19 pages, 3 figures

The present article aims to design and analyze efficient first-order strong schemes for a generalized A\"{i}t-Sahalia type model arising in mathematical finance and evolving in a positive domain $(0, \infty)$, which possesses a diffusion term with superlinear growth and a highly nonlinear drift that blows up at the origin. Such a complicated structure of the model unavoidably causes essential difficulties in the construction and convergence analysis of time discretizations. By incorporating implicitness in the term $\alpha_{-1} x^{-1}$ and a corrective mapping $\Phi_h$ in the recursion, we develop a novel class of explicit and unconditionally positivity-preserving (i.e., for any step-size $h>0$) Milstein-type schemes for the underlying model. In both non-critical and general critical cases, we introduce a novel approach to analyze mean-square error bounds of the novel schemes, without relying on a priori high-order moment bounds of the numerical approximations. The expected order-one mean-square convergence is attained for the proposed scheme. The above theoretical guarantee can be used to justify the optimal complexity of the Multilevel Monte Carlo method. Numerical experiments are finally provided to verify the theoretical findings.

泛函 · Performer · MoDELS · 預測器/決策函數 · 標量 ·

2024 年 3 月 26 日

Bayesian scalar-on-network regression with applications to brain functional connectivity

Xiaomeng Ju,Hyung G. Park,Thaddeus Tarpey

This paper presents a Bayesian regression model relating scalar outcomes to brain functional connectivity represented as symmetric positive definite (SPD) matrices. Unlike many proposals that simply vectorize the matrix-valued connectivity predictors thereby ignoring their geometric structure, the method presented here respects the Riemannian geometry of SPD matrices by using a tangent space modeling. Dimension reduction is performed in the tangent space, relating the resulting low-dimensional representations to the responses. The dimension reduction matrix is learned in a supervised manner with a sparsity-inducing prior imposed on a Stiefel manifold to prevent overfitting. Our method yields a parsimonious regression model that allows uncertainty quantification of all model parameters and identification of key brain regions that predict the outcomes. We demonstrate the performance of our approach in simulation settings and through a case study to predict Picture Vocabulary scores using data from the Human Connectome Project.

MoDELS · 原點 · CASES · 蒙特卡羅 · 評論員 ·

2024 年 3 月 25 日

Unconditionally positivity-preserving explicit Euler-type schemes for a generalized Ait-Sahalia model

Ruishu Liu,Yulin Cao,Xiaojie Wang

from arxiv, 25 pages. 4 figures

The present work is devoted to strong approximations of a generalized A\"{i}t-Sahalia model arising from mathematical finance. The numerical study of the considered model faces essential difficulties caused by a drift that blows up at the origin, highly nonlinear drift and diffusion coefficients and positivity-preserving requirement. In this paper, a novel explicit Euler-type scheme is proposed, which is easily implementable and able to preserve positivity of the original model unconditionally, i.e., for any time step-size $h >0$. A mean-square convergence rate of order $0.5$ is also obtained for the proposed scheme in both non-critical and general critical cases. Our work is motivated by the need to justify the multi-level Monte Carlo (MLMC) simulations for the underlying model, where the rate of mean-square convergence is required and the preservation of positivity is desirable particularly for large discretization time steps. Numerical experiments are finally provided to confirm the theoretical findings.

MoDELS · 近似 · CASES · 蒙特卡羅方法 · 蒙特卡羅 ·

2024 年 3 月 25 日

Unconditionally positivity-preserving approximations of the Ait-Sahalia type model Explicit Milstein-type schemes

Yingsong Jiang,Ruishu Liu,Xiaojie Wang,Jinghua Zhuo

from arxiv, 19 pages, 3 figures

噪聲 · Performer · 去噪 · 無監督 · Integration ·

2024 年 3 月 25 日

denoiSplit: a method for joint image splitting and unsupervised denoising

Ashesh Ashesh,Florian Jug

In this work we present denoiSplit, a method to tackle a new analysis task, i.e. the challenge of joint semantic image splitting and unsupervised denoising. This dual approach has important applications in fluorescence microscopy, where semantic image splitting has important applications but noise does generally hinder the downstream analysis of image content. Image splitting involves dissecting an image into its distinguishable semantic structures. We show that the current state-of-the-art method for this task struggles in the presence of image noise, inadvertently also distributing the noise across the predicted outputs. The method we present here can deal with image noise by integrating an unsupervised denoising sub-task. This integration results in improved semantic image unmixing, even in the presence of notable and realistic levels of imaging noise. A key innovation in denoiSplit is the use of specifically formulated noise models and the suitable adjustment of KL-divergence loss for the high-dimensional hierarchical latent space we are training. We showcase the performance of denoiSplit across 4 tasks on real-world microscopy images. Additionally, we perform qualitative and quantitative evaluations and compare results to existing benchmarks, demonstrating the effectiveness of using denoiSplit: a single Variational Splitting Encoder-Decoder (VSE) Network using two suitable noise models to jointly perform semantic splitting and denoising.

優化器 · MoDELS · Processing（編程語言） · 似然 · 類別 ·

2024 年 3 月 25 日

Optimal testing in a class of nonregular models

Yuya Shimizu,Taisuke Otsu

This paper studies optimal hypothesis testing for nonregular statistical models with parameter-dependent support. We consider both one-sided and two-sided hypothesis testing and develop asymptotically uniformly most powerful tests based on the likelihood ratio process. The proposed one-sided test involves randomization to achieve asymptotic size control, some tuning constant to avoid discontinuities in the limiting likelihood ratio process, and a user-specified alternative hypothetical value to achieve the asymptotic optimality. Our two-sided test becomes asymptotically uniformly most powerful without imposing further restrictions such as unbiasedness. Simulation results illustrate desirable power properties of the proposed tests.

MoDELS · 統計量 · PID · Processing（編程語言） · 講稿 ·

2024 年 3 月 24 日

A maturity model for catalogues of semantic artefacts

Oscar Corcho,Fajar J. Ekaputra,Ivan Heibi,Clement Jonquet,Andras Micsik,Silvio Peroni,Emanuele Storti

This work presents a maturity model for assessing catalogues of semantic artefacts, one of the keystones that permit semantic interoperability of systems. We defined the dimensions and related features to include in the maturity model by analysing the current literature and existing catalogues of semantic artefacts provided by experts. In addition, we assessed 26 different catalogues to demonstrate the effectiveness of the maturity model, which includes 12 different dimensions (Metadata, Openness, Quality, Availability, Statistics, PID, Governance, Community, Sustainability, Technology, Transparency, and Assessment) and 43 related features (or sub-criteria) associated with these dimensions. Such a maturity model is one of the first attempts to provide recommendations for governance and processes for preserving and maintaining semantic artefacts and helps assess/address interoperability challenges.

估計/估計量 · 泛函 · 線性的 · Performer · 線性模型 ·

2024 年 3 月 22 日

Testing for linearity in scalar-on-function regression with responses missing at random

Manuel Febrero-Bande,Pedro Galeano,Eduardo García-Portugués,Wenceslao González-Manteiga

from arxiv, 21 pages, 6 figures, 4 tables

A goodness-of-fit test for the Functional Linear Model with Scalar Response (FLMSR) with responses Missing at Random (MAR) is proposed in this paper. The test statistic relies on a marked empirical process indexed by the projected functional covariate and its distribution under the null hypothesis is calibrated using a wild bootstrap procedure. The computation and performance of the test rely on having an accurate estimator of the functional slope of the FLMSR when the sample has MAR responses. Three estimation methods based on the Functional Principal Components (FPCs) of the covariate are considered. First, the simplified method estimates the functional slope by simply discarding observations with missing responses. Second, the imputed method estimates the functional slope by imputing the missing responses using the simplified estimator. Third, the inverse probability weighted method incorporates the missing response generation mechanism when imputing. Furthermore, both cross-validation and LASSO regression are used to select the FPCs used by each estimator. Several Monte Carlo experiments are conducted to analyze the behavior of the testing procedure in combination with the functional slope estimators. Results indicate that estimators performing missing-response imputation achieve the highest power. The testing procedure is applied to check for linear dependence between the average number of sunny days per year and the mean curve of daily temperatures at weather stations in Spain.

缺失值 · 統計量 · 確切的 · 控制器 · 試驗 ·

2024 年 3 月 22 日

On two-sample testing for data with arbitrarily missing values

Yijin Zeng,Niall M. Adams,Dean A. Bodenham

from arxiv, 60 pages, 12 figures

We develop a new rank-based approach for univariate two-sample testing in the presence of missing data which makes no assumptions about the missingness mechanism. This approach is a theoretical extension of the Wilcoxon-Mann-Whitney test that controls the Type I error by providing exact bounds for the test statistic after accounting for the number of missing values. Greater statistical power is shown when the method is extended to account for a bounded domain. Furthermore, exact bounds are provided on the proportions of data that can be missing in the two samples while yielding a significant result. Simulations demonstrate that our method has good power, typically for cases of $10\%$ to $20\%$ missing data, while standard imputation approaches fail to control the Type I error. We illustrate our method on complex clinical trial data in which patients' withdrawal from the trial lead to missing values.