国产一本二本三本的区别视频,中文字幕AV一区二区三区亭亭色,男女高清免费视频午夜网,丝袜无码制服中文字幕,韩国精品欧美一区二区三区

Determining the difficulty of a text involves assessing various textual features that may impact the reader's text comprehension, yet current research in Vietnamese has only focused on statistical features. This paper introduces a new approach that integrates statistical and semantic approaches to assessing text readability. Our research utilized three distinct datasets: the Vietnamese Text Readability Dataset (ViRead), OneStopEnglish, and RACE, with the latter two translated into Vietnamese. Advanced semantic analysis methods were employed for the semantic aspect using state-of-the-art language models such as PhoBERT, ViDeBERTa, and ViBERT. In addition, statistical methods were incorporated to extract syntactic and lexical features of the text. We conducted experiments using various machine learning models, including Support Vector Machine (SVM), Random Forest, and Extra Trees and evaluated their performance using accuracy and F1 score metrics. Our results indicate that a joint approach that combines semantic and statistical features significantly enhances the accuracy of readability classification compared to using each method in isolation. The current study emphasizes the importance of considering both statistical and semantic aspects for a more accurate assessment of text difficulty in Vietnamese. This contribution to the field provides insights into the adaptability of advanced language models in the context of Vietnamese text readability. It lays the groundwork for future research in this area.

相關內容

統計量

關注 3

Boosting（一種模型訓練加速方式） · MoDELS · Performer · 決策樹 · LightGBM ·

2024 年 12 月 19 日

From Point to probabilistic gradient boosting for claim frequency and severity prediction

Dominik Chevalier,Marie-Pier C?té

from arxiv, 26 pages, 4 figures, 26 tables, 7 algorithms

Gradient boosting for decision tree algorithms are increasingly used in actuarial applications as they show superior predictive performance over traditional generalized linear models. Many improvements and sophistications to the first gradient boosting machine algorithm exist. We present in a unified notation, and contrast, all the existing point and probabilistic gradient boosting for decision tree algorithms: GBM, XGBoost, DART, LightGBM, CatBoost, EGBM, PGBM, XGBoostLSS, cyclic GBM, and NGBoost. In this comprehensive numerical study, we compare their performance on five publicly available datasets for claim frequency and severity, of various size and comprising different number of (high cardinality) categorical variables. We explain how varying exposure-to-risk can be handled with boosting in frequency models. We compare the algorithms on the basis of computational efficiency, predictive performance, and model adequacy. LightGBM and XGBoostLSS win in terms of computational efficiency. The fully interpretable EGBM achieves competitive predictive performance compared to the black box algorithms considered. We find that there is no trade-off between model adequacy and predictive accuracy: both are achievable simultaneously.

向量化 · 正則化項 · 近似 · Analysis · SimPLe ·

2024 年 12 月 19 日

An analysis of the Rayleigh-Ritz and refined Rayleigh-Ritz methods for regular nonlinear eigenvalue problems

Zhongxiao Jia,Qingqing Zheng

from arxiv, 25 pages, 1 figure, SIAM Journal on Matrix Analysis and Applications, accepted, December 18, 2024

We establish a general convergence theory of the Rayleigh--Ritz method and the refined Rayleigh--Ritz method for computing some simple eigenpair $(\lambda_{*},x_{*})$ of a given analytic regular nonlinear eigenvalue problem (NEP). In terms of the deviation $\varepsilon$ of $x_{*}$ from a given subspace $\mathcal{W}$, we establish a priori convergence results on the Ritz value, the Ritz vector and the refined Ritz vector. The results show that, as $\varepsilon\rightarrow 0$, there exists a Ritz value that unconditionally converges to $\lambda_*$ and the corresponding refined Ritz vector does so too but the Ritz vector converges conditionally and it may fail to converge and even may not be unique. We also present an error bound for the approximate eigenvector in terms of the computable residual norm of a given approximate eigenpair, and give lower and upper bounds for the error of the refined Ritz vector and the Ritz vector as well as for that of the corresponding residual norms. These results nontrivially extend some convergence results on these two methods for the linear eigenvalue problem to the NEP. Examples are constructed to illustrate the main results.

優化器 · 收縮 · 層 · 泛函 · 向量化 ·

2024 年 12 月 19 日

Contractivity of neural ODEs: an eigenvalue optimization problem

Nicola Guglielmi,Arturo De Marinis,Anton Savostianov,Francesco Tudisco

from arxiv, 26 pages, 6 figures, 4 tables

We propose a novel methodology to solve a key eigenvalue optimization problem which arises in the contractivity analysis of neural ODEs. When looking at contractivity properties of a one layer weight-tied neural ODE $\dot{u}(t)=\sigma(Au(t)+b)$ (with $u,b \in {\mathbb R}^n$, $A$ is a given $n \times n$ matrix, $\sigma : {\mathbb R} \to {\mathbb R}$ denotes an activation function and for a vector $z \in {\mathbb R}^n$, $\sigma(z) \in {\mathbb R}^n$ has to be interpreted entry-wise), we are led to study the logarithmic norm of a set of products of type $D A$, where $D$ is a diagonal matrix such that ${\mathrm{diag}}(D) \in \sigma'({\mathbb R}^n)$. Specifically, given a real number $c$ (usually $c=0$), the problem consists in finding the largest positive interval $\text{I}\subseteq \mathbb [0,\infty)$ such that the logarithmic norm $\mu(DA) \le c$ for all diagonal matrices $D$ with $D_{ii}\in \text{I}$. We propose a two-level nested methodology: an inner level where, for a given $\text{I}$, we compute an optimizer $D^\star(\text{I})$ by a gradient system approach, and an outer level where we tune $\text{I}$ so that the value $c$ is reached by $\mu(D^\star(\text{I})A)$. We extend the proposed two-level approach to the general multilayer, and possibly time-dependent, case $\dot{u}(t) = \sigma( A_k(t) \ldots \sigma ( A_{1}(t) u(t) + b_{1}(t) ) \ldots + b_{k}(t) )$ and we propose several numerical examples to illustrate its behaviour, including its stabilizing performance on a one-layer neural ODE applied to the classification of the MNIST handwritten digits dataset.

Projection · PG · Lipschitz · Lipschitz常數 · 非凸 ·

2024 年 12 月 18 日

Projected gradient methods for nonconvex and stochastic optimization: new complexities and auto-conditioned stepsizes

Guanghui Lan,Tianjiao Li,Yangyang Xu

We present a novel class of projected gradient (PG) methods for minimizing a smooth but not necessarily convex function over a convex compact set. We first provide a novel analysis of the "vanilla" PG method, achieving the best-known iteration complexity for finding an approximate stationary point of the problem. We then develop an "auto-conditioned" projected gradient (AC-PG) variant that achieves the same iteration complexity without requiring the input of the Lipschitz constant of the gradient or any line search procedure. The key idea is to estimate the Lipschitz constant using first-order information gathered from the previous iterations, and to show that the error caused by underestimating the Lipschitz constant can be properly controlled. We then generalize the PG methods to the stochastic setting, by proposing a stochastic projected gradient (SPG) method and a variance-reduced stochastic gradient (VR-SPG) method, achieving new complexity bounds in different oracle settings. We also present auto-conditioned stepsize policies for both stochastic PG methods and establish comparable convergence guarantees.

MoDELS · 語言模型化 · 神經語言模型 · 相同 · 組合性 ·

2024 年 12 月 18 日

Montague semantics and modifier consistency measurement in neural language models

Danilo S. Carvalho,Edoardo Manino,Julia Rozanova,Lucas Cordeiro,André Freitas

This work proposes a novel methodology for measuring compositional behavior in contemporary language embedding models. Specifically, we focus on adjectival modifier phenomena in adjective-noun phrases. In recent years, distributional language representation models have demonstrated great practical success. At the same time, the need for interpretability has elicited questions on their intrinsic properties and capabilities. Crucially, distributional models are often inconsistent when dealing with compositional phenomena in natural language, which has significant implications for their safety and fairness. Despite this, most current research on compositionality is directed towards improving their performance on similarity tasks only. This work takes a different approach, introducing three novel tests of compositional behavior inspired by Montague semantics. Our experimental results indicate that current neural language models do not behave according to the expected linguistic theories. This indicates that current language models may lack the capability to capture the semantic properties we evaluated on limited context, or that linguistic theories from Montagovian tradition may not match the expected capabilities of distributional models.

MoDELS · Performer · 大語言模型 · 語言模型化 · Nuance ·

2024 年 12 月 18 日

Kalahi: A handcrafted, grassroots cultural LLM evaluation suite for Filipino

Jann Railey Montalan,Jian Gang Ngui,Wei Qi Leong,Yosephine Susanto,Hamsawardhini Rengarajan,Alham Fikri Aji,William Chandra Tjhi

from arxiv, Accepted for presentation at Paclic 38, 2024

Multilingual large language models (LLMs) today may not necessarily provide culturally appropriate and relevant responses to its Filipino users. We introduce Kalahi, a cultural LLM evaluation suite collaboratively created by native Filipino speakers. It is composed of 150 high-quality, handcrafted and nuanced prompts that test LLMs for generations that are relevant to shared Filipino cultural knowledge and values. Strong LLM performance in Kalahi indicates a model's ability to generate responses similar to what an average Filipino would say or do in a given situation. We conducted experiments on LLMs with multilingual and Filipino language support. Results show that Kalahi, while trivial for Filipinos, is challenging for LLMs, with the best model answering only 46.0% of the questions correctly compared to native Filipino performance of 89.10%. Thus, Kalahi can be used to accurately and reliably evaluate Filipino cultural representation in LLMs.

穩健性 · 約束 · 統計量 · 估計/估計量 · Learning ·

2024 年 12 月 18 日

Distributionally robust risk evaluation with an isotonic constraint

Yu Gui,Rina Foygel Barber,Cong Ma

Statistical learning under distribution shift is challenging when neither prior knowledge nor fully accessible data from the target distribution is available. Distributionally robust learning (DRL) aims to control the worst-case statistical performance within an uncertainty set of candidate distributions, but how to properly specify the set remains challenging. To enable distributional robustness without being overly conservative, in this paper, we propose a shape-constrained approach to DRL, which incorporates prior information about the way in which the unknown target distribution differs from its estimate. More specifically, we assume the unknown density ratio between the target distribution and its estimate is isotonic with respect to some partial order. At the population level, we provide a solution to the shape-constrained optimization problem that does not involve the isotonic constraint. At the sample level, we provide consistency results for an empirical estimator of the target in a range of different settings. Empirical studies on both synthetic and real data examples demonstrate the improved accuracy of the proposed shape-constrained approach.

圖 · 確切的 · 情景 · 泛函 · 成對型 ·

2024 年 12 月 17 日

The exact subgraph hierarchy and its local variant for the stable set problem for Paley graphs

Elisabeth Gaar,Dunja Pucher

from arxiv, 25 pages, 3 figures, 2 tables

The stability number of a graph, defined as the cardinality of the largest set of pairwise non-adjacent vertices, is NP-hard to compute. The exact subgraph hierarchy (ESH) provides a sequence of increasingly tighter upper bounds on the stability number, starting with the Lov\'asz theta function at the first level and including all exact subgraph constraints of subgraphs of order $k$ into the semidefinite program to compute the Lov\'asz theta function at level $k$. In this paper, we investigate the ESH for Paley graphs, a class of strongly regular, vertex-transitive graphs. We show that for Paley graphs, the bounds obtained from the ESH remain the Lov\'asz theta function up to a certain threshold level, i.e., the bounds of the ESH do not improve up to a certain level. To overcome this limitation, we introduce the local ESH for the stable set problem for vertex-transitive graphs such as Paley graphs. We prove that this new hierarchy provides upper bounds on the stability number of vertex-transitive graphs that are at least as tight as those obtained from the ESH. Additionally, our computational experiments reveal that the local ESH produces superior bounds compared to the ESH for Paley graphs.

塊 · 平滑 · 值域 · CASES · CASE ·

2024 年 12 月 17 日

A comparative study of efficient multigrid solvers for high-order local discontinuous Galerkin methods: Poisson, elliptic interface, and multiphase Stokes problems

Robert I. Saye

from arxiv, 24 pages, 10 figures, 1 legend, 1 algorithm, 2 tables

We design and investigate a variety of multigrid solvers for high-order local discontinuous Galerkin methods applied to elliptic interface and multiphase Stokes problems. Using the template of a standard multigrid V-cycle, we consider a variety of element-wise block smoothers, including Jacobi, multi-coloured Gauss-Seidel, processor-block Gauss-Seidel, and with special interest, smoothers based on sparse approximate inverse (SAI) methods. In particular, we develop SAI methods that: (i) balance the smoothing of velocity and pressure variables in Stokes problems; and (ii) robustly handles high-contrast viscosity coefficients in multiphase problems. Across a broad range of two- and three-dimensional test cases, including Poisson, elliptic interface, steady-state Stokes, and unsteady Stokes problems, we examine a multitude of multigrid smoother and solver combinations. In every case, there is at least one approach that matches the performance of classical geometric multigrid algorithms, e.g., 4 to 8 iterations to reduce the residual by 10 orders of magnitude. We also discuss their relative merits with regard to simplicity, robustness, computational cost, and parallelisation.

Continuity · state-of-the-art · 學成 · Extensibility · Networking ·

2021 年 4 月 16 日

A continual learning survey: Defying forgetting in classification tasks

Matthias De Lange,Rahaf Aljundi,Marc Masana,Sarah Parisot,Xu Jia,Ales Leonardis,Gregory Slabaugh,Tinne Tuytelaars

from arxiv, Accepted TPAMI paper, including Appendix, code publicly available

Artificial neural networks thrive in solving the classification problem for a particular rigid task, acquiring knowledge through generalized learning behaviour from a distinct training phase. The resulting network resembles a static entity of knowledge, with endeavours to extend this knowledge without targeting the original task resulting in a catastrophic forgetting. Continual learning shifts this paradigm towards networks that can continually accumulate knowledge over different tasks without the need to retrain from scratch. We focus on task incremental classification, where tasks arrive sequentially and are delineated by clear boundaries. Our main contributions concern 1) a taxonomy and extensive overview of the state-of-the-art, 2) a novel framework to continually determine the stability-plasticity trade-off of the continual learner, 3) a comprehensive experimental comparison of 11 state-of-the-art continual learning methods and 4 baselines. We empirically scrutinize method strengths and weaknesses on three benchmarks, considering Tiny Imagenet and large-scale unbalanced iNaturalist and a sequence of recognition datasets. We study the influence of model capacity, weight decay and dropout regularization, and the order in which the tasks are presented, and qualitatively compare methods in terms of required memory, computation time, and storage.