日本人体黄色三级视频_欧美精品A在线观看_免费观看一级一片_精品中文字幕一区二区三区四区_日韩特级黄色视频_视频在线播放一区二区三区_国产系列在线视频

Insurance loss data are usually in the form of left-truncation and right-censoring due to deductibles and policy limits respectively. This paper investigates the model uncertainty and selection procedure when various parametric models are constructed to accommodate such left-truncated and right-censored data. The joint asymptotic properties of the estimators have been established using the Delta method along with Maximum Likelihood Estimation when the model is specified. We conduct the simulation studies using Fisk, Lognormal, Lomax, Paralogistic, and Weibull distributions with various proportions of loss data below deductibles and above policy limits. A variety of graphic tools, hypothesis tests, and penalized likelihood criteria are employed to validate the models, and their performances on the model selection are evaluated through the probability of each parent distribution being correctly selected. The effectiveness of each tool on model selection is also illustrated using {well-studied} data that represent Wisconsin property losses in the United States from 2007 to 2010.

相關內容

MoDELS

關注 43

ACM/IEEE第23屆模型驅動工程語言和系統國際會議，是模型驅動軟件和系統工程的首要會議系列，由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來，模型涵蓋了建模的各個方面，從語言和方法到工具和應用程序。模特的參加者來自不同的背景，包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇，參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會，并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。官網鏈接： · 核化 · Minimax · 相互獨立的 · 統計量 ·

2024 年 3 月 12 日

The Minimax Rate of HSIC Estimation for Translation-Invariant Kernels

Florian Kalinke,Zoltan Szabo

Kernel techniques are among the most influential approaches in data science and statistics. Under mild conditions, the reproducing kernel Hilbert space associated to a kernel is capable of encoding the independence of $M\ge 2$ random variables. Probably the most widespread independence measure relying on kernels is the so-called Hilbert-Schmidt independence criterion (HSIC; also referred to as distance covariance in the statistics literature). Despite various existing HSIC estimators designed since its introduction close to two decades ago, the fundamental question of the rate at which HSIC can be estimated is still open. In this work, we prove that the minimax optimal rate of HSIC estimation on $\mathbb R^d$ for Borel measures containing the Gaussians with continuous bounded translation-invariant characteristic kernels is $\mathcal O\!\left(n^{-1/2}\right)$. Specifically, our result implies the optimality in the minimax sense of many of the most-frequently used estimators (including the U-statistic, the V-statistic, and the Nystr\"om-based one) on $\mathbb R^d$.

數據集 · Learning · 標注 · Machine Learning · MoDELS ·

2024 年 3 月 12 日

SATDAUG -- A Balanced and Augmented Dataset for Detecting Self-Admitted Technical Debt

Edi Sutoyo,Andrea Capiluppi

from arxiv, Accepted to be published at the 21st IEEE/ACM International Conference on Mining Software Repositories (MSR 2024)

Self-admitted technical debt (SATD) refers to a form of technical debt in which developers explicitly acknowledge and document the existence of technical shortcuts, workarounds, or temporary solutions within the codebase. Over recent years, researchers have manually labeled datasets derived from various software development artifacts: source code comments, messages from the issue tracker and pull request sections, and commit messages. These datasets are designed for training, evaluation, performance validation, and improvement of machine learning and deep learning models to accurately identify SATD instances. However, class imbalance poses a serious challenge across all the existing datasets, particularly when researchers are interested in categorizing the specific types of SATD. In order to address the scarcity of labeled data for SATD \textit{identification} (i.e., whether an instance is SATD or not) and \textit{categorization} (i.e., which type of SATD is being classified) in existing datasets, we share the \textit{SATDAUG} dataset, an augmented version of existing SATD datasets, including source code comments, issue tracker, pull requests, and commit messages. These augmented datasets have been balanced in relation to the available artifacts and provide a much richer source of labeled data for training machine learning or deep learning models.

MoDELS · 類別 · 成比例 · 基準 · CASES ·

2024 年 3 月 12 日

A Class of Semiparametric Yang and Prentice Frailty Models

Cassius Henrique Xavier Oliveira,Fabio Nogueira Demarqui,Vinicius Diniz Mayrink

from arxiv, 12 pages, 1 figure

The Yang and Prentice (YP) regression models have garnered interest from the scientific community due to their ability to analyze data whose survival curves exhibit intersection. These models include proportional hazards (PH) and proportional odds (PO) models as specific cases. However, they encounter limitations when dealing with multivariate survival data due to potential dependencies between the times-to-event. A solution is introducing a frailty term into the hazard functions, making it possible for the times-to-event to be considered independent, given the frailty term. In this study, we propose a new class of YP models that incorporate frailty. We use the exponential distribution, the piecewise exponential distribution (PE), and Bernstein polynomials (BP) as baseline functions. Our approach adopts a Bayesian methodology. The proposed models are evaluated through a simulation study, which shows that the YP frailty models with BP and PE baselines perform similarly to the generator parametric model of the data. We apply the models in two real data sets.

優化器 · 深度Q網絡 · Q網絡` · INFORMS · DQN ·

2024 年 3 月 12 日

Multi-source Scheduling and Resource Allocation for Age-of-Semantic-Importance Optimization in Status Update Systems

Lunyuan Chen,Jie Gong

from arxiv, 6 pages, 5 figures, accepted by IEEE WCNC wksp 2024

In recent years, semantic communication is progressively emerging as an effective means of facilitating intelligent and context-aware communication. However, current researches seldom simultaneously consider the reliability and timeliness of semantic communication, where scheduling and resource allocation (SRA) plays a crucial role. In contrast, conventional age-based approaches cannot seamlessly extend to semantic communication due to their oversight of semantic importance. To bridge this gap, we introduce a novel metric: Age of Semantic Importance (AoSI), which adaptly captures both the freshness of information and its semantic importance. Utilizing AoSI, we formulate an average AoSI minimization problem by optimizing multi-source SRA. To address this problem, we proposed a AoSI-aware joint SRA algorithm based on Deep Q-Network (DQN). Simulation results validate the effectiveness of our proposed method, demonstrating its ability to facilitate timely and reliable semantic communication.

置信度 · 推斷 · Principle · 估計/估計量 · 統計量 ·

2024 年 3 月 11 日

Near-Optimal Non-Parametric Sequential Tests and Confidence Sequences with Possibly Dependent Observations

Aurelien Bibaut,Nathan Kallus,Michael Lindon

Sequential tests and their implied confidence sequences, which are valid at arbitrary stopping times, promise flexible statistical inference and on-the-fly decision making. However, strong guarantees are limited to parametric sequential tests that under-cover in practice or concentration-bound-based sequences that over-cover and have suboptimal rejection times. In this work, we consider classic delayed-start normal-mixture sequential probability ratio tests, and we provide the first asymptotic type-I-error and expected-rejection-time guarantees under general non-parametric data generating processes, where the asymptotics are indexed by the test's burn-in time. The type-I-error results primarily leverage a martingale strong invariance principle and establish that these tests (and their implied confidence sequences) have type-I error rates asymptotically equivalent to the desired (possibly varying) $\alpha$-level. The expected-rejection-time results primarily leverage an identity inspired by It\^o's lemma and imply that, in certain asymptotic regimes, the expected rejection time is asymptotically equivalent to the minimum possible among $\alpha$-level tests. We show how to apply our results to sequential inference on parameters defined by estimating equations, such as average treatment effects. Together, our results establish these (ostensibly parametric) tests as general-purpose, non-parametric, and near-optimal. We illustrate this via numerical simulations and a real-data application to A/B testing at Netflix.

Analysis · CASES · 樣本 · 類別 · Performer ·

2024 年 3 月 10 日

An Improved Analysis of Langevin Algorithms with Prior Diffusion for Non-Log-Concave Sampling

Xunpeng Huang,Hanze Dong,Difan Zou,Tong Zhang

from arxiv, 32 pages

Understanding the dimension dependency of computational complexity in high-dimensional sampling problem is a fundamental problem, both from a practical and theoretical perspective. Compared with samplers with unbiased stationary distribution, e.g., Metropolis-adjusted Langevin algorithm (MALA), biased samplers, e.g., Underdamped Langevin Dynamics (ULD), perform better in low-accuracy cases just because a lower dimension dependency in their complexities. Along this line, Freund et al. (2022) suggest that the modified Langevin algorithm with prior diffusion is able to converge dimension independently for strongly log-concave target distributions. Nonetheless, it remains open whether such property establishes for more general cases. In this paper, we investigate the prior diffusion technique for the target distributions satisfying log-Sobolev inequality (LSI), which covers a much broader class of distributions compared to the strongly log-concave ones. In particular, we prove that the modified Langevin algorithm can also obtain the dimension-independent convergence of KL divergence with different step size schedules. The core of our proof technique is a novel construction of an interpolating SDE, which significantly helps to conduct a more accurate characterization of the discrete updates of the overdamped Langevin dynamics. Our theoretical analysis demonstrates the benefits of prior diffusion for a broader class of target distributions and provides new insights into developing faster sampling algorithms.

INFORMS · 不變 · 結點 · 線性的 · 類別 ·

2024 年 3 月 9 日

Invariant Properties of Linear-Iterative Distributed Averaging Algorithms and Application to Error Detection

Christoforos N. Hadjicostis,Alejandro D. Dominguez-Garcia

from arxiv, 7 pages, this is an expanded version of a conference paper that appears in 2024 European Control Conference

We consider the problem of average consensus in a distributed system comprising a set of nodes that can exchange information among themselves. We focus on a class of algorithms for solving such a problem whereby each node maintains a state and updates it iteratively as a linear combination of the states maintained by its in-neighbors, i.e., nodes from which it receives information directly. Averaging algorithms within this class can be thought of as discrete-time linear time-varying systems without external driving inputs and whose state matrix is column stochastic. As a result, the algorithms exhibit a global invariance property in that the sum of the state variables remains constant at all times. In this paper, we report on another invariance property for the aforementioned class of averaging algorithms. This property is local to each node and reflects the conservation of certain quantities capturing an aggregate of all the values received by a node from its in-neighbors and all the values sent by said node to its out-neighbors (i.e., nodes to which it sends information directly) throughout the execution of the averaging algorithm. We show how this newly-discovered invariant can be leveraged for detecting errors while executing the averaging algorithm.

優化器 · Performer · 統計量 · 維數災難 · Processing（編程語言） ·

2024 年 3 月 8 日

The Behavior and Convergence of Local Bayesian Optimization

Kaiwen Wu,Kyurae Kim,Roman Garnett,Jacob R. Gardner

from arxiv, 27 pages; NeurIPS 2023

A recent development in Bayesian optimization is the use of local optimization strategies, which can deliver strong empirical performance on high-dimensional problems compared to traditional global strategies. The "folk wisdom" in the literature is that the focus on local optimization sidesteps the curse of dimensionality; however, little is known concretely about the expected behavior or convergence of Bayesian local optimization routines. We first study the behavior of the local approach, and find that the statistics of individual local solutions of Gaussian process sample paths are surprisingly good compared to what we would expect to recover from global methods. We then present the first rigorous analysis of such a Bayesian local optimization algorithm recently proposed by M\"uller et al. (2021), and derive convergence rates in both the noisy and noiseless settings.

有偏 · 可約的 · MoDELS · Prompt · Seven ·

2024 年 3 月 8 日

Bias-Augmented Consistency Training Reduces Biased Reasoning in Chain-of-Thought

James Chua,Edward Rees,Hunar Batra,Samuel R. Bowman,Julian Michael,Ethan Perez,Miles Turpin

While chain-of-thought prompting (CoT) has the potential to improve the explainability of language model reasoning, it can systematically misrepresent the factors influencing models' behavior--for example, rationalizing answers in line with a user's opinion without mentioning this bias. To mitigate this biased reasoning problem, we introduce bias-augmented consistency training (BCT), an unsupervised fine-tuning scheme that trains models to give consistent reasoning across prompts with and without biasing features. We construct a suite testing nine forms of biased reasoning on seven question-answering tasks, and find that applying BCT to GPT-3.5-Turbo with one bias reduces the rate of biased reasoning by 86% on held-out tasks. Moreover, this model generalizes to other forms of bias, reducing biased reasoning on held-out biases by an average of 37%. As BCT generalizes to held-out biases and does not require gold labels, this method may hold promise for reducing biased reasoning from as-of-yet unknown biases and on tasks where supervision for ground truth reasoning is unavailable.

學成 · 大數據 · 相同 · 人工智能 · 統計方法 ·

2020 年 5 月 5 日

A Survey of Learning Causality with Data: Problems and Methods

Ruocheng Guo,Lu Cheng,Jundong Li,P. Richard Hahn,Huan Liu

from arxiv, 35 pages, accepted by ACM CSUR

This work considers the question of how convenient access to copious data impacts our ability to learn causal effects and relations. In what ways is learning causality in the era of big data different from -- or the same as -- the traditional one? To answer this question, this survey provides a comprehensive and structured review of both traditional and frontier methods in learning causality and relations along with the connections between causality and machine learning. This work points out on a case-by-case basis how big data facilitates, complicates, or motivates each approach.