亚洲AV永久无码精品九之-国产在线观看成永久免费视频

A reasonable confidence interval should have a confidence coefficient no less than the given nominal level and a small expected length to reliably and accurately estimate the parameter of interest, and the bootstrap interval is considered to be an efficient interval estimation technique. In this paper, we offer a first attempt at computing the coverage probability and expected length of a parametric or percentile bootstrap interval by exact probabilistic calculation for any fixed sample size. This method is applied to the basic bootstrap intervals for functions of binomial proportions and a normal mean. None of these intervals, however, are found to have a correct confidence coefficient, which leads to illogical conclusions including that the bootstrap interval is narrower than the z-interval when estimating a normal mean. This raises a general question of how to utilize bootstrap intervals appropriately in practice since the sample size is typically fixed.

相關內容

自助法/自舉法

關注 0

Analysis · 講稿 · 可理解性 · prototype · 閾值 ·

2024 年 3 月 26 日

Design and Preliminary Evaluation of a Torso Stabiliser for Individuals with Spinal Cord Injury

Rejin John Varghese,Man-Yan Tong,Isabella Szczech,Peter Bryan,Dario Farina,Etienne Burdet

from arxiv, 4 pages, 4 figures, 10 references. Submitted to IEEE EMBC 2024 conference

Spinal cord injuries (SCIs) generally result in sensory and mobility impairments, with torso instability being particularly debilitating. Existing torso stabilisers are often rigid and restrictive. This paper presents an early investigation into a non-restrictive 1 degree-of-freedom (DoF) mechanical torso stabiliser inspired by devices such as centrifugal clutches and seat-belt mechanisms. Firstly, the paper presents a motion-capture (MoCap) and OpenSim-based kinematic analysis of the cable-based system to understand requisite device characteristics. The simulated evaluation resulted in the cable-based device to require 55-60cm of unrestricted travel, and to lock at a threshold cable velocity of 80-100cm/sec. Next, the developed 1-DoF device is introduced. The proposed mechanical device is transparent during activities of daily living, and transitions to compliant blocking when incipient fall is detected. Prototype behaviour was then validated using a MoCap-based kinematic analysis to verify non-restrictive movement, reliable transition to blocking, and compliance of the blocking.

INTERACT · Pair · Learning · 代價 · 可約的 ·

2024 年 3 月 26 日

Sharing the Cost of Success: A Game for Evaluating and Learning Collaborative Multi-Agent Instruction Giving and Following Policies

Philipp Sadler,Sherzod Hakimov,David Schlangen

from arxiv, 9 pages, Accepted at LREC-COLING 2024

In collaborative goal-oriented settings, the participants are not only interested in achieving a successful outcome, but do also implicitly negotiate the effort they put into the interaction (by adapting to each other). In this work, we propose a challenging interactive reference game that requires two players to coordinate on vision and language observations. The learning signal in this game is a score (given after playing) that takes into account the achieved goal and the players' assumed efforts during the interaction. We show that a standard Proximal Policy Optimization (PPO) setup achieves a high success rate when bootstrapped with heuristic partner behaviors that implement insights from the analysis of human-human interactions. And we find that a pairing of neural partners indeed reduces the measured joint effort when playing together repeatedly. However, we observe that in comparison to a reasonable heuristic pairing there is still room for improvement -- which invites further research in the direction of cost-sharing in collaborative interactions.

優化器 · 正則化項 · 泛函 · Agent · 代價函數 ·

2024 年 3 月 26 日

On the Geometric Convergence of Byzantine-Resilient Distributed Optimization Algorithms

Kananart Kuwaranancharoen,Shreyas Sundaram

from arxiv, 29 pages, 5 figures

The problem of designing distributed optimization algorithms that are resilient to Byzantine adversaries has received significant attention. For the Byzantine-resilient distributed optimization problem, the goal is to (approximately) minimize the average of the local cost functions held by the regular (non adversarial) agents in the network. In this paper, we provide a general algorithmic framework for Byzantine-resilient distributed optimization which includes some state-of-the-art algorithms as special cases. We analyze the convergence of algorithms within the framework, and derive a geometric rate of convergence of all regular agents to a ball around the optimal solution (whose size we characterize). Furthermore, we show that approximate consensus can be achieved geometrically fast under some minimal conditions. Our analysis provides insights into the relationship among the convergence region, distance between regular agents' values, step-size, and properties of the agents' functions for Byzantine-resilient distributed optimization.

Facebook AI Research · 評論員 · Extensibility · WEB · 成比例 ·

2024 年 3 月 25 日

Unveiling the Blind Spots: A Critical Examination of Fairness in Autonomous Driving Systems

Xinyue Li,Zhenpeng Chen,Jie M. Zhang,Federica Sarro,Ying Zhang,Xuanzhe Liu

from arxiv, Update the models evaluated and the experimental results

Autonomous driving systems have extended the spectrum of Web of Things for intelligent vehicles and have become an important component of the Web ecosystem. Similar to traditional Web-based applications, fairness is an essential aspect for ensuring the high quality of autonomous driving systems, particularly in the context of pedestrian detectors within them. However, there is an absence in the literature of a comprehensive assessment of the fairness of current Deep Learning (DL)-based pedestrian detectors. To fill the gap, we evaluate eight widely-explored DL-based pedestrian detectors across demographic groups on large-scale real-world datasets. To enable a thorough fairness evaluation, we provide extensive annotations for the datasets, resulting in 8,311 images with 16,070 gender labels, 20,115 age labels, and 3,513 skin tone labels. Our findings reveal significant fairness issues related to age. The undetected proportions for adults are 20.14% lower compared to children. Furthermore, we explore how various driving scenarios affect the fairness of pedestrian detectors. We find that the bias may exacerbate for children and females towards low brightness and low contrast.

MoDELS · 區塊鏈 · 容差 · TransAct · Less ·

2024 年 3 月 25 日

Formally Verifying the Safety of Pipelined Moonshot Consensus Protocol

M. Praveen,Raghavendra Ramesh,Isaac Doidge

Decentralized Finance (DeFi) has emerged as a contemporary competitive as well as complementary to traditional centralized finance systems. As of 23rd January 2024, per Defillama approximately USD 55 billion is the total value locked on the DeFi applications on all blockchains put together. A Byzantine Fault Tolerant (BFT) State Machine Replication (SMR) protocol, popularly known as the consensus protocol, is the central component of a blockchain. If forks are possible in a consensus protocol, they can be misused to carry out double spending attacks and can be catastrophic given high volumes of finance that are transacted on blockchains. Formal verification of the safety of consensus protocols is the golden standard for guaranteeing that forks are not possible. However, it is considered complex and challenging to do. This is reflected by the fact that not many complex consensus protocols are formally verified except for Tendermint and QBFT. We focus on Supra's Pipelined Moonshot consensus protocol. Similar to Tendermint's formal verification, we too model Pipelined Moonshot using IVy and formally prove that for all network sizes, as long as the number of Byzantine validators is less than one thirds, the protocol does not allow forks, thus proving that Pipelined Moonshot is safe and double spending cannot be done using forks. The IVy model and proof of safety is available on Github.

語言模型化 · MoDELS · 大語言模型 · 樣例 · 樣本 ·

2024 年 3 月 25 日

Exploring the Adversarial Capabilities of Large Language Models

Lukas Struppek,Minh Hieu Le,Dominik Hintersdorf,Kristian Kersting

The proliferation of large language models (LLMs) has sparked widespread and general interest due to their strong language generation capabilities, offering great potential for both industry and research. While previous research delved into the security and privacy issues of LLMs, the extent to which these models can exhibit adversarial behavior remains largely unexplored. Addressing this gap, we investigate whether common publicly available LLMs have inherent capabilities to perturb text samples to fool safety measures, so-called adversarial examples resp.~attacks. More specifically, we investigate whether LLMs are inherently able to craft adversarial examples out of benign samples to fool existing safe rails. Our experiments, which focus on hate speech detection, reveal that LLMs succeed in finding adversarial perturbations, effectively undermining hate speech detection systems. Our findings carry significant implications for (semi-)autonomous systems relying on LLMs, highlighting potential challenges in their interaction with existing systems and safety measures.

論文 · 數據集 · 百度智能生活事業群組 · HTTPS · 語言模型化 ·

2024 年 3 月 22 日

LimGen: Probing the LLMs for Generating Suggestive Limitations of Research Papers

Abdur Rahman Bin Md Faizullah,Ashok Urlana,Rahul Mishra

from arxiv, 16 pages, 3 figures

Examining limitations is a crucial step in the scholarly research reviewing process, revealing aspects where a study might lack decisiveness or require enhancement. This aids readers in considering broader implications for further research. In this article, we present a novel and challenging task of Suggestive Limitation Generation (SLG) for research papers. We compile a dataset called LimGen, encompassing 4068 research papers and their associated limitations from the ACL anthology. We investigate several approaches to harness large language models (LLMs) for producing suggestive limitations, by thoroughly examining the related challenges, practical insights, and potential opportunities. Our LimGen dataset and code can be accessed at //github.com/armbf/LimGen.

Learning · Pattern Recognition · 可理解性 · 深度學習 · 模型構建 ·

2022 年 9 月 14 日

A Review and Roadmap of Deep Learning Causal Discovery in Different Variable Paradigms

Hang Chen,Keqing Du,Xinyu Yang,Chenguang Li

from arxiv, 26 pages,10 figures. arXiv admin note: text overlap with arXiv:2012.07138, arXiv:1605.08179, arXiv:2203.14237 by other authors

Understanding causality helps to structure interventions to achieve specific goals and enables predictions under interventions. With the growing importance of learning causal relationships, causal discovery tasks have transitioned from using traditional methods to infer potential causal structures from observational data to the field of pattern recognition involved in deep learning. The rapid accumulation of massive data promotes the emergence of causal search methods with brilliant scalability. Existing summaries of causal discovery methods mainly focus on traditional methods based on constraints, scores and FCMs, there is a lack of perfect sorting and elaboration for deep learning-based methods, also lacking some considers and exploration of causal discovery methods from the perspective of variable paradigms. Therefore, we divide the possible causal discovery tasks into three types according to the variable paradigm and give the definitions of the three tasks respectively, define and instantiate the relevant datasets for each task and the final causal model constructed at the same time, then reviews the main existing causal discovery methods for different tasks. Finally, we propose some roadmaps from different perspectives for the current research gaps in the field of causal discovery and point out future research directions.

MoDELS · Performer · Processing（編程語言） · 學成 · 穩健性 ·

2021 年 9 月 3 日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Paul Michel

from arxiv, PhD thesis

The dominating NLP paradigm of training a strong neural predictor to perform one task on a specific dataset has led to state-of-the-art performance in a variety of applications (eg. sentiment classification, span-prediction based question answering or machine translation). However, it builds upon the assumption that the data distribution is stationary, ie. that the data is sampled from a fixed distribution both at training and test time. This way of training is inconsistent with how we as humans are able to learn from and operate within a constantly changing stream of information. Moreover, it is ill-adapted to real-world use cases where the data distribution is expected to shift over the course of a model's lifetime. The first goal of this thesis is to characterize the different forms this shift can take in the context of natural language processing, and propose benchmarks and evaluation metrics to measure its effect on current deep learning architectures. We then proceed to take steps to mitigate the effect of distributional shift on NLP models. To this end, we develop methods based on parametric reformulations of the distributionally robust optimization framework. Empirically, we demonstrate that these approaches yield more robust models as demonstrated on a selection of realistic problems. In the third and final part of this thesis, we explore ways of efficiently adapting existing models to new domains or tasks. Our contribution to this topic takes inspiration from information geometry to derive a new gradient update rule which alleviate catastrophic forgetting issues during adaptation.

Processing（編程語言） · 學成 · 自然語言處理 · 語言處理 · 深度學習 ·

2019 年 9 月 11 日

A Survey of the Usages of Deep Learning in Natural Language Processing

Daniel W. Otter,Julian R. Medina,Jugal K. Kalita

Over the last several years, the field of natural language processing has been propelled forward by an explosion in the use of deep learning models. This survey provides a brief introduction to the field and a quick overview of deep learning architectures and methods. It then sifts through the plethora of recent studies and summarizes a large assortment of relevant contributions. Analyzed research areas include several core linguistic processing issues in addition to a number of applications of computational linguistics. A discussion of the current state of the art is then provided along with recommendations for future research in the field.