国产又色又爽又黄又免费软件_人妻超清中文字幕乱码一区_国产精品爆乳无码一区二区三区_亚洲中文字幕久久电影_国产V欧美V日韩V在线精品_思思热在线新免费视频_国产成免费视频在线观看

Extracting structured information from videos is critical for numerous downstream applications in the industry. In this paper, we define a significant task of extracting hierarchical key information from visual texts on videos. To fulfill this task, we decouple it into four subtasks and introduce two implementation solutions called PipVKIE and UniVKIE. PipVKIE sequentially completes the four subtasks in continuous stages, while UniVKIE is improved by unifying all the subtasks into one backbone. Both PipVKIE and UniVKIE leverage multimodal information from vision, text, and coordinates for feature representation. Extensive experiments on one well-defined dataset demonstrate that our solutions can achieve remarkable performance and efficient inference speed.

相關內容

INFORMS

關注 10

《計算機信息》雜志發表高質量的論文，擴大了運籌學和計算的范圍，尋求有關理論、方法、實驗、系統和應用方面的原創研究論文、新穎的調查和教程論文，以及描述新的和有用的軟件工具的論文。官網鏈接： · 穩健性 · MoDELS · 矩 · 損失 ·

2024 年 2 月 20 日

Robust Estimation of Loss Models for Truncated and Censored Severity Data

Chudamani Poudyal,Vytaras Brazauskas

from arxiv, 32 pages, 2 figures

In this paper, we consider robust estimation of claim severity models in insurance, when data are affected by truncation (due to deductibles), censoring (due to policy limits), and scaling (due to coinsurance). In particular, robust estimators based on the methods of trimmed moments (T-estimators) and winsorized moments (W-estimators) are pursued and fully developed. The general definitions of such estimators are formulated and their asymptotic properties are investigated. For illustrative purposes, specific formulas for T- and W-estimators of the tail parameter of a single-parameter Pareto distribution are derived. The practical performance of these estimators is then explored using the well-known Norwegian fire claims data. Our results demonstrate that T- and W-estimators offer a robust and computationally efficient alternative to the likelihood-based inference for models that are affected by deductibles, policy limits, and coinsurance.

混合專家模型 · 大語言模型 · INTERACT · MoDELS · 自動問答 ·

2024 年 2 月 20 日

BiMediX: Bilingual Medical Mixture of Experts LLM

Sara Pieri,Sahal Shaji Mullappilly,Fahad Shahbaz Khan,Rao Muhammad Anwer,Salman Khan,Timothy Baldwin,Hisham Cholakkal

In this paper, we introduce BiMediX, the first bilingual medical mixture of experts LLM designed for seamless interaction in both English and Arabic. Our model facilitates a wide range of medical interactions in English and Arabic, including multi-turn chats to inquire about additional details such as patient symptoms and medical history, multiple-choice question answering, and open-ended question answering. We propose a semi-automated English-to-Arabic translation pipeline with human refinement to ensure high-quality translations. We also introduce a comprehensive evaluation benchmark for Arabic medical LLMs. Furthermore, we introduce BiMed1.3M, an extensive Arabic-English bilingual instruction set covering 1.3 Million diverse medical interactions, resulting in over 632 million healthcare specialized tokens for instruction tuning. Our BiMed1.3M dataset includes 250k synthesized multi-turn doctor-patient chats and maintains a 1:2 Arabic-to-English ratio. Our model outperforms state-of-the-art Med42 and Meditron by average absolute gains of 2.5% and 4.1%, respectively, computed across multiple medical evaluation benchmarks in English, while operating at 8-times faster inference. Moreover, our BiMediX outperforms the generic Arabic-English bilingual LLM, Jais-30B, by average absolute gains of 10% on our Arabic medical benchmark and 15% on bilingual evaluations across multiple datasets. Our project page with source code and trained model is available at //github.com/mbzuai-oryx/BiMediX .

任務對話系統 · binary · Analysis · Taxonomy · Performer ·

2024 年 2 月 20 日

TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

Liyan Tang,Igor Shalyminov,Amy Wing-mei Wong,Jon Burnsky,Jake W. Vincent,Yu'an Yang,Siffi Singh,Song Feng,Hwanjun Song,Hang Su,Lijia Sun,Yi Zhang,Saab Mansour,Kathleen McKeown

from arxiv, Linguistic annotations available at //github.com/amazon-science/tofueval

Single document news summarization has seen substantial progress on faithfulness in recent years, driven by research on the evaluation of factual consistency, or hallucinations. We ask whether these advances carry over to other text summarization domains. We propose a new evaluation benchmark on topic-focused dialogue summarization, generated by LLMs of varying sizes. We provide binary sentence-level human annotations of the factual consistency of these summaries along with detailed explanations of factually inconsistent sentences. Our analysis shows that existing LLMs hallucinate significant amounts of factual errors in the dialogue domain, regardless of the model's size. On the other hand, when LLMs, including GPT-4, serve as binary factual evaluators, they perform poorly and can be outperformed by prevailing state-of-the-art specialized factuality evaluation metrics. Finally, we conducted an analysis of hallucination types with a curated error taxonomy. We find that there are diverse errors and error distributions in model-generated summaries and that non-LLM based metrics can capture all error types better than LLM-based evaluators.

Machine Learning · AutoML · 優化器 · Integration · Learning ·

2024 年 2 月 20 日

Data Pipeline Training: Integrating AutoML to Optimize the Data Flow of Machine Learning Models

Jiang Wu,Hongbo Wang,Chunhe Ni,Chenwei Zhang,Wenran Lu

Data Pipeline plays an indispensable role in tasks such as modeling machine learning and developing data products. With the increasing diversification and complexity of Data sources, as well as the rapid growth of data volumes, building an efficient Data Pipeline has become crucial for improving work efficiency and solving complex problems. This paper focuses on exploring how to optimize data flow through automated machine learning methods by integrating AutoML with Data Pipeline. We will discuss how to leverage AutoML technology to enhance the intelligence of Data Pipeline, thereby achieving better results in machine learning tasks. By delving into the automation and optimization of Data flows, we uncover key strategies for constructing efficient data pipelines that can adapt to the ever-changing data landscape. This not only accelerates the modeling process but also provides innovative solutions to complex problems, enabling more significant outcomes in increasingly intricate data domains. Keywords- Data Pipeline Training;AutoML; Data environment; Machine learning

MoDELS · 數據集 · 目標檢測 · Performer · state-of-the-art ·

2024 年 2 月 20 日

InstaGen: Enhancing Object Detection by Training on Synthetic Dataset

Chengjian Feng,Yujie Zhong,Zequn Jie,Weidi Xie,Lin Ma

from arxiv, Tech report

In this paper, we introduce a novel paradigm to enhance the ability of object detector, e.g., expanding categories or improving detection performance, by training on synthetic dataset generated from diffusion models. Specifically, we integrate an instance-level grounding head into a pre-trained, generative diffusion model, to augment it with the ability of localising arbitrary instances in the generated images. The grounding head is trained to align the text embedding of category names with the regional visual feature of the diffusion model, using supervision from an off-the-shelf object detector, and a novel self-training scheme on (novel) categories not covered by the detector. This enhanced version of diffusion model, termed as InstaGen, can serve as a data synthesizer for object detection. We conduct thorough experiments to show that, object detector can be enhanced while training on the synthetic dataset from InstaGen, demonstrating superior performance over existing state-of-the-art methods in open-vocabulary (+4.5 AP) and data-sparse (+1.2 to 5.2 AP) scenarios.

查全率/召回率 · 圖 · 模型評估 · Performer · 有偏 ·

2024 年 2 月 20 日

Microstructures and Accuracy of Graph Recall by Large Language Models

Yanbang Wang,Hejie Cui,Jon Kleinberg

from arxiv, 16 pages, 7 tables, 5 figures

Graphs data is crucial for many applications, and much of it exists in the relations described in textual format. As a result, being able to accurately recall and encode a graph described in earlier text is a basic yet pivotal ability that LLMs need to demonstrate if they are to perform reasoning tasks that involve graph-structured information. Human performance at graph recall has been studied by cognitive scientists for decades, and has been found to often exhibit certain structural patterns of bias that align with human handling of social relationships. To date, however, we know little about how LLMs behave in analogous graph recall tasks: do their recalled graphs also exhibit certain biased patterns, and if so, how do they compare with humans and affect other graph reasoning tasks? In this work, we perform the first systematical study of graph recall by LLMs, investigating the accuracy and biased microstructures (local structural patterns) in their recall. We find that LLMs not only underperform often in graph recall, but also tend to favor more triangles and alternating 2-paths. Moreover, we find that more advanced LLMs have a striking dependence on the domain that a real-world graph comes from -- by yielding the best recall accuracy when the graph is narrated in a language style consistent with its original domain.

MoDELS · 泛函 · 尖峰和平板 · INFORMS · Continuity ·

2024 年 2 月 19 日

Screening the Discrepancy Function of a Computer Model

Pierre Barbillon,Anabel Forte,Rui Paulo

from arxiv, Accepted in Technometrics

Screening traditionally refers to the problem of detecting active inputs in the computer model. In this paper, we develop methodology that applies to screening, but the main focus is on detecting active inputs not in the computer model itself but rather on the discrepancy function that is introduced to account for model inadequacy when linking the computer model with field observations. We contend this is an important problem as it informs the modeler which are the inputs that are potentially being mishandled in the model, but also along which directions it may be less recommendable to use the model for prediction. The methodology is Bayesian and is inspired by the continuous spike and slab prior popularized by the literature on Bayesian variable selection. In our approach, and in contrast with previous proposals, a single MCMC sample from the full model allows us to compute the posterior probabilities of all the competing models, resulting in a methodology that is computationally very fast. The approach hinges on the ability to obtain posterior inclusion probabilities of the inputs, which are very intuitive and easy to interpret quantities, as the basis for selecting active inputs. For that reason, we name the methodology PIPS -- posterior inclusion probability screening.

知識 (knowledge) · Processing（編程語言） · 圖 · NLP · 知識圖譜 ·

2022 年 9 月 30 日

A Decade of Knowledge Graphs in Natural Language Processing: A Survey

Phillip Schneider,Tim Schopf,Juraj Vladika,Mikhail Galkin,Elena Simperl,Florian Matthes

from arxiv, Accepted to AACL-IJCNLP 2022

In pace with developments in the research field of artificial intelligence, knowledge graphs (KGs) have attracted a surge of interest from both academia and industry. As a representation of semantic relations between entities, KGs have proven to be particularly relevant for natural language processing (NLP), experiencing a rapid spread and wide adoption within recent years. Given the increasing amount of research work in this area, several KG-related approaches have been surveyed in the NLP research community. However, a comprehensive study that categorizes established topics and reviews the maturity of individual research streams remains absent to this day. Contributing to closing this gap, we systematically analyzed 507 papers from the literature on KGs in NLP. Our survey encompasses a multifaceted review of tasks, research types, and contributions. As a result, we present a structured overview of the research landscape, provide a taxonomy of tasks, summarize our findings, and highlight directions for future work.

圖像檢索 · 牛津大學 (University of Oxford) · Extensibility · 數據集 · Performer ·

2018 年 3 月 29 日

Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking

Filip Radenovi?,Ahmet Iscen,Giorgos Tolias,Yannis Avrithis,Ond?ej Chum

from arxiv, CVPR 2018

In this paper we address issues with image retrieval benchmarking on standard and popular Oxford 5k and Paris 6k datasets. In particular, annotation errors, the size of the dataset, and the level of challenge are addressed: new annotation for both datasets is created with an extra attention to the reliability of the ground truth. Three new protocols of varying difficulty are introduced. The protocols allow fair comparison between different methods, including those using a dataset pre-processing stage. For each dataset, 15 new challenging queries are introduced. Finally, a new set of 1M hard, semi-automatically cleaned distractors is selected. An extensive comparison of the state-of-the-art methods is performed on the new benchmark. Different types of methods are evaluated, ranging from local-feature-based to modern CNN based methods. The best results are achieved by taking the best of the two worlds. Most importantly, image retrieval appears far from being solved.

秩 · 目標檢測 · Performer · 排序 · DATE ·

2018 年 3 月 14 日

Revisiting Salient Object Detection: Simultaneous Detection, Ranking, and Subitizing of Multiple Salient Objects

Md Amirul Islam,Mahmoud Kalash,Neil D. B. Bruce

from arxiv, To appear in CVPR 2018

Salient object detection is a problem that has been considered in detail and many solutions proposed. In this paper, we argue that work to date has addressed a problem that is relatively ill-posed. Specifically, there is not universal agreement about what constitutes a salient object when multiple observers are queried. This implies that some objects are more likely to be judged salient than others, and implies a relative rank exists on salient objects. The solution presented in this paper solves this more general problem that considers relative rank, and we propose data and metrics suitable to measuring success in a relative objects saliency landscape. A novel deep learning solution is proposed based on a hierarchical representation of relative saliency and stage-wise refinement. We also show that the problem of salient object subitizing can be addressed with the same network, and our approach exceeds performance of any prior work across all metrics considered (both traditional and newly proposed).