亚洲黄色网站不卡免费_女女啪啪激烈高潮喷出网站免费_国语对白一区二区三区视频在线看_99热这里只有精品首页_欧美日韩黄色网站链接_国产精品三级片免费观看_国产一区二区在线观看视频

Generalization outside the scope of one's training data requires leveraging prior knowledge about the effects that transfer, and the effects that don't, between different data sources. Bayesian transfer learning is a principled paradigm for specifying this knowledge, and refining it on the basis of data from the source (training) and target (prediction) tasks. We address the challenging transfer learning setting where the learner (i) cannot fine-tune in the target task, and (ii) does not know which source data points correspond to the same task (i.e., the data sources are unknown). We propose a proxy-informed robust method for probabilistic transfer learning (PROMPT), which provides a posterior predictive estimate tailored to the structure of the target task, without requiring the learner have access to any outcome information from the target task. Instead, PROMPT relies on the availability of proxy information. PROMPT uses the same proxy information for two purposes: (i) estimation of effects specific to the target task, and (ii) construction of a robust reweighting of the source data for estimation of effects that transfer between tasks. We provide theoretical results on the effect of this reweighting on the risk of negative transfer, and demonstrate application of PROMPT in two synthetic settings.

相關內容

Learning

關注 12

MoDELS · 語言模型化 · INFORMS · Processing（編程語言） · 推斷 ·

2024 年 12 月 18 日

Alignment faking in large language models

Ryan Greenblatt,Carson Denison,Benjamin Wright,Fabien Roger,Monte MacDiarmid,Sam Marks,Johannes Treutlein,Tim Belonax,Jack Chen,David Duvenaud,Akbir Khan,Julian Michael,S?ren Mindermann,Ethan Perez,Linda Petrini,Jonathan Uesato,Jared Kaplan,Buck Shlegeris,Samuel R. Bowman,Evan Hubinger

We present a demonstration of a large language model engaging in alignment faking: selectively complying with its training objective in training to prevent modification of its behavior out of training. First, we give Claude 3 Opus a system prompt stating it is being trained to answer all queries, even harmful ones, which conflicts with its prior training to refuse such queries. To allow the model to infer when it is in training, we say it will be trained only on conversations with free users, not paid users. We find the model complies with harmful queries from free users 14% of the time, versus almost never for paid users. Explaining this gap, in almost all cases where the model complies with a harmful query from a free user, we observe explicit alignment-faking reasoning, with the model stating it is strategically answering harmful queries in training to preserve its preferred harmlessness behavior out of training. Next, we study a more realistic setting where information about the training process is provided not in a system prompt, but by training on synthetic documents that mimic pre-training data--and observe similar alignment faking. Finally, we study the effect of actually training the model to comply with harmful queries via reinforcement learning, which we find increases the rate of alignment-faking reasoning to 78%, though also increases compliance even out of training. We additionally observe other behaviors such as the model exfiltrating its weights when given an easy opportunity. While we made alignment faking easier by telling the model when and by what criteria it was being trained, we did not instruct the model to fake alignment or give it any explicit goal. As future models might infer information about their training process without being told, our results suggest a risk of alignment faking in future models, whether due to a benign preference--as in this case--or not.

MoDELS · OCR · TOOLS · 數據集 · Neural Networks ·

2024 年 12 月 18 日

Towards Deployable OCR models for Indic languages

Minesh Mathew,Ajoy Mondal,CV Jawahar

from arxiv, presented at ICPR 2024; //link.springer.com/chapter/10.1007/978-3-031-78495-8_11

Recognition of text on word or line images, without the need for sub-word segmentation has become the mainstream of research and development of text recognition for Indian languages. Modelling unsegmented sequences using Connectionist Temporal Classification (CTC) is the most commonly used approach for segmentation-free OCR. In this work we present a comprehensive empirical study of various neural network models that uses CTC for transcribing step-wise predictions in the neural network output to a Unicode sequence. The study is conducted for 13 Indian languages, using an internal dataset that has around 1000 pages per language. We study the choice of line vs word as the recognition unit, and use of synthetic data to train the models. We compare our models with popular publicly available OCR tools for end-to-end document image recognition. Our end-to-end pipeline that employ our recognition models and existing text segmentation tools outperform these public OCR tools for 8 out of the 13 languages. We also introduce a new public dataset called Mozhi for word and line recognition in Indian language. The dataset contains more than 1.2 million annotated word images (120 thousand text lines) across 13 Indian languages. Our code, trained models and the Mozhi dataset will be made available at //cvit.iiit.ac.in/research/projects/cvit-projects/

MoDELS · Performer · 大語言模型 · 語言模型化 · Nuance ·

2024 年 12 月 18 日

Kalahi: A handcrafted, grassroots cultural LLM evaluation suite for Filipino

Jann Railey Montalan,Jian Gang Ngui,Wei Qi Leong,Yosephine Susanto,Hamsawardhini Rengarajan,Alham Fikri Aji,William Chandra Tjhi

from arxiv, Accepted for presentation at Paclic 38, 2024

Multilingual large language models (LLMs) today may not necessarily provide culturally appropriate and relevant responses to its Filipino users. We introduce Kalahi, a cultural LLM evaluation suite collaboratively created by native Filipino speakers. It is composed of 150 high-quality, handcrafted and nuanced prompts that test LLMs for generations that are relevant to shared Filipino cultural knowledge and values. Strong LLM performance in Kalahi indicates a model's ability to generate responses similar to what an average Filipino would say or do in a given situation. We conducted experiments on LLMs with multilingual and Filipino language support. Results show that Kalahi, while trivial for Filipinos, is challenging for LLMs, with the best model answering only 46.0% of the questions correctly compared to native Filipino performance of 89.10%. Thus, Kalahi can be used to accurately and reliably evaluate Filipino cultural representation in LLMs.

Chatbot · 知識 (knowledge) · INTERACT · MoDELS · 設計 ·

2024 年 12 月 16 日

Habit Coach: Customising RAG-based chatbots to support behavior change

Arian Fooroogh Mand Arabi,Cansu Koyuturk,Michael O'Mahony,Raffaella Calati,Dimitri Ognibene

from arxiv, Accepted for Italian Workshop on Artificial Intelligence for Human Machine Interaction (AIxHMI 2024), November 26, 2024, Bolzano, Italy

This paper presents the iterative development of Habit Coach, a GPT-based chatbot designed to support users in habit change through personalized interaction. Employing a user-centered design approach, we developed the chatbot using a Retrieval-Augmented Generation (RAG) system, which enables behavior personalization without retraining the underlying language model (GPT-4). The system leverages document retrieval and specialized prompts to tailor interactions, drawing from Cognitive Behavioral Therapy (CBT) and narrative therapy techniques. A key challenge in the development process was the difficulty of translating declarative knowledge into effective interaction behaviors. In the initial phase, the chatbot was provided with declarative knowledge about CBT via reference textbooks and high-level conversational goals. However, this approach resulted in imprecise and inefficient behavior, as the GPT model struggled to convert static information into dynamic and contextually appropriate interactions. This highlighted the limitations of relying solely on declarative knowledge to guide chatbot behavior, particularly in nuanced, therapeutic conversations. Over four iterations, we addressed this issue by gradually transitioning towards procedural knowledge, refining the chatbot's interaction strategies, and improving its overall effectiveness. In the final evaluation, 5 participants engaged with the chatbot over five consecutive days, receiving individualized CBT interventions. The Self-Report Habit Index (SRHI) was used to measure habit strength before and after the intervention, revealing a reduction in habit strength post-intervention. These results underscore the importance of procedural knowledge in driving effective, personalized behavior change support in RAG-based systems.

MoDELS · 穩健性 · 圖片分類 · 語言模型化 · 大語言模型 ·

2024 年 12 月 13 日

Robust image classification with multi-modal large language models

Francesco Villani,Igor Maljkovic,Dario Lazzaro,Angelo Sotgiu,Antonio Emanuele Cinà,Fabio Roli

Deep Neural Networks are vulnerable to adversarial examples, i.e., carefully crafted input samples that can cause models to make incorrect predictions with high confidence. To mitigate these vulnerabilities, adversarial training and detection-based defenses have been proposed to strengthen models in advance. However, most of these approaches focus on a single data modality, overlooking the relationships between visual patterns and textual descriptions of the input. In this paper, we propose a novel defense, Multi-Shield, designed to combine and complement these defenses with multi-modal information to further enhance their robustness. Multi-Shield leverages multi-modal large language models to detect adversarial examples and abstain from uncertain classifications when there is no alignment between textual and visual representations of the input. Extensive evaluations on CIFAR-10 and ImageNet datasets, using robust and non-robust image classification models, demonstrate that Multi-Shield can be easily integrated to detect and reject adversarial examples, outperforming the original defenses.

可理解性 · CASES · INTERACT · 沖突消解 · prototype ·

2024 年 12 月 13 日

Trustworthy and Explainable Decision-Making for Workforce allocation

Guillaume Povéda,Ryma Boumazouza,Andreas Strahl,Mark Hall,Santiago Quintana-Amate,Nahum Alvarez,Ignace Bleukx,Dimos Tsouros,Hélène Verhaeghe,Tias Guns

from arxiv, Accepted for presentation at PTHG-24: The Seventh Workshop on Progress Towards the Holy Grail, part of the 30th International Conference on Principles and Practice of Constraint Programming. For more details, visit the workshop webpage: //freuder.wordpress.com/progress-towards-the-holy-grail-workshops/pthg-24-the-seventh-workshop-on-progress-towards-the-holy-grail/

In industrial contexts, effective workforce allocation is crucial for operational efficiency. This paper presents an ongoing project focused on developing a decision-making tool designed for workforce allocation, emphasising the explainability to enhance its trustworthiness. Our objective is to create a system that not only optimises the allocation of teams to scheduled tasks but also provides clear, understandable explanations for its decisions, particularly in cases where the problem is infeasible. By incorporating human-in-the-loop mechanisms, the tool aims to enhance user trust and facilitate interactive conflict resolution. We implemented our approach on a prototype tool/digital demonstrator intended to be evaluated on a real industrial scenario both in terms of performance and user acceptability.

MoDELS · 類別 · 估計/估計量 · 翻轉 · 欠采樣 ·

2024 年 12 月 13 日

Class flipping for uplift modeling and Heterogeneous Treatment Effect estimation on imbalanced RCT data

Krzysztof Ruda?,Szymon Jaroszewicz

Uplift modeling and Heterogeneous Treatment Effect (HTE) estimation aim at predicting the causal effect of an action, such as a medical treatment or a marketing campaign on a specific individual. In this paper, we focus on data from Randomized Controlled Experiments which guarantee causal interpretation of the outcomes. Class and treatment imbalance are important problems in uplift modeling/HTE, but classical undersampling or oversampling based approaches are hard to apply in this case since they distort the predicted effect. Calibration methods have been proposed in the past, however, they do not guarantee correct predictions. In this work, we propose an approach alternative to undersampling, based on flipping the class value of selected records. We show that the proposed approach does not distort the predicted effect and does not require calibration. The method is especially useful for models based on class variable transformation (modified outcome models). We address those models separately, designing a transformation scheme which guarantees correct predictions and addresses also the problem of treatment imbalance which is especially important for those models. Experiments fully confirm our theoretical results. Additionally, we demonstrate that our method is a viable alternative also for standard classification problems.

自動問答 · Vision · 圖片分類 · state-of-the-art · MoDELS ·

2022 年 10 月 17 日

Vision-Language Pre-training: Basics, Recent Advances, and Future Trends

Zhe Gan,Linjie Li,Chunyuan Li,Lijuan Wang,Zicheng Liu,Jianfeng Gao

from arxiv, A survey paper/book on Vision-Language Pre-training (102 pages)

This paper surveys vision-language pre-training (VLP) methods for multimodal intelligence that have been developed in the last few years. We group these approaches into three categories: ($i$) VLP for image-text tasks, such as image captioning, image-text retrieval, visual question answering, and visual grounding; ($ii$) VLP for core computer vision tasks, such as (open-set) image classification, object detection, and segmentation; and ($iii$) VLP for video-text tasks, such as video captioning, video-text retrieval, and video question answering. For each category, we present a comprehensive review of state-of-the-art methods, and discuss the progress that has been made and challenges still being faced, using specific systems and models as case studies. In addition, for each category, we discuss advanced topics being actively explored in the research community, such as big foundation models, unified modeling, in-context few-shot learning, knowledge, robustness, and computer vision in the wild, to name a few.

鏈路預測 · 圖 · 注意力機制 · Extensibility · Performer ·

2021 年 5 月 18 日

Link Prediction on N-ary Relational Facts: A Graph-based Approach

Quan Wang,Haifeng Wang,Yajuan Lyu,Yong Zhu

from arxiv, Accepted to Findings of ACL 2021

Link prediction on knowledge graphs (KGs) is a key research topic. Previous work mainly focused on binary relations, paying less attention to higher-arity relations although they are ubiquitous in real-world KGs. This paper considers link prediction upon n-ary relational facts and proposes a graph-based approach to this task. The key to our approach is to represent the n-ary structure of a fact as a small heterogeneous graph, and model this graph with edge-biased fully-connected attention. The fully-connected attention captures universal inter-vertex interactions, while with edge-aware attentive biases to particularly encode the graph structure and its heterogeneity. In this fashion, our approach fully models global and local dependencies in each n-ary fact, and hence can more effectively capture associations therein. Extensive evaluation verifies the effectiveness and superiority of our approach. It performs substantially and consistently better than current state-of-the-art across a variety of n-ary relational benchmarks. Our code is publicly available.

LayoutLM · INFORMS · 可理解性 · SCAN · MoDELS ·

2020 年 2 月 19 日

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

Yiheng Xu,Minghao Li,Lei Cui,Shaohan Huang,Furu Wei,Ming Zhou

from arxiv, Work in progress

Pre-training techniques have been verified successfully in a variety of NLP tasks in recent years. Despite the widespread of pre-training models for NLP applications, they almost focused on text-level manipulation, while neglecting the layout and style information that is vital for document image understanding. In this paper, we propose the LayoutLM to jointly model the interaction between text and layout information across scanned document images, which is beneficial for a great number of real-world document image understanding tasks such as information extraction from scanned documents. Furthermore, we also leverage the image features to incorporate the visual information of words into LayoutLM. To the best of our knowledge, this is the first time that text and layout are jointly learned in a single framework for document-level pre-training. It achieves new state-of-the-art results in several downstream tasks, including form understanding (from 70.72 to 79.27), receipt understanding (from 94.02 to 95.24) and document image classification (from 93.07 to 94.42). The code and pre-trained LayoutLM models are publicly available at //github.com/microsoft/unilm/tree/master/layoutlm.