顾美玲国产一区二区三区,午夜大片免费男女爽爽影院久久,国产一区二区视频免费观看

Recent years have witnessed tremendous success in Self-Supervised Learning (SSL), which has been widely utilized to facilitate various downstream tasks in Computer Vision (CV) and Natural Language Processing (NLP) domains. However, attackers may steal such SSL models and commercialize them for profit, making it crucial to verify the ownership of the SSL models. Most existing ownership protection solutions (e.g., backdoor-based watermarks) are designed for supervised learning models and cannot be used directly since they require that the models' downstream tasks and target labels be known and available during watermark embedding, which is not always possible in the domain of SSL. To address such a problem, especially when downstream tasks are diverse and unknown during watermark embedding, we propose a novel black-box watermarking solution, named SSL-WM, for verifying the ownership of SSL models. SSL-WM maps watermarked inputs of the protected encoders into an invariant representation space, which causes any downstream classifier to produce expected behavior, thus allowing the detection of embedded watermarks. We evaluate SSL-WM on numerous tasks, such as CV and NLP, using different SSL models both contrastive-based and generative-based. Experimental results demonstrate that SSL-WM can effectively verify the ownership of stolen SSL models in various downstream tasks. Furthermore, SSL-WM is robust against model fine-tuning, pruning, and input preprocessing attacks. Lastly, SSL-WM can also evade detection from evaluated watermark detection approaches, demonstrating its promising application in protecting the ownership of SSL models.

相關內容

SSL

關注 3

大語言模型 · 可理解性 · MoDELS · 縮放 · Performer ·

2024 年 3 月 11 日

Hybrid Human-LLM Corpus Construction and LLM Evaluation for Rare Linguistic Phenomena

Leonie Weissweiler,Abdullatif K?ksal,Hinrich Schütze

Argument Structure Constructions (ASCs) are one of the most well-studied construction groups, providing a unique opportunity to demonstrate the usefulness of Construction Grammar (CxG). For example, the caused-motion construction (CMC, ``She sneezed the foam off her cappuccino'') demonstrates that constructions must carry meaning, otherwise the fact that ``sneeze'' in this context causes movement cannot be explained. We form the hypothesis that this remains challenging even for state-of-the-art Large Language Models (LLMs), for which we devise a test based on substituting the verb with a prototypical motion verb. To be able to perform this test at statistically significant scale, in the absence of adequate CxG corpora, we develop a novel pipeline of NLP-assisted collection of linguistically annotated text. We show how dependency parsing and GPT-3.5 can be used to significantly reduce annotation cost and thus enable the annotation of rare phenomena at scale. We then evaluate GPT, Gemini, Llama2 and Mistral models for their understanding of the CMC using the newly collected corpus. We find that all models struggle with understanding the motion component that the CMC adds to a sentence.

優化器 · Markovian · 估計/估計量 · 重要性采樣 · Projection ·

2024 年 3 月 11 日

Automated Importance Sampling via Optimal Control for Stochastic Reaction Networks: A Markovian Projection-based Approach

Chiheb Ben Hammouda,Nadhir Ben Rached,Raúl Tempone,Sophia Wiechert

We propose a novel alternative approach to our previous work (Ben Hammouda et al., 2023) to improve the efficiency of Monte Carlo (MC) estimators for rare event probabilities for stochastic reaction networks (SRNs). In the same spirit of (Ben Hammouda et al., 2023), an efficient path-dependent measure change is derived based on a connection between determining optimal importance sampling (IS) parameters within a class of probability measures and a stochastic optimal control formulation, corresponding to solving a variance minimization problem. In this work, we propose a novel approach to address the encountered curse of dimensionality by mapping the problem to a significantly lower-dimensional space via a Markovian projection (MP) idea. The output of this model reduction technique is a low-dimensional SRN (potentially even one dimensional) that preserves the marginal distribution of the original high-dimensional SRN system. The dynamics of the projected process are obtained by solving a related optimization problem via a discrete $L^2$ regression. By solving the resulting projected Hamilton-Jacobi-Bellman (HJB) equations for the reduced-dimensional SRN, we obtain projected IS parameters, which are then mapped back to the original full-dimensional SRN system, resulting in an efficient IS-MC estimator for rare events probabilities of the full-dimensional SRN. Our analysis and numerical experiments reveal that the proposed MP-HJB-IS approach substantially reduces the MC estimator variance, resulting in a lower computational complexity in the rare event regime than standard MC estimators.

contrastive · anchor · 小樣本學習 · MoDELS · 多樣性 ·

2024 年 3 月 11 日

Synergistic Anchored Contrastive Pre-training for Few-Shot Relation Extraction

Da Luo,Yanglei Gan,Rui Hou,Run Lin,Qiao Liu,Yuxiang Cai,Wannian Gao

Few-shot Relation Extraction (FSRE) aims to extract relational facts from a sparse set of labeled corpora. Recent studies have shown promising results in FSRE by employing Pre-trained Language Models (PLMs) within the framework of supervised contrastive learning, which considers both instances and label facts. However, how to effectively harness massive instance-label pairs to encompass the learned representation with semantic richness in this learning paradigm is not fully explored. To address this gap, we introduce a novel synergistic anchored contrastive pre-training framework. This framework is motivated by the insight that the diverse viewpoints conveyed through instance-label pairs capture incomplete yet complementary intrinsic textual semantics. Specifically, our framework involves a symmetrical contrastive objective that encompasses both sentence-anchored and label-anchored contrastive losses. By combining these two losses, the model establishes a robust and uniform representation space. This space effectively captures the reciprocal alignment of feature distributions among instances and relational facts, simultaneously enhancing the maximization of mutual information across diverse perspectives within the same relation. Experimental results demonstrate that our framework achieves significant performance enhancements compared to baseline models in downstream FSRE tasks. Furthermore, our approach exhibits superior adaptability to handle the challenges of domain shift and zero-shot relation extraction. Our code is available online at //github.com/AONE-NLP/FSRE-SaCon.

MoDELS · 表示 · 大語言模型 · 代碼 · Automator ·

2024 年 3 月 11 日

RepairLLaMA: Efficient Representations and Fine-Tuned Adapters for Program Repair

André Silva,Sen Fang,Martin Monperrus

Automated Program Repair (APR) has evolved significantly with the advent of Large Language Models (LLMs). Fine-tuning LLMs for program repair is a recent avenue of research, with many dimensions which have not been explored. Existing work mostly fine-tunes LLMs with naive code representations and is fundamentally limited in its ability to fine-tune larger LLMs. To address this problem, we propose RepairLLaMA, a novel program repair approach that combines 1) code representations for APR and 2) the state-of-the-art parameter-efficient LLM fine-tuning technique called LoRA. This results in RepairLLaMA producing a highly effective `program repair adapter' for fixing bugs with language models. Our experiments demonstrate the validity of both concepts. First, fine-tuning adapters with program repair specific code representations enables the model to use meaningful repair signals. Second, parameter-efficient fine-tuning helps fine-tuning to converge and contributes to the effectiveness of the repair adapter to fix data-points outside the fine-tuning data distribution. Overall, RepairLLaMA correctly fixes 125 Defects4J v2 and 82 HumanEval-Java bugs, outperforming all baselines.

展開 · 線性的 · Performer · SPIN · state-of-the-art ·

2024 年 3 月 10 日

Partial-order Checking with Unfolding for Linear Temporal Properties

Shuo Li,Liao Zheng,Zhijun Ding

from arxiv, 15 pages, 3 figures

Unfolding can tackle the path-explosion problem caused by concurrency. Traditional unfolding generation faces an NP-complete problem when adding events to the unfolding structure, which also exists in the case of verifying linear temporal logic (LTL). The reason is that it is necessary to enumerate possible concurrent event combinations after adding an event. Many state-of-the-art methods optimally explore unfolding-based structure (called event structure) by a tree-like structure, which should be constructed on the event structure with complete conflict and causal relations. However, a synchronization of a Petri net and the Buchi representation of LTL as a folded net can not represent complete conflict and causal relations. Thus, it is difficult to apply such a tree-like structure directly on the folded net. To resolve this difficulty, we propose a new method, called partial-order checking with unfolding, to verify LTL based on PDNet (program dependence net). We define an exploration tree with a new notion of delayed transitions, which is different from the existing tree-like structure. It improves the unfolding generation by avoiding all possible event combinations. Then, we propose an algorithm to simultaneously construct the exploration tree while generating the unfolding structure, as well as checking LTL. We implement a tool PUPER for concurrent programs with POSIX threads. It improves traditional unfolding generations via our exploration tree-based algorithms and shows better performance than SPIN and DiVine on the used benchmarks.

Performer · 多峰值 · MoDELS · 可約的 · state-of-the-art ·

2024 年 3 月 8 日

RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback

Tianyu Yu,Yuan Yao,Haoye Zhang,Taiwen He,Yifeng Han,Ganqu Cui,Jinyi Hu,Zhiyuan Liu,Hai-Tao Zheng,Maosong Sun,Tat-Seng Chua

from arxiv, Accepted by CVPR 2024

Multimodal Large Language Models (MLLMs) have recently demonstrated impressive capabilities in multimodal understanding, reasoning, and interaction. However, existing MLLMs prevalently suffer from serious hallucination problems, generating text that is not factually grounded in associated images. The problem makes existing MLLMs untrustworthy and thus impractical in real-world (especially high-stakes) applications. To address the challenge, we present RLHF-V, which enhances MLLM trustworthiness via behavior alignment from fine-grained correctional human feedback. Specifically, RLHF-V collects human preference in the form of segment-level corrections on hallucinations, and performs dense direct preference optimization over the human feedback. Comprehensive experiments on five benchmarks in both automatic and human evaluation show that, RLHF-V can enable substantially more trustworthy MLLM behaviors with promising data and computation efficiency. Remarkably, using 1.4k annotated data samples, RLHF-V significantly reduces the hallucination rate of the base MLLM by 34.8%, outperforming the concurrent LLaVA-RLHF trained on 10k annotated data. The final model achieves state-of-the-art performance in trustworthiness among open-source MLLMs, and shows better robustness than GPT-4V in preventing hallucinations aroused from over-generalization. We open-source our code, model, and data at //github.com/RLHF-V/RLHF-V.

Networking · 講稿 · 可理解性 · AI · 多代理人模型 ·

2023 年 10 月 11 日

Generative Agent-Based Social Networks for Disinformation: Research Opportunities and Open Challenges

Javier Pastor-Galindo,Pantaleone Nespoli,José A. Ruipérez-Valiente

This article presents the affordances that Generative Artificial Intelligence can have in disinformation context, one of the major threats to our digitalized society. We present a research framework to generate customized agent-based social networks for disinformation simulations that would enable understanding and evaluation of the phenomena whilst discussing open challenges.

剪枝 · Better · CAP · contrastive · MoDELS ·

2021 年 12 月 14 日

From Dense to Sparse: Contrastive Pruning for Better Pre-trained Language Model Compression

Runxin Xu,Fuli Luo,Chengyu Wang,Baobao Chang,Jun Huang,Songfang Huang,Fei Huang

from arxiv, Accepted to AAAI 2022

Pre-trained Language Models (PLMs) have achieved great success in various Natural Language Processing (NLP) tasks under the pre-training and fine-tuning paradigm. With large quantities of parameters, PLMs are computation-intensive and resource-hungry. Hence, model pruning has been introduced to compress large-scale PLMs. However, most prior approaches only consider task-specific knowledge towards downstream tasks, but ignore the essential task-agnostic knowledge during pruning, which may cause catastrophic forgetting problem and lead to poor generalization ability. To maintain both task-agnostic and task-specific knowledge in our pruned model, we propose ContrAstive Pruning (CAP) under the paradigm of pre-training and fine-tuning. It is designed as a general framework, compatible with both structured and unstructured pruning. Unified in contrastive learning, CAP enables the pruned model to learn from the pre-trained model for task-agnostic knowledge, and fine-tuned model for task-specific knowledge. Besides, to better retain the performance of the pruned model, the snapshots (i.e., the intermediate models at each pruning iteration) also serve as effective supervisions for pruning. Our extensive experiments show that adopting CAP consistently yields significant improvements, especially in extremely high sparsity scenarios. With only 3% model parameters reserved (i.e., 97% sparsity), CAP successfully achieves 99.2% and 96.3% of the original BERT performance in QQP and MNLI tasks. In addition, our probing experiments demonstrate that the model pruned by CAP tends to achieve better generalization ability.

語言模型化 · 自動問答 · MoDELS · 可約的 · entity ·

2021 年 9 月 22 日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Fu Sun,Feng-Lin Li,Ruize Wang,Qianglong Chen,Xingyi Cheng,Ji Zhang

from arxiv, CIKM 2021

Knowledge enhanced pre-trained language models (K-PLMs) are shown to be effective for many public tasks in the literature but few of them have been successfully applied in practice. To address this problem, we propose K-AID, a systematic approach that includes a low-cost knowledge acquisition process for acquiring domain knowledge, an effective knowledge infusion module for improving model performance, and a knowledge distillation component for reducing the model size and deploying K-PLMs on resource-restricted devices (e.g., CPU) for real-world application. Importantly, instead of capturing entity knowledge like the majority of existing K-PLMs, our approach captures relational knowledge, which contributes to better-improving sentence-level text classification and text matching tasks that play a key role in question answering (QA). We conducted a set of experiments on five text classification tasks and three text matching tasks from three domains, namely E-commerce, Government, and Film&TV, and performed online A/B tests in E-commerce. Experimental results show that our approach is able to achieve substantial improvement on sentence-level question answering tasks and bring beneficial business value in industrial settings.

圖卷積神經網絡/圖卷積網絡 · Networking · INFORMS · 圖卷積 · 圖 ·

2019 年 6 月 9 日

Fine-grained Event Categorization with Heterogeneous Graph Convolutional Networks

Hao Peng,Jianxin Li,Qiran Gong,Yangqiu Song,Yuanxing Ning,Kunfeng Lai,Philip S. Yu

from arxiv, Accepted by IJCAI'19(International Joint Conference on Artificial Intelligence)

Events are happening in real-world and real-time, which can be planned and organized occasions involving multiple people and objects. Social media platforms publish a lot of text messages containing public events with comprehensive topics. However, mining social events is challenging due to the heterogeneous event elements in texts and explicit and implicit social network structures. In this paper, we design an event meta-schema to characterize the semantic relatedness of social events and build an event-based heterogeneous information network (HIN) integrating information from external knowledge base, and propose a novel Pair-wise Popularity Graph Convolutional Network (PP-GCN) based fine-grained social event categorization model. We propose a Knowledgeable meta-paths Instances based social Event Similarity (KIES) between events and build a weighted adjacent matrix as input to the PP-GCN model. Comprehensive experiments on real data collections are conducted to compare various social event detection and clustering tasks. Experimental results demonstrate that our proposed framework outperforms other alternative social event categorization techniques.