唯美清纯另类亚洲一区二区-A级日本乱理伦片免费入口

Safety alignment of Large Language Models (LLMs) can be compromised with manual jailbreak attacks and (automatic) adversarial attacks. Recent work suggests that patching LLMs against these attacks is possible: manual jailbreak attacks are human-readable but often limited and public, making them easy to block; adversarial attacks generate gibberish prompts that can be detected using perplexity-based filters. In this paper, we show that these solutions may be too optimistic. We propose an interpretable adversarial attack, \texttt{AutoDAN}, that combines the strengths of both types of attacks. It automatically generates attack prompts that bypass perplexity-based filters while maintaining a high attack success rate like manual jailbreak attacks. These prompts are interpretable and diverse, exhibiting strategies commonly used in manual jailbreak attacks, and transfer better than their non-readable counterparts when using limited training data or a single proxy model. We also customize \texttt{AutoDAN}'s objective to leak system prompts, another jailbreak application not addressed in the adversarial attack literature. %, demonstrating the versatility of the approach. We can also customize the objective of \texttt{AutoDAN} to leak system prompts, beyond the ability to elicit harmful content from the model, demonstrating the versatility of the approach. Our work provides a new way to red-team LLMs and to understand the mechanism of jailbreak attacks.

相關內容

語言模型化

關注 9

SPARQL · 知識 (knowledge) · 圖 · 自動問答 · 知識圖譜 ·

2023 年 12 月 8 日

Spider4SPARQL: A Complex Benchmark for Evaluating Knowledge Graph Question Answering Systems

Catherine Kosten,Philippe Cudré-Mauroux,Kurt Stockinger

from arxiv, 10 pages, 5 figures, accepted at IEEE BigData Conference 2023, 8th IEEE Special Session on Machine Learning on Big Data (MLBD 2023)

With the recent spike in the number and availability of Large Language Models (LLMs), it has become increasingly important to provide large and realistic benchmarks for evaluating Knowledge Graph Question Answering (KGQA) systems. So far the majority of benchmarks rely on pattern-based SPARQL query generation approaches. The subsequent natural language (NL) question generation is conducted through crowdsourcing or other automated methods, such as rule-based paraphrasing or NL question templates. Although some of these datasets are of considerable size, their pitfall lies in their pattern-based generation approaches, which do not always generalize well to the vague and linguistically diverse questions asked by humans in real-world contexts. In this paper, we introduce Spider4SPARQL - a new SPARQL benchmark dataset featuring 9,693 previously existing manually generated NL questions and 4,721 unique, novel, and complex SPARQL queries of varying complexity. In addition to the NL/SPARQL pairs, we also provide their corresponding 166 knowledge graphs and ontologies, which cover 138 different domains. Our complex benchmark enables novel ways of evaluating the strengths and weaknesses of modern KGQA systems. We evaluate the system with state-of-the-art KGQA systems as well as LLMs, which achieve only up to 45\% execution accuracy, demonstrating that Spider4SPARQL is a challenging benchmark for future research.

大語言模型 · 語言模型化 · MoDELS · 模型評估 · Performer ·

2023 年 12 月 8 日

Ophtha-LLaMA2: A Large Language Model for Ophthalmology

Huan Zhao,Qian Ling,Yi Pan,Tianyang Zhong,Jin-Yu Hu,Junjie Yao,Fengqian Xiao,Zhenxiang Xiao,Yutong Zhang,San-Hua Xu,Shi-Nan Wu,Min Kang,Zihao Wu,Zhengliang Liu,Xi Jiang,Tianming Liu,Yi Shao

In recent years, pre-trained large language models (LLMs) have achieved tremendous success in the field of Natural Language Processing (NLP). Prior studies have primarily focused on general and generic domains, with relatively less research on specialized LLMs in the medical field. The specialization and high accuracy requirements for diagnosis in the medical field, as well as the challenges in collecting large-scale data, have constrained the application and development of LLMs in medical scenarios. In the field of ophthalmology, clinical diagnosis mainly relies on doctors' interpretation of reports and making diagnostic decisions. In order to take advantage of LLMs to provide decision support for doctors, we collected three modalities of ophthalmic report data and fine-tuned the LLaMA2 model, successfully constructing an LLM termed the "Ophtha-LLaMA2" specifically tailored for ophthalmic disease diagnosis. Inference test results show that even with a smaller fine-tuning dataset, Ophtha-LLaMA2 performs significantly better in ophthalmic diagnosis compared to other LLMs. It demonstrates that the Ophtha-LLaMA2 exhibits satisfying accuracy and efficiency in ophthalmic disease diagnosis, making it a valuable tool for ophthalmologists to provide improved diagnostic support for patients. This research provides a useful reference for the application of LLMs in the field of ophthalmology, while showcasing the immense potential and prospects in this domain.

Prompt · 數據集 · GPT-3.5 · 大語言模型 · AIM ·

2023 年 12 月 8 日

Goal-Oriented Prompt Attack and Safety Evaluation for LLMs

Chengyuan Liu,Fubang Zhao,Lizhi Qing,Yangyang Kang,Changlong Sun,Kun Kuang,Fei Wu

Large Language Models (LLMs) presents significant priority in text understanding and generation. However, LLMs suffer from the risk of generating harmful contents especially while being employed to applications. There are several black-box attack methods, such as Prompt Attack, which can change the behaviour of LLMs and induce LLMs to generate unexpected answers with harmful contents. Researchers are interested in Prompt Attack and Defense with LLMs, while there is no publicly available dataset with high successful attacking rate to evaluate the abilities of defending prompt attack. In this paper, we introduce a pipeline to construct high-quality prompt attack samples, along with a Chinese prompt attack dataset called CPAD. Our prompts aim to induce LLMs to generate unexpected outputs with several carefully designed prompt attack templates and widely concerned attacking contents. Different from previous datasets involving safety estimation, we construct the prompts considering three dimensions: contents, attacking methods and goals. Especially, the attacking goals indicate the behaviour expected after successfully attacking the LLMs, thus the responses can be easily evaluated and analysed. We run several popular Chinese LLMs on our dataset, and the results show that our prompts are significantly harmful to LLMs, with around 70% attack success rate to GPT-3.5. CPAD is publicly available at //github.com/liuchengyuan123/CPAD.

圖像修復 · 3D · MoDELS · 可辨認的 · HTTPS ·

2023 年 12 月 7 日

NeRFiller: Completing Scenes via Generative 3D Inpainting

Ethan Weber,Aleksander Ho?yński,Varun Jampani,Saurabh Saxena,Noah Snavely,Abhishek Kar,Angjoo Kanazawa

from arxiv, Project page: //ethanweber.me/nerfiller

We propose NeRFiller, an approach that completes missing portions of a 3D capture via generative 3D inpainting using off-the-shelf 2D visual generative models. Often parts of a captured 3D scene or object are missing due to mesh reconstruction failures or a lack of observations (e.g., contact regions, such as the bottom of objects, or hard-to-reach areas). We approach this challenging 3D inpainting problem by leveraging a 2D inpainting diffusion model. We identify a surprising behavior of these models, where they generate more 3D consistent inpaints when images form a 2$\times$2 grid, and show how to generalize this behavior to more than four images. We then present an iterative framework to distill these inpainted regions into a single consistent 3D scene. In contrast to related works, we focus on completing scenes rather than deleting foreground objects, and our approach does not require tight 2D object masks or text. We compare our approach to relevant baselines adapted to our setting on a variety of scenes, where NeRFiller creates the most 3D consistent and plausible scene completions. Our project page is at //ethanweber.me/nerfiller.

MoDELS · 潛在 · Extensibility · state-of-the-art · HTTPS ·

2023 年 12 月 6 日

FoodFusion: A Latent Diffusion Model for Realistic Food Image Generation

Olivia Markham,Yuhao Chen,Chi-en Amy Tai,Alexander Wong

Current state-of-the-art image generation models such as Latent Diffusion Models (LDMs) have demonstrated the capacity to produce visually striking food-related images. However, these generated images often exhibit an artistic or surreal quality that diverges from the authenticity of real-world food representations. This inadequacy renders them impractical for applications requiring realistic food imagery, such as training models for image-based dietary assessment. To address these limitations, we introduce FoodFusion, a Latent Diffusion model engineered specifically for the faithful synthesis of realistic food images from textual descriptions. The development of the FoodFusion model involves harnessing an extensive array of open-source food datasets, resulting in over 300,000 curated image-caption pairs. Additionally, we propose and employ two distinct data cleaning methodologies to ensure that the resulting image-text pairs maintain both realism and accuracy. The FoodFusion model, thus trained, demonstrates a remarkable ability to generate food images that exhibit a significant improvement in terms of both realism and diversity over the publicly available image generation models. We openly share the dataset and fine-tuned models to support advancements in this critical field of food image synthesis at //bit.ly/genai4good.

類別 · 評論員 · 模態 · 論文 ·

2023 年 12 月 6 日

Internal and External Calculi: Ordering the Jungle without Being Lost in Translations

Tim S. Lyon,Agata Ciabattoni,Didier Galmiche,Dominique Larchey-Wendling,Daniel Méry,Nicola Olivetti,Revantha Ramanayake

This paper gives a broad account of the various sequent-based proof formalisms in the proof-theoretic literature. We consider formalisms for various modal and tense logics, intuitionistic logic, conditional logics, and bunched logics. After providing an overview of the logics and proof formalisms under consideration, we show how these sequent-based formalisms can be placed in a hierarchy in terms of the underlying data structure of the sequents. We then discuss how this hierarchy can be traversed using translations. Translating proofs up this hierarchy is found to be relatively easy while translating proofs down the hierarchy is substantially more difficult. Finally, we inspect the prevalent distinction in structural proof theory between 'internal calculi' and 'external calculi'. It is observed that these classes resist a rigorous separation, and we critically assess the properties that (calculi from) these classes are purported to possess.

得分 · 樣本 · 蒸餾 · PARCO · CASES ·

2023 年 12 月 6 日

DreamPropeller: Supercharge Text-to-3D Generation with Parallel Sampling

Linqi Zhou,Andy Shih,Chenlin Meng,Stefano Ermon

from arxiv, Github repo: //github.com/alexzhou907/DreamPropeller; Project page: //alexzhou907.github.io/dreampropeller_page/

Recent methods such as Score Distillation Sampling (SDS) and Variational Score Distillation (VSD) using 2D diffusion models for text-to-3D generation have demonstrated impressive generation quality. However, the long generation time of such algorithms significantly degrades the user experience. To tackle this problem, we propose DreamPropeller, a drop-in acceleration algorithm that can be wrapped around any existing text-to-3D generation pipeline based on score distillation. Our framework generalizes Picard iterations, a classical algorithm for parallel sampling an ODE path, and can account for non-ODE paths such as momentum-based gradient updates and changes in dimensions during the optimization process as in many cases of 3D generation. We show that our algorithm trades parallel compute for wallclock time and empirically achieves up to 4.7x speedup with a negligible drop in generation quality for all tested frameworks.

大語言模型 · 多峰值 · 語言模型化 · Performer · MoDELS ·

2023 年 12 月 6 日

MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models

Chaoyou Fu,Peixian Chen,Yunhang Shen,Yulei Qin,Mengdan Zhang,Xu Lin,Jinrui Yang,Xiawu Zheng,Ke Li,Xing Sun,Yunsheng Wu,Rongrong Ji

from arxiv, Project page: //github.com/BradyFU/Awesome-Multimodal-Large-Language-Models

Multimodal Large Language Model (MLLM) relies on the powerful LLM to perform multimodal tasks, showing amazing emergent abilities in recent studies, such as writing poems based on an image. However, it is difficult for these case studies to fully reflect the performance of MLLM, lacking a comprehensive evaluation. In this paper, we fill in this blank, presenting the first comprehensive MLLM Evaluation benchmark MME. It measures both perception and cognition abilities on a total of 14 subtasks. In order to avoid data leakage that may arise from direct use of public datasets for evaluation, the annotations of instruction-answer pairs are all manually designed. The concise instruction design allows us to fairly compare MLLMs, instead of struggling in prompt engineering. Besides, with such an instruction, we can also easily carry out quantitative statistics. A total of 30 advanced MLLMs are comprehensively evaluated on our MME, which not only suggests that existing MLLMs still have a large room for improvement, but also reveals the potential directions for the subsequent model optimization.

圖 · Neural Networks · Networks · AIM · 圖形處理器 ·

2023 年 8 月 31 日

A Survey on Privacy in Graph Neural Networks: Attacks, Preservation, and Applications

Yi Zhang,Yuying Zhao,Zhaoqing Li,Xueqi Cheng,Yu Wang,Olivera Kotevska,Philip S. Yu,Tyler Derr

Graph Neural Networks (GNNs) have gained significant attention owing to their ability to handle graph-structured data and the improvement in practical applications. However, many of these models prioritize high utility performance, such as accuracy, with a lack of privacy consideration, which is a major concern in modern society where privacy attacks are rampant. To address this issue, researchers have started to develop privacy-preserving GNNs. Despite this progress, there is a lack of a comprehensive overview of the attacks and the techniques for preserving privacy in the graph domain. In this survey, we aim to address this gap by summarizing the attacks on graph data according to the targeted information, categorizing the privacy preservation techniques in GNNs, and reviewing the datasets and applications that could be used for analyzing/solving privacy issues in GNNs. We also outline potential directions for future research in order to build better privacy-preserving GNNs.

語音識別 · Google Voice · 清華大學智能產業研究院 · CRAFT · Cortana ·

2018 年 1 月 24 日

CommanderSong: A Systematic Approach for Practical Adversarial Voice Recognition

Xuejing Yuan,Yuxuan Chen,Yue Zhao,Yunhui Long,Xiaokang Liu,Kai Chen,Shengzhi Zhang,Heqing Huang,Xiaofeng Wang,Carl A. Gunter

ASR (automatic speech recognition) systems like Siri, Alexa, Google Voice or Cortana has become quite popular recently. One of the key techniques enabling the practical use of such systems in people's daily life is deep learning. Though deep learning in computer vision is known to be vulnerable to adversarial perturbations, little is known whether such perturbations are still valid on the practical speech recognition. In this paper, we not only demonstrate such attacks can happen in reality, but also show that the attacks can be systematically conducted. To minimize users' attention, we choose to embed the voice commands into a song, called CommandSong. In this way, the song carrying the command can spread through radio, TV or even any media player installed in the portable devices like smartphones, potentially impacting millions of users in long distance. In particular, we overcome two major challenges: minimizing the revision of a song in the process of embedding commands, and letting the CommandSong spread through the air without losing the voice "command". Our evaluation demonstrates that we can craft random songs to "carry" any commands and the modify is extremely difficult to be noticed. Specially, the physical attack that we play the CommandSongs over the air and record them can success with 94 percentage.