两个人的电影全免费观看720_综合综合综合综合综合网_日韩一区二区在线观看免费视频_伊人亚洲综合青草青草久热_永久免费视频在线观看_午夜爽爽爽男女免费观看视频_在线观看国产一区

In response to rising concerns surrounding the safety, security, and trustworthiness of Generative AI (GenAI) models, practitioners and regulators alike have pointed to AI red-teaming as a key component of their strategies for identifying and mitigating these risks. However, despite AI red-teaming's central role in policy discussions and corporate messaging, significant questions remain about what precisely it means, what role it can play in regulation, and how precisely it relates to conventional red-teaming practices as originally conceived in the field of cybersecurity. In this work, we identify recent cases of red-teaming activities in the AI industry and conduct an extensive survey of the relevant research literature to characterize the scope, structure, and criteria for AI red-teaming practices. Our analysis reveals that prior methods and practices of AI red-teaming diverge along several axes, including the purpose of the activity (which is often vague), the artifact under evaluation, the setting in which the activity is conducted (e.g., actors, resources, and methods), and the resulting decisions it informs (e.g., reporting, disclosure, and mitigation). In light of our findings, we argue that while red-teaming may be a valuable big-tent idea for characterizing a broad set of activities and attitudes aimed at improving the behavior of GenAI models, gestures towards red-teaming as a panacea for every possible risk verge on security theater. To move toward a more robust toolbox of evaluations for generative AI, we synthesize our recommendations into a question bank meant to guide and scaffold future AI red-teaming practices.

相關內容

生成式人工智能

關注 36

生(sheng)成(cheng)式(shi)人工(gong)智(zhi)能(neng)(neng)是利用復(fu)雜的(de)(de)(de)算法、模型(xing)和規(gui)則(ze)，從(cong)(cong)大規(gui)模數據集中學習，以創(chuang)造新的(de)(de)(de)原創(chuang)內容的(de)(de)(de)人工(gong)智(zhi)能(neng)(neng)技(ji)(ji)(ji)(ji)術。這(zhe)項(xiang)技(ji)(ji)(ji)(ji)術能(neng)(neng)夠創(chuang)造文本(ben)、圖片、聲音、視頻(pin)和代碼(ma)等多(duo)種(zhong)類型(xing)的(de)(de)(de)內容，全面(mian)(mian)超越了(le)傳統軟件的(de)(de)(de)數據處(chu)理(li)和分析能(neng)(neng)力。2022年(nian)末，OpenAI推出(chu)的(de)(de)(de)ChatGPT標志著這(zhe)一技(ji)(ji)(ji)(ji)術在文本(ben)生(sheng)成(cheng)領(ling)域(yu)取得了(le)顯(xian)著進展，2023年(nian)被稱為生(sheng)成(cheng)式(shi)人工(gong)智(zhi)能(neng)(neng)的(de)(de)(de)突破(po)之年(nian)。這(zhe)項(xiang)技(ji)(ji)(ji)(ji)術從(cong)(cong)單一的(de)(de)(de)語言生(sheng)成(cheng)逐步(bu)向多(duo)模態、具身化快速發展。在圖像(xiang)生(sheng)成(cheng)方(fang)面(mian)(mian)，生(sheng)成(cheng)系統在解釋提(ti)示(shi)和生(sheng)成(cheng)逼真輸出(chu)方(fang)面(mian)(mian)取得了(le)顯(xian)著的(de)(de)(de)進步(bu)。同時，視頻(pin)和音頻(pin)的(de)(de)(de)生(sheng)成(cheng)技(ji)(ji)(ji)(ji)術也在迅速發展，這(zhe)為虛擬(ni)現實和元宇(yu)宙的(de)(de)(de)實現提(ti)供了(le)新的(de)(de)(de)途徑(jing)。生(sheng)成(cheng)式(shi)人工(gong)智(zhi)能(neng)(neng)技(ji)(ji)(ji)(ji)術在各行業、各領(ling)域(yu)都(dou)具有廣泛的(de)(de)(de)應(ying)用前景。

估計/估計量 · 線性的 · 黑盒 · Projection · 推斷 ·

2024 年 3 月 9 日

Linear Convergence of Black-Box Variational Inference: Should We Stick the Landing?

Kyurae Kim,Yian Ma,Jacob R. Gardner

from arxiv, Accepted to AISTATS'24

We prove that black-box variational inference (BBVI) with control variates, particularly the sticking-the-landing (STL) estimator, converges at a geometric (traditionally called "linear") rate under perfect variational family specification. In particular, we prove a quadratic bound on the gradient variance of the STL estimator, one which encompasses misspecified variational families. Combined with previous works on the quadratic variance condition, this directly implies convergence of BBVI with the use of projected stochastic gradient descent. For the projection operator, we consider a domain with triangular scale matrices, which the projection onto is computable in $\Theta(d)$ time, where $d$ is the dimensionality of the target posterior. We also improve existing analysis on the regular closed-form entropy gradient estimators, which enables comparison against the STL estimator, providing explicit non-asymptotic complexity guarantees for both.

MoDELS · 語言模型化 · 大語言模型 · 機器人 · INTERACT ·

2024 年 3 月 8 日

Are Large Language Models Aligned with People's Social Intuitions for Human-Robot Interactions?

Lennart Wachowiak,Andrew Coles,Oya Celiktutan,Gerard Canal

Large language models (LLMs) are increasingly used in robotics, especially for high-level action planning. Meanwhile, many robotics applications involve human supervisors or collaborators. Hence, it is crucial for LLMs to generate socially acceptable actions that align with people's preferences and values. In this work, we test whether LLMs capture people's intuitions about behavior judgments and communication preferences in human-robot interaction (HRI) scenarios. For evaluation, we reproduce three HRI user studies, comparing the output of LLMs with that of real participants. We find that GPT-4 strongly outperforms other models, generating answers that correlate strongly with users' answers in two studies $\unicode{x2014}$ the first study dealing with selecting the most appropriate communicative act for a robot in various situations ($r_s$ = 0.82), and the second with judging the desirability, intentionality, and surprisingness of behavior ($r_s$ = 0.83). However, for the last study, testing whether people judge the behavior of robots and humans differently, no model achieves strong correlations. Moreover, we show that vision models fail to capture the essence of video stimuli and that LLMs tend to rate different communicative acts and behavior desirability higher than people.

泛函 · MoDELS · 蒸餾 · 樣例 · 多樣性 ·

2024 年 3 月 8 日

LVDiffusor: Distilling Functional Rearrangement Priors from Large Models into Diffusor

Yiming Zeng,Mingdong Wu,Long Yang,Jiyao Zhang,Hao Ding,Hui Cheng,Hao Dong

Object rearrangement, a fundamental challenge in robotics, demands versatile strategies to handle diverse objects, configurations, and functional needs. To achieve this, the AI robot needs to learn functional rearrangement priors in order to specify precise goals that meet the functional requirements. Previous methods typically learn such priors from either laborious human annotations or manually designed heuristics, which limits scalability and generalization. In this work, we propose a novel approach that leverages large models to distill functional rearrangement priors. Specifically, our approach collects diverse arrangement examples using both LLMs and VLMs and then distills the examples into a diffusion model. During test time, the learned diffusion model is conditioned on the initial configuration and guides the positioning of objects to meet functional requirements. In this manner, we create a handshaking point that combines the strengths of conditional generative models and large models. Extensive experiments on multiple domains, including real-world scenarios, demonstrate the effectiveness of our approach in generating compatible goals for object rearrangement tasks, significantly outperforming baseline methods.

數據集 · Neural Networks · Networking · 多樣性 · 白盒 ·

2024 年 3 月 8 日

EVD4UAV: An Altitude-Sensitive Benchmark to Evade Vehicle Detection in UAV

Huiming Sun,Jiacheng Guo,Zibo Meng,Tianyun Zhang,Jianwu Fang,Yuewei Lin,Hongkai Yu

Vehicle detection in Unmanned Aerial Vehicle (UAV) captured images has wide applications in aerial photography and remote sensing. There are many public benchmark datasets proposed for the vehicle detection and tracking in UAV images. Recent studies show that adding an adversarial patch on objects can fool the well-trained deep neural networks based object detectors, posing security concerns to the downstream tasks. However, the current public UAV datasets might ignore the diverse altitudes, vehicle attributes, fine-grained instance-level annotation in mostly side view with blurred vehicle roof, so none of them is good to study the adversarial patch based vehicle detection attack problem. In this paper, we propose a new dataset named EVD4UAV as an altitude-sensitive benchmark to evade vehicle detection in UAV with 6,284 images and 90,886 fine-grained annotated vehicles. The EVD4UAV dataset has diverse altitudes (50m, 70m, 90m), vehicle attributes (color, type), fine-grained annotation (horizontal and rotated bounding boxes, instance-level mask) in top view with clear vehicle roof. One white-box and two black-box patch based attack methods are implemented to attack three classic deep neural networks based object detectors on EVD4UAV. The experimental results show that these representative attack methods could not achieve the robust altitude-insensitive attack performance.

優化器 · FPS · 線性的 · LBS 應用 · state-of-the-art ·

2024 年 3 月 8 日

SplattingAvatar: Realistic Real-Time Human Avatars with Mesh-Embedded Gaussian Splatting

Zhijing Shao,Zhaolong Wang,Zhuang Li,Duotun Wang,Xiangru Lin,Yu Zhang,Mingming Fan,Zeyu Wang

from arxiv, [CVPR 2024] Code and data are available at //github.com/initialneil/SplattingAvatar

We present SplattingAvatar, a hybrid 3D representation of photorealistic human avatars with Gaussian Splatting embedded on a triangle mesh, which renders over 300 FPS on a modern GPU and 30 FPS on a mobile device. We disentangle the motion and appearance of a virtual human with explicit mesh geometry and implicit appearance modeling with Gaussian Splatting. The Gaussians are defined by barycentric coordinates and displacement on a triangle mesh as Phong surfaces. We extend lifted optimization to simultaneously optimize the parameters of the Gaussians while walking on the triangle mesh. SplattingAvatar is a hybrid representation of virtual humans where the mesh represents low-frequency motion and surface deformation, while the Gaussians take over the high-frequency geometry and detailed appearance. Unlike existing deformation methods that rely on an MLP-based linear blend skinning (LBS) field for motion, we control the rotation and translation of the Gaussians directly by mesh, which empowers its compatibility with various animation techniques, e.g., skeletal animation, blend shapes, and mesh editing. Trainable from monocular videos for both full-body and head avatars, SplattingAvatar shows state-of-the-art rendering quality across multiple datasets.

生成式人工智能 · Analysis · MoDELS · AI · ChatGPT ·

2024 年 3 月 7 日

The Social Impact of Generative AI: An Analysis on ChatGPT

Maria T. Baldassarre,Danilo Caivano,Berenice Fernandez Nieto,Domenico Gigante,Azzurra Ragone

from arxiv, Presented at GoodIT2023 - ACM Conference on Information Technology for Social Good

In recent months, the social impact of Artificial Intelligence (AI) has gained considerable public interest, driven by the emergence of Generative AI models, ChatGPT in particular. The rapid development of these models has sparked heated discussions regarding their benefits, limitations, and associated risks. Generative models hold immense promise across multiple domains, such as healthcare, finance, and education, to cite a few, presenting diverse practical applications. Nevertheless, concerns about potential adverse effects have elicited divergent perspectives, ranging from privacy risks to escalating social inequality. This paper adopts a methodology to delve into the societal implications of Generative AI tools, focusing primarily on the case of ChatGPT. It evaluates the potential impact on several social sectors and illustrates the findings of a comprehensive literature review of both positive and negative effects, emerging trends, and areas of opportunity of Generative AI models. This analysis aims to facilitate an in-depth discussion by providing insights that can inspire policy, regulation, and responsible development practices to foster a human-centered AI.

CLIP · 多峰值 · 有偏 · Learning · MoDELS ·

2024 年 3 月 7 日

CLIP the Bias: How Useful is Balancing Data in Multimodal Learning?

Ibrahim Alabdulmohsin,Xiao Wang,Andreas Steiner,Priya Goyal,Alexander D'Amour,Xiaohua Zhai

from arxiv, 32 pages, 20 figures, 7 tables

We study the effectiveness of data-balancing for mitigating biases in contrastive language-image pretraining (CLIP), identifying areas of strength and limitation. First, we reaffirm prior conclusions that CLIP models can inadvertently absorb societal stereotypes. To counter this, we present a novel algorithm, called Multi-Modal Moment Matching (M4), designed to reduce both representation and association biases (i.e. in first- and second-order statistics) in multimodal data. We use M4 to conduct an in-depth analysis taking into account various factors, such as the model, representation, and data size. Our study also explores the dynamic nature of how CLIP learns and unlearns biases. In particular, we find that fine-tuning is effective in countering representation biases, though its impact diminishes for association biases. Also, data balancing has a mixed impact on quality: it tends to improve classification but can hurt retrieval. Interestingly, data and architectural improvements seem to mitigate the negative impact of data balancing on performance; e.g. applying M4 to SigLIP-B/16 with data quality filters improves COCO image-to-text retrieval @5 from 86% (without data balancing) to 87% and ImageNet 0-shot classification from 77% to 77.5%! Finally, we conclude with recommendations for improving the efficacy of data balancing in multimodal systems.

AdderNet · Neural Networks · Networking · 卷積 · 模型評估 ·

2019 年 12 月 31 日

AdderNet: Do We Really Need Multiplications in Deep Learning?

Hanting Chen,Yunhe Wang,Chunjing Xu,Boxin Shi,Chao Xu,Qi Tian,Chang Xu

Compared with cheap addition operation, multiplication operation is of much higher computation complexity. The widely-used convolutions in deep neural networks are exactly cross-correlation to measure the similarity between input feature and convolution filters, which involves massive multiplications between float values. In this paper, we present adder networks (AdderNets) to trade these massive multiplications in deep neural networks, especially convolutional neural networks (CNNs), for much cheaper additions to reduce computation costs. In AdderNets, we take the $\ell_1$-norm distance between filters and input feature as the output response. The influence of this new similarity measure on the optimization of neural network have been thoroughly analyzed. To achieve a better performance, we develop a special back-propagation approach for AdderNets by investigating the full-precision gradient. We then propose an adaptive learning rate strategy to enhance the training procedure of AdderNets according to the magnitude of each neuron's gradient. As a result, the proposed AdderNets can achieve 74.9% Top-1 accuracy 91.7% Top-5 accuracy using ResNet-50 on the ImageNet dataset without any multiplication in convolution layer.

學成 · 深度學習 · 可辨認的 · MoDELS · 目標跟蹤 ·

2019 年 7 月 31 日

Deep Learning in Video Multi-Object Tracking: A Survey

Gioele Ciaparrone,Francisco Luque Sánchez,Siham Tabik,Luigi Troiano,Roberto Tagliaferri,Francisco Herrera

from arxiv, New in v2: corrected typos and various minor mistakes. Submitted to Neurocomputing. Main text: 25 pages, 5 figures, 6 tables. Summary table in appendix at the end of the paper

The problem of Multiple Object Tracking (MOT) consists in following the trajectory of different objects in a sequence, usually a video. In recent years, with the rise of Deep Learning, the algorithms that provide a solution to this problem have benefited from the representational power of deep models. This paper provides a comprehensive survey on works that employ Deep Learning models to solve the task of MOT on single-camera videos. Four main steps in MOT algorithms are identified, and an in-depth review of how Deep Learning was employed in each one of these stages is presented. A complete experimental comparison of the presented works on the three MOTChallenge datasets is also provided, identifying a number of similarities among the top-performing methods and presenting some possible future research directions.

文本分類 · 語言模型化 · BERT · state-of-the-art · MoDELS ·

2019 年 5 月 14 日

How to Fine-Tune BERT for Text Classification?

Chi Sun,Xipeng Qiu,Yige Xu,Xuanjing Huang

Language model pre-training has proven to be useful in learning universal language representations. As a state-of-the-art language model pre-training model, BERT (Bidirectional Encoder Representations from Transformers) has achieved amazing results in many language understanding tasks. In this paper, we conduct exhaustive experiments to investigate different fine-tuning methods of BERT on text classification task and provide a general solution for BERT fine-tuning. Finally, the proposed solution obtains new state-of-the-art results on eight widely-studied text classification datasets.