久久久久久久精品少妇9999_五月婷婷开心之中文字幕_国产女做A性色精品视频免费_免费观看日韩钙片GV网站_欧美精品V日韩精品V国产综合_九九精品国产亚洲AV网站美利坚_又粗又黄又硬又爽的免费网站

The rise of Generative Artificial Intelligence systems ("AI systems") has created unprecedented social engagement. AI code generation systems provide responses (output) to questions or requests by accessing the vast library of open-source code created by developers over the past few decades. However, they do so by allegedly stealing the open-source code stored in virtual libraries, known as repositories. This Article focuses on how this happens and whether there is a solution that protects innovation and avoids years of litigation. We also touch upon the array of issues raised by the relationship between AI and copyright. Looking ahead, we propose the following: (a) immediate changes to the licenses for open-source code created by developers that will limit access and/or use of any open-source code to humans only; (b) we suggest revisions to the Massachusetts Institute of Technology ("MIT") license so that AI systems are required to procure appropriate licenses from open-source code developers, which we believe will harmonize standards and build social consensus for the benefit of all of humanity, rather than promote profit-driven centers of innovation; (c) we call for urgent legislative action to protect the future of AI systems while also promoting innovation; and (d) we propose a shift in the burden of proof to AI systems in obfuscation cases.

相關內容

關注 7040

人工(gong)智(zhi)能(neng)(neng)雜志AI(Artificial Intelligence)是目前公認(ren)的(de)(de)(de)發(fa)表(biao)該領域(yu)最新(xin)(xin)研究成果的(de)(de)(de)主(zhu)要(yao)國際論壇。該期刊歡(huan)(huan)迎(ying)有關(guan)AI廣泛方(fang)(fang)面的(de)(de)(de)論文(wen)，這些論文(wen)構成了(le)整個(ge)(ge)領域(yu)的(de)(de)(de)進步，也(ye)歡(huan)(huan)迎(ying)介(jie)(jie)紹人工(gong)智(zhi)能(neng)(neng)應用(yong)的(de)(de)(de)論文(wen)，但重(zhong)點應該放在新(xin)(xin)的(de)(de)(de)和新(xin)(xin)穎的(de)(de)(de)人工(gong)智(zhi)能(neng)(neng)方(fang)(fang)法如何提(ti)高應用(yong)領域(yu)的(de)(de)(de)性(xing)能(neng)(neng)，而(er)不是介(jie)(jie)紹傳統人工(gong)智(zhi)能(neng)(neng)方(fang)(fang)法的(de)(de)(de)另一個(ge)(ge)應用(yong)。關(guan)于應用(yong)的(de)(de)(de)論文(wen)應該描(miao)述一個(ge)(ge)原則(ze)性(xing)的(de)(de)(de)解決方(fang)(fang)案，強(qiang)調其新(xin)(xin)穎性(xing)，并(bing)對正在開發(fa)的(de)(de)(de)人工(gong)智(zhi)能(neng)(neng)技術進行深入(ru)的(de)(de)(de)評估。官網地址(zhi)：

MoDELS · CASES · 數據集 · AI · INFORMS ·

2024 年 3 月 11 日

On the Consideration of AI Openness: Can Good Intent Be Abused?

Yeeun Kim,Eunkyung Choi,Hyunjun Kim,Hongseok Oh,Hyunseo Shin,Wonseok Hwang

from arxiv, 10 pages

Openness is critical for the advancement of science. In particular, recent rapid progress in AI has been made possible only by various open-source models, datasets, and libraries. However, this openness also means that technologies can be freely used for socially harmful purposes. Can open-source models or datasets be used for malicious purposes? If so, how easy is it to adapt technology for such goals? Here, we conduct a case study in the legal domain, a realm where individual decisions can have profound social consequences. To this end, we build EVE, a dataset consisting of 200 examples of questions and corresponding answers about criminal activities based on 200 Korean precedents. We found that a widely accepted open-source LLM, which initially refuses to answer unethical questions, can be easily tuned with EVE to provide unethical and informative answers about criminal activities. This implies that although open-source technologies contribute to scientific progress, some care must be taken to mitigate possible malicious use cases. Warning: This paper contains contents that some may find unethical.

MoDELS · 跡 · Learning · Performer · Extensibility ·

2024 年 3 月 10 日

Are You Being Tracked? Discover the Power of Zero-Shot Trajectory Tracing with LLMs!

Huanqi Yang,Sijie Ji,Rucheng Wu,Weitao Xu

There is a burgeoning discussion around the capabilities of Large Language Models (LLMs) in acting as fundamental components that can be seamlessly incorporated into Artificial Intelligence of Things (AIoT) to interpret complex trajectories. This study introduces LLMTrack, a model that illustrates how LLMs can be leveraged for Zero-Shot Trajectory Recognition by employing a novel single-prompt technique that combines role-play and think step-by-step methodologies with unprocessed Inertial Measurement Unit (IMU) data. We evaluate the model using real-world datasets designed to challenge it with distinct trajectories characterized by indoor and outdoor scenarios. In both test scenarios, LLMTrack not only meets but exceeds the performance benchmarks set by traditional machine learning approaches and even contemporary state-of-the-art deep learning models, all without the requirement of training on specialized datasets. The results of our research suggest that, with strategically designed prompts, LLMs can tap into their extensive knowledge base and are well-equipped to analyze raw sensor data with remarkable effectiveness.

MoDELS · Processing（編程語言） · 生成式對抗網絡 · Networking · 潛在 ·

2024 年 3 月 8 日

Warfare:Breaking the Watermark Protection of AI-Generated Content

Guanlin Li,Yifei Chen,Jie Zhang,Jiwei Li,Shangwei Guo,Tianwei Zhang

AI-Generated Content (AIGC) is gaining great popularity, with many emerging commercial services and applications. These services leverage advanced generative models, such as latent diffusion models and large language models, to generate creative content (e.g., realistic images and fluent sentences) for users. The usage of such generated content needs to be highly regulated, as the service providers need to ensure the users do not violate the usage policies (e.g., abuse for commercialization, generating and distributing unsafe content). A promising solution to achieve this goal is watermarking, which adds unique and imperceptible watermarks on the content for service verification and attribution. Numerous watermarking approaches have been proposed recently. However, in this paper, we show that an adversary can easily break these watermarking mechanisms. Specifically, we consider two possible attacks. (1) Watermark removal: the adversary can easily erase the embedded watermark from the generated content and then use it freely bypassing the regulation of the service provider. (2) Watermark forging: the adversary can create illegal content with forged watermarks from another user, causing the service provider to make wrong attributions. We propose Warfare, a unified methodology to achieve both attacks in a holistic way. The key idea is to leverage a pre-trained diffusion model for content processing and a generative adversarial network for watermark removal or forging. We evaluate Warfare on different datasets and embedding setups. The results prove that it can achieve high success rates while maintaining the quality of the generated content. Compared to existing diffusion model-based attacks, Warfare is 5,050~11,000x faster.

Performer · RAVEN · 多樣性 · CLUES · 多跳 ·

2024 年 3 月 8 日

How Far Are We from Intelligent Visual Deductive Reasoning?

Yizhe Zhang,He Bai,Ruixiang Zhang,Jiatao Gu,Shuangfei Zhai,Josh Susskind,Navdeep Jaitly

from arxiv, ICLR 2024 AGI workshop. //github.com/apple/ml-rpm-bench

Vision-Language Models (VLMs) such as GPT-4V have recently demonstrated incredible strides on diverse vision language tasks. We dig into vision-based deductive reasoning, a more sophisticated but less explored realm, and find previously unexposed blindspots in the current SOTA VLMs. Specifically, we leverage Raven's Progressive Matrices (RPMs), to assess VLMs' abilities to perform multi-hop relational and deductive reasoning relying solely on visual clues. We perform comprehensive evaluations of several popular VLMs employing standard strategies such as in-context learning, self-consistency, and Chain-of-thoughts (CoT) on three diverse datasets, including the Mensa IQ test, IntelligenceTest, and RAVEN. The results reveal that despite the impressive capabilities of LLMs in text-based reasoning, we are still far from achieving comparable proficiency in visual deductive reasoning. We found that certain standard strategies that are effective when applied to LLMs do not seamlessly translate to the challenges presented by visual reasoning tasks. Moreover, a detailed analysis reveals that VLMs struggle to solve these tasks mainly because they are unable to perceive and comprehend multiple, confounding abstract patterns in RPM examples.

生成式人工智能 · Analysis · MoDELS · AI · ChatGPT ·

2024 年 3 月 7 日

The Social Impact of Generative AI: An Analysis on ChatGPT

Maria T. Baldassarre,Danilo Caivano,Berenice Fernandez Nieto,Domenico Gigante,Azzurra Ragone

from arxiv, Presented at GoodIT2023 - ACM Conference on Information Technology for Social Good

In recent months, the social impact of Artificial Intelligence (AI) has gained considerable public interest, driven by the emergence of Generative AI models, ChatGPT in particular. The rapid development of these models has sparked heated discussions regarding their benefits, limitations, and associated risks. Generative models hold immense promise across multiple domains, such as healthcare, finance, and education, to cite a few, presenting diverse practical applications. Nevertheless, concerns about potential adverse effects have elicited divergent perspectives, ranging from privacy risks to escalating social inequality. This paper adopts a methodology to delve into the societal implications of Generative AI tools, focusing primarily on the case of ChatGPT. It evaluates the potential impact on several social sectors and illustrates the findings of a comprehensive literature review of both positive and negative effects, emerging trends, and areas of opportunity of Generative AI models. This analysis aims to facilitate an in-depth discussion by providing insights that can inspire policy, regulation, and responsible development practices to foster a human-centered AI.

MoDELS · SLIM · 語言模型化 · 大語言模型 · 蒸餾 ·

2024 年 3 月 7 日

Can Small Language Models be Good Reasoners for Sequential Recommendation?

Yuling Wang,Changxin Tian,Binbin Hu,Yanhua Yu,Ziqi Liu,Zhiqiang Zhang,Jun Zhou,Liang Pang,Xiao Wang

Large language models (LLMs) open up new horizons for sequential recommendations, owing to their remarkable language comprehension and generation capabilities. However, there are still numerous challenges that should be addressed to successfully implement sequential recommendations empowered by LLMs. Firstly, user behavior patterns are often complex, and relying solely on one-step reasoning from LLMs may lead to incorrect or task-irrelevant responses. Secondly, the prohibitively resource requirements of LLM (e.g., ChatGPT-175B) are overwhelmingly high and impractical for real sequential recommender systems. In this paper, we propose a novel Step-by-step knowLedge dIstillation fraMework for recommendation (SLIM), paving a promising path for sequential recommenders to enjoy the exceptional reasoning capabilities of LLMs in a "slim" (i.e., resource-efficient) manner. We introduce CoT prompting based on user behavior sequences for the larger teacher model. The rationales generated by the teacher model are then utilized as labels to distill the downstream smaller student model (e.g., LLaMA2-7B). In this way, the student model acquires the step-by-step reasoning capabilities in recommendation tasks. We encode the generated rationales from the student model into a dense vector, which empowers recommendation in both ID-based and ID-agnostic scenarios. Extensive experiments demonstrate the effectiveness of SLIM over state-of-the-art baselines, and further analysis showcasing its ability to generate meaningful recommendation reasoning at affordable costs.

向量化 · 相似度 · Processing（編程語言） · Storage · 優化器 ·

2023 年 10 月 21 日

Survey of Vector Database Management Systems

James Jie Pan,Jianguo Wang,Guoliang Li

from arxiv, 25 pages

There are now over 20 commercial vector database management systems (VDBMSs), all produced within the past five years. But embedding-based retrieval has been studied for over ten years, and similarity search a staggering half century and more. Driving this shift from algorithms to systems are new data intensive applications, notably large language models, that demand vast stores of unstructured data coupled with reliable, secure, fast, and scalable query processing capability. A variety of new data management techniques now exist for addressing these needs, however there is no comprehensive survey to thoroughly review these techniques and systems. We start by identifying five main obstacles to vector data management, namely vagueness of semantic similarity, large size of vectors, high cost of similarity comparison, lack of natural partitioning that can be used for indexing, and difficulty of efficiently answering hybrid queries that require both attributes and vectors. Overcoming these obstacles has led to new approaches to query processing, storage and indexing, and query optimization and execution. For query processing, a variety of similarity scores and query types are now well understood; for storage and indexing, techniques include vector compression, namely quantization, and partitioning based on randomization, learning partitioning, and navigable partitioning; for query optimization and execution, we describe new operators for hybrid queries, as well as techniques for plan enumeration, plan selection, and hardware accelerated execution. These techniques lead to a variety of VDBMSs across a spectrum of design and runtime characteristics, including native systems specialized for vectors and extended systems that incorporate vector capabilities into existing systems. We then discuss benchmarks, and finally we outline research challenges and point the direction for future work.

語言模型化 · 推薦系統 · MoDELS · 可理解性 · 有向 ·

2023 年 8 月 5 日

Recommender Systems in the Era of Large Language Models (LLMs)

Wenqi Fan,Zihuai Zhao,Jiatong Li,Yunqing Liu,Xiaowei Mei,Yiqi Wang,Zhen Wen,Fei Wang,Xiangyu Zhao,Jiliang Tang,Qing Li

from arxiv, 16 pages, 5 figures

With the prosperity of e-commerce and web applications, Recommender Systems (RecSys) have become an important component of our daily life, providing personalized suggestions that cater to user preferences. While Deep Neural Networks (DNNs) have made significant advancements in enhancing recommender systems by modeling user-item interactions and incorporating textual side information, DNN-based methods still face limitations, such as difficulties in understanding users' interests and capturing textual side information, inabilities in generalizing to various recommendation scenarios and reasoning on their predictions, etc. Meanwhile, the emergence of Large Language Models (LLMs), such as ChatGPT and GPT4, has revolutionized the fields of Natural Language Processing (NLP) and Artificial Intelligence (AI), due to their remarkable abilities in fundamental responsibilities of language understanding and generation, as well as impressive generalization and reasoning capabilities. As a result, recent studies have attempted to harness the power of LLMs to enhance recommender systems. Given the rapid evolution of this research direction in recommender systems, there is a pressing need for a systematic overview that summarizes existing LLM-empowered recommender systems, to provide researchers in relevant fields with an in-depth understanding. Therefore, in this paper, we conduct a comprehensive review of LLM-empowered recommender systems from various aspects including Pre-training, Fine-tuning, and Prompting. More specifically, we first introduce representative methods to harness the power of LLMs (as a feature encoder) for learning representations of users and items. Then, we review recent techniques of LLMs for enhancing recommender systems from three paradigms, namely pre-training, fine-tuning, and prompting. Finally, we comprehensively discuss future directions in this emerging field.

entity · MINE · 可約的 · 規范化的 · 實體對齊 ·

2021 年 3 月 29 日

Boosting the Speed of Entity Alignment 10*: Dual Attention Matching Network with Normalized Hard Sample Mining

Xin Mao,Wenting Wang,Yuanbin Wu,Man Lan

from arxiv, 12 pages; Accepted by TheWebConf(WWW) 2021

Seeking the equivalent entities among multi-source Knowledge Graphs (KGs) is the pivotal step to KGs integration, also known as \emph{entity alignment} (EA). However, most existing EA methods are inefficient and poor in scalability. A recent summary points out that some of them even require several days to deal with a dataset containing 200,000 nodes (DWY100K). We believe over-complex graph encoder and inefficient negative sampling strategy are the two main reasons. In this paper, we propose a novel KG encoder -- Dual Attention Matching Network (Dual-AMN), which not only models both intra-graph and cross-graph information smartly, but also greatly reduces computational complexity. Furthermore, we propose the Normalized Hard Sample Mining Loss to smoothly select hard negative samples with reduced loss shift. The experimental results on widely used public datasets indicate that our method achieves both high accuracy and high efficiency. On DWY100K, the whole running process of our method could be finished in 1,100 seconds, at least 10* faster than previous work. The performances of our method also outperform previous works across all datasets, where Hits@1 and MRR have been improved from 6% to 13%.

長短期記憶網絡 · RNN · Networking · Weight · MoDELS ·

2020 年 6 月 10 日

Do RNN and LSTM have Long Memory?

Jingyu Zhao,Feiqing Huang,Jia Lv,Yanjie Duan,Zhen Qin,Guodong Li,Guangjian Tian

from arxiv, Accepted by ICML 2020. Added references, experiments and acknowledgements

The LSTM network was proposed to overcome the difficulty in learning long-term dependence, and has made significant advancements in applications. With its success and drawbacks in mind, this paper raises the question - do RNN and LSTM have long memory? We answer it partially by proving that RNN and LSTM do not have long memory from a statistical perspective. A new definition for long memory networks is further introduced, and it requires the model weights to decay at a polynomial rate. To verify our theory, we convert RNN and LSTM into long memory networks by making a minimal modification, and their superiority is illustrated in modeling long-term dependence of various datasets.