人人操人人莫人人草-露脸视频一区二区三区在线播放

潛在狄利克雷分配 · Analysis · Twitter · 話題 · 可辨認的 ·

2022 年 9 月 19 日

What are People Talking about in #BlackLivesMatter and #StopAsianHate? Exploring and Categorizing Twitter Topics Emerging in Online Social Movements through the Latent Dirichlet Allocation Model

Xin Tong,Yixuan Li,Jiayi Li,Rongqi Bei,Luyao Zhang

from arxiv, Accepted at AAAI and ACM Conference on AI, Ethics, and Society, August 1 to 3, 2022, Oxford, United Kingdom

Minority groups have been using social media to organize social movements that create profound social impacts. Black Lives Matter (BLM) and Stop Asian Hate (SAH) are two successful social movements that have spread on Twitter that promote protests and activities against racism and increase the public's awareness of other social challenges that minority groups face. However, previous studies have mostly conducted qualitative analyses of tweets or interviews with users, which may not comprehensively and validly represent all tweets. Very few studies have explored the Twitter topics within BLM and SAH dialogs in a rigorous, quantified and data-centered approach. Therefore, in this research, we adopted a mixed-methods approach to comprehensively analyze BLM and SAH Twitter topics. We implemented (1) the latent Dirichlet allocation model to understand the top high-level words and topics and (2) open-coding analysis to identify specific themes across the tweets. We collected more than one million tweets with the #blacklivesmatter and #stopasianhate hashtags and compared their topics. Our findings revealed that the tweets discussed a variety of influential topics in depth, and social justice, social movements, and emotional sentiments were common topics in both movements, though with unique subtopics for each movement. Our study contributes to the topic analysis of social movements on social media platforms in particular and the literature on the interplay of AI, ethics, and society in general.

相關內容

潛在狄利克雷分配

關注 0

語言模型化 · 可理解性 · MoDELS · Integration · AI ·

2022 年 10 月 27 日

The Debate Over Understanding in AI's Large Language Models

Melanie Mitchell,David C. Krakauer

from arxiv, Under submission as a Perspective article

We survey a current, heated debate in the AI research community on whether large pre-trained language models can be said to "understand" language -- and the physical and social situations language encodes -- in any important sense. We describe arguments that have been made for and against such understanding, and key questions for the broader sciences of intelligence that have arisen in light of these arguments. We contend that a new science of intelligence can be developed that will provide insight into distinct modes of understanding, their strengths and limitations, and the challenge of integrating diverse forms of cognition.

Networking · Neural Networks · Learning · MoDELS · TOOLS ·

2022 年 10 月 26 日

A Variational Inequality Model for Learning Neural Networks

Patrick L. Combettes,Jean-Christophe Pesquet,Audrey Repetti

Neural networks have become ubiquitous tools for solving signal and image processing problems, and they often outperform standard approaches. Nevertheless, training neural networks is a challenging task in many applications. The prevalent training procedure consists of minimizing highly non-convex objectives based on data sets of huge dimension. In this context, current methodologies are not guaranteed to produce global solutions. We present an alternative approach which foregoes the optimization framework and adopts a variational inequality formalism. The associated algorithm guarantees convergence of the iterates to a true solution of the variational inequality and it possesses an efficient block-iterative structure. A numerical application is presented.

INFORMS · 可辨認的 · 得分 · Analysis · 可理解性 ·

2022 年 10 月 26 日

Novelty and Cultural Evolution in Modern Popular Music

Katherine O'Toole,Em?ke-ágnes Horvát

The ubiquity of digital music consumption has made it possible to extract information about modern music that allows us to perform large scale analysis of stylistic change over time. In order to uncover underlying patterns in cultural evolution, we examine the relationship between the established characteristics of different genres and styles, and the introduction of novel ideas that fuel this ongoing creative evolution. To understand how this dynamic plays out and shapes the cultural ecosystem, we compare musical artifacts to their contemporaries to identify novel artifacts, study the relationship between novelty and commercial success, and connect this to the changes in musical content that we can observe over time. Using Music Information Retrieval (MIR) data and lyrics from Billboard Hot 100 songs between 1974-2013, we calculate a novelty score for each song's aural attributes and lyrics. Comparing both scores to the popularity of the song following its release, we uncover key patterns in the relationship between novelty and audience reception. Additionally, we look at the link between novelty and the likelihood that a song was influential given where its MIR and lyrical features fit within the larger trends we observed.

穩健性 · INTERACT · 數據集 · 泛化理論 · MoDELS ·

2022 年 10 月 26 日

Quality Not Quantity: On the Interaction between Dataset Design and Robustness of CLIP

Thao Nguyen,Gabriel Ilharco,Mitchell Wortsman,Sewoong Oh,Ludwig Schmidt

from arxiv, Updated plots

Web-crawled datasets have enabled remarkable generalization capabilities in recent image-text models such as CLIP (Contrastive Language-Image pre-training) or Flamingo, but little is known about the dataset creation processes. In this work, we introduce a testbed of six publicly available data sources - YFCC, LAION, Conceptual Captions, WIT, RedCaps, Shutterstock - to investigate how pre-training distributions induce robustness in CLIP. We find that the performance of the pre-training data varies substantially across distribution shifts, with no single data source dominating. Moreover, we systematically study the interactions between these data sources and find that combining multiple sources does not necessarily yield better models, but rather dilutes the robustness of the best individual data source. We complement our empirical findings with theoretical insights from a simple setting, where combining the training data also results in diluted robustness. In addition, our theoretical model provides a candidate explanation for the success of the CLIP-based data filtering technique recently employed in the LAION dataset. Overall our results demonstrate that simply gathering a large amount of data from the web is not the most effective way to build a pre-training dataset for robust generalization, necessitating further study into dataset design. Code is available at //github.com/mlfoundations/clip_quality_not_quantity.

語言模型化 · MoDELS · 秩 · 解碼 · 得分 ·

2022 年 10 月 25 日

RankGen: Improving Text Generation with Large Ranking Models

Kalpesh Krishna,Yapei Chang,John Wieting,Mohit Iyyer

from arxiv, EMNLP 2022 camera ready (34 pages), model checkpoints available at //github.com/martiansideofthemoon/rankgen

Given an input sequence (or prefix), modern language models often assign high probabilities to output sequences that are repetitive, incoherent, or irrelevant to the prefix; as such, model-generated text also contains such artifacts. To address these issues we present RankGen, a 1.2B parameter encoder model for English that scores model generations given a prefix. RankGen can be flexibly incorporated as a scoring function in beam search and used to decode from any pretrained language model. We train RankGen using large-scale contrastive learning to map a prefix close to the ground-truth sequence that follows it and far away from two types of negatives: (1) random sequences from the same document as the prefix, and (2) sequences generated from a large language model conditioned on the prefix. Experiments across four different language models (345M-11B parameters) and two domains show that RankGen significantly outperforms decoding algorithms like nucleus, top-k, and typical sampling on both automatic metrics (85.0 vs 77.3 MAUVE) as well as human evaluations with English writers (74.5% human preference over nucleus sampling). Analysis reveals that RankGen outputs are more relevant to the prefix and improve continuity and coherence compared to baselines. We release our model checkpoints, code, and human preference data with explanations to facilitate future research.

跡 · INFORMS · 峰值 · Automator · Analysis ·

2022 年 10 月 24 日

Does Mode of Digital Contact Tracing Affect User Willingness to Share Information? A Quantitative Study

Camellia Zakaria,Pin Sym Foong,Chang Siang Lim,Pavithren V. S. Pakianathan,Gerald Huat Choon Koh,Simon Tangi Perrault

from arxiv, 18 pages, 11 figures, 13 tables

Digital contact tracing can limit the spread of infectious diseases. Nevertheless, there remain barriers to attaining sufficient adoption. In this study, we investigate how willingness to participate in contact tracing is affected by two critical factors: the modes of data collection and the type of data collected. We conducted a scenario-based survey study among 220 respondents in the United States (U.S.) to understand their perceptions about contact tracing associated with automated and manual contact tracing methods. The findings indicate a promising use of smartphones and a combination of public health officials and medical health records as information sources. Through a quantitative analysis, we describe how different modalities and individual demographic factors may affect user compliance in providing four key pieces of information to contact tracing.

ESA · 可理解性 · INFORMS · CASE · Processing（編程語言） ·

2022 年 10 月 24 日

Artificial Intelligence and Natural Language Processing and Understanding in Space: A Methodological Framework and Four ESA Case Studies

José Manuel Gómez-Pérez,Andrés García-Silva,Rosemarie Leone,Mirko Albani,Moritz Fontaine,Charles Poncet,Leopold Summerer,Alessandro Donati,Ilaria Roma,Stefano Scaglioni

The European Space Agency is well known as a powerful force for scientific discovery in numerous areas related to Space. The amount and depth of the knowledge produced throughout the different missions carried out by ESA and their contribution to scientific progress is enormous, involving large collections of documents like scientific publications, feasibility studies, technical reports, and quality management procedures, among many others. Through initiatives like the Open Space Innovation Platform, ESA also acts as a hub for new ideas coming from the wider community across different challenges, contributing to a virtuous circle of scientific discovery and innovation. Handling such wealth of information, of which large part is unstructured text, is a colossal task that goes beyond human capabilities, hence requiring automation. In this paper, we present a methodological framework based on artificial intelligence and natural language processing and understanding to automatically extract information from Space documents, generating value from it, and illustrate such framework through several case studies implemented across different functional areas of ESA, including Mission Design, Quality Assurance, Long-Term Data Preservation, and the Open Space Innovation Platform. In doing so, we demonstrate the value of these technologies in several tasks ranging from effortlessly searching and recommending Space information to automatically determining how innovative an idea can be, answering questions about Space, and generating quizzes regarding quality procedures. Each of these accomplishments represents a step forward in the application of increasingly intelligent AI systems in Space, from structuring and facilitating information access to intelligent systems capable to understand and reason with such information.

Performer · state-of-the-art · Principle · 估計/估計量 · Networking ·

2022 年 10 月 22 日

A Survey on Computationally Efficient Neural Architecture Search

Shiqing Liu,Haoyu Zhang,Yaochu Jin

from arxiv, 20 pages, 7 figures

Neural architecture search (NAS) has become increasingly popular in the deep learning community recently, mainly because it can provide an opportunity to allow interested users without rich expertise to benefit from the success of deep neural networks (DNNs). However, NAS is still laborious and time-consuming because a large number of performance estimations are required during the search process of NAS, and training DNNs is computationally intensive. To solve this major limitation of NAS, improving the computational efficiency is essential in the design of NAS. However, a systematic overview of computationally efficient NAS (CE-NAS) methods still lacks. To fill this gap, we provide a comprehensive survey of the state-of-the-art on CE-NAS by categorizing the existing work into proxy-based and surrogate-assisted NAS methods, together with a thorough discussion of their design principles and a quantitative comparison of their performances and computational complexities. The remaining challenges and open research questions are also discussed, and promising research topics in this emerging field are suggested.

Vision · Machine Learning · AI · 學成 · 計算機視覺 ·

2022 年 1 月 5 日

Challenges of Artificial Intelligence -- From Machine Learning and Computer Vision to Emotional Intelligence

Matti Pietik?inen,Olli Silven

from arxiv, 234 pages. Published as an electronic publication at the University of Oulu, Finland, in December 2021, ISBN: 978-952-62-3199-0 link //jultika.oulu.fi/Record/isbn978-952-62-3199-0

Artificial intelligence (AI) has become a part of everyday conversation and our lives. It is considered as the new electricity that is revolutionizing the world. AI is heavily invested in both industry and academy. However, there is also a lot of hype in the current AI debate. AI based on so-called deep learning has achieved impressive results in many problems, but its limits are already visible. AI has been under research since the 1940s, and the industry has seen many ups and downs due to over-expectations and related disappointments that have followed. The purpose of this book is to give a realistic picture of AI, its history, its potential and limitations. We believe that AI is a helper, not a ruler of humans. We begin by describing what AI is and how it has evolved over the decades. After fundamentals, we explain the importance of massive data for the current mainstream of artificial intelligence. The most common representations for AI, methods, and machine learning are covered. In addition, the main application areas are introduced. Computer vision has been central to the development of AI. The book provides a general introduction to computer vision, and includes an exposure to the results and applications of our own research. Emotions are central to human intelligence, but little use has been made in AI. We present the basics of emotional intelligence and our own research on the topic. We discuss super-intelligence that transcends human understanding, explaining why such achievement seems impossible on the basis of present knowledge,and how AI could be improved. Finally, a summary is made of the current state of AI and what to do in the future. In the appendix, we look at the development of AI education, especially from the perspective of contents at our own university.

多峰值 · 情感分析 · MoDELS · AIM · Tumblr ·

2018 年 5 月 25 日

Multimodal Sentiment Analysis To Explore the Structure of Emotions

Anthony Hu,Seth Flaxman

from arxiv, Accepted as a conference paper at KDD 2018

We propose a novel approach to multimodal sentiment analysis using deep neural networks combining visual analysis and natural language processing. Our goal is different than the standard sentiment analysis goal of predicting whether a sentence expresses positive or negative sentiment; instead, we aim to infer the latent emotional state of the user. Thus, we focus on predicting the emotion word tags attached by users to their Tumblr posts, treating these as "self-reported emotions." We demonstrate that our multimodal model combining both text and image features outperforms separate models based solely on either images or text. Our model's results are interpretable, automatically yielding sensible word lists associated with emotions. We explore the structure of emotions implied by our model and compare it to what has been posited in the psychology literature, and validate our model on a set of images that have been used in psychology studies. Finally, our work also provides a useful tool for the growing academic study of images - both photographs and memes - on social networks.