日韩在线精品小视频_中文熟妇亚洲视频观看_欧美精品一区二区视频在线播放_亚洲一区二区三区高清_成年人网站免费在线观看_久青草国产在线伊人_精品国产一区二区三

Official government publications are key sources for understanding the history of societies. Web publishing has fundamentally changed the scale and processes by which governments produce and disseminate information. Significantly, a range of web archiving programs have captured massive troves of government publications. For example, hundreds of millions of unique U.S. Government documents posted to the web in PDF form have been archived by libraries to date. Yet, these PDFs remain largely unutilized and understudied in part due to the challenges surrounding the development of scalable pipelines for searching and analyzing them. This paper utilizes a Library of Congress dataset of 1,000 government PDFs in order to offer initial approaches for searching and analyzing these PDFs at scale. In addition to demonstrating the utility of PDF metadata, this paper offers computationally-efficient machine learning approaches to search and discovery that utilize the PDFs' textual and visual features as well. We conclude by detailing how these methods can be operationalized at scale in order to support systems for navigating millions of PDFs.

相關內容

縮放

關注 0

Processing（編程語言） · Integration · 區塊鏈 · 有向 · ASSETS ·

2022 年 2 月 8 日

Blockchain-based Digital Twin for Supply Chain Management: A Literature Review and Future Research Directions

Jiongbin Liu,William Yeoh,Youyang Qu,Longxiang Gao

Supply chain management plays an essential role in our economy, as evidenced by recent COVID-19-induced supply chain challenges. Traditional supply chain management faces security and efficiency issues, but they can be addressed by leveraging digital twins and blockchain technology. The integration of blockchain technology can benefit the digital twins through improved security, traceability, transparency, and efficiency of digital twin data processing. A digital twin is an exact virtual representation of a physical asset, system, or process to synchronise data for the monitoring, simulation, and prediction of performance. Thus, the combination of blockchain and digital twins can refine the concepts of both technologies and reform supply chain management to advance into Industry 4.0. In this literature survey, we provide a comprehensive literature review of the blockchain-based digital twin solutions to optimise the processes of data management, data storage, and data sharing. We also investigate the key benefits of the integration of blockchain and digital twins and study their potential implementation in various processes of supply chains, including smart manufacturing, intelligent maintenance, and blockchain-based digital twin shop floor, warehouse, and logistics. This paper has implications for research and practice, which we detail in future research opportunities.

COVID-19 · 有向 · AI · Processing（編程語言） · MINE ·

2022 年 2 月 6 日

Artificial Intelligence in the Battle against Coronavirus (COVID-19): A Survey and Future Research Directions

Thanh Thi Nguyen,Quoc Viet Hung Nguyen,Dung Tien Nguyen,Samuel Yang,Peter W. Eklund,Thien Huynh-The,Thanh Tam Nguyen,Quoc-Viet Pham,Edbert B. Hsu

Artificial intelligence (AI) has been applied widely in our daily lives in a variety of ways with numerous success stories. AI has also contributed to dealing with the coronavirus disease (COVID-19) pandemic, which has been happening around the globe. This paper presents a survey of AI methods being used in various applications in the fight against the COVID-19 outbreak and outlines the crucial role of AI research in this unprecedented battle. We touch on areas where AI plays as an essential component, from medical image processing, data analytics, text mining and natural language processing, the Internet of Things, to computational biology and medicine. A summary of COVID-19 related data sources that are available for research purposes is also presented. Research directions on exploring the potential of AI and enhancing its capability and power in the pandemic battle are thoroughly discussed. We identify 13 groups of problems related to the COVID-19 pandemic and highlight promising AI methods and tools that can be used to address these problems. It is envisaged that this study will provide AI researchers and the wider community with an overview of the current status of AI applications, and motivate researchers to harness AI's potential in the fight against COVID-19.

Things · 可交換的 · INTERACT · INFORMS · TOOLS ·

2022 年 2 月 5 日

A bibliometric investigation into the literature of semantic reasoning in Internet of Things

Mohammad Javad Shayegan

Nowadays, semantic interoperability is a new keyword in the Internet of Things (IoT) for the exchange of information between sources. The constant need for interaction and cooperation has resulted in the creation of the Semantic Web with the help of tools and reasoners which manage personal information. Given the significance of the IoT and the increasing use of semantic techniques in this field, the present bibliometric investigation was conducted in the domain of semantic reasoning in the IoT. Bibliometrics involves analyzing bibliographic data of scientific sources, and it can be employed to arrive at an analysis of the status quo in a scientific field. In this study, through the analysis of 799 articles retrieved from the Web of Science database, distribution of topic categories, prolific and influential authors, language of articles, publishers of articles and their geographical distribution, the most debated/researched and the most frequently cited articles, and keyword trends were studied. The results of this study indicate that the number of articles published in the domain of semantic reasoning in the IoT has increased considerably in recent years. Of the articles analyzed, it was revealed that 10 countries produced 84% of the total documents, with China being in the lead. Moreover, as a result of keyword analysis, it can be maintained that the words fog computing, edge computing, Semantic Web, and wireless sensor network are among the most important keywords in this domain. As well, ontology has the highest average number of citations among the specialized keywords.

CASES · COVID-19 · contrastive · 樣例 · GROUP ·

2022 年 2 月 4 日

OpenStreetMap data use cases during the early months of the COVID-19 pandemic

Peter Mooney,A. Yair Grinberger,Marco Minghini,Serena Coetzee,Levente Juhasz,Godwin Yeboah

from arxiv, 15 pages, 6 figures. Submitted to the UN GGIM (//unggim.academicnetwork.org/) edited book titled COVID - 19 : Geospatial Information and Community Resilience. The volume is edited by Prof. Abbas Rajabifard from the University of Melbourne

Created by volunteers since 2004, OpenStreetMap (OSM) is a global geographic database available under an open access license and currently used by a multitude of actors worldwide. This chapter describes the role played by OSM during the early months (from January to July 2020) of the ongoing COVID-19 pandemic, which - in contrast to past disasters and epidemics - is a global event impacting both developed and developing countries. A large number of COVID-19-related OSM use cases were collected and grouped into a number of research frameworks which are analyzed separately: dashboards and services simply using OSM as a basemap, applications using raw OSM data, initiatives to collect new OSM data, imports of authoritative data into OSM, and traditional academic research on OSM in the COVID-19 response. The wealth of examples provided in the chapter, including an analysis of OSM tile usage in two countries (Italy and China) deeply affected in the earliest months of 2020, prove that OSM has been and still is heavily used to address the COVID-19 crisis, although with types and mechanisms that are often different depending on the affected area or country and the related communities.

COVID-19 · CASES · 話題 · 操作 · 應用統計 ·

2022 年 2 月 1 日

One-Year In: COVID-19 Research at the International Level in CORD-19 Data

Caroline S. Wagner,Xiaojing Cai,Yi Zhang,Caroline V. Fry

from arxiv, 39 pages, 8 figures, Appendix

The appearance of a novel coronavirus in late 2019 radically changed the community of researchers working on coronaviruses since the 2002 SARS epidemic. In 2020, coronavirus-related publications grew by 20 times over the previous two years, with 130,000 more researchers publishing on related topics. The United States, the United Kingdom and China led dozens of nations working on coronavirus prior to the pandemic, but leadership consolidated among these three nations in 2020, which collectively accounted for 50% of all papers, garnering well more than 60% of citations. China took an early lead on COVID-19 research, but dropped rapidly in production and international participation through the year. Europe showed an opposite pattern, beginning slowly in publications but growing in contributions during the year. The share of internationally collaborative publications dropped from pre-pandemic rates; single-authored publications grew. For all nations, including China, the number of publications about COVID track closely with the outbreak of COVID-19 cases. Lower-income nations participate very little in COVID-19 research in 2020. Topic maps of internationally collaborative work show the rise of patient care and public health clusters, two topics that were largely absent from coronavirus research in the two years prior to 2020. Findings are consistent with global science as a self-organizing system operating on a reputation-based dynamic.

MoDELS · 評論員 · 縮放 · 可理解性 · GPT-3 ·

2021 年 8 月 18 日

On the Opportunities and Risks of Foundation Models

Rishi Bommasani,Drew A. Hudson,Ehsan Adeli,Russ Altman,Simran Arora,Sydney von Arx,Michael S. Bernstein,Jeannette Bohg,Antoine Bosselut,Emma Brunskill,Erik Brynjolfsson,Shyamal Buch,Dallas Card,Rodrigo Castellon,Niladri Chatterji,Annie Chen,Kathleen Creel,Jared Quincy Davis,Dora Demszky,Chris Donahue,Moussa Doumbouya,Esin Durmus,Stefano Ermon,John Etchemendy,Kawin Ethayarajh,Li Fei-Fei,Chelsea Finn,Trevor Gale,Lauren Gillespie,Karan Goel,Noah Goodman,Shelby Grossman,Neel Guha,Tatsunori Hashimoto,Peter Henderson,John Hewitt,Daniel E. Ho,Jenny Hong,Kyle Hsu,Jing Huang,Thomas Icard,Saahil Jain,Dan Jurafsky,Pratyusha Kalluri,Siddharth Karamcheti,Geoff Keeling,Fereshte Khani,Omar Khattab,Pang Wei Kohd,Mark Krass,Ranjay Krishna,Rohith Kuditipudi,Ananya Kumar,Faisal Ladhak,Mina Lee,Tony Lee,Jure Leskovec,Isabelle Levent,Xiang Lisa Li,Xuechen Li,Tengyu Ma,Ali Malik,Christopher D. Manning,Suvir Mirchandani,Eric Mitchell,Zanele Munyikwa,Suraj Nair,Avanika Narayan,Deepak Narayanan,Ben Newman,Allen Nie,Juan Carlos Niebles,Hamed Nilforoshan,Julian Nyarko,Giray Ogut,Laurel Orr,Isabel Papadimitriou,Joon Sung Park,Chris Piech,Eva Portelance,Christopher Potts,Aditi Raghunathan,Rob Reich,Hongyu Ren,Frieda Rong,Yusuf Roohani,Camilo Ruiz,Jack Ryan,Christopher Ré,Dorsa Sadigh,Shiori Sagawa,Keshav Santhanam,Andy Shih,Krishnan Srinivasan,Alex Tamkin,Rohan Taori,Armin W. Thomas,Florian Tramèr,Rose E. Wang,William Wang,Bohan Wu,Jiajun Wu,Yuhuai Wu,Sang Michael Xie,Michihiro Yasunaga,Jiaxuan You,Matei Zaharia,Michael Zhang,Tianyi Zhang,Xikun Zhang,Yuhui Zhang,Lucia Zheng,Kaitlyn Zhou,Percy Liang

from arxiv, Authored by the Center for Research on Foundation Models (CRFM) at the Stanford Institute for Human-Centered Artificial Intelligence (HAI)

AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical principles(e.g., model architectures, training procedures, data, systems, security, evaluation, theory) to their applications (e.g., law, healthcare, education) and societal impact (e.g., inequity, misuse, economic and environmental impact, legal and ethical considerations). Though foundation models are based on standard deep learning and transfer learning, their scale results in new emergent capabilities,and their effectiveness across so many tasks incentivizes homogenization. Homogenization provides powerful leverage but demands caution, as the defects of the foundation model are inherited by all the adapted models downstream. Despite the impending widespread deployment of foundation models, we currently lack a clear understanding of how they work, when they fail, and what they are even capable of due to their emergent properties. To tackle these questions, we believe much of the critical research on foundation models will require deep interdisciplinary collaboration commensurate with their fundamentally sociotechnical nature.

Stream Processing · 流 · Processing（編程語言） · 容差 · 相似度 ·

2020 年 8 月 3 日

A Survey on the Evolution of Stream Processing Systems

Marios Fragkoulis,Paris Carbone,Vasiliki Kalavri,Asterios Katsifodimos

from arxiv, 34 pages, 15 figures, 5 tables

Stream processing has been an active research field for more than 20 years, but it is now witnessing its prime time due to recent successful efforts by the research community and numerous worldwide open-source communities. This survey provides a comprehensive overview of fundamental aspects of stream processing systems and their evolution in the functional areas of out-of-order data management, state management, fault tolerance, high availability, load management, elasticity, and reconfiguration. We review noteworthy past research findings, outline the similarities and differences between early ('00-'10) and modern ('11-'18) streaming systems, and discuss recent trends and open problems.

Processing（編程語言） · MoDELS · NLP · Taxonomy · 語言表示 ·

2020 年 3 月 18 日

Pre-trained Models for Natural Language Processing: A Survey

Xipeng Qiu,Tianxiang Sun,Yige Xu,Yunfan Shao,Ning Dai,Xuanjing Huang

from arxiv, Invited Review of Science China Technological Sciences

Recently, the emergence of pre-trained models (PTMs) has brought natural language processing (NLP) to a new era. In this survey, we provide a comprehensive review of PTMs for NLP. We first briefly introduce language representation learning and its research progress. Then we systematically categorize existing PTMs based on a taxonomy with four perspectives. Next, we describe how to adapt the knowledge of PTMs to the downstream tasks. Finally, we outline some potential directions of PTMs for future research. This survey is purposed to be a hands-on guide for understanding, using, and developing PTMs for various NLP tasks.

情感分析 · Engineering · TOOLS · 可辨認的 · Performer ·

2018 年 3 月 17 日

A Benchmark Study on Sentiment Analysis for Software Engineering Research

Nicole Novielli,Daniela Girardi,Filippo Lanubile

from arxiv, Proceedings of 15th International Conference on Mining Software Repositories (MSR 2018)

A recent research trend has emerged to identify developers' emotions, by applying sentiment analysis to the content of communication traces left in collaborative development environments. Trying to overcome the limitations posed by using off-the-shelf sentiment analysis tools, researchers recently started to develop their own tools for the software engineering domain. In this paper, we report a benchmark study to assess the performance and reliability of three sentiment analysis tools specifically customized for software engineering. Furthermore, we offer a reflection on the open challenges, as they emerge from a qualitative analysis of misclassified texts.

深度強化學習 · 學成 · 強化學習 · tuning · CASE ·

2018 年 1 月 17 日

The Case for Automatic Database Administration using Deep Reinforcement Learning

Ankur Sharma,Felix Martin Schuhknecht,Jens Dittrich

Like any large software system, a full-fledged DBMS offers an overwhelming amount of configuration knobs. These range from static initialisation parameters like buffer sizes, degree of concurrency, or level of replication to complex runtime decisions like creating a secondary index on a particular column or reorganising the physical layout of the store. To simplify the configuration, industry grade DBMSs are usually shipped with various advisory tools, that provide recommendations for given workloads and machines. However, reality shows that the actual configuration, tuning, and maintenance is usually still done by a human administrator, relying on intuition and experience. Recent work on deep reinforcement learning has shown very promising results in solving problems, that require such a sense of intuition. For instance, it has been applied very successfully in learning how to play complicated games with enormous search spaces. Motivated by these achievements, in this work we explore how deep reinforcement learning can be used to administer a DBMS. First, we will describe how deep reinforcement learning can be used to automatically tune an arbitrary software system like a DBMS by defining a problem environment. Second, we showcase our concept of NoDBA at the concrete example of index selection and evaluate how well it recommends indexes for given workloads.