Natural language processing (NLP) plays a significant role in tools for the COVID-19 pandemic response, from detecting misinformation on social media to helping to provide accurate clinical information or summarizing scientific research. However, the approaches developed thus far have not benefited all populations, regions or languages equally. We discuss ways in which current and future NLP approaches can be made more inclusive by covering low-resource languages, including alternative modalities, leveraging out-of-the-box tools and forming meaningful partnerships. We suggest several future directions for researchers interested in maximizing the positive societal impacts of NLP.
As language technologies become more ubiquitous, there are increasing efforts towards expanding the language diversity and coverage of natural language processing (NLP) systems. Arguably, the most important factor influencing the quality of modern NLP systems is data availability. In this work, we study the geographical representativeness of NLP datasets, aiming to quantify if and by how much do NLP datasets match the expected needs of the language speakers. In doing so, we use entity recognition and linking systems, also making important observations about their cross-lingual consistency and giving suggestions for more robust evaluation. Last, we explore some geographical and economic factors that may explain the observed dataset distributions. Code and data are available here: //github.com/ffaisal93/dataset_geography. Additional visualizations are available here: //nlp.cs.gmu.edu/project/datasetmaps/.
The recent MSMARCO passage retrieval collection has allowed researchers to develop highly tuned retrieval systems. One aspect of this data set that makes it distinctive compared to traditional corpora is that most of the topics only have a single answer passage marked relevant. Here we carry out a "what if" sensitivity study, asking whether a set of systems would still have the same relative performance if more passages per topic were deemed to be "relevant", exploring several mechanisms for identifying sets of passages to be so categorized. Our results show that, in general, while run scores can vary markedly if additional plausible passages are presumed to be relevant, the derived system ordering is relatively insensitive to additional relevance, providing support for the methodology that was used at the time the MSMARCO passage collection was created.
As the globally increasing population drives rapid urbanisation in various parts of the world, there is a great need to deliberate on the future of the cities worth living. In particular, as modern smart cities embrace more and more data-driven artificial intelligence services, it is worth remembering that technology can facilitate prosperity, wellbeing, urban livability, or social justice, but only when it has the right analog complements (such as well-thought out policies, mature institutions, responsible governance); and the ultimate objective of these smart cities is to facilitate and enhance human welfare and social flourishing. Researchers have shown that various technological business models and features can in fact contribute to social problems such as extremism, polarization, misinformation, and Internet addiction. In the light of these observations, addressing the philosophical and ethical questions involved in ensuring the security, safety, and interpretability of such AI algorithms that will form the technological bedrock of future cities assumes paramount importance. Globally there are calls for technology to be made more humane and human-centered. In this paper, we analyze and explore key challenges including security, robustness, interpretability, and ethical (data and algorithmic) challenges to a successful deployment of AI in human-centric applications, with a particular emphasis on the convergence of these concepts/challenges. We provide a detailed review of existing literature on these key challenges and analyze how one of these challenges may lead to others or help in solving other challenges. The paper also advises on the current limitations, pitfalls, and future directions of research in these domains, and how it can fill the current gaps and lead to better solutions. We believe such rigorous analysis will provide a baseline for future research in the domain.
The COVID-19 disease spreads swiftly, and nearly three months after the first positive case was confirmed in China, Coronavirus started to spread all over the United States. Some states and counties reported high number of positive cases and deaths, while some reported lower COVID-19 related cases and mortality. In this paper, the factors that could affect the risk of COVID-19 infection and mortality were analyzed in county level. An innovative method by using K-means clustering and several classification models is utilized to determine the most critical factors. Results showed that mean temperature, percent of people below poverty, percent of adults with obesity, air pressure, population density, wind speed, longitude, and percent of uninsured people were the most significant attributes
*Content warning: This work displays examples of explicit and strongly offensive language. The COVID-19 pandemic has fueled a surge in anti-Asian xenophobia and prejudice. Many have taken to social media to express these negative sentiments, necessitating the development of reliable systems to detect hate speech against this often under-represented demographic. In this paper, we create and annotate a corpus of Twitter tweets using 2 experimental approaches to explore anti-Asian abusive and hate speech at finer granularity. Using the dataset with less biased annotation, we deploy multiple models and also examine the applicability of other relevant corpora to accomplish these multi-task classifications. In addition to demonstrating promising results, our experiments offer insights into the nuances of cultural and logistical factors in annotating hate speech for different demographics. Our analyses together aim to contribute to the understanding of the area of hate speech detection, particularly towards low-resource groups.
What happens when a machine learning dataset is deprecated for legal, ethical, or technical reasons, but continues to be widely used? In this paper, we examine the public afterlives of several prominent deprecated or redacted datasets, including ImageNet, 80 Million Tiny Images, MS-Celeb-1M, Duke MTMC, Brainwash, and HRT Transgender, in order to inform a framework for more consistent, ethical, and accountable dataset deprecation. Building on prior research, we find that there is a lack of consistency, transparency, and centralized sourcing of information on the deprecation of datasets, and as such, these datasets and their derivatives continue to be cited in papers and circulate online. These datasets that never die -- which we term "zombie datasets" -- continue to inform the design of production-level systems, causing technical, legal, and ethical challenges; in so doing, they risk perpetuating the harms that prompted their supposed withdrawal, including concerns around bias, discrimination, and privacy. Based on this analysis, we propose a Dataset Deprecation Framework that includes considerations of risk, mitigation of impact, appeal mechanisms, timeline, post-deprecation protocol, and publication checks that can be adapted and implemented by the machine learning community. Drawing on work on datasheets and checklists, we further offer two sample dataset deprecation sheets and propose a centralized repository that tracks which datasets have been deprecated and could be incorporated into the publication protocols of venues like NeurIPS.
Global Navigation Satellite Systems (GNSS) provide positioning services for connected and autonomous vehicles. Differential GNSS (DGNSS) has been demonstrated to provide reliable, high quality range correction information enabling real-time navigation with sub-meter or centimeter accuracy. However, DGNSS requires a local reference station near each user, which for a continental or global scale implementation would require a dense network of reference stations whose construction and maintenance would be prohibitively expensive. Precise Point Positioning (PPP) affords more flexibility as a public service for GNSS receivers, but its State Space Representation (SSR) format is not currently supported by most receivers. This article proposes a novel Virtual Network DGNSS (VN-DGNSS) design that capitalizes on the PPP infrastructure to provide global coverage for real-time navigation without building physical reference stations. Correction information is computed using data from public GNSS SSR data services and transmitted to users by Radio Technical Commission for Maritime Services (RTCM) Observation Space Representation (OSR) messages which are accepted by most receivers. The real-time stationary and moving platform testing performance, using u-blox M8P and ZED-F9P receivers, surpasses the Society of Automotive Engineering (SAE) specification (68% of horizontal error $\leqslant$ 1.5 m and vertical error $\leqslant$ 3 m) and shows significantly better horizontal performance than GNSS Open Service (OS). The moving tests also show better horizontal performance than the ZEDF9P receiver with Satellite Based Augmentation Systems (SBAS) enabled and achieve the lane-level accuracy which requires 95% of horizontal errors less than 1 meter.
The COVID-19 pandemic continues to have a devastating effect on the health and well-being of the global population. A critical step in the fight against COVID-19 is effective screening of infected patients, with one of the key screening approaches being radiological imaging using chest radiography. Motivated by this, a number of artificial intelligence (AI) systems based on deep learning have been proposed and results have been shown to be quite promising in terms of accuracy in detecting patients infected with COVID-19 using chest radiography images. However, to the best of the authors' knowledge, these developed AI systems have been closed source and unavailable to the research community for deeper understanding and extension, and unavailable for public access and use. Therefore, in this study we introduce COVID-Net, a deep convolutional neural network design tailored for the detection of COVID-19 cases from chest radiography images that is open source and available to the general public. We also describe the chest radiography dataset leveraged to train COVID-Net, which we will refer to as COVIDx and is comprised of 5941 posteroanterior chest radiography images across 2839 patient cases from two open access data repositories. Furthermore, we investigate how COVID-Net makes predictions using an explainability method in an attempt to gain deeper insights into critical factors associated with COVID cases, which can aid clinicians in improved screening. By no means a production-ready solution, the hope is that the open access COVID-Net, along with the description on constructing the open source COVIDx dataset, will be leveraged and build upon by both researchers and citizen data scientists alike to accelerate the development of highly accurate yet practical deep learning solutions for detecting COVID-19 cases and accelerate treatment of those who need it the most.
This study is to review the approaches used for measuring sentences similarity. Measuring similarity between natural language sentences is a crucial task for many Natural Language Processing applications such as text classification, information retrieval, question answering, and plagiarism detection. This survey classifies approaches of calculating sentences similarity based on the adopted methodology into three categories. Word-to-word based, structure based, and vector-based are the most widely used approaches to find sentences similarity. Each approach measures relatedness between short texts based on a specific perspective. In addition, datasets that are mostly used as benchmarks for evaluating techniques in this field are introduced to provide a complete view on this issue. The approaches that combine more than one perspective give better results. Moreover, structure based similarity that measures similarity between sentences structures needs more investigation.
Classification tasks are usually analysed and improved through new model architectures or hyperparameter optimisation but the underlying properties of datasets are discovered on an ad-hoc basis as errors occur. However, understanding the properties of the data is crucial in perfecting models. In this paper we analyse exactly which characteristics of a dataset best determine how difficult that dataset is for the task of text classification. We then propose an intuitive measure of difficulty for text classification datasets which is simple and fast to calculate. We show that this measure generalises to unseen data by comparing it to state-of-the-art datasets and results. This measure can be used to analyse the precise source of errors in a dataset and allows fast estimation of how difficult a dataset is to learn. We searched for this measure by training 12 classical and neural network based models on 78 real-world datasets, then use a genetic algorithm to discover the best measure of difficulty. Our difficulty-calculating code ( //github.com/Wluper/edm ) and datasets ( //data.wluper.com ) are publicly available.