We conduct a large-scale social media-based study of oral health during the COVID-19 pandemic based on tweets from 9,104 Twitter users across 26 states (with sufficient samples) in the United States for the period between November 12, 2020 and June 14, 2021. To better understand how discussions on different topics/oral diseases vary across the users, we acquire or infer demographic information of users and other characteristics based on retrieved information from user profiles. Women and younger adults (19-29) are more likely to talk about oral health problems. We use the LDA topic model to extract the major topics/oral diseases in tweets. Overall, 26.70% of the Twitter users talk about wisdom tooth pain/jaw hurt, 23.86% tweet about dental service/cavity, 18.97% discuss chipped tooth/tooth break, 16.23% talk about dental pain, and the rest are about tooth decay/gum bleeding. By conducting logistic regression, we find that discussions vary across user characteristics. More importantly, we find social disparities in oral health during the pandemic. Specifically, we find that health insurance coverage rate is the most significant predictor in logistic regression for topic prediction. People from counties with higher insurance coverage tend to tweet less about all topics of oral diseases. People from counties at a higher risk of COVID-19 talk more about tooth decay/gum bleeding and chipped tooth/tooth break. Older adults (50+), who are vulnerable to COVID-19, are more likely to discuss dental pain. To our best knowledge, this is the first large-scale social media-based study to analyze and understand oral health in America amid the COVID-19 pandemic. We hope the findings of our study through the lens of social media can provide insights for oral health practitioners and policy makers.
Byzantine fault-tolerant (BFT) protocols allow a group of replicas to come to a consensus even when some of the replicas are Byzantine faulty. There exist multiple BFT protocols to securely tolerate an optimal number of faults $t$ under different network settings. However, if the number of faults $f$ exceeds $t$ then security could be violated. In this paper we mathematically formalize the study of forensic support of BFT protocols: we aim to identify (with cryptographic integrity) as many of the malicious replicas as possible and in as a distributed manner as possible. Our main result is that forensic support of BFT protocols depends heavily on minor implementation details that do not affect the protocol's security or complexity. Focusing on popular BFT protocols (PBFT, HotStuff, Algorand) we exactly characterize their forensic support, showing that there exist minor variants of each protocol for which the forensic supports vary widely. We show strong forensic support capability of LibraBFT, the consensus protocol of Diem cryptocurrency; our lightweight forensic module implemented on a Diem client is open-sourced and is under active consideration for deployment in Diem. Finally, we show that all secure BFT protocols designed for $2t+1$ replicas communicating over a synchronous network forensic support are inherently nonexistent; this impossibility result holds for all BFT protocols and even if one has access to the states of all replicas (including Byzantine ones).
Biological age is an important sociodemographic factor in studies on academic careers (research productivity, scholarly impact, and collaboration patterns). It is assumed that the academic age, or the time elapsed from the first publication, is a good proxy for biological age. In this study, we analyze the limitations of the proxy in academic career studies, using as an example the entire population of Polish academic scientists visible in the last decade in global science and holding at least a PhD (N = 20,569). The proxy works well for science, technology, engineering, mathematics, and medicine (STEMM) disciplines; however, for non-STEMM disciplines (particularly for humanities and social sciences), it has a dramatically worse performance. This negative conclusion is particularly important for systems that have only become recently visible in global academic journals. The micro-level data suggest a delayed participation of social scientists and humanists in global science networks, with practical implications for predicting biological age from academic age. We calculate correlation coefficients, present contingency analysis of academic career stages with academic positions and age groups, and create a linear multivariate regression model. Our research suggests that in scientifically developing countries, academic age as a proxy for biological age must be used more cautiously than in advanced countries: ideally, it must be used only for STEMM disciplines.
Collective behavior is widespread across the animal kingdom. To date, however, the developmental and mechanistic foundations of collective behavior have not been formally established. What learning mechanisms drive the development of collective behavior in newborn animals? Here, we used deep reinforcement learning and curiosity-driven learning -- two learning mechanisms deeply rooted in psychological and neuroscientific research -- to build newborn artificial agents that develop collective behavior. Like newborn animals, our agents learn collective behavior from raw sensory inputs in naturalistic environments. Our agents also learn collective behavior without external rewards, using only intrinsic motivation (curiosity) to drive learning. Specifically, when we raise our artificial agents in natural visual environments with groupmates, the agents spontaneously develop ego-motion, object recognition, and a preference for groupmates, rapidly learning all of the core skills required for collective behavior. This work bridges the divide between high-dimensional sensory inputs and collective action, resulting in a pixels-to-actions model of collective animal behavior. More generally, we show that two generic learning mechanisms -- deep reinforcement learning and curiosity-driven learning -- are sufficient to learn collective behavior from unsupervised natural experience.
The rapid spread of the new SARS-CoV-2 virus triggered a global health crisis disproportionately impacting people with pre-existing health conditions and particular demographic and socioeconomic characteristics. One of the main concerns of governments has been to avoid the overwhelm of health systems. For this reason, they have implemented a series of non-pharmaceutical measures to control the spread of the virus, with mass tests being one of the most effective control. To date, public health officials continue to promote some of these measures, mainly due to delays in mass vaccination and the emergence of new virus strains. In this study, we studied the association between COVID-19 positivity rate and hospitalization rates at the county level in California using a mixed linear model. The analysis was performed in the three waves of confirmed COVID-19 cases registered in the state to September 2021. Our findings suggest that test positivity rate is consistently associated with hospitalization rates at the county level for all waves of study. Demographic factors that seem to be related with higher hospitalization rates changed over time, as the profile of the pandemic impacted different fractions of the population in counties across California.
Slow emerging topic detection is a task between event detection, where we aggregate behaviors of different words on short period of time, and language evolution, where we monitor their long term evolution. In this work, we tackle the problem of early detection of slowly emerging new topics. To this end, we gather evidence of weak signals at the word level. We propose to monitor the behavior of words representation in an embedding space and use one of its geometrical properties to characterize the emergence of topics. As evaluation is typically hard for this kind of task, we present a framework for quantitative evaluation. We show positive results that outperform state-of-the-art methods on two public datasets of press and scientific articles.
The novel Corona virus pandemic is one of the biggest worldwide problems right now. While hygiene and wearing masks make up a large portion of the currently suggested precautions by the Centers for Disease Control and Prevention (CDC) and World Health Organization (WHO), social distancing is another and arguably the most important precaution that would protect people since the airborne virus is easily transmitted through the air. Social distancing while walking outside, can be more effective, if pedestrians know locations of each other and even better if they know locations of people who are possible carriers. With this information, they can change their routes depending on the people walking nearby or they can stay away from areas that contain or have recently contained crowds. This paper presents a mobile device application that would be a very beneficial tool for social distancing during Coronavirus Disease 2019 (COVID-19). The application works, synced close to real-time, in a networking fashion with all users obtaining their locations and drawing a virtual safety bubble around them. These safety bubbles are used with the constant velocity pedestrian model to predict possible future social distancing violations and warn the user with sound and vibration. Moreover, it takes into account the virus staying airborne for a certain time, hence, creating time-decaying non-safe areas in the past trajectories of the users. The mobile app generates collision free paths for navigating around the undesired locations for the pedestrian mode of transportation when used as part of a multi-modal trip planning app. Results are applicable to other modes of transportation also. Features and the methods used for implementation are discussed in the paper. The application is tested using previously collected real pedestrian walking data in a realistic environment.
Maternal and child mortality is a public health problem that disproportionately affects low- and middle-income countries. Every day, 800 women and 6,700 newborns die from complications related to pregnancy or childbirth. And for every maternal death, about 20 women suffer serious birth injuries. However, nearly all of these deaths and negative health outcomes are preventable. Midwives are key to revert this situation, and thus it is essential to strengthen their capacities and the quality of their education. This is the aim of the Safe Delivery App, a digital job aid and learning tool to enhance the knowledge, confidence and skills of health practitioners. Here, we use the behavioral logs of the App to implement a recommendation system that presents each midwife with suitable contents to continue gaining expertise. We focus on predicting the click-through rate, the probability that a given user will click on a recommended content. We evaluate four deep learning models and show that all of them produce highly accurate predictions.
In this study, we propose a clustering-based approach on time-series data to capture COVID-19 spread patterns in the early period of the pandemic. We analyze the spread dynamics based on the early and post stages of COVID-19 for different countries based on different geographical locations. Furthermore, we investigate the confinement policies and the effect they made on the spread. We found that implementations of the same confinement policies exhibit different results in different countries. Specifically, lockdowns become less effective in densely populated regions, because of the reluctance to comply with social distancing measures. Lack of testing, contact tracing, and social awareness in some countries forestall people from self-isolation and maintaining social distance. Large labor camps with unhealthy living conditions also aid in high community transmissions in countries depending on foreign labor. Distrust in government policies and fake news instigate the spread in both developed and under-developed countries. Large social gatherings play a vital role in causing rapid outbreaks almost everywhere. While some countries were able to contain the spread by implementing strict and widely adopted confinement policies, some others contained the spread with the help of social distancing measures and rigorous testing capacity. An early and rapid response at the beginning of the pandemic is necessary to contain the spread, yet it is not always sufficient.
Rankings of people and items are at the heart of selection-making, match-making, and recommender systems, ranging from employment sites to sharing economy platforms. As ranking positions influence the amount of attention the ranked subjects receive, biases in rankings can lead to unfair distribution of opportunities and resources, such as jobs or income. This paper proposes new measures and mechanisms to quantify and mitigate unfairness from a bias inherent to all rankings, namely, the position bias, which leads to disproportionately less attention being paid to low-ranked subjects. Our approach differs from recent fair ranking approaches in two important ways. First, existing works measure unfairness at the level of subject groups while our measures capture unfairness at the level of individual subjects, and as such subsume group unfairness. Second, as no single ranking can achieve individual attention fairness, we propose a novel mechanism that achieves amortized fairness, where attention accumulated across a series of rankings is proportional to accumulated relevance. We formulate the challenge of achieving amortized individual fairness subject to constraints on ranking quality as an online optimization problem and show that it can be solved as an integer linear program. Our experimental evaluation reveals that unfair attention distribution in rankings can be substantial, and demonstrates that our method can improve individual fairness while retaining high ranking quality.
Privacy is a major good for users of personalized services such as recommender systems. When applied to the field of health informatics, privacy concerns of users may be amplified, but the possible utility of such services is also high. Despite availability of technologies such as k-anonymity, differential privacy, privacy-aware recommendation, and personalized privacy trade-offs, little research has been conducted on the users' willingness to share health data for usage in such systems. In two conjoint-decision studies (sample size n=521), we investigate importance and utility of privacy-preserving techniques related to sharing of personal health data for k-anonymity and differential privacy. Users were asked to pick a preferred sharing scenario depending on the recipient of the data, the benefit of sharing data, the type of data, and the parameterized privacy. Users disagreed with sharing data for commercial purposes regarding mental illnesses and with high de-anonymization risks but showed little concern when data is used for scientific purposes and is related to physical illnesses. Suggestions for health recommender system development are derived from the findings.