Cyber-Physical systems (CPS) have complex lifecycles involving multiple stakeholders, and the transparency of both hardware and software components' supply chain is opaque at best. This raises concerns for stakeholders who may not trust that what they receive is what was requested. There is an opportunity to build a cyberphysical titling process offering universal traceability and the ability to differentiate systems based on provenance. Today, RFID tags and barcodes address some of these needs, though they are easily manipulated due to non-linkage with an object or system's intrinsic characteristics. We propose cyberphysical sequencing as a low-cost, light-weight and pervasive means of adding track-and-trace capabilities to any asset that ties a system's physical identity to a unique and invariant digital identifier. CPS sequencing offers benefits similar Digital Twins' for identifying and managing the provenance and identity of an asset throughout its life with far fewer computational and other resources.
In this paper, we introduce reduced-bias estimators for the estimation of the tail index of a Pareto-type distribution. This is achieved through the use of a regularised weighted least squares with an exponential regression model for log-spacings of top order statistics. The asymptotic properties of the proposed estimators are investigated analytically and found to be asymptotically unbiased, consistent and normally distributed. Also, the finite sample behaviour of the estimators are studied through a simulations theory. The proposed estimators were found to yield low bias and MSE. In addition, the proposed estimators are illustrated through the estimation of the tail index of the underlying distribution of claims from the insurance industry.
Understanding collective decision making at a large-scale, and elucidating how community organization and community dynamics shape collective behavior are at the heart of social science research. In this work we study the behavior of thousands of communities with millions of active members. We define a novel task: predicting which community will undertake an unexpected, large-scale, distributed campaign. To this end, we develop a hybrid model, combining textual cues, community meta-data, and structural properties. We show how this multi-faceted model can accurately predict large-scale collective decision-making in a distributed environment. We demonstrate the applicability of our model through Reddit's r/place - a large-scale online experiment in which millions of users, self-organized in thousands of communities, clashed and collaborated in an effort to realize their agenda. Our hybrid model achieves a high F1 prediction score of 0.826. We find that coarse meta-features are as important for prediction accuracy as fine-grained textual cues, while explicit structural features play a smaller role. Interpreting our model, we provide and support various social insights about the unique characteristics of the communities that participated in the \r/place experiment. Our results and analysis shed light on the complex social dynamics that drive collective behavior, and on the factors that propel user coordination. The scale and the unique conditions of the \rp~experiment suggest that our findings may apply in broader contexts, such as online activism, (countering) the spread of hate speech and reducing political polarization. The broader applicability of the model is demonstrated through an extensive analysis of the WallStreetBets community, their role in r/place and four years later, in the GameStop short squeeze campaign of 2021.
As COVID-19 pandemic progresses, severe flu seasons may happen alongside an increase in cases in cases and death of COVID-19, causing severe burdens on health care resources and public safety. A consequence of a twindemic may be a mixture of two different infections in the same person at the same time, "flurona". Admist the raising trend of "flurona", forecasting both influenza outbreaks and COVID-19 waves in a timely manner is more urgent than ever, as accurate joint real-time tracking of the twindemic aids health organizations and policymakers in adequate preparation and decision making. Under the current pandemic, state-of-art influenza and COVID-19 forecasting models carry valuable domain information but face shortcomings under current complex disease dynamics, such as similarities in symptoms and public healthcare seeking patterns of the two diseases. Inspired by the inner-connection between influenza and COVID-19 activities, we propose ARGOX-Joint-Ensemble which allows us to combine historical influenza and COVID-19 disease forecasting models to a new ensemble framework that handles scenarios where flu and COVID co-exist. Our framework is able to emphasize learning from COVID-related or influenza signals, through a winner-takes-all ensemble fashion. Moreover, our experiments demonstrate that our approach is successful in adapting past influenza forecasting models to the current pandemic, while improving upon previous COVID-19 forecasting models, by steadily outperforming alternative benchmark methods, and remaining competitive with publicly available models.
Many individuals with disabilities and/or chronic conditions (da/cc) experience symptoms that may require intermittent or ongoing medical care. However, healthcare is an often-overlooked domain for accessibility work, where access needs associated with temporary and long-term disability must be addressed to increase the utility of physical and digital interactions with healthcare workers and spaces. Our work focuses on a specific domain of healthcare often used by individuals with da/ccs: Physical Therapy (PT). Through a twelve-person interview study, we examined how people's access to PT for their da/cc is hampered by social (e.g., physically visiting a PT clinic) and physiological (e.g., chronic pain) barriers, and how technology could improve PT access. In-person PT is often inaccessible to our participants due to lack of transportation and insufficient insurance coverage. As such, many of our participants relied on at-home PT to manage their da/cc symptoms and worked towards PT goals. Participants felt that PT barriers, such as having particularly bad symptoms or feeling short on time, could be addressed with well-designed technology that flexibly adapts to the person's dynamically changing needs while supporting their PT goals. We introduce core design principles (flexibility, movement tracking, community building) and tensions (insurance) to consider when developing technology to support PT access. Rethinking da/cc access to PT from a lens that includes social and physiological barriers presents opportunities to integrate accessibility and flexibility into PT technology.
Platform trials have gained a lot of attention recently as a possible remedy for time-consuming classical two-arm randomized controlled trials, especially in early phase drug development. This short article illustrates how to use the CohortPlat R package to simulate a cohort platform trial, where each cohort consists of a combination treatment and the respective monotherapies and standard-of-care. The endpoint is always assumed to be binary. The package offers extensive flexibility with respect to both platform trial trajectories, as well as treatment effect scenarios and decision rules. As a special feature, the package provides a designated function for running multiple such simulations efficiently in parallel and saving the results in a concise manner. Many illustrations of code usage are provided.
Missing data is a systemic problem in practical scenarios that causes noise and bias when estimating treatment effects. This makes treatment effect estimation from data with missingness a particularly tricky endeavour. A key reason for this is that standard assumptions on missingness are rendered insufficient due to the presence of an additional variable, treatment, besides the individual and the outcome. Having a treatment variable introduces additional complexity with respect to why some variables are missing that is not fully explored by previous work. In our work we identify a new missingness mechanism, which we term mixed confounded missingness (MCM), where some missingness determines treatment selection and other missingness is determined by treatment selection. Given MCM, we show that naively imputing all data leads to poor performing treatment effects models, as the act of imputation effectively removes information necessary to provide unbiased estimates. However, no imputation at all also leads to biased estimates, as missingness determined by treatment divides the population in distinct subpopulations, where estimates across these populations will be biased. Our solution is selective imputation, where we use insights from MCM to inform precisely which variables should be imputed and which should not. We empirically demonstrate how various learners benefit from selective imputation compared to other solutions for missing data.
Traditional industrial systems, e.g., power plants, water treatment plants, etc., were built to operate highly isolated and controlled capacity. Recently, Industrial Control Systems (ICSs) have been exposed to the Internet for ease of access and adaptation to advanced technologies. However, it creates security vulnerabilities. Attackers often exploit these vulnerabilities to launch an attack on ICSs. Towards this, threat hunting is performed to proactively monitor the security of ICS networks and protect them against threats that could make the systems malfunction. A threat hunter manually identifies threats and provides a hypothesis based on the available threat intelligence. In this paper, we motivate the gap in lacking research in the automation of threat hunting in ICS networks. We propose an automated extraction of threat intelligence and the generation and validation of a hypothesis. We present an automated threat hunting framework based on threat intelligence provided by the ICS MITRE ATT&CK framework to automate the tasks. Unlike the existing hunting solutions which are cloud-based, costly and prone to human errors, our solution is a central and open-source implemented using different open-source technologies, e.g., Elasticsearch, Conpot, Metasploit, Web Single Page Application (SPA), and a machine learning analyser. Our results demonstrate that the proposed threat hunting solution can identify the network's attacks and alert a threat hunter with a hypothesis generated based on the techniques, tactics, and procedures (TTPs) from ICS MITRE ATT&CK. Then, a machine learning classifier automatically predicts the future actions of the attack.
This special issue interrogates the meaning and impacts of "tech ethics": the embedding of ethics into digital technology research, development, use, and governance. In response to concerns about the social harms associated with digital technologies, many individuals and institutions have articulated the need for a greater emphasis on ethics in digital technology. Yet as more groups embrace the concept of ethics, critical discourses have emerged questioning whose ethics are being centered, whether "ethics" is the appropriate frame for improving technology, and what it means to develop "ethical" technology in practice. This interdisciplinary issue takes up these questions, interrogating the relationships among ethics, technology, and society in action. This special issue engages with the normative and contested notions of ethics itself, how ethics has been integrated with technology across domains, and potential paths forward to support more just and egalitarian technology. Rather than starting from philosophical theories, the authors in this issue orient their articles around the real-world discourses and impacts of tech ethics--i.e., tech ethics in action.
The appearance of a novel coronavirus in late 2019 radically changed the community of researchers working on coronaviruses since the 2002 SARS epidemic. In 2020, coronavirus-related publications grew by 20 times over the previous two years, with 130,000 more researchers publishing on related topics. The United States, the United Kingdom and China led dozens of nations working on coronavirus prior to the pandemic, but leadership consolidated among these three nations in 2020, which collectively accounted for 50% of all papers, garnering well more than 60% of citations. China took an early lead on COVID-19 research, but dropped rapidly in production and international participation through the year. Europe showed an opposite pattern, beginning slowly in publications but growing in contributions during the year. The share of internationally collaborative publications dropped from pre-pandemic rates; single-authored publications grew. For all nations, including China, the number of publications about COVID track closely with the outbreak of COVID-19 cases. Lower-income nations participate very little in COVID-19 research in 2020. Topic maps of internationally collaborative work show the rise of patient care and public health clusters, two topics that were largely absent from coronavirus research in the two years prior to 2020. Findings are consistent with global science as a self-organizing system operating on a reputation-based dynamic.
The field of Multi-Agent System (MAS) is an active area of research within Artificial Intelligence, with an increasingly important impact in industrial and other real-world applications. Within a MAS, autonomous agents interact to pursue personal interests and/or to achieve common objectives. Distributed Constraint Optimization Problems (DCOPs) have emerged as one of the prominent agent architectures to govern the agents' autonomous behavior, where both algorithms and communication models are driven by the structure of the specific problem. During the last decade, several extensions to the DCOP model have enabled them to support MAS in complex, real-time, and uncertain environments. This survey aims at providing an overview of the DCOP model, giving a classification of its multiple extensions and addressing both resolution methods and applications that find a natural mapping within each class of DCOPs. The proposed classification suggests several future perspectives for DCOP extensions, and identifies challenges in the design of efficient resolution algorithms, possibly through the adaptation of strategies from different areas.