The NSF-funded Robust Epidemic Surveillance and Modeling (RESUME) project successfully convened a workshop entitled "High-performance computing and large-scale data management in service of epidemiological modeling" at the University of Chicago on May 1-2, 2023. This was part of a series of workshops designed to foster sustainable and interdisciplinary co-design for predictive intelligence and pandemic prevention. The event brought together 31 experts in epidemiological modeling, high-performance computing (HPC), HPC workflows, and large-scale data management to develop a shared vision for capabilities needed for computational epidemiology to better support pandemic prevention. Through the workshop, participants identified key areas in which HPC capabilities could be used to improve epidemiological modeling, particularly in supporting public health decision-making, with an emphasis on HPC workflows, data integration, and HPC access. The workshop explored nascent HPC workflow and large-scale data management approaches currently in use for epidemiological modeling and sought to draw from approaches used in other domains to determine which practices could be best adapted for use in epidemiological modeling. This report documents the key findings and takeaways from the workshop.
Large Language Models (LLMs) have showcased impressive reasoning capabilities, particularly when guided by specifically designed prompts in complex reasoning tasks such as math word problems. These models typically solve tasks using a chain-of-thought approach, which not only bolsters their reasoning abilities but also provides valuable insights into their problem-solving process. However, there is still significant room for enhancing the reasoning abilities of LLMs. Some studies suggest that the integration of an LLM output verifier can boost reasoning accuracy without necessitating additional model training. In this paper, we follow these studies and introduce a novel graph-based method to further augment the reasoning capabilities of LLMs. We posit that multiple solutions to a reasoning task, generated by an LLM, can be represented as a reasoning graph due to the logical connections between intermediate steps from different reasoning paths. Therefore, we propose the Reasoning Graph Verifier (RGV) to analyze and verify the solutions generated by LLMs. By evaluating these graphs, models can yield more accurate and reliable results.Our experimental results show that our graph-based verification method not only significantly enhances the reasoning abilities of LLMs but also outperforms existing verifier methods in terms of improving these models' reasoning performance.
The DEVStone benchmark allows us to evaluate the performance of discrete-event simulators based on the DEVS formalism. It provides model sets with different characteristics, enabling the analysis of specific issues of simulation engines. However, this heterogeneity hinders the comparison of the results among studies, as the results obtained on each research work depend on the chosen subset of DEVStone models. We define the DEVStone metric based on the DEVStone synthetic benchmark and provide a mechanism for specifying objective ratings for DEVS-based simulators. This metric corresponds to the average number of times that a simulator can execute a selection of 12 DEVStone models in one minute. The variety of the chosen models ensures we measure different particularities provided by DEVStone. The proposed metric allows us to compare various simulators and to assess the impact of new features on their performance. We use the DEVStone metric to compare some popular DEVS-based simulators.
Many libraries, such as OpenCV, FFmpeg, XNNPACK, and Eigen, utilize Arm or x86 SIMD Intrinsics to optimize programs for performance. With the emergence of RISC-V Vector Extensions (RVV), there is a need to migrate these performance legacy codes for RVV. Currently, the migration of NEON code to RVV code requires manual rewriting, which is a time-consuming and error-prone process. In this work, we use the open source tool, "SIMD Everywhere" (SIMDe), to automate the migration. Our primary task is to enhance SIMDe to enable the conversion of ARM NEON Intrinsics types and functions to their corresponding RVV Intrinsics types and functions. For type conversion, we devise strategies to convert Neon Intrinsics types to RVV Intrinsics by considering the vector length agnostic (vla) architectures. With function conversions, we analyze commonly used conversion methods in SIMDe and develop customized conversions for each function based on the results of RVV code generations. In our experiments with Google XNNPACK library, our enhanced SIMDe achieves speedup ranging from 1.51x to 5.13x compared to the original SIMDe, which does not utilize customized RVV implementations for the conversions.
We study 'Merlinized' versions of the recently defined Guided Local Hamiltonian problem, which we call 'Guidable Local Hamiltonian' problems. Unlike their guided counterparts, these problems do not have a guiding state provided as a part of the input, but merely come with the promise that one exists. We consider in particular two classes of guiding states: those that can be prepared efficiently by a quantum circuit; and those belonging to a class of quantum states we call classically evaluatable, for which it is possible to efficiently compute expectation values of local observables classically. We show that guidable local Hamiltonian problems for both classes of guiding states are $\mathsf{QCMA}$-complete in the inverse-polynomial precision setting, but lie within $\mathsf{NP}$ (or $\mathsf{NqP}$) in the constant precision regime when the guiding state is classically evaluatable. Our completeness results show that, from a complexity-theoretic perspective, classical Ans\"atze selected by classical heuristics are just as powerful as quantum Ans\"atze prepared by quantum heuristics, as long as one has access to quantum phase estimation. In relation to the quantum PCP conjecture, we (i) define a complexity class capturing quantum-classical probabilistically checkable proof systems and show that it is contained in $\mathsf{BQP}^{\mathsf{NP}[1]}$ for constant proof queries; (ii) give a no-go result on 'dequantizing' the known quantum reduction which maps a $\mathsf{QPCP}$-verification circuit to a local Hamiltonian with constant promise gap; (iii) give several no-go results for the existence of quantum gap amplification procedures that preserve certain ground state properties; and (iv) propose two conjectures that can be viewed as stronger versions of the NLTS theorem. Finally, we show that many of our results can be directly modified to obtain similar results for the class $\mathsf{MA}$.
Privacy policies outline the data practices of Online Social Networks (OSN) to comply with privacy regulations such as the EU-GDPR and CCPA. Several ontologies for modeling privacy regulations, policies, and compliance have emerged in recent years. However, they are limited in various ways: (1) they specifically model what is required of privacy policies according to one specific privacy regulation such as GDPR; (2) they provide taxonomies of concepts but are not sufficiently axiomatized to afford automated reasoning with them; and (3) they do not model data practices of privacy policies in sufficient detail to allow assessing the transparency of policies. This paper presents an OWL Ontology for Privacy Policies of OSNs, OPPO, that aims to fill these gaps by formalizing detailed data practices from OSNS' privacy policies. OPPO is grounded in BFO, IAO, OMRSE, and OBI, and its design is guided by the use case of representing and reasoning over the content of OSNs' privacy policies and evaluating policies' transparency in greater detail.
We investigate the internal behavior of Transformer-based Large Language Models (LLMs) when they generate factually incorrect text. We propose modeling factual queries as Constraint Satisfaction Problems and use this framework to investigate how the model interacts internally with factual constraints. Specifically, we discover a strong positive relation between the model's attention to constraint tokens and the factual accuracy of its responses. In our curated suite of 11 datasets with over 40,000 prompts, we study the task of predicting factual errors with the Llama-2 family across all scales (7B, 13B, 70B). We propose SAT Probe, a method probing self-attention patterns, that can predict constraint satisfaction and factual errors, and allows early error identification. The approach and findings demonstrate how using the mechanistic understanding of factuality in LLMs can enhance reliability.
The advent of generative Large Language Models (LLMs) such as ChatGPT has catalyzed transformative advancements across multiple domains. However, alongside these advancements, they have also introduced potential threats. One critical concern is the misuse of LLMs by disinformation spreaders, leveraging these models to generate highly persuasive yet misleading content that challenges the disinformation detection system. This work aims to address this issue by answering three research questions: (1) To what extent can the current disinformation detection technique reliably detect LLM-generated disinformation? (2) If traditional techniques prove less effective, can LLMs themself be exploited to serve as a robust defense against advanced disinformation? and, (3) Should both these strategies falter, what novel approaches can be proposed to counter this burgeoning threat effectively? A holistic exploration for the formation and detection of disinformation is conducted to foster this line of research.
Graph Neural Networks (GNNs) have gained momentum in graph representation learning and boosted the state of the art in a variety of areas, such as data mining (\emph{e.g.,} social network analysis and recommender systems), computer vision (\emph{e.g.,} object detection and point cloud learning), and natural language processing (\emph{e.g.,} relation extraction and sequence learning), to name a few. With the emergence of Transformers in natural language processing and computer vision, graph Transformers embed a graph structure into the Transformer architecture to overcome the limitations of local neighborhood aggregation while avoiding strict structural inductive biases. In this paper, we present a comprehensive review of GNNs and graph Transformers in computer vision from a task-oriented perspective. Specifically, we divide their applications in computer vision into five categories according to the modality of input data, \emph{i.e.,} 2D natural images, videos, 3D data, vision + language, and medical images. In each category, we further divide the applications according to a set of vision tasks. Such a task-oriented taxonomy allows us to examine how each task is tackled by different GNN-based approaches and how well these approaches perform. Based on the necessary preliminaries, we provide the definitions and challenges of the tasks, in-depth coverage of the representative approaches, as well as discussions regarding insights, limitations, and future directions.
Knowledge Graph Embedding (KGE) aims to learn representations for entities and relations. Most KGE models have gained great success, especially on extrapolation scenarios. Specifically, given an unseen triple (h, r, t), a trained model can still correctly predict t from (h, r, ?), or h from (?, r, t), such extrapolation ability is impressive. However, most existing KGE works focus on the design of delicate triple modeling function, which mainly tells us how to measure the plausibility of observed triples, but offers limited explanation of why the methods can extrapolate to unseen data, and what are the important factors to help KGE extrapolate. Therefore in this work, we attempt to study the KGE extrapolation of two problems: 1. How does KGE extrapolate to unseen data? 2. How to design the KGE model with better extrapolation ability? For the problem 1, we first discuss the impact factors for extrapolation and from relation, entity and triple level respectively, propose three Semantic Evidences (SEs), which can be observed from train set and provide important semantic information for extrapolation. Then we verify the effectiveness of SEs through extensive experiments on several typical KGE methods. For the problem 2, to make better use of the three levels of SE, we propose a novel GNN-based KGE model, called Semantic Evidence aware Graph Neural Network (SE-GNN). In SE-GNN, each level of SE is modeled explicitly by the corresponding neighbor pattern, and merged sufficiently by the multi-layer aggregation, which contributes to obtaining more extrapolative knowledge representation. Finally, through extensive experiments on FB15k-237 and WN18RR datasets, we show that SE-GNN achieves state-of-the-art performance on Knowledge Graph Completion task and performs a better extrapolation ability.
Pre-trained Language Models (PLMs) have achieved great success in various Natural Language Processing (NLP) tasks under the pre-training and fine-tuning paradigm. With large quantities of parameters, PLMs are computation-intensive and resource-hungry. Hence, model pruning has been introduced to compress large-scale PLMs. However, most prior approaches only consider task-specific knowledge towards downstream tasks, but ignore the essential task-agnostic knowledge during pruning, which may cause catastrophic forgetting problem and lead to poor generalization ability. To maintain both task-agnostic and task-specific knowledge in our pruned model, we propose ContrAstive Pruning (CAP) under the paradigm of pre-training and fine-tuning. It is designed as a general framework, compatible with both structured and unstructured pruning. Unified in contrastive learning, CAP enables the pruned model to learn from the pre-trained model for task-agnostic knowledge, and fine-tuned model for task-specific knowledge. Besides, to better retain the performance of the pruned model, the snapshots (i.e., the intermediate models at each pruning iteration) also serve as effective supervisions for pruning. Our extensive experiments show that adopting CAP consistently yields significant improvements, especially in extremely high sparsity scenarios. With only 3% model parameters reserved (i.e., 97% sparsity), CAP successfully achieves 99.2% and 96.3% of the original BERT performance in QQP and MNLI tasks. In addition, our probing experiments demonstrate that the model pruned by CAP tends to achieve better generalization ability.