A crucial aspect of a rumor detection model is its ability to generalize, particularly its ability to detect emerging, previously unknown rumors. Past research has indicated that content-based (i.e., using solely source posts as input) rumor detection models tend to perform less effectively on unseen rumors. At the same time, the potential of context-based models remains largely untapped. The main contribution of this paper is in the in-depth evaluation of the performance gap between content and context-based models specifically on detecting new, unseen rumors. Our empirical findings demonstrate that context-based models are still overly dependent on the information derived from the rumors' source post and tend to overlook the significant role that contextual information can play. We also study the effect of data split strategies on classifier performance. Based on our experimental results, the paper also offers practical suggestions on how to minimize the effects of temporal concept drift in static datasets during the training of rumor detection methods.
We study the hardness of the problem of finding the distance of quantum error-correcting codes. The analogous problem for classical codes is known to be NP-hard, even in approximate form. For quantum codes, various problems related to decoding are known to be NP-hard, but the hardness of the distance problem has not been studied before. In this work, we show that finding the minimum distance of stabilizer quantum codes exactly or approximately is NP-hard. This result is obtained by reducing the classical minimum distance problem to the quantum problem, using the CWS framework for quantum codes, which constructs a quantum code using a classical code and a graph. A main technical tool used for our result is a lower bound on the so-called graph state distance of 4-cycle free graphs. In particular, we show that for a 4-cycle free graph $G$, its graph state distance is either $\delta$ or $\delta+1$, where $\delta$ is the minimum vertex degree of $G$. Due to a well-known reduction from stabilizer codes to CSS codes, our results also imply that finding the minimum distance of CSS codes is also NP-hard.
We extend classical methods of computational complexity to the setting of distributed computing, where they are sometimes more effective than in their original context. Our focus is on distributed decision in the LOCAL model, where multiple networked computers communicate via synchronous message-passing to collectively answer a question about their network topology. Rather unusually, we impose two orthogonal constraints on the running time of this model: the number of communication rounds is bounded by a constant, and the number of computation steps of each computer is polynomially bounded by the size of its local input and the messages it receives. By letting two players take turns assigning certificates to all computers in the network, we obtain a generalization of the polynomial hierarchy (and hence of the complexity classes $\mathbf{P}$ and $\mathbf{NP}$). We then extend some key results of complexity theory to this setting, in particular the Cook-Levin theorem (which identifies Boolean satisfiability as a complete problem for $\mathbf{NP}$), and Fagin's theorem (which characterizes $\mathbf{NP}$ as the problems expressible in existential second-order logic). The original results can be recovered as the special case where the network consists of a single computer. But perhaps more surprisingly, the task of separating complexity classes becomes easier in the general case: we can show that our hierarchy is infinite, while it remains notoriously open whether the same is true in the case of a single computer. (By contrast, a collapse of our hierarchy would have implied a collapse of the polynomial hierarchy.) As an application, we propose quantifier alternation as a new approach to measuring the locality of problems in distributed computing.
Conditional copulas are useful tools for modeling the dependence between multiple response variables that may vary with a given set of predictor variables. Conditional dependence measures such as conditional Kendall's tau and Spearman's rho that can be expressed as functionals of the conditional copula are often used to evaluate the strength of dependence conditioning on the covariates. In general, semiparametric estimation methods of conditional copulas rely on an assumed parametric copula family where the copula parameter is assumed to be a function of the covariates. The functional relationship can be estimated nonparametrically using different techniques but it is required to choose an appropriate copula model from various candidate families. In this paper, by employing the empirical checkerboard Bernstein copula (ECBC) estimator we propose a fully nonparametric approach for estimating conditional copulas, which doesn't require any selection of parametric copula models. Closed-form estimates of the conditional dependence measures are derived directly from the proposed ECBC-based conditional copula estimator. We provide the large-sample consistency of the proposed estimator as well as the estimates of conditional dependence measures. The finite-sample performance of the proposed estimator and comparison with semiparametric methods are investigated through simulation studies. An application to real case studies is also provided.
Adversarial examples in machine learning has emerged as a focal point of research due to their remarkable ability to deceive models with seemingly inconspicuous input perturbations, potentially resulting in severe consequences. In this study, we embark on a comprehensive exploration of adversarial machine learning models, shedding light on their intrinsic complexity and interpretability. Our investigation reveals intriguing links between machine learning model complexity and Einstein's theory of special relativity, all through the lens of entanglement. While our work does not primarily center on quantum entanglement, we instead define the entanglement correlations we have discovered to be computational, and demonstrate that distant feature samples can be entangled, strongly resembling entanglement correlation in the quantum realm. This revelation bestows fresh insights for understanding the phenomenon of emergent adversarial examples in modern machine learning, potentially paving the way for more robust and interpretable models in this rapidly evolving field.
In recent years, several authors have been investigating simplicial models, a model of epistemic logic based on higher-dimensional structures called simplicial complexes. In the original formulation, simplicial models were always assumed to be pure, meaning that all worlds have the same dimension. This is equivalent to the standard S5n semantics of epistemic logic, based on Kripke models. By removing the assumption that models must be pure, we can go beyond the usual Kripke semantics and study epistemic logics where the number of agents participating in a world can vary. This approach has been developed in a number of papers, with applications in fault-tolerant distributed computing where processes may crash during the execution of a system. A difficulty that arises is that subtle design choices in the definition of impure simplicial models can result in different axioms of the resulting logic. In this paper, we classify those design choices systematically, and axiomatize the corresponding logics. We illustrate them via distributed computing examples of synchronous systems where processes may crash.
Semantic consistency of a language model is broadly defined as the model's ability to produce semantically-equivalent outputs, given semantically-equivalent inputs. We address the task of assessing question-answering (QA) semantic consistency of contemporary large language models (LLMs) by manually creating a benchmark dataset with high-quality paraphrases for factual questions, and release the dataset to the community. We further combine the semantic consistency metric with additional measurements suggested in prior work as correlating with LLM QA accuracy, for building and evaluating a framework for factual QA reference-less performance prediction -- predicting the likelihood of a language model to accurately answer a question. Evaluating the framework on five contemporary LLMs, we demonstrate encouraging, significantly outperforming baselines, results.
Longitudinal processes often pose nonlinear change patterns. Latent basis growth models (LBGMs) provide a versatile solution without requiring specific functional forms. Building on the LBGM specification for unequally-spaced waves and individual occasions proposed by Liu and Perera (2023), we extend LBGMs to multivariate longitudinal outcomes. This provides a unified approach to nonlinear, interconnected trajectories. Simulation studies demonstrate that the proposed model can provide unbiased and accurate estimates with target coverage probabilities for the parameters of interest. Real-world analyses of reading and mathematics scores demonstrates its effectiveness in analyzing joint developmental processes that vary in temporal patterns. Computational code is included.
As artificial intelligence (AI) models continue to scale up, they are becoming more capable and integrated into various forms of decision-making systems. For models involved in moral decision-making, also known as artificial moral agents (AMA), interpretability provides a way to trust and understand the agent's internal reasoning mechanisms for effective use and error correction. In this paper, we provide an overview of this rapidly-evolving sub-field of AI interpretability, introduce the concept of the Minimum Level of Interpretability (MLI) and recommend an MLI for various types of agents, to aid their safe deployment in real-world settings.
Graph neural networks (GNNs) have been demonstrated to be a powerful algorithmic model in broad application fields for their effectiveness in learning over graphs. To scale GNN training up for large-scale and ever-growing graphs, the most promising solution is distributed training which distributes the workload of training across multiple computing nodes. However, the workflows, computational patterns, communication patterns, and optimization techniques of distributed GNN training remain preliminarily understood. In this paper, we provide a comprehensive survey of distributed GNN training by investigating various optimization techniques used in distributed GNN training. First, distributed GNN training is classified into several categories according to their workflows. In addition, their computational patterns and communication patterns, as well as the optimization techniques proposed by recent work are introduced. Second, the software frameworks and hardware platforms of distributed GNN training are also introduced for a deeper understanding. Third, distributed GNN training is compared with distributed training of deep neural networks, emphasizing the uniqueness of distributed GNN training. Finally, interesting issues and opportunities in this field are discussed.
Object detection typically assumes that training and test data are drawn from an identical distribution, which, however, does not always hold in practice. Such a distribution mismatch will lead to a significant performance drop. In this work, we aim to improve the cross-domain robustness of object detection. We tackle the domain shift on two levels: 1) the image-level shift, such as image style, illumination, etc, and 2) the instance-level shift, such as object appearance, size, etc. We build our approach based on the recent state-of-the-art Faster R-CNN model, and design two domain adaptation components, on image level and instance level, to reduce the domain discrepancy. The two domain adaptation components are based on H-divergence theory, and are implemented by learning a domain classifier in adversarial training manner. The domain classifiers on different levels are further reinforced with a consistency regularization to learn a domain-invariant region proposal network (RPN) in the Faster R-CNN model. We evaluate our newly proposed approach using multiple datasets including Cityscapes, KITTI, SIM10K, etc. The results demonstrate the effectiveness of our proposed approach for robust object detection in various domain shift scenarios.