Discovering causal relationships from observational data is a challenging task that relies on assumptions connecting statistical quantities to graphical or algebraic causal models. In this work, we focus on widely employed assumptions for causal discovery when objects of interest are (multivariate) groups of random variables rather than individual (univariate) random variables, as is the case in a variety of problems in scientific domains such as climate science or neuroscience. If the group-level causal models are derived from partitioning a micro-level model into groups, we explore the relationship between micro and group-level causal discovery assumptions. We investigate the conditions under which assumptions like Causal Faithfulness hold or fail to hold. Our analysis encompasses graphical causal models that contain cycles and bidirected edges. We also discuss grouped time series causal graphs and variants thereof as special cases of our general theoretical framework. Thereby, we aim to provide researchers with a solid theoretical foundation for the development and application of causal discovery methods for variable groups.
Computing the rate-distortion function for continuous sources is commonly regarded as a standard continuous optimization problem. When numerically addressing this problem, a typical approach involves discretizing the source space and subsequently solving the associated discrete problem. However, existing literature has predominantly concentrated on the convergence analysis of solving discrete problems, usually neglecting the convergence relationship between the original continuous optimization and its associated discrete counterpart. This neglect is not rigorous, since the solution of a discrete problem does not necessarily imply convergence to the solution of the original continuous problem, especially for non-linear problems. To address this gap, our study employs rigorous mathematical analysis, which constructs a series of finite-dimensional spaces approximating the infinite-dimensional space of the probability measure, establishing that solutions from discrete schemes converge to those from the continuous problems.
In the largest survey of its kind, 2,778 researchers who had published in top-tier artificial intelligence (AI) venues gave predictions on the pace of AI progress and the nature and impacts of advanced AI systems The aggregate forecasts give at least a 50% chance of AI systems achieving several milestones by 2028, including autonomously constructing a payment processing site from scratch, creating a song indistinguishable from a new song by a popular musician, and autonomously downloading and fine-tuning a large language model. If science continues undisrupted, the chance of unaided machines outperforming humans in every possible task was estimated at 10% by 2027, and 50% by 2047. The latter estimate is 13 years earlier than that reached in a similar survey we conducted only one year earlier [Grace et al., 2022]. However, the chance of all human occupations becoming fully automatable was forecast to reach 10% by 2037, and 50% as late as 2116 (compared to 2164 in the 2022 survey). Most respondents expressed substantial uncertainty about the long-term value of AI progress: While 68.3% thought good outcomes from superhuman AI are more likely than bad, of these net optimists 48% gave at least a 5% chance of extremely bad outcomes such as human extinction, and 59% of net pessimists gave 5% or more to extremely good outcomes. Between 38% and 51% of respondents gave at least a 10% chance to advanced AI leading to outcomes as bad as human extinction. More than half suggested that "substantial" or "extreme" concern is warranted about six different AI-related scenarios, including misinformation, authoritarian control, and inequality. There was disagreement about whether faster or slower AI progress would be better for the future of humanity. However, there was broad agreement that research aimed at minimizing potential risks from AI systems ought to be prioritized more.
The analysis of data such as graphs has been gaining increasing attention in the past years. This is justified by the numerous applications in which they appear. Several methods are present to predict graphs, but much fewer to quantify the uncertainty of the prediction. The present work proposes an uncertainty quantification methodology for graphs, based on conformal prediction. The method works both for graphs with the same set of nodes (labelled graphs) and graphs with no clear correspondence between the set of nodes across the observed graphs (unlabelled graphs). The unlabelled case is dealt with the creation of prediction sets embedded in a quotient space. The proposed method does not rely on distributional assumptions, it achieves finite-sample validity, and it identifies interpretable prediction sets. To explore the features of this novel forecasting technique, we perform two simulation studies to show the methodology in both the labelled and the unlabelled case. We showcase the applicability of the method in analysing the performance of different teams during the FIFA 2018 football world championship via their player passing networks.
The stability of persistent homology has led to wide applications of the persistence diagram as a trusted topological descriptor in the presence of noise. However, with the increasing demand for high-dimension and low-sample-size data processing in modern science, it is questionable whether persistence diagrams retain their reliability in the presence of high-dimensional noise. This work aims to study the reliability of persistence diagrams in the high-dimension low-sample-size data setting. By analyzing the asymptotic behavior of persistence diagrams for high-dimensional random data, we show that persistence diagrams are no longer reliable descriptors of low-sample-size data under high-dimensional noise perturbations. We refer to this loss of reliability of persistence diagrams in such data settings as the curse of dimensionality on persistence diagrams. Next, we investigate the possibility of using normalized principal component analysis as a method for reducing the dimensionality of the high-dimensional observed data to resolve the curse of dimensionality. We show that this method can mitigate the curse of dimensionality on persistence diagrams. Our results shed some new light on the challenges of processing high-dimension low-sample-size data by persistence diagrams and provide a starting point for future research in this area.
Event reasoning is a fundamental ability that underlies many applications. It requires event schema knowledge to perform global reasoning and needs to deal with the diversity of the inter-event relations and the reasoning paradigms. How well LLMs accomplish event reasoning on various relations and reasoning paradigms remains unknown. To mitigate this disparity, we comprehensively evaluate the abilities of event reasoning of LLMs. We introduce a novel benchmark EV2 for EValuation of EVent reasoning. EV2 consists of two levels of evaluation of schema and instance and is comprehensive in relations and reasoning paradigms. We conduct extensive experiments on EV2. We find that LLMs have abilities to accomplish event reasoning but their performances are far from satisfactory. We also notice the imbalance of event reasoning abilities in LLMs. Besides, LLMs have event schema knowledge, however, they're not aligned with humans on how to utilize the knowledge. Based on these findings, we introduce two methods to guide the LLMs to utilize the event schema knowledge. Both methods achieve improvements.
Acoustic recognition is a common task for deep learning in recent researches, with the employment of spectral feature extraction such as Short-time Fourier transform and Wavelet transform. However, not many researches have found that discuss the advantages and drawbacks, as well as performance comparison of them. In this consideration, this paper aims to comparing the attributes of these two transforms, called spectrogram and scalogram. A Convolutional Neural Networks for acoustic faults recognition is implemented, then the performance of them is recorded for comparison. A latest research on the same audio database is considered for benchmarking to see how good the designed spectrogram and scalogram is. The advantages and limitations of them are also analyzed. By doing so, the results of this paper provide indications for application scenarios of spectrogram and scalogram, as well as potential further research directions.
The fusion of causal models with deep learning introducing increasingly intricate data sets, such as the causal associations within images or between textual components, has surfaced as a focal research area. Nonetheless, the broadening of original causal concepts and theories to such complex, non-statistical data has been met with serious challenges. In response, our study proposes redefinitions of causal data into three distinct categories from the standpoint of causal structure and representation: definite data, semi-definite data, and indefinite data. Definite data chiefly pertains to statistical data used in conventional causal scenarios, while semi-definite data refers to a spectrum of data formats germane to deep learning, including time-series, images, text, and others. Indefinite data is an emergent research sphere inferred from the progression of data forms by us. To comprehensively present these three data paradigms, we elaborate on their formal definitions, differences manifested in datasets, resolution pathways, and development of research. We summarize key tasks and achievements pertaining to definite and semi-definite data from myriad research undertakings, present a roadmap for indefinite data, beginning with its current research conundrums. Lastly, we classify and scrutinize the key datasets presently utilized within these three paradigms.
Residual networks (ResNets) have displayed impressive results in pattern recognition and, recently, have garnered considerable theoretical interest due to a perceived link with neural ordinary differential equations (neural ODEs). This link relies on the convergence of network weights to a smooth function as the number of layers increases. We investigate the properties of weights trained by stochastic gradient descent and their scaling with network depth through detailed numerical experiments. We observe the existence of scaling regimes markedly different from those assumed in neural ODE literature. Depending on certain features of the network architecture, such as the smoothness of the activation function, one may obtain an alternative ODE limit, a stochastic differential equation or neither of these. These findings cast doubts on the validity of the neural ODE model as an adequate asymptotic description of deep ResNets and point to an alternative class of differential equations as a better description of the deep network limit.
Deep neural networks have revolutionized many machine learning tasks in power systems, ranging from pattern recognition to signal processing. The data in these tasks is typically represented in Euclidean domains. Nevertheless, there is an increasing number of applications in power systems, where data are collected from non-Euclidean domains and represented as the graph-structured data with high dimensional features and interdependency among nodes. The complexity of graph-structured data has brought significant challenges to the existing deep neural networks defined in Euclidean domains. Recently, many studies on extending deep neural networks for graph-structured data in power systems have emerged. In this paper, a comprehensive overview of graph neural networks (GNNs) in power systems is proposed. Specifically, several classical paradigms of GNNs structures (e.g., graph convolutional networks, graph recurrent neural networks, graph attention networks, graph generative networks, spatial-temporal graph convolutional networks, and hybrid forms of GNNs) are summarized, and key applications in power systems such as fault diagnosis, power prediction, power flow calculation, and data generation are reviewed in detail. Furthermore, main issues and some research trends about the applications of GNNs in power systems are discussed.
This work considers the question of how convenient access to copious data impacts our ability to learn causal effects and relations. In what ways is learning causality in the era of big data different from -- or the same as -- the traditional one? To answer this question, this survey provides a comprehensive and structured review of both traditional and frontier methods in learning causality and relations along with the connections between causality and machine learning. This work points out on a case-by-case basis how big data facilitates, complicates, or motivates each approach.