The classic studies of causal emergence have revealed that in some Markovian dynamical systems, far stronger causal connections can be found on the higher-level descriptions than the lower-level of the same systems if we coarse-grain the system states in an appropriate way. However, identifying this emergent causality from the data is still a hard problem that has not been solved because the correct coarse-graining strategy can not be found easily. This paper proposes a general machine learning framework called Neural Information Squeezer to automatically extract the effective coarse-graining strategy and the macro-state dynamics, as well as identify causal emergence directly from the time series data. By decomposing a coarse-graining operation into two processes: information conversion and information dropping out, we can not only exactly control the width of the information channel, but also can derive some important properties analytically including the exact expression of the effective information of a macro-dynamics. We also show how our framework can extract the dynamics on different levels and identify causal emergence from the data on several exampled systems.
Unlike traditional time series, the action sequences of human decision making usually involve many cognitive processes such as beliefs, desires, intentions, and theory of mind, i.e., what others are thinking. This makes predicting human decision-making challenging to be treated agnostically to the underlying psychological mechanisms. We propose here to use a recurrent neural network architecture based on long short-term memory networks (LSTM) to predict the time series of the actions taken by human subjects engaged in gaming activity, the first application of such methods in this research domain. In this study, we collate the human data from 8 published literature of the Iterated Prisoner's Dilemma comprising 168,386 individual decisions and post-process them into 8,257 behavioral trajectories of 9 actions each for both players. Similarly, we collate 617 trajectories of 95 actions from 10 different published studies of Iowa Gambling Task experiments with healthy human subjects. We train our prediction networks on the behavioral data and demonstrate a clear advantage over the state-of-the-art methods in predicting human decision-making trajectories in both the single-agent scenario of the Iowa Gambling Task and the multi-agent scenario of the Iterated Prisoner's Dilemma. Moreover, we observe that the weights of the LSTM networks modeling the top performers tend to have a wider distribution compared to poor performers, as well as a larger bias, which suggest possible interpretations for the distribution of strategies adopted by each group.
We propose a new sheaf semantics for secure information flow over a space of abstract behaviors, based on synthetic domain theory: security classes are open/closed partitions, types are sheaves, and redaction of sensitive information corresponds to restricting a sheaf to a closed subspace. Our security-aware computational model satisfies termination-insensitive noninterference automatically, and therefore constitutes an intrinsic alternative to state of the art extrinsic/relational models of noninterference. Our semantics is the latest application of Sterling and Harper's recent re-interpretation of phase distinctions and noninterference in programming languages in terms of Artin gluing and topos-theoretic open/closed modalities. Prior applications include parametricity for ML modules, the proof of normalization for cubical type theory by Sterling and Angiuli, and the cost-aware logical framework of Niu et al. In this paper we employ the phase distinction perspective twice: first to reconstruct the syntax and semantics of secure information flow as a lattice of phase distinctions between "higher" and "lower" security, and second to verify the computational adequacy of our sheaf semantics vis-\`a-vis an extension of Abadi et al.'s dependency core calculus with a construct for declassifying termination channels.
In this paper we present a deep learning method to predict the temporal evolution of dissipative dynamic systems. We propose using both geometric and thermodynamic inductive biases to improve accuracy and generalization of the resulting integration scheme. The first is achieved with Graph Neural Networks, which induces a non-Euclidean geometrical prior with permutation invariant node and edge update functions. The second bias is forced by learning the GENERIC structure of the problem, an extension of the Hamiltonian formalism, to model more general non-conservative dynamics. Several examples are provided in both Eulerian and Lagrangian description in the context of fluid and solid mechanics respectively, achieving relative mean errors of less than 3% in all the tested examples. Two ablation studies are provided based on recent works in both physics-informed and geometric deep learning.
Representation learning enables us to automatically extract generic feature representations from a dataset to solve another machine learning task. Recently, extracted feature representations by a representation learning algorithm and a simple predictor have exhibited state-of-the-art performance on several machine learning tasks. Despite its remarkable progress, there exist various ways to evaluate representation learning algorithms depending on the application because of the flexibility of representation learning. To understand the current representation learning, we review evaluation methods of representation learning algorithms and theoretical analyses. On the basis of our evaluation survey, we also discuss the future direction of representation learning. Note that this survey is the extended version of Nozawa and Sato (2022).
The dynamic response of the legged robot locomotion is non-Lipschitz and can be stochastic due to environmental uncertainties. To test, validate, and characterize the safety performance of legged robots, existing solutions on observed and inferred risk can be incomplete and sampling inefficient. Some formal verification methods suffer from the model precision and other surrogate assumptions. In this paper, we propose a scenario sampling based testing framework that characterizes the overall safety performance of a legged robot by specifying (i) where (in terms of a set of states) the robot is potentially safe, and (ii) how safe the robot is within the specified set. The framework can also help certify the commercial deployment of the legged robot in real-world environment along with human and compare safety performance among legged robots with different mechanical structures and dynamic properties. The proposed framework is further deployed to evaluate a group of state-of-the-art legged robot locomotion controllers from various model-based, deep neural network involved, and reinforcement learning based methods in the literature. Among a series of intended work domains of the studied legged robots (e.g. tracking speed on sloped surface, with abrupt changes on demanded velocity, and against adversarial push-over disturbances), we show that the method can adequately capture the overall safety characterization and the subtle performance insights. Many of the observed safety outcomes, to the best of our knowledge, have never been reported by the existing work in the legged robot literature.
Many forms of dependence manifest themselves over time, with behavior of variables in dynamical systems as a paradigmatic example. This paper studies temporal dependence in dynamical systems from a logical perspective, by extending a minimal modal base logic of static functional dependencies. We define a logic for dynamical systems with single time steps, provide a complete axiomatic proof calculus, and show the decidability of the satisfiability problem for a substantial fragment. The system comes in two guises: modal and first-order, that naturally complement each other. Next, we consider a timed semantics for our logic, as an intermediate between state spaces and temporal universes for the unfoldings of a dynamical system. We prove completeness and decidability by combining techniques from dynamic-epistemic logic and modal logic of functional dependencies with complex terms for objects. Also, we extend these results to the timed logic with functional symbols and term identity. Finally, we conclude with a brief outlook on how the system proposed here connects with richer temporal logics of system behavior, and with dynamic topological logic.
Automatic text summarization has experienced substantial progress in recent years. With this progress, the question has arisen whether the types of summaries that are typically generated by automatic summarization models align with users' needs. Ter Hoeve et al (2020) answer this question negatively. Amongst others, they recommend focusing on generating summaries with more graphical elements. This is in line with what we know from the psycholinguistics literature about how humans process text. Motivated from these two angles, we propose a new task: summarization with graphical elements, and we verify that these summaries are helpful for a critical mass of people. We collect a high quality human labeled dataset to support research into the task. We present a number of baseline methods that show that the task is interesting and challenging. Hence, with this work we hope to inspire a new line of research within the automatic summarization community.
Models for dependent data are distinguished by their targets of inference. Marginal models are useful when interest lies in quantifying associations averaged across a population of clusters. When the functional form of a covariate-outcome association is unknown, flexible regression methods are needed to allow for potentially non-linear relationships. We propose a novel marginal additive model (MAM) for modelling cluster-correlated data with non-linear population-averaged associations. The proposed MAM is a unified framework for estimation and uncertainty quantification of a marginal mean model, combined with inference for between-cluster variability and cluster-specific prediction. We propose a fitting algorithm that enables efficient computation of standard errors and corrects for estimation of penalty terms. We demonstrate the proposed methods in simulations and in application to (i) a longitudinal study of beaver foraging behaviour, and (ii) a spatial analysis of Loaloa infection in West Africa. R code for implementing the proposed methodology is available at //github.com/awstringer1/mam.
This paper focuses on the expected difference in borrower's repayment when there is a change in the lender's credit decisions. Classical estimators overlook the confounding effects and hence the estimation error can be magnificent. As such, we propose another approach to construct the estimators such that the error can be greatly reduced. The proposed estimators are shown to be unbiased, consistent, and robust through a combination of theoretical analysis and numerical testing. Moreover, we compare the power of estimating the causal quantities between the classical estimators and the proposed estimators. The comparison is tested across a wide range of models, including linear regression models, tree-based models, and neural network-based models, under different simulated datasets that exhibit different levels of causality, different degrees of nonlinearity, and different distributional properties. Most importantly, we apply our approaches to a large observational dataset provided by a global technology firm that operates in both the e-commerce and the lending business. We find that the relative reduction of estimation error is strikingly substantial if the causal effects are accounted for correctly.
Edge intelligence refers to a set of connected systems and devices for data collection, caching, processing, and analysis in locations close to where data is captured based on artificial intelligence. The aim of edge intelligence is to enhance the quality and speed of data processing and protect the privacy and security of the data. Although recently emerged, spanning the period from 2011 to now, this field of research has shown explosive growth over the past five years. In this paper, we present a thorough and comprehensive survey on the literature surrounding edge intelligence. We first identify four fundamental components of edge intelligence, namely edge caching, edge training, edge inference, and edge offloading, based on theoretical and practical results pertaining to proposed and deployed systems. We then aim for a systematic classification of the state of the solutions by examining research results and observations for each of the four components and present a taxonomy that includes practical problems, adopted techniques, and application goals. For each category, we elaborate, compare and analyse the literature from the perspectives of adopted techniques, objectives, performance, advantages and drawbacks, etc. This survey article provides a comprehensive introduction to edge intelligence and its application areas. In addition, we summarise the development of the emerging research field and the current state-of-the-art and discuss the important open issues and possible theoretical and technical solutions.