Recent research has provided a wealth of evidence highlighting the pivotal role of high-order interdependencies in supporting the information-processing capabilities of distributed complex systems. These findings may suggest that high-order interdependencies constitute a powerful resource that is, however, challenging to harness and can be readily disrupted. In this paper we contest this perspective by demonstrating that high-order interdependencies can not only exhibit robustness to stochastic perturbations, but can in fact be enhanced by them. Using elementary cellular automata as a general testbed, our results unveil the capacity of dynamical noise to enhance the statistical regularities between agents and, intriguingly, even alter the prevailing character of their interdependencies. Furthermore, our results show that these effects are related to the high-order structure of the local rules, which affect the system's susceptibility to noise and characteristic times-scales. These results deepen our understanding of how high-order interdependencies may spontaneously emerge within distributed systems interacting with stochastic environments, thus providing an initial step towards elucidating their origin and function in complex systems like the human brain.
Many small to large organizations have adopted the Microservices Architecture (MSA) style to develop and deliver their core businesses. Despite the popularity of MSA in the software industry, there is a limited evidence-based and thorough understanding of the types of issues (e.g., errors, faults, failures, and bugs) that microservices system developers experience, the causes of the issues, and the solutions as potential fixing strategies to address the issues. To ameliorate this gap, we conducted a mixed-methods empirical study that collected data from 2,641 issues from the issue tracking systems of 15 open-source microservices systems on GitHub, 15 interviews, and an online survey completed by 150 practitioners from 42 countries across 6 continents. Our analysis led to comprehensive taxonomies for the issues, causes, and solutions. The findings of this study inform that Technical Debt, Continuous Integration and Delivery, Exception Handling, Service Execution and Communication, and Security are the most dominant issues in microservices systems. Furthermore, General Programming Errors, Missing Features and Artifacts, and Invalid Configuration and Communication are the main causes behind the issues. Finally, we found 177 types of solutions that can be applied to fix the identified issues. Based on our study results, we formulated future research directions that could help researchers and practitioners to engineer emergent and next-generation microservices systems.
The abundance of observed data in recent years has increased the number of statistical augmentations to complex models across science and engineering. By augmentation we mean coherent statistical methods that incorporate measurements upon arrival and adjust the model accordingly. However, in this research area methodological developments tend to be central, with important assessments of model fidelity often taking second place. Recently, the statistical finite element method (statFEM) has been posited as a potential solution to the problem of model misspecification when the data are believed to be generated from an underlying partial differential equation system. Bayes nonlinear filtering permits data driven finite element discretised solutions that are updated to give a posterior distribution which quantifies the uncertainty over model solutions. The statFEM has shown great promise in systems subject to mild misspecification but its ability to handle scenarios of severe model misspecification has not yet been presented. In this paper we fill this gap, studying statFEM in the context of shallow water equations chosen for their oceanographic relevance. By deliberately misspecifying the governing equations, via linearisation, viscosity, and bathymetry, we systematically analyse misspecification through studying how the resultant approximate posterior distribution is affected, under additional regimes of decreasing spatiotemporal observational frequency. Results show that statFEM performs well with reasonable accuracy, as measured by theoretically sound proper scoring rules.
The logic of goal-directed knowing-how extends the standard epistemic logic with an operator of knowing-how. The knowing-how operator is interpreted as that there exists a strategy such that the agent knows that the strategy can make sure that p. This paper presents a tableau procedure for the multi-agent version of the logic of strategically knowing-how and shows the soundness and completeness of this tableau procedure. This paper also shows that the satisfiability problem of the logic can be decided in PSPACE.
In this paper, we provide a novel framework for the analysis of generalization error of first-order optimization algorithms for statistical learning when the gradient can only be accessed through partial observations given by an oracle. Our analysis relies on the regularity of the gradient w.r.t. the data samples, and allows to derive near matching upper and lower bounds for the generalization error of multiple learning problems, including supervised learning, transfer learning, robust learning, distributed learning and communication efficient learning using gradient quantization. These results hold for smooth and strongly-convex optimization problems, as well as smooth non-convex optimization problems verifying a Polyak-Lojasiewicz assumption. In particular, our upper and lower bounds depend on a novel quantity that extends the notion of conditional standard deviation, and is a measure of the extent to which the gradient can be approximated by having access to the oracle. As a consequence, our analysis provides a precise meaning to the intuition that optimization of the statistical learning objective is as hard as the estimation of its gradient. Finally, we show that, in the case of standard supervised learning, mini-batch gradient descent with increasing batch sizes and a warm start can reach a generalization error that is optimal up to a multiplicative factor, thus motivating the use of this optimization scheme in practical applications.
The transformative influence of Large Language Models (LLMs) is profoundly reshaping the Artificial Intelligence (AI) technology domain. Notably, ChatGPT distinguishes itself within these models, demonstrating remarkable performance in multi-turn conversations and exhibiting code proficiency across an array of languages. In this paper, we carry out a comprehensive evaluation of ChatGPT's coding capabilities based on what is to date the largest catalog of coding challenges. Our focus is on the python programming language and problems centered on data structures and algorithms, two topics at the very foundations of Computer Science. We evaluate ChatGPT for its ability to generate correct solutions to the problems fed to it, its code quality, and nature of run-time errors thrown by its code. Where ChatGPT code successfully executes, but fails to solve the problem at hand, we look into patterns in the test cases passed in order to gain some insights into how wrong ChatGPT code is in these kinds of situations. To infer whether ChatGPT might have directly memorized some of the data that was used to train it, we methodically design an experiment to investigate this phenomena. Making comparisons with human performance whenever feasible, we investigate all the above questions from the context of both its underlying learning models (GPT-3.5 and GPT-4), on a vast array sub-topics within the main topics, and on problems having varying degrees of difficulty.
We introduce a sufficient graphical model by applying the recently developed nonlinear sufficient dimension reduction techniques to the evaluation of conditional independence. The graphical model is nonparametric in nature, as it does not make distributional assumptions such as the Gaussian or copula Gaussian assumptions. However, unlike a fully nonparametric graphical model, which relies on the high-dimensional kernel to characterize conditional independence, our graphical model is based on conditional independence given a set of sufficient predictors with a substantially reduced dimension. In this way we avoid the curse of dimensionality that comes with a high-dimensional kernel. We develop the population-level properties, convergence rate, and variable selection consistency of our estimate. By simulation comparisons and an analysis of the DREAM 4 Challenge data set, we demonstrate that our method outperforms the existing methods when the Gaussian or copula Gaussian assumptions are violated, and its performance remains excellent in the high-dimensional setting.
Time-series datasets are central in numerous fields of science and engineering, such as biomedicine, Earth observation, and network analysis. Extensive research exists on state-space models (SSMs), which are powerful mathematical tools that allow for probabilistic and interpretable learning on time series. Estimating the model parameters in SSMs is arguably one of the most complicated tasks, and the inclusion of prior knowledge is known to both ease the interpretation but also to complicate the inferential tasks. Very recent works have attempted to incorporate a graphical perspective on some of those model parameters, but they present notable limitations that this work addresses. More generally, existing graphical modeling tools are designed to incorporate either static information, focusing on statistical dependencies among independent random variables (e.g., graphical Lasso approach), or dynamic information, emphasizing causal relationships among time series samples (e.g., graphical Granger approaches). However, there are no joint approaches combining static and dynamic graphical modeling within the context of SSMs. This work proposes a novel approach to fill this gap by introducing a joint graphical modeling framework that bridges the static graphical Lasso model and a causal-based graphical approach for the linear-Gaussian SSM. We present DGLASSO (Dynamic Graphical Lasso), a new inference method within this framework that implements an efficient block alternating majorization-minimization algorithm. The algorithm's convergence is established by departing from modern tools from nonlinear analysis. Experimental validation on synthetic and real weather variability data showcases the effectiveness of the proposed model and inference algorithm.
Causality can be described in terms of a structural causal model (SCM) that carries information on the variables of interest and their mechanistic relations. For most processes of interest the underlying SCM will only be partially observable, thus causal inference tries to leverage any exposed information. Graph neural networks (GNN) as universal approximators on structured input pose a viable candidate for causal learning, suggesting a tighter integration with SCM. To this effect we present a theoretical analysis from first principles that establishes a novel connection between GNN and SCM while providing an extended view on general neural-causal models. We then establish a new model class for GNN-based causal inference that is necessary and sufficient for causal effect identification. Our empirical illustration on simulations and standard benchmarks validate our theoretical proofs.
The remarkable practical success of deep learning has revealed some major surprises from a theoretical perspective. In particular, simple gradient methods easily find near-optimal solutions to non-convex optimization problems, and despite giving a near-perfect fit to training data without any explicit effort to control model complexity, these methods exhibit excellent predictive accuracy. We conjecture that specific principles underlie these phenomena: that overparametrization allows gradient methods to find interpolating solutions, that these methods implicitly impose regularization, and that overparametrization leads to benign overfitting. We survey recent theoretical progress that provides examples illustrating these principles in simpler settings. We first review classical uniform convergence results and why they fall short of explaining aspects of the behavior of deep learning methods. We give examples of implicit regularization in simple settings, where gradient methods lead to minimal norm functions that perfectly fit the training data. Then we review prediction methods that exhibit benign overfitting, focusing on regression problems with quadratic loss. For these methods, we can decompose the prediction rule into a simple component that is useful for prediction and a spiky component that is useful for overfitting but, in a favorable setting, does not harm prediction accuracy. We focus specifically on the linear regime for neural networks, where the network can be approximated by a linear model. In this regime, we demonstrate the success of gradient flow, and we consider benign overfitting with two-layer networks, giving an exact asymptotic analysis that precisely demonstrates the impact of overparametrization. We conclude by highlighting the key challenges that arise in extending these insights to realistic deep learning settings.
Graph Neural Networks (GNNs), which generalize deep neural networks to graph-structured data, have drawn considerable attention and achieved state-of-the-art performance in numerous graph related tasks. However, existing GNN models mainly focus on designing graph convolution operations. The graph pooling (or downsampling) operations, that play an important role in learning hierarchical representations, are usually overlooked. In this paper, we propose a novel graph pooling operator, called Hierarchical Graph Pooling with Structure Learning (HGP-SL), which can be integrated into various graph neural network architectures. HGP-SL incorporates graph pooling and structure learning into a unified module to generate hierarchical representations of graphs. More specifically, the graph pooling operation adaptively selects a subset of nodes to form an induced subgraph for the subsequent layers. To preserve the integrity of graph's topological information, we further introduce a structure learning mechanism to learn a refined graph structure for the pooled graph at each layer. By combining HGP-SL operator with graph neural networks, we perform graph level representation learning with focus on graph classification task. Experimental results on six widely used benchmarks demonstrate the effectiveness of our proposed model.