We present a lattice of distributed program specifications, whose ordering represents implementability/refinement. Specifications are modelled by families of subsets of relative execution traces, which encode the local orderings of state transitions, rather than their absolute timing according to a global clock. This is to overcome fundamental physical difficulties with synchronisation. The lattice of specifications is assembled and analysed with several established mathematical tools. Sets of nondegenerate cells of a simplicial set are used to model relative traces, presheaves model the parametrisation of these traces by a topological space of variables, and information algebras reveal novel constraints on program correctness. The latter aspect brings the enterprise of program specification under the widening umbrella of contextual semantics introduced by Abramsky et al. In this model of program specifications, contextuality manifests as a failure of a consistency criterion comparable to Lamport's definition of sequential consistency. The theory of information algebras also suggests efficient local computation algorithms for the verification of this criterion. The novel constructions in this paper have been verified in the proof assistant Isabelle/HOL.
Specifications of complex, large scale, computer software and hardware systems can be radically simplified by using simple maps from input sequences to output values. These "state machine maps" provide an alternative representation of classical Moore type state machines. Composition of state machine maps corresponds to state machine products and can be used to specify essentially any type of interconnection as well as parallel and distributed computation. State machine maps can also specify abstract properties of systems and are significantly more concise and scalable than traditional representations of automata. Examples included here include specifications of producer/consumer software, network distributed consensus, real-time digital circuits, and operating system scheduling. The motivation for this work comes from experience designing and developing operating systems and real-time software where weak methods for understanding and exploring designs is a well known handicap. The methods introduced here are based on ordinary discrete mathematics, primitive recursive functions and deterministic state machines and are intended, initially, to aid the intuition and understanding of the system developers. Staying strictly within the boundaries of classical deterministic state machines anchors the methods to the algebraic structures of automata and semigroups, obviates any need for axiomatic deduction systems, "formal methods", or extensions to the model, and makes the specifications more faithful to engineering practice. While state machine maps are obvious representations of state machines, the techniques introduced here for defining and composing them are novel. To illustrate applications, the paper includes a fairly detailed specification and verification of the well-known "Paxos" distributed consensus algorithm plus a number of simpler examples including a digital PID controller.
Data is a precious resource in today's society, and is generated at an unprecedented and constantly growing pace. The need to store, analyze, and make data promptly available to a multitude of users introduces formidable challenges in modern software platforms. These challenges radically transformed all research fields that gravitate around data management and processing, with the introduction of distributed data-intensive systems that offer new programming models and implementation strategies to handle data characteristics such as its volume, the rate at which it is produced, its heterogeneity, and its distribution. Each data-intensive system brings its specific choices in terms of data model, usage assumptions, synchronization, processing strategy, deployment, guarantees in terms of consistency, fault tolerance, ordering. Yet, the problems data-intensive systems face and the solutions they propose are frequently overlapping. This paper proposes a unifying model that dissects the core functionalities of data-intensive systems, and precisely discusses alternative design and implementation strategies, pointing out their assumptions and implications. The model offers a common ground to understand and compare highly heterogeneous solutions, with the potential of fostering cross-fertilization across research communities and advancing the field. We apply our model by classifying tens of systems: an exercise that brings to interesting observations on the current trends in the domain of data-intensive systems and suggests open research directions.
We introduce the mean inverse integrator (MII), a novel approach to increase the accuracy when training neural networks to approximate vector fields of dynamical systems from noisy data. This method can be used to average multiple trajectories obtained by numerical integrators such as Runge-Kutta methods. We show that the class of mono-implicit Runge-Kutta methods (MIRK) has particular advantages when used in connection with MII. When training vector field approximations, explicit expressions for the loss functions are obtained when inserting the training data in the MIRK formulae, unlocking symmetric and high-order integrators that would otherwise be implicit for initial value problems. The combined approach of applying MIRK within MII yields a significantly lower error compared to the plain use of the numerical integrator without averaging the trajectories. This is demonstrated with experiments using data from several (chaotic) Hamiltonian systems. Additionally, we perform a sensitivity analysis of the loss functions under normally distributed perturbations, supporting the favorable performance of MII.
Federated learning in satellites offers several advantages. Firstly, it ensures data privacy and security, as sensitive data remains on the satellites and is not transmitted to a central location. This is particularly important when dealing with sensitive or classified information. Secondly, federated learning allows satellites to collectively learn from a diverse set of data sources, benefiting from the distributed knowledge across the satellite network. Lastly, the use of federated learning reduces the communication bandwidth requirements between satellites and the central server, as only model updates are exchanged instead of raw data. By leveraging federated learning, satellites can collaborate and continuously improve their machine learning models while preserving data privacy and minimizing communication overhead. This enables the development of more intelligent and efficient satellite systems for various applications, such as Earth observation, weather forecasting, and space exploration.
We study the problem of adaptively identifying patient subpopulations that benefit from a given treatment during a confirmatory clinical trial. This type of adaptive clinical trial has been thoroughly studied in biostatistics, but has been allowed only limited adaptivity so far. Here, we aim to relax classical restrictions on such designs and investigate how to incorporate ideas from the recent machine learning literature on adaptive and online experimentation to make trials more flexible and efficient. We find that the unique characteristics of the subpopulation selection problem -- most importantly that (i) one is usually interested in finding subpopulations with any treatment benefit (and not necessarily the single subgroup with largest effect) given a limited budget and that (ii) effectiveness only has to be demonstrated across the subpopulation on average -- give rise to interesting challenges and new desiderata when designing algorithmic solutions. Building on these findings, we propose AdaGGI and AdaGCPI, two meta-algorithms for subpopulation construction. We empirically investigate their performance across a range of simulation scenarios and derive insights into their (dis)advantages across different settings.
Companies' adoption of artificial intelligence (AI) is increasingly becoming an essential element of business success. However, using AI poses new requirements for companies and their employees, including transparency and comprehensibility of AI systems. The field of Explainable AI (XAI) aims to address these issues. Yet, the current research primarily consists of laboratory studies, and there is a need to improve the applicability of the findings to real-world situations. Therefore, this project report paper provides insights into employees' needs and attitudes towards (X)AI. For this, we investigate employees' perspectives on (X)AI. Our findings suggest that AI and XAI are well-known terms perceived as important for employees. This recognition is a critical first step for XAI to potentially drive successful usage of AI by providing comprehensible insights into AI technologies. In a lessons-learned section, we discuss the open questions identified and suggest future research directions to develop human-centered XAI designs for companies. By providing insights into employees' needs and attitudes towards (X)AI, our project report contributes to the development of XAI solutions that meet the requirements of companies and their employees, ultimately driving the successful adoption of AI technologies in the business context.
Learned inverse problem solvers exhibit remarkable performance in applications like image reconstruction tasks. These data-driven reconstruction methods often follow a two-step scheme. First, one trains the often neural network-based reconstruction scheme via a dataset. Second, one applies the scheme to new measurements to obtain reconstructions. We follow these steps but parameterize the reconstruction scheme with invertible residual networks (iResNets). We demonstrate that the invertibility enables investigating the influence of the training and architecture choices on the resulting reconstruction scheme. For example, assuming local approximation properties of the network, we show that these schemes become convergent regularizations. In addition, the investigations reveal a formal link to the linear regularization theory of linear inverse problems and provide a nonlinear spectral regularization for particular architecture classes. On the numerical side, we investigate the local approximation property of selected trained architectures and present a series of experiments on the MNIST dataset that underpin and extend our theoretical findings.
Existing recommender systems extract the user preference based on learning the correlation in data, such as behavioral correlation in collaborative filtering, feature-feature, or feature-behavior correlation in click-through rate prediction. However, regretfully, the real world is driven by causality rather than correlation, and correlation does not imply causation. For example, the recommender systems can recommend a battery charger to a user after buying a phone, in which the latter can serve as the cause of the former, and such a causal relation cannot be reversed. Recently, to address it, researchers in recommender systems have begun to utilize causal inference to extract causality, enhancing the recommender system. In this survey, we comprehensively review the literature on causal inference-based recommendation. At first, we present the fundamental concepts of both recommendation and causal inference as the basis of later content. We raise the typical issues that the non-causality recommendation is faced. Afterward, we comprehensively review the existing work of causal inference-based recommendation, based on a taxonomy of what kind of problem causal inference addresses. Last, we discuss the open problems in this important research area, along with interesting future works.
The demand for artificial intelligence has grown significantly over the last decade and this growth has been fueled by advances in machine learning techniques and the ability to leverage hardware acceleration. However, in order to increase the quality of predictions and render machine learning solutions feasible for more complex applications, a substantial amount of training data is required. Although small machine learning models can be trained with modest amounts of data, the input for training larger models such as neural networks grows exponentially with the number of parameters. Since the demand for processing training data has outpaced the increase in computation power of computing machinery, there is a need for distributing the machine learning workload across multiple machines, and turning the centralized into a distributed system. These distributed systems present new challenges, first and foremost the efficient parallelization of the training process and the creation of a coherent model. This article provides an extensive overview of the current state-of-the-art in the field by outlining the challenges and opportunities of distributed machine learning over conventional (centralized) machine learning, discussing the techniques used for distributed machine learning, and providing an overview of the systems that are available.
In recent years, mobile devices have gained increasingly development with stronger computation capability and larger storage. Some of the computation-intensive machine learning and deep learning tasks can now be run on mobile devices. To take advantage of the resources available on mobile devices and preserve users' privacy, the idea of mobile distributed machine learning is proposed. It uses local hardware resources and local data to solve machine learning sub-problems on mobile devices, and only uploads computation results instead of original data to contribute to the optimization of the global model. This architecture can not only relieve computation and storage burden on servers, but also protect the users' sensitive information. Another benefit is the bandwidth reduction, as various kinds of local data can now participate in the training process without being uploaded to the server. In this paper, we provide a comprehensive survey on recent studies of mobile distributed machine learning. We survey a number of widely-used mobile distributed machine learning methods. We also present an in-depth discussion on the challenges and future directions in this area. We believe that this survey can demonstrate a clear overview of mobile distributed machine learning and provide guidelines on applying mobile distributed machine learning to real applications.