The author introduced models of linear logic known as ''Interaction Graphs'' which generalise Girard's various geometry of interaction constructions. In this work, we establish how these models essentially rely on a deep connection between zeta functions and the execution of programs, expressed as a cocycle. This is first shown in the simple case of graphs, before begin lifted to dynamical systems. Focussing on probabilistic models, we then explain how the notion of graphings used in Interaction Graphs captures a natural class of sub-Markov processes. We then extend the realisability constructions and the notion of zeta function to provide a realisability model of second-order linear logic over the set of all (discrete-time) sub-Markov processes.
Within high-performance computing (HPC), solving large sparse linear systems efficiently remains paramount, with iterative methods being the predominant choice. However, the performance of these methods is tightly coupled to the aptness of the chosen preconditioner. The multifaceted nature of sparse matrices makes the universal prescription of preconditioners elusive. Notably, the key attribute of sparsity is not precisely captured by scalar metrics such as bandwidth or matrix dimensions. Advancing prior methodologies, this research introduces matrix sparsity depiction via RGB images. Utilizing a convolutional neural network (CNN), the task of preconditioner selection turns into a multi-class classification problem. Extensive tests on 126 SuiteSparse matrices emphasize the enhanced prowess of the CNN model, noting a 32% boost in accuracy and a 25% reduction in computational slowdown.
Every Model of High-Level Computation (MHC) has an underlying composition mechanism for combining simple computation devices into more complex ones. Composition can be done by (explicitly or implicitly) defining control flow, data flow or any combination thereof. Control flow specifies the order in which individual computations are activated, whereas data flow defines how data is exchanged among them. Unfortunately, traditional MHCs either mix data and control or only consider one dimension explicitly, which makes it difficult to reason about data flow and control flow separately. Reasoning about these dimensions orthogonally is a crucial desideratum for optimisation, maintainability and verification purposes. In this paper, we introduce a novel MHC that explicitly treats data flow and control flow as separate dimensions, while providing modularity. As the model is rooted in category theory, it provides category-theoretic operations for compositionally constructing sequential, parallel, branchial or iterative composites. Compositionality entails that a composite exhibits the same properties as its respective constituents, including separation of concerns and modularity. We conclude the paper by demonstrating how our proposal can be used to model high-level computations in two different application domains: software engineering and artificial intelligence.
Gaussian processes (GPs) are the most common formalism for defining probability distributions over spaces of functions. While applications of GPs are myriad, a comprehensive understanding of GP sample paths, i.e. the function spaces over which they define a probability measure on, is lacking. In practice, GPs are not constructed through a probability measure, but instead through a mean function and a covariance kernel. In this paper we provide necessary and sufficient conditions on the covariance kernel for the sample paths of the corresponding GP to attain a given regularity. We use the framework of H\"older regularity as it grants us particularly straightforward conditions, which simplify further in the cases of stationary and isotropic GPs. We then demonstrate that our results allow for novel and unusually tight characterisations of the sample path regularities of the GPs commonly used in machine learning applications, such as the Mat\'ern GPs.
Classical mathematical statistics deals with models that are parametrized by a Euclidean, i.e. finite dimensional, parameter. Quite often such models have been and still are chosen in practical situations for their mathematical simplicity and tractability. However, these models are typically inappropriate since the implied distributional assumptions cannot be supported by hard evidence. It is natural then to relax these assumptions. This leads to the class of semiparametric models. These models have been studied in a local asymptotic setting, in which the Convolution Theorem yields bounds on the performance of regular estimators. Alternatively, local asymptotics can be based on the Local Asymptotic Minimax Theorem and on the Local Asymptotic Spread Theorem, both valid for any sequence of estimators. This Local Asymptotic Spread Theorem is a straightforward consequence of a Finite Sample Spread Inequality, which has some intrinsic value for estimation theory in general. We will discuss both the Finite Sample and Local Asymptotic Spread Theorem, as well as the Convolution Theorem.
In the fields of computer graphics, computer vision and photogrammetry, Neural Radiance Fields (NeRFs) are a major topic driving current research and development. However, the quality of NeRF-generated 3D scene reconstructions and subsequent surface reconstructions, heavily relies on the network output, particularly the density. Regarding this critical aspect, we propose to utilize NeRF-Ensembles that provide a density uncertainty estimate alongside the mean density. We demonstrate that data constraints such as low-quality images and poses lead to a degradation of the training process, increased density uncertainty and decreased predicted density. Even with high-quality input data, the density uncertainty varies based on scene constraints such as acquisition constellations, occlusions and material properties. NeRF-Ensembles not only provide a tool for quantifying the uncertainty but exhibit two promising advantages: Enhanced robustness and artifact removal. Through the utilization of NeRF-Ensembles instead of single NeRFs, small outliers are removed, yielding a smoother output with improved completeness of structures. Furthermore, applying percentile-based thresholds on density uncertainty outliers proves to be effective for the removal of large (foggy) artifacts in post-processing. We conduct our methodology on 3 different datasets: (i) synthetic benchmark dataset, (ii) real benchmark dataset, (iii) real data under realistic recording conditions and sensors.
Data catalogs play a crucial role in modern data-driven organizations by facilitating the discovery, understanding, and utilization of diverse data assets. However, ensuring their quality and reliability is complex, especially in open and large-scale data environments. This paper proposes a framework to automatically determine the quality of open data catalogs, addressing the need for efficient and reliable quality assessment mechanisms. Our framework can analyze various core quality dimensions, such as accuracy, completeness, consistency, scalability, and timeliness, offer several alternatives for the assessment of compatibility and similarity across such catalogs as well as the implementation of a set of non-core quality dimensions such as provenance, readability, and licensing. The goal is to empower data-driven organizations to make informed decisions based on trustworthy and well-curated data assets. The source code that illustrates our approach can be downloaded from //www.github.com/jorge-martinez-gil/dataq/.
A Gr\"obner basis computation for the Weyl algebra with respect to a tropical term order and by using a homogenization-dehomogenization technique is sufficiently sluggish. A significant number of reductions to zero occur. To improve the computation, a tropical F5 algorithm is developed for this context. As a member of the family of signature-based algorithms, this algorithm keeps track of where Weyl algebra elements come from to anticipate reductions to zero. The total order for ordering module monomials or signatures in this paper is designed as close as possible to the definition of the tropical term order. As in Vaccon et al. (2021), this total order is not compatible with the tropical term order.
In this work, a novel Stackelberg game theoretic framework is proposed for trading energy bidirectionally between the demand-response (DR) aggregator and the prosumers. This formulation allows for flexible energy arbitrage and additional monetary rewards while ensuring that the prosumers' desired daily energy demand is met. Then, a scalable (linear with the number of prosumers), decentralized, privacy-preserving algorithm is proposed to find approximate equilibria with online sampling and learning of the prosumers' cumulative best response, which finds applications beyond this energy game. Moreover, cost bounds are provided on the quality of the approximate equilibrium solution. Finally, real data from the California day-ahead market and the UC Davis campus building energy demands are utilized to demonstrate the efficacy of the proposed framework and algorithm.
The fusion of causal models with deep learning introducing increasingly intricate data sets, such as the causal associations within images or between textual components, has surfaced as a focal research area. Nonetheless, the broadening of original causal concepts and theories to such complex, non-statistical data has been met with serious challenges. In response, our study proposes redefinitions of causal data into three distinct categories from the standpoint of causal structure and representation: definite data, semi-definite data, and indefinite data. Definite data chiefly pertains to statistical data used in conventional causal scenarios, while semi-definite data refers to a spectrum of data formats germane to deep learning, including time-series, images, text, and others. Indefinite data is an emergent research sphere inferred from the progression of data forms by us. To comprehensively present these three data paradigms, we elaborate on their formal definitions, differences manifested in datasets, resolution pathways, and development of research. We summarize key tasks and achievements pertaining to definite and semi-definite data from myriad research undertakings, present a roadmap for indefinite data, beginning with its current research conundrums. Lastly, we classify and scrutinize the key datasets presently utilized within these three paradigms.
Deep neural networks have revolutionized many machine learning tasks in power systems, ranging from pattern recognition to signal processing. The data in these tasks is typically represented in Euclidean domains. Nevertheless, there is an increasing number of applications in power systems, where data are collected from non-Euclidean domains and represented as the graph-structured data with high dimensional features and interdependency among nodes. The complexity of graph-structured data has brought significant challenges to the existing deep neural networks defined in Euclidean domains. Recently, many studies on extending deep neural networks for graph-structured data in power systems have emerged. In this paper, a comprehensive overview of graph neural networks (GNNs) in power systems is proposed. Specifically, several classical paradigms of GNNs structures (e.g., graph convolutional networks, graph recurrent neural networks, graph attention networks, graph generative networks, spatial-temporal graph convolutional networks, and hybrid forms of GNNs) are summarized, and key applications in power systems such as fault diagnosis, power prediction, power flow calculation, and data generation are reviewed in detail. Furthermore, main issues and some research trends about the applications of GNNs in power systems are discussed.