Temporally and spatially dependent uncertain parameters are regularly encountered in engineering applications. Commonly these uncertainties are accounted for using random fields and processes, which require knowledge about the appearing probability distributions functions that is not readily available. In these cases non-probabilistic approaches such as interval analysis and fuzzy set theory are helpful uncertainty measures. Partial differential equations involving fuzzy and interval fields are traditionally solved using the finite element method where the input fields are sampled using some basis function expansion methods. This approach however is problematic, as it is reliant on knowledge about the spatial correlation fields. In this work we utilize physics-informed neural networks (PINNs) to solve interval and fuzzy partial differential equations. The resulting network structures termed interval physics-informed neural networks (iPINNs) and fuzzy physics-informed neural networks (fPINNs) show promising results for obtaining bounded solutions of equations involving spatially and/or temporally uncertain parameter fields. In contrast to finite element approaches, no correlation length specification of the input fields as well as no Monte-Carlo simulations are necessary. In fact, information about the input interval fields is obtained directly as a byproduct of the presented solution scheme. Furthermore, all major advantages of PINNs are retained, i.e. meshfree nature of the scheme, and ease of inverse problem set-up.
Reference priors are theoretically attractive for the analysis of geostatistical data since they enable automatic Bayesian analysis and have desirable Bayesian and frequentist properties. But their use is hindered by computational hurdles that make their application in practice challenging. In this work, we derive a new class of default priors that approximate reference priors for the parameters of some Gaussian random fields. It is based on an approximation to the integrated likelihood of the covariance parameters derived from the spectral approximation of stationary random fields. This prior depends on the structure of the mean function and the spectral density of the model evaluated at a set of spectral points associated with an auxiliary regular grid. In addition to preserving the desirable Bayesian and frequentist properties, these approximate reference priors are more stable, and their computations are much less onerous than those of exact reference priors. Unlike exact reference priors, the marginal approximate reference prior of correlation parameter is always proper, regardless of the mean function or the smoothness of the correlation function. This property has important consequences for covariance model selection. An illustration comparing default Bayesian analyses is provided with a data set of lead pollution in Galicia, Spain.
Statistical divergences (SDs), which quantify the dissimilarity between probability distributions, are a basic constituent of statistical inference and machine learning. A modern method for estimating those divergences relies on parametrizing an empirical variational form by a neural network (NN) and optimizing over parameter space. Such neural estimators are abundantly used in practice, but corresponding performance guarantees are partial and call for further exploration. We establish non-asymptotic absolute error bounds for a neural estimator realized by a shallow NN, focusing on four popular $\mathsf{f}$-divergences -- Kullback-Leibler, chi-squared, squared Hellinger, and total variation. Our analysis relies on non-asymptotic function approximation theorems and tools from empirical process theory to bound the two sources of error involved: function approximation and empirical estimation. The bounds characterize the effective error in terms of NN size and the number of samples, and reveal scaling rates that ensure consistency. For compactly supported distributions, we further show that neural estimators of the first three divergences above with appropriate NN growth-rate are minimax rate-optimal, achieving the parametric convergence rate.
Radar sensors are crucial for environment perception of driver assistance systems as well as autonomous vehicles. With a rising number of radar sensors and the so far unregulated automotive radar frequency band, mutual interference is inevitable and must be dealt with. Algorithms and models operating on radar data are required to run the early processing steps on specialized radar sensor hardware. This specialized hardware typically has strict resource-constraints, i.e. a low memory capacity and low computational power. Convolutional Neural Network (CNN)-based approaches for denoising and interference mitigation yield promising results for radar processing in terms of performance. Regarding resource-constraints, however, CNNs typically exceed the hardware's capacities by far. In this paper we investigate quantization techniques for CNN-based denoising and interference mitigation of radar signals. We analyze the quantization of (i) weights and (ii) activations of different CNN-based model architectures. This quantization results in reduced memory requirements for model storage and during inference. We compare models with fixed and learned bit-widths and contrast two different methodologies for training quantized CNNs, i.e. the straight-through gradient estimator and training distributions over discrete weights. We illustrate the importance of structurally small real-valued base models for quantization and show that learned bit-widths yield the smallest models. We achieve a memory reduction of around 80\% compared to the real-valued baseline. Due to practical reasons, however, we recommend the use of 8 bits for weights and activations, which results in models that require only 0.2 megabytes of memory.
A theoretical, and potentially also practical, problem with stochastic gradient descent is that trajectories may escape to infinity. In this note, we investigate uniform boundedness properties of iterates and function values along the trajectories of the stochastic gradient descent algorithm and its important momentum variant. Under smoothness and $R$-dissipativity of the loss function, we show that broad families of step-sizes, including the widely used step-decay and cosine with (or without) restart step-sizes, result in uniformly bounded iterates and function values. Several important applications that satisfy these assumptions, including phase retrieval problems, Gaussian mixture models and some neural network classifiers, are discussed in detail.
The classic studies of causal emergence have revealed that in some Markovian dynamical systems, far stronger causal connections can be found on the higher-level descriptions than the lower-level of the same systems if we coarse-grain the system states in an appropriate way. However, identifying this emergent causality from the data is still a hard problem that has not been solved because the correct coarse-graining strategy can not be found easily. This paper proposes a general machine learning framework called Neural Information Squeezer to automatically extract the effective coarse-graining strategy and the macro-state dynamics, as well as identify causal emergence directly from the time series data. By decomposing a coarse-graining operation into two processes: information conversion and information dropping out, we can not only exactly control the width of the information channel, but also can derive some important properties analytically including the exact expression of the effective information of a macro-dynamics. We also show how our framework can extract the dynamics on different levels and identify causal emergence from the data on several exampled systems.
The development of a future, global quantum communication network (or quantum internet) will enable high rate private communication and entanglement distribution over very long distances. However, the large-scale performance of ground-based quantum networks (which employ photons as information carriers through optical-fibres) is fundamentally limited by fibre quality and link length, with the latter being a primary design factor for practical network architectures. While these fundamental limits are well established for arbitrary network topologies, the question of how to best design global architectures remains open. In this work, we introduce a large-scale quantum network model called weakly-regular architectures. Such networks are capable of idealising network connectivity, provide freedom to capture a broad class of spatial topologies and remain analytically treatable. This allows us to investigate the effectiveness of large-scale networks with consistent connective properties, and unveil critical conditions under which end-to-end rates remain optimal. Furthermore, through a strict performance comparison of ideal, ground-based quantum networks with that of realistic satellite quantum communication protocols, we establish conditions for which satellites can be used to outperform fibre-based quantum infrastructure; {rigorously proving the efficacy of satellite-based technologies for global quantum communications.
We introduce a neural implicit framework that bridges discrete differential geometry of triangle meshes and continuous differential geometry of neural implicit surfaces. It exploits the differentiable properties of neural networks and the discrete geometry of triangle meshes to approximate them as the zero-level sets of neural implicit functions. To train a neural implicit function, we propose a loss function that allows terms with high-order derivatives, such as the alignment between the principal directions, to learn more geometric details. During training, we consider a non-uniform sampling strategy based on the discrete curvatures of the triangle mesh to access points with more geometric details. This sampling implies faster learning while preserving geometric accuracy. We present the analytical differential geometry formulas for neural surfaces, such as normal vectors and curvatures. We use them to render the surfaces using sphere tracing. Additionally, we propose a network optimization based on singular value decomposition to reduce the number of parameters.
The layout optimization of the heat conduction is essential during design in engineering, especially for thermal sensible products. When the optimization algorithm iteratively evaluates different loading cases, the traditional numerical simulation methods used usually lead to a substantial computational cost. To effectively reduce the computational effort, data-driven approaches are used to train a surrogate model as a mapping between the prescribed external loads and various geometry. However, the existing model are trained by data-driven methods which requires intensive training samples that from numerical simulations and not really effectively solve the problem. Choosing the steady heat conduction problems as examples, this paper proposes a Physics-driven Convolutional Neural Networks (PD-CNN) method to infer the physical field solutions for random varied loading cases. After that, the Particle Swarm Optimization (PSO) algorithm is used to optimize the sizes and the positions of the hole masks in the prescribed design domain, and the average temperature value of the entire heat conduction field is minimized, and the goal of minimizing heat transfer is achieved. Compared with the existing data-driven approaches, the proposed PD-CNN optimization framework not only predict field solutions that are highly consistent with conventional simulation results, but also generate the solution space with without any pre-obtained training data.
Due to their increasing spread, confidence in neural network predictions became more and more important. However, basic neural networks do not deliver certainty estimates or suffer from over or under confidence. Many researchers have been working on understanding and quantifying uncertainty in a neural network's prediction. As a result, different types and sources of uncertainty have been identified and a variety of approaches to measure and quantify uncertainty in neural networks have been proposed. This work gives a comprehensive overview of uncertainty estimation in neural networks, reviews recent advances in the field, highlights current challenges, and identifies potential research opportunities. It is intended to give anyone interested in uncertainty estimation in neural networks a broad overview and introduction, without presupposing prior knowledge in this field. A comprehensive introduction to the most crucial sources of uncertainty is given and their separation into reducible model uncertainty and not reducible data uncertainty is presented. The modeling of these uncertainties based on deterministic neural networks, Bayesian neural networks, ensemble of neural networks, and test-time data augmentation approaches is introduced and different branches of these fields as well as the latest developments are discussed. For a practical application, we discuss different measures of uncertainty, approaches for the calibration of neural networks and give an overview of existing baselines and implementations. Different examples from the wide spectrum of challenges in different fields give an idea of the needs and challenges regarding uncertainties in practical applications. Additionally, the practical limitations of current methods for mission- and safety-critical real world applications are discussed and an outlook on the next steps towards a broader usage of such methods is given.
We present the problem of selecting relevant premises for a proof of a given statement. When stated as a binary classification task for pairs (conjecture, axiom), it can be efficiently solved using artificial neural networks. The key difference between our advance to solve this problem and previous approaches is the use of just functional signatures of premises. To further improve the performance of the model, we use dimensionality reduction technique, to replace long and sparse signature vectors with their compact and dense embedded versions. These are obtained by firstly defining the concept of a context for each functor symbol, and then training a simple neural network to predict the distribution of other functor symbols in the context of this functor. After training the network, the output of its hidden layer is used to construct a lower dimensional embedding of a functional signature (for each premise) with a distributed representation of features. This allows us to use 512-dimensional embeddings for conjecture-axiom pairs, containing enough information about the original statements to reach the accuracy of 76.45% in premise selection task, only with simple two-layer densely connected neural networks.