Application of Neural Networks to river hydraulics is fledgling, despite the field suffering from data scarcity, a challenge for machine learning techniques. Consequently, many purely data-driven Neural Networks proved to lack predictive capabilities. In this work, we propose to mitigate such problem by introducing physical information into the training phase. The idea is borrowed from Physics-Informed Neural Networks which have been recently proposed in other contexts. Physics-Informed Neural Networks embed physical information in the form of the residual of the Partial Differential Equations (PDEs) governing the phenomenon and, as such, are conceived as neural solvers, i.e. an alternative to traditional numerical solvers. Such approach is seldom suitable for environmental hydraulics, where epistemic uncertainties are large, and computing residuals of PDEs exhibits difficulties similar to those faced by classical numerical methods. Instead, we envisaged the employment of Neural Networks as neural operators, featuring physical constraints formulated without resorting to PDEs. The proposed novel methodology shares similarities with data augmentation and regularization. We show that incorporating such soft physical information can improve predictive capabilities.
The false confidence theorem establishes that, for any data-driven, precise-probabilistic method for uncertainty quantification, there exists (non-trivial) false hypotheses to which the method tends to assign high confidence. This raises concerns about the reliability of these widely-used methods, and shines new light on the consonant belief function-based methods that are provably immune to false confidence. But an existence result alone is insufficient. Towards a partial answer to the title question, I show that, roughly, complements of convex hypotheses are afflicted by false confidence.
With the advent of massive data sets much of the computational science and engineering community has moved toward data-intensive approaches in regression and classification. However, these present significant challenges due to increasing size, complexity and dimensionality of the problems. In particular, covariance matrices in many cases are numerically unstable and linear algebra shows that often such matrices cannot be inverted accurately on a finite precision computer. A common ad hoc approach to stabilizing a matrix is application of a so-called nugget. However, this can change the model and introduce error to the original solution. It is well known from numerical analysis that ill-conditioned matrices cannot be accurately inverted. In this paper we develop a multilevel computational method that scales well with the number of observations and dimensions. A multilevel basis is constructed adapted to a kD-tree partitioning of the observations. Numerically unstable covariance matrices with large condition numbers can be transformed into well conditioned multilevel ones without compromising accuracy. Moreover, it is shown that the multilevel prediction exactly solves the Best Linear Unbiased Predictor (BLUP) and Generalized Least Squares (GLS) model, but is numerically stable. The multilevel method is tested on numerically unstable problems of up to 25 dimensions. Numerical results show speedups of up to 42,050 times for solving the BLUP problem, but with the same accuracy as the traditional iterative approach. For very ill-conditioned cases the speedup is infinite. In addition, decay estimates of the multilevel covariance matrices are derived based on high dimensional interpolation techniques from the field of numerical analysis. This work lies at the intersection of statistics, uncertainty quantification, high performance computing and computational applied mathematics.
GeoAI has emerged as an exciting interdisciplinary research area that combines spatial theories and data with cutting-edge AI models to address geospatial problems in a novel, data-driven manner. While GeoAI research has flourished in the GIScience literature, its reproducibility and replicability (R&R), fundamental principles that determine the reusability, reliability, and scientific rigor of research findings, have rarely been discussed. This paper aims to provide an in-depth analysis of this topic from both computational and spatial perspectives. We first categorize the major goals for reproducing GeoAI research, namely, validation (repeatability), learning and adapting the method for solving a similar or new problem (reproducibility), and examining the generalizability of the research findings (replicability). Each of these goals requires different levels of understanding of GeoAI, as well as different methods to ensure its success. We then discuss the factors that may cause the lack of R&R in GeoAI research, with an emphasis on (1) the selection and use of training data; (2) the uncertainty that resides in the GeoAI model design, training, deployment, and inference processes; and more importantly (3) the inherent spatial heterogeneity of geospatial data and processes. We use a deep learning-based image analysis task as an example to demonstrate the results' uncertainty and spatial variance caused by different factors. The findings reiterate the importance of knowledge sharing, as well as the generation of a "replicability map" that incorporates spatial autocorrelation and spatial heterogeneity into consideration in quantifying the spatial replicability of GeoAI research.
Many analyses of multivariate data focus on evaluating the dependence between two sets of variables, rather than the dependence among individual variables within each set. Canonical correlation analysis (CCA) is a classical data analysis technique that estimates parameters describing the dependence between such sets. However, inference procedures based on traditional CCA rely on the assumption that all variables are jointly normally distributed. We present a semiparametric approach to CCA in which the multivariate margins of each variable set may be arbitrary, but the dependence between variable sets is described by a parametric model that provides low-dimensional summaries of dependence. While maximum likelihood estimation in the proposed model is intractable, we propose two estimation strategies: one using a pseudolikelihood for the model and one using a Markov chain Monte Carlo (MCMC) algorithm that provides Bayesian estimates and confidence regions for the between-set dependence parameters. The MCMC algorithm is derived from a multirank likelihood function, which uses only part of the information in the observed data in exchange for being free of assumptions about the multivariate margins. We apply the proposed Bayesian inference procedure to Brazilian climate data and monthly stock returns from the materials and communications market sectors.
Circuit complexity, defined as the minimum circuit size required for implementing a particular Boolean computation, is a foundational concept in computer science. Determining circuit complexity is believed to be a hard computational problem [1]. Recently, in the context of black holes, circuit complexity has been promoted to a physical property, wherein the growth of complexity is reflected in the time evolution of the Einstein-Rosen bridge (``wormhole'') connecting the two sides of an AdS ``eternal'' black hole [2]. Here we explore another link between complexity and thermodynamics for circuits of given functionality, making the physics-inspired approach relevant to real computational problems, for which functionality is the key element of interest. In particular, our thermodynamic framework provides a new perspective on the obfuscation of programs of arbitrary length -- an important problem in cryptography -- as thermalization through recursive mixing of neighboring sections of a circuit, which can be viewed as the mixing of two containers with ``gases of gates''. This recursive process equilibrates the average complexity and leads to the saturation of the circuit entropy, while preserving functionality of the overall circuit. The thermodynamic arguments hinge on ergodicity in the space of circuits which we conjecture is limited to disconnected ergodic sectors due to fragmentation. The notion of fragmentation has important implications for the problem of circuit obfuscation as it implies that there are circuits with same size and functionality that cannot be connected via local moves. Furthermore, we argue that fragmentation is unavoidable unless the complexity classes NP and coNP coincide, a statement that implies the collapse of the polynomial hierarchy of computational complexity theory to its first level.
This work presents a new algorithm to compute the matrix exponential within a given tolerance. Combined with the scaling and squaring procedure, the algorithm incorporates Taylor, partitioned and classical Pad\'e methods shown to be superior in performance to the approximants used in state-of-the-art software. The algorithm computes matrix--matrix products and also matrix inverses, but it can be implemented to avoid the computation of inverses, making it convenient for some problems. If the matrix A belongs to a Lie algebra, then exp(A) belongs to its associated Lie group, being a property which is preserved by diagonal Pad\'e approximants, and the algorithm has another option to use only these. Numerical experiments show the superior performance with respect to state-of-the-art implementations.
In a Jacobi--Davidson (JD) type method for singular value decomposition (SVD) problems, called JDSVD, a large symmetric and generally indefinite correction equation is approximately solved iteratively at each outer iteration, which constitutes the inner iterations and dominates the overall efficiency of JDSVD. In this paper, a convergence analysis is made on the minimal residual (MINRES) method for the correction equation. Motivated by the results obtained, a preconditioned correction equation is derived that extracts useful information from current searching subspaces to construct effective preconditioners for the correction equation and is proved to retain the same convergence of outer iterations of JDSVD. The resulting method is called inner preconditioned JDSVD (IPJDSVD) method. Convergence results show that MINRES for the preconditioned correction equation can converge much faster when there is a cluster of singular values closest to a given target, so that IPJDSVD is more efficient than JDSVD. A new thick-restart IPJDSVD algorithm with deflation and purgation is proposed that simultaneously accelerates the outer and inner convergence of the standard thick-restart JDSVD and computes several singular triplets of a large matrix. Numerical experiments justify the theory and illustrate the considerable superiority of IPJDSVD to JDSVD.
In the big data era, integrating diverse data modalities poses significant challenges, particularly in complex fields like healthcare. This paper introduces a new process model for multimodal Data Fusion for Data Mining, integrating embeddings and the Cross-Industry Standard Process for Data Mining with the existing Data Fusion Information Group model. Our model aims to decrease computational costs, complexity, and bias while improving efficiency and reliability. We also propose "disentangled dense fusion", a novel embedding fusion method designed to optimize mutual information and facilitate dense inter-modality feature interaction, thereby minimizing redundant information. We demonstrate the model's efficacy through three use cases: predicting diabetic retinopathy using retinal images and patient metadata, domestic violence prediction employing satellite imagery, internet, and census data, and identifying clinical and demographic features from radiography images and clinical notes. The model achieved a Macro F1 score of 0.92 in diabetic retinopathy prediction, an R-squared of 0.854 and sMAPE of 24.868 in domestic violence prediction, and a macro AUC of 0.92 and 0.99 for disease prediction and sex classification, respectively, in radiological analysis. These results underscore the Data Fusion for Data Mining model's potential to significantly impact multimodal data processing, promoting its adoption in diverse, resource-constrained settings.
This paper studies the convergence of a spatial semidiscretization of a three-dimensional stochastic Allen-Cahn equation with multiplicative noise. For non-smooth initial values, the regularity of the mild solution is investigated, and an error estimate is derived with the spatial $ L^2 $-norm. For smooth initial values, two error estimates with the general spatial $ L^q $-norms are established.
This study exploits information fusion in IoT systems and uses a clustering method to identify similarities in behaviours and key characteristics within each cluster. This approach facilitates early detection of behaviour changes and provides a more in-depth understanding of behaviour routines for continuous health monitoring.