Many natural systems are observed as point patterns in time, space, or space and time. Examples include plant and cellular systems, animal colonies, earthquakes, and wildfires. In practice the locations of the points are not always observed correctly. However, in the point process literature, there has been relatively scant attention paid to the issue of errors in the location of points. In this paper, we discuss how the observed point pattern may deviate from the actual point pattern and review methods and models that exist to handle such deviations. The discussion is supplemented with several scientific illustrations.
Partial differential equations with highly oscillatory input terms are hardly ever solvable analytically and their numerical treatment is difficult. Modulated Fourier expansion used as an {\it ansatz} is a well known and extensively investigated tool in asymptotic numerical approach for this kind of problems. Although the efficiency of this approach has been recognised, its error analysis has not been investigated rigorously for general forms of linear PDEs. In this paper, we start such kind of investigations for a general form of linear PDEs with an input term characterised by a single high frequency. More precisely we derive an analytical form of such an expansion and provide a formula for the error of its truncation. Theoretical investigations are illustrated by computational simulations.
In domains where sample sizes are limited, efficient learning algorithms are critical. Learning using privileged information (LuPI) offers increased sample efficiency by allowing prediction models access to auxiliary information at training time which is unavailable when the models are used. In recent work, it was shown that for prediction in linear-Gaussian dynamical systems, a LuPI learner with access to intermediate time series data is never worse and often better in expectation than any unbiased classical learner. We provide new insights into this analysis and generalize it to nonlinear prediction tasks in latent dynamical systems, extending theoretical guarantees to the case where the map connecting latent variables and observations is known up to a linear transform. In addition, we propose algorithms based on random features and representation learning for the case when this map is unknown. A suite of empirical results confirm theoretical findings and show the potential of using privileged time-series information in nonlinear prediction.
This paper studies the impact of bootstrap procedure on the eigenvalue distributions of the sample covariance matrix under a high-dimensional factor structure. We provide asymptotic distributions for the top eigenvalues of bootstrapped sample covariance matrix under mild conditions. After bootstrap, the spiked eigenvalues which are driven by common factors will converge weakly to Gaussian limits after proper scaling and centralization. However, the largest non-spiked eigenvalue is mainly determined by the order statistics of the bootstrap resampling weights, and follows extreme value distribution. Based on the disparate behavior of the spiked and non-spiked eigenvalues, we propose innovative methods to test the number of common factors. Indicated by extensive numerical and empirical studies, the proposed methods perform reliably and convincingly under the existence of both weak factors and cross-sectionally correlated errors. Our technical details contribute to random matrix theory on spiked covariance model with convexly decaying density and unbounded support, or with general elliptical distributions.
Understanding fluid movement in multi-pored materials is vital for energy security and physiology. For instance, shale (a geological material) and bone (a biological material) exhibit multiple pore networks. Double porosity/permeability models provide a mechanics-based approach to describe hydrodynamics in aforesaid porous materials. However, current theoretical results primarily address state-state response, and their counterparts in the transient regime are still wanting. The primary aim of this paper is to fill this knowledge gap. We present three principal properties -- with rigorous mathematical arguments -- that the solutions under the double porosity/permeability model satisfy in the transient regime: backward-in-time uniqueness, reciprocity, and a variational principle. We employ the ``energy method'' -- by exploiting the physical total kinetic energy of the flowing fluid -- to establish the first property and Cauchy-Riemann convolutions to prove the next two. The results reported in this paper -- that qualitatively describe the dynamics of fluid flow in double-pored media -- have (a) theoretical significance, (b) practical applications, and (c) considerable pedagogical value. In particular, these results will benefit practitioners and computational scientists in checking the accuracy of numerical simulators. The backward-in-time uniqueness lays a firm theoretical foundation for pursuing inverse problems in which one predicts the prescribed initial conditions based on data available about the solution at a later instance.
We observe a large variety of robots in terms of their bodies, sensors, and actuators. Given the commonalities in the skill sets, teaching each skill to each different robot independently is inefficient and not scalable when the large variety in the robotic landscape is considered. If we can learn the correspondences between the sensorimotor spaces of different robots, we can expect a skill that is learned in one robot can be more directly and easily transferred to other robots. In this paper, we propose a method to learn correspondences among two or more robots that may have different morphologies. To be specific, besides robots with similar morphologies with different degrees of freedom, we show that a fixed-based manipulator robot with joint control and a differential drive mobile robot can be addressed within the proposed framework. To set up the correspondence among the robots considered, an initial base task is demonstrated to the robots to achieve the same goal. Then, a common latent representation is learned along with the individual robot policies for achieving the goal. After the initial learning stage, the observation of a new task execution by one robot becomes sufficient to generate a latent space representation pertaining to the other robots to achieve the same task. We verified our system in a set of experiments where the correspondence between robots is learned (1) when the robots need to follow the same paths to achieve the same task, (2) when the robots need to follow different trajectories to achieve the same task, and (3) when complexities of the required sensorimotor trajectories are different for the robots. We also provide a proof-of-the-concept realization of correspondence learning between a real manipulator robot and a simulated mobile robot.
Two new hybrid algorithms are proposed for large-scale linear discrete ill-posed problems in general-form regularization. They are both based on Krylov subspace inner-outer iterative algorithms. At each iteration, they need to solve a linear least squares problem. It is proved that the inner linear least squares problems, which are solved by LSQR, become better conditioned as k increases, so LSQR converges faster. We also prove how to choose the stopping tolerance for LSQR in order to guarantee that the computed solutions have the same accuracy with the exact best regularized solutions. Numerical experiments are given to show the effectiveness and efficiency of our new hybrid algorithms, and comparisons are made with the existing algorithm.
Finding the distribution of the velocities and pressures of a fluid (by solving the Navier-Stokes equations) is a principal task in the chemical, energy, and pharmaceutical industries, as well as in mechanical engineering and the design of pipeline systems. With existing solvers, such as OpenFOAM and Ansys, simulations of fluid dynamics in intricate geometries are computationally expensive and require re-simulation whenever the geometric parameters or the initial and boundary conditions are altered. Physics-informed neural networks are a promising tool for simulating fluid flows in complex geometries, as they can adapt to changes in the geometry and mesh definitions, allowing for generalization across different shapes. We present a hybrid quantum physics-informed neural network that simulates laminar fluid flows in 3D Y-shaped mixers. Our approach combines the expressive power of a quantum model with the flexibility of a physics-informed neural network, resulting in a 21% higher accuracy compared to a purely classical neural network. Our findings highlight the potential of machine learning approaches, and in particular hybrid quantum physics-informed neural network, for complex shape optimization tasks in computational fluid dynamics. By improving the accuracy of fluid simulations in complex geometries, our research using hybrid quantum models contributes to the development of more efficient and reliable fluid dynamics solvers.
Learning distance functions between complex objects, such as the Wasserstein distance to compare point sets, is a common goal in machine learning applications. However, functions on such complex objects (e.g., point sets and graphs) are often required to be invariant to a wide variety of group actions e.g. permutation or rigid transformation. Therefore, continuous and symmetric product functions (such as distance functions) on such complex objects must also be invariant to the product of such group actions. We call these functions symmetric and factor-wise group invariant (or SFGI functions in short). In this paper, we first present a general neural network architecture for approximating SFGI functions. The main contribution of this paper combines this general neural network with a sketching idea to develop a specific and efficient neural network which can approximate the $p$-th Wasserstein distance between point sets. Very importantly, the required model complexity is independent of the sizes of input point sets. On the theoretical front, to the best of our knowledge, this is the first result showing that there exists a neural network with the capacity to approximate Wasserstein distance with bounded model complexity. Our work provides an interesting integration of sketching ideas for geometric problems with universal approximation of symmetric functions. On the empirical front, we present a range of results showing that our newly proposed neural network architecture performs comparatively or better than other models (including a SOTA Siamese Autoencoder based approach). In particular, our neural network generalizes significantly better and trains much faster than the SOTA Siamese AE. Finally, this line of investigation could be useful in exploring effective neural network design for solving a broad range of geometric optimization problems (e.g., $k$-means in a metric space).
How do score-based generative models (SBMs) learn the data distribution supported on a low-dimensional manifold? We investigate the score model of a trained SBM through its linear approximations and subspaces spanned by local feature vectors. During diffusion as the noise decreases, the local dimensionality increases and becomes more varied between different sample sequences. Importantly, we find that the learned vector field mixes samples by a non-conservative field within the manifold, although it denoises with normal projections as if there is an energy function in off-manifold directions. At each noise level, the subspace spanned by the local features overlap with an effective density function. These observations suggest that SBMs can flexibly mix samples with the learned score field while carefully maintaining a manifold-like structure of the data distribution.
We study Whitney-type estimates for approximation of convex functions in the uniform norm on various convex multivariate domains while paying a particular attention to the dependence of the involved constants on the dimension and the geometry of the domain.