The analysis of multivariate functional curves has the potential to yield important scientific discoveries in domains such as healthcare, medicine, economics and social sciences. However it is common for real-world settings to present data that are both sparse and irregularly sampled, and this introduces important challenges for the current functional data methodology. Here we propose a Bayesian hierarchical framework for multivariate functional principal component analysis which accommodates the intricacies of such sampling designs by flexibly pooling information across subjects and correlated curves. Our model represents common latent dynamics via shared functional principal component scores, thereby effectively borrowing strength across curves while circumventing the computationally challenging task of estimating covariance matrices. These scores also provide a parsimonious representation of the major modes of joint variation of the curves, and constitute interpretable scalar summaries that can be employed in follow-up analyses. We perform inference using a variational message passing algorithm which combines efficiency, modularity and approximate posterior density estimation, enabling the joint analysis of large datasets with parameter uncertainty quantification. We conduct detailed simulations to assess the effectiveness of our approach in sharing information under complex sampling designs. We also exploit it to estimate the molecular disease courses of individual patients with SARS-CoV-2 infection and characterise patient heterogeneity in recovery outcomes; this study reveals key coordinated dynamics across the immune, inflammatory and metabolic systems, which are associated with survival and long-COVID symptoms up to one year post disease onset. Our approach is implemented in the R package bayesFPCA.
We present a deterministic algorithm for the efficient evaluation of imaginary time diagrams based on the recently introduced discrete Lehmann representation (DLR) of imaginary time Green's functions. In addition to the efficient discretization of diagrammatic integrals afforded by its approximation properties, the DLR basis is separable in imaginary time, allowing us to decompose diagrams into linear combinations of nested sequences of one-dimensional products and convolutions. Focusing on the strong coupling bold-line expansion of generalized Anderson impurity models, we show that our strategy reduces the computational complexity of evaluating an $M$th-order diagram at inverse temperature $\beta$ and spectral width $\omega_{\max}$ from $\mathcal{O}((\beta \omega_{\max})^{2M-1})$ for a direct quadrature to $\mathcal{O}(M (\log (\beta \omega_{\max}))^{M+1})$, with controllable high-order accuracy. We benchmark our algorithm using third-order expansions for multi-band impurity problems with off-diagonal hybridization and spin-orbit coupling, presenting comparisons with exact diagonalization and quantum Monte Carlo approaches. In particular, we perform a self-consistent dynamical mean-field theory calculation for a three-band Hubbard model with strong spin-orbit coupling representing a minimal model of Ca$_2$RuO$_4$, demonstrating the promise of the method for modeling realistic strongly correlated multi-band materials. For both strong and weak coupling expansions of low and intermediate order, in which diagrams can be enumerated, our method provides an efficient, straightforward, and robust black-box evaluation procedure. In this sense, it fills a gap between diagrammatic approximations of the lowest order, which are simple and inexpensive but inaccurate, and those based on Monte Carlo sampling of high-order diagrams.
We define positive and strictly positive definite functions on a domain and study these functions on a list of regular domains. The list includes the unit ball, conic surface, hyperbolic surface, solid hyperboloid, and simplex. Each of these domains is embedded in a quadrant or a union of quadrants of the unit sphere by a distance preserving map, from which characterizations of positive definite and strictly positive definite functions are derived for these regular domains.
Laguerre spectral approximations play an important role in the development of efficient algorithms for problems in unbounded domains. In this paper, we present a comprehensive convergence rate analysis of Laguerre spectral approximations for analytic functions. By exploiting contour integral techniques from complex analysis, we prove that Laguerre projection and interpolation methods of degree $n$ converge at the root-exponential rate $O(\exp(-2\rho\sqrt{n}))$ with $\rho>0$ when the underlying function is analytic inside and on a parabola with focus at the origin and vertex at $z=-\rho^2$. As far as we know, this is the first rigorous proof of root-exponential convergence of Laguerre approximations for analytic functions. Several important applications of our analysis are also discussed, including Laguerre spectral differentiations, Gauss-Laguerre quadrature rules, the scaling factor and the Weeks method for the inversion of Laplace transform, and some sharp convergence rate estimates are derived. Numerical experiments are presented to verify the theoretical results.
Multiscale stochastic dynamical systems have been widely adopted to a variety of scientific and engineering problems due to their capability of depicting complex phenomena in many real world applications. This work is devoted to investigating the effective dynamics for slow-fast stochastic dynamical systems. Given observation data on a short-term period satisfying some unknown slow-fast stochastic systems, we propose a novel algorithm including a neural network called Auto-SDE to learn invariant slow manifold. Our approach captures the evolutionary nature of a series of time-dependent autoencoder neural networks with the loss constructed from a discretized stochastic differential equation. Our algorithm is also validated to be accurate, stable and effective through numerical experiments under various evaluation metrics.
We present and analyze an algorithm designed for addressing vector-valued regression problems involving possibly infinite-dimensional input and output spaces. The algorithm is a randomized adaptation of reduced rank regression, a technique to optimally learn a low-rank vector-valued function (i.e. an operator) between sampled data via regularized empirical risk minimization with rank constraints. We propose Gaussian sketching techniques both for the primal and dual optimization objectives, yielding Randomized Reduced Rank Regression (R4) estimators that are efficient and accurate. For each of our R4 algorithms we prove that the resulting regularized empirical risk is, in expectation w.r.t. randomness of a sketch, arbitrarily close to the optimal value when hyper-parameteres are properly tuned. Numerical expreriments illustrate the tightness of our bounds and show advantages in two distinct scenarios: (i) solving a vector-valued regression problem using synthetic and large-scale neuroscience datasets, and (ii) regressing the Koopman operator of a nonlinear stochastic dynamical system.
The statistical analysis of group studies in neuroscience is particularly challenging due to the complex spatio-temporal nature of the data, its multiple levels and the inter-individual variability in brain responses. In this respect, traditional ANOVA-based studies and linear mixed effects models typically provide only limited exploration of the dynamic of the group brain activity and variability of the individual responses potentially leading to overly simplistic conclusions and/or missing more intricate patterns. In this study we propose a novel method based on functional Principal Components Analysis and Bayesian model-based clustering to simultaneously assess group effects and individual deviations over the most important temporal features in the data. This method provides a thorough exploration of group differences and individual deviations in neuroscientific group studies without compromising on the spatio-temporal nature of the data. By means of a simulation study we demonstrate that the proposed model returns correct classification in different clustering scenarios under low and high of noise levels in the data. Finally we consider a case study using Electroencephalogram data recorded during an object recognition task where our approach provides new insights into the underlying brain mechanisms generating the data and their variability.
Computational effects are commonly modelled by monads, but often a monad can be presented by an algebraic theory of operations and equations. This talk is about monads and algebraic theories for languages for inference, and their connections to semirings and tensors. A basic class of examples of algebraic theories comes from considering the theory of modules for a semiring, e.g. the theory of unnormalized distributions, where the semiring is that of the non-negative real numbers. We propose that an interesting perspective is given by studying theories via semirings, and to this end explore several examples of subtheories of module theories, mostly relating to probability. Our main contribution concerns the commutative combination of effects, as studied by Hyland, Plotkin and Power: we observe that while the semiring tensor does not in general determine the tensor of subtheories of module theories, it still does in several fundamental probabilistic examples.
Generative AI, such as image generation models and large language models, stands to provide tremendous value to end-user programmers in creative and knowledge workflows. Current research methods struggle to engage end-users in a realistic conversation that balances the actually existing capabilities of generative AI with the open-ended nature of user workflows and the many opportunities for the application of this technology. In this work-in-progress paper, we introduce participatory prompting, a method for eliciting opportunities for generative AI in end-user workflows. The participatory prompting method combines a contextual inquiry and a researcher-mediated interaction with a generative model, which helps study participants interact with a generative model without having to develop prompting strategies of their own. We discuss the ongoing development of a study whose aim will be to identify end-user programming opportunities for generative AI in data analysis workflows.
In recent literature, for modeling reasons, fractional differential problems have been considered equipped with anti-symmetric boundary conditions. Twenty years ago the anti-reflective boundary conditions were introduced in a context of signal processing and imaging for increasing the quality of the reconstruction of a blurred signal/image contaminated by noise and for reducing the overall complexity to that of few fast sine transforms i.e. to $O(N\log N)$ real arithmetic operations, where $N$ is the number of pixels. Here we consider the anti-symmetric boundary conditions and we introduce the anti-reflective boundary conditions in the context of nonlocal problems of fractional differential type. In the latter context, we study both types of boundary conditions, which in reality are similar in the essentials, from the perspective of computational efficiency, by considering nontruncated and truncated versions. Several numerical tests, tables, and visualizations are provided and critically discussed.
Fractional calculus with respect to function $\psi$, also named as $\psi$-fractional calculus, generalizes the Hadamard and the Riemann-Liouville fractional calculi, which causes challenge in numerical treatment. In this paper we study spectral-type methods using mapped Jacobi functions (MJFs) as basis functions and obtain efficient algorithms to solve $\psi$-fractional differential equations. In particular, we setup the Petrov-Galerkin spectral method and spectral collocation method for initial and boundary value problems involving $\psi$-fractional derivatives. We develop basic approximation theory for the MJFs and conduct the error estimates of the derived methods. We also establish a recurrence relation to evaluate the collocation differentiation matrix for implementing the spectral collocation algorithm. Numerical examples confirm the theoretical results and demonstrate the effectiveness of the spectral and collocation methods.