High-dimensional complex multi-parameter problems are prevalent in engineering, exceeding the capabilities of traditional surrogate models designed for low/medium-dimensional problems. These models face the curse of dimensionality, resulting in decreased modeling accuracy as the design parameter space expands. Furthermore, the lack of a parameter decoupling mechanism hinders the identification of couplings between design variables, particularly in highly nonlinear cases. To address these challenges and enhance prediction accuracy while reducing sample demand, this paper proposes a PC-Kriging-HDMR approximate modeling method within the framework of Cut-HDMR. The method leverages the precision of PC-Kriging and optimizes test point placement through a multi-stage adaptive sequential sampling strategy. This strategy encompasses a first-stage adaptive proportional sampling criterion and a second-stage central-based maximum entropy criterion. Numerical tests and a practical application involving a cantilever beam demonstrate the advantages of the proposed method. Key findings include: (1) The performance of traditional single-surrogate models, such as Kriging, significantly deteriorates in high-dimensional nonlinear problems compared to combined surrogate models under the Cut-HDMR framework (e.g., Kriging-HDMR, PCE-HDMR, SVR-HDMR, MLS-HDMR, and PC-Kriging-HDMR); (2) The number of samples required for PC-Kriging-HDMR modeling increases polynomially rather than exponentially as the parameter space expands, resulting in substantial computational cost reduction; (3) Among existing Cut-HDMR methods, no single approach outperforms the others in all aspects. However, PC-Kriging-HDMR exhibits improved modeling accuracy and efficiency within the desired improvement range compared to PCE-HDMR and Kriging-HDMR, demonstrating robustness.
This study addresses a class of linear mixed-integer programming (MILP) problems that involve uncertainty in the objective function parameters. The parameters are assumed to form a random vector, whose probability distribution can only be observed through a finite training data set. Unlike most of the related studies in the literature, we also consider uncertainty in the underlying data set. The data uncertainty is described by a set of linear constraints for each random sample, and the uncertainty in the distribution (for a fixed realization of data) is defined using a type-1 Wasserstein ball centered at the empirical distribution of the data. The overall problem is formulated as a three-level distributionally robust optimization (DRO) problem. First, we prove that the three-level problem admits a single-level MILP reformulation, if the class of loss functions is restricted to biaffine functions. Secondly, it turns out that for several particular forms of data uncertainty, the outlined problem can be solved reasonably fast by leveraging the nominal MILP problem. Finally, we conduct a computational study, where the out-of-sample performance of our model and computational complexity of the proposed MILP reformulation are explored numerically for several application domains.
Advances in next-generation sequencing technology have enabled the high-throughput profiling of metagenomes and accelerated the microbiome study. Recently, there has been a rise in quantitative studies that aim to decipher the microbiome co-occurrence network and its underlying community structure based on metagenomic sequence data. Uncovering the complex microbiome community structure is essential to understanding the role of the microbiome in disease progression and susceptibility. Taxonomic abundance data generated from metagenomic sequencing technologies are high-dimensional and compositional, suffering from uneven sampling depth, over-dispersion, and zero-inflation. These characteristics often challenge the reliability of the current methods for microbiome community detection. To this end, we propose a Bayesian stochastic block model to study the microbiome co-occurrence network based on the recently developed modified centered-log ratio transformation tailored for microbiome data analysis. Our model allows us to incorporate taxonomic tree information using a Markov random field prior. The model parameters are jointly inferred by using Markov chain Monte Carlo sampling techniques. Our simulation study showed that the proposed approach performs better than competing methods even when taxonomic tree information is non-informative. We applied our approach to a real urinary microbiome dataset from postmenopausal women, the first time the urinary microbiome co-occurrence network structure has been studied. In summary, this statistical methodology provides a new tool for facilitating advanced microbiome studies.
We study a class of Gaussian processes for which the posterior mean, for a particular choice of data, replicates a truncated Taylor expansion of any order. The data consist of derivative evaluations at the expansion point and the prior covariance kernel belongs to the class of Taylor kernels, which can be written in a certain power series form. We discuss and prove some results on maximum likelihood estimation of parameters of Taylor kernels. The proposed framework is a special case of Gaussian process regression based on data that is orthogonal in the reproducing kernel Hilbert space of the covariance kernel.
We study the continuous multi-reference alignment model of estimating a periodic function on the circle from noisy and circularly-rotated observations. Motivated by analogous high-dimensional problems that arise in cryo-electron microscopy, we establish minimax rates for estimating generic signals that are explicit in the dimension $K$. In a high-noise regime with noise variance $\sigma^2 \gtrsim K$, for signals with Fourier coefficients of roughly uniform magnitude, the rate scales as $\sigma^6$ and has no further dependence on the dimension. This rate is achieved by a bispectrum inversion procedure, and our analyses provide new stability bounds for bispectrum inversion that may be of independent interest. In a low-noise regime where $\sigma^2 \lesssim K/\log K$, the rate scales instead as $K\sigma^2$, and we establish this rate by a sharp analysis of the maximum likelihood estimator that marginalizes over latent rotations. A complementary lower bound that interpolates between these two regimes is obtained using Assouad's hypercube lemma. We extend these analyses also to signals whose Fourier coefficients have a slow power law decay.
This paper deals with speeding up the convergence of a class of two-step iterative methods for solving linear systems of equations. To implement the acceleration technique, the residual norm associated with computed approximations for each sub-iterate is minimized over a certain two-dimensional subspace. Convergence properties of the proposed method are studied in detail. The approach is further developed to solve (regularized) normal equations arising from the discretization of ill-posed problems. The results of numerical experiments are reported to illustrate the performance of exact and inexact variants of the method on several test problems from different application areas.
The numerical solution of continuum damage mechanics (CDM) problems suffers from critical points during the material softening stage, and consequently existing iterative solvers are subject to a trade-off between computational expense and solution accuracy. Displacement-controlled arc-length methods were developed to address these challenges, but are currently applicable only to geometrically non-linear problems. In this work, we present a novel displacement-controlled arc-length (DAL) method for CDM problems in both local damage and non-local gradient damage versions. The analytical tangent matrix is derived for the DAL solver in both of the local and the non-local models. In addition, several consistent and non-consistent implementation algorithms are proposed, implemented, and evaluated. Unlike existing force-controlled arc-length solvers that monolithically scale the external force vector, the proposed method treats the external force vector as an independent variable and determines the position of the system on the equilibrium path based on all the nodal variations of the external force vector. Such a flexible approach renders the proposed solver to be substantially more efficient and versatile than existing solvers used in CDM problems. The considerable advantages of the proposed DAL algorithm are demonstrated against several benchmark 1D problems with sharp snap-backs and 2D examples with various boundary conditions and loading scenarios, where the proposed method drastically outperforms existing conventional approaches in terms of accuracy, computational efficiency, and the ability to predict the complete equilibrium path including all critical points.
Early sensory systems in the brain rapidly adapt to fluctuating input statistics, which requires recurrent communication between neurons. Mechanistically, such recurrent communication is often indirect and mediated by local interneurons. In this work, we explore the computational benefits of mediating recurrent communication via interneurons compared with direct recurrent connections. To this end, we consider two mathematically tractable recurrent linear neural networks that statistically whiten their inputs -- one with direct recurrent connections and the other with interneurons that mediate recurrent communication. By analyzing the corresponding continuous synaptic dynamics and numerically simulating the networks, we show that the network with interneurons is more robust to initialization than the network with direct recurrent connections in the sense that the convergence time for the synaptic dynamics in the network with interneurons (resp. direct recurrent connections) scales logarithmically (resp. linearly) with the spectrum of their initialization. Our results suggest that interneurons are computationally useful for rapid adaptation to changing input statistics. Interestingly, the network with interneurons is an overparameterized solution of the whitening objective for the network with direct recurrent connections, so our results can be viewed as a recurrent linear neural network analogue of the implicit acceleration phenomenon observed in overparameterized feedforward linear neural networks.
Neural dynamical systems with stable attractor structures, such as point attractors and continuous attractors, are hypothesized to underlie meaningful temporal behavior that requires working memory. However, working memory may not support useful learning signals necessary to adapt to changes in the temporal structure of the environment. We show that in addition to the continuous attractors that are widely implicated, periodic and quasi-periodic attractors can also support learning arbitrarily long temporal relationships. Unlike the continuous attractors that suffer from the fine-tuning problem, the less explored quasi-periodic attractors are uniquely qualified for learning to produce temporally structured behavior. Our theory has broad implications for the design of artificial learning systems and makes predictions about observable signatures of biological neural dynamics that can support temporal dependence learning and working memory. Based on our theory, we developed a new initialization scheme for artificial recurrent neural networks that outperforms standard methods for tasks that require learning temporal dynamics. Moreover, we propose a robust recurrent memory mechanism for integrating and maintaining head direction without a ring attractor.
The design of automatic speech pronunciation assessment can be categorized into closed and open response scenarios, each with strengths and limitations. A system with the ability to function in both scenarios can cater to diverse learning needs and provide a more precise and holistic assessment of pronunciation skills. In this study, we propose a Multi-task Pronunciation Assessment model called MultiPA. MultiPA provides an alternative to Kaldi-based systems in that it has simpler format requirements and better compatibility with other neural network models. Compared with previous open response systems, MultiPA provides a wider range of evaluations, encompassing assessments at both the sentence and word-level. Our experimental results show that MultiPA achieves comparable performance when working in closed response scenarios and maintains more robust performance when directly used for open responses.
Data-driven modeling is useful for reconstructing nonlinear dynamical systems when the underlying process is unknown or too expensive to compute. Having reliable uncertainty assessment of the forecast enables tools to be deployed to predict new scenarios unobserved before. In this work, we first extend parallel partial Gaussian processes for predicting the vector-valued transition function that links the observations between the current and next time points, and quantify the uncertainty of predictions by posterior sampling. Second, we show the equivalence between the dynamic mode decomposition and the maximum likelihood estimator of the linear mapping matrix in the linear state space model. The connection provides a data generating model of dynamic mode decomposition and thus, uncertainty of predictions can be obtained. Furthermore, we draw close connections between different data-driven models for approximating nonlinear dynamics, through a unified view of data generating models. We study two numerical examples, where the inputs of the dynamics are assumed to be known in the first example and the inputs are unknown in the second example. The examples indicate that uncertainty of forecast can be properly quantified, whereas model or input misspecification can degrade the accuracy of uncertainty quantification.