Study of dynamical systems using partial state observation is an important problem due to its applicability to many real-world systems. We address the problem by proposing an echo state network (ESN) framework with partial state input with partial or full state output. Application to the Lorenz system and Chua's oscillator (both numerically simulated and experimental systems) demonstrate the effectiveness of our method. We show that the ESN, as an autonomous dynamical system, is capable of making short-term predictions up to a few Lyapunov times. However, the prediction horizon has high variability depending on the initial condition - an aspect that we explore in detail using the distribution of the prediction horizon. Further, using a variety of statistical metrics to compare the long-term dynamics of the ESN predictions with numerically simulated or experimental dynamics and observed similar results, we show that the ESN can effectively learn the system's dynamics even when trained with noisy numerical or experimental datasets. Thus, we demonstrate the potential of ESNs to serve as cheap surrogate models for simulating the dynamics of systems where complete observations are unavailable.
Testing cross-sectional independence in panel data models is of fundamental importance in econometric analysis with high-dimensional panels. Recently, econometricians began to turn their attention to the problem in the presence of serial dependence. The existing procedure for testing cross-sectional independence with serial correlation is based on the sum of the sample cross-sectional correlations, which generally performs well when the alternative has dense cross-sectional correlations, but suffers from low power against sparse alternatives. To deal with sparse alternatives, we propose a test based on the maximum of the squared sample cross-sectional correlations. Furthermore, we propose a combined test to combine the p-values of the max based and sum based tests, which performs well under both dense and sparse alternatives. The combined test relies on the asymptotic independence of the max based and sum based test statistics, which we show rigorously. We show that the proposed max based and combined tests have attractive theoretical properties and demonstrate the superior performance via extensive simulation results. We apply the two new tests to analyze the weekly returns on the securities in the S\&P 500 index under the Fama-French three-factor model, and confirm the usefulness of the proposed combined test in detecting cross-sectional independence.
A general a posteriori error analysis applies to five lowest-order finite element methods for two fourth-order semi-linear problems with trilinear non-linearity and a general source. A quasi-optimal smoother extends the source term to the discrete trial space, and more importantly, modifies the trilinear term in the stream-function vorticity formulation of the incompressible 2D Navier-Stokes and the von K\'{a}rm\'{a}n equations. This enables the first efficient and reliable a posteriori error estimates for the 2D Navier-Stokes equations in the stream-function vorticity formulation for Morley, two discontinuous Galerkin, $C^0$ interior penalty, and WOPSIP discretizations with piecewise quadratic polynomials.
Several fundamental problems in science and engineering consist of global optimization tasks involving unknown high-dimensional (black-box) functions that map a set of controllable variables to the outcomes of an expensive experiment. Bayesian Optimization (BO) techniques are known to be effective in tackling global optimization problems using a relatively small number objective function evaluations, but their performance suffers when dealing with high-dimensional outputs. To overcome the major challenge of dimensionality, here we propose a deep learning framework for BO and sequential decision making based on bootstrapped ensembles of neural architectures with randomized priors. Using appropriate architecture choices, we show that the proposed framework can approximate functional relationships between design variables and quantities of interest, even in cases where the latter take values in high-dimensional vector spaces or even infinite-dimensional function spaces. In the context of BO, we augmented the proposed probabilistic surrogates with re-parameterized Monte Carlo approximations of multiple-point (parallel) acquisition functions, as well as methodological extensions for accommodating black-box constraints and multi-fidelity information sources. We test the proposed framework against state-of-the-art methods for BO and demonstrate superior performance across several challenging tasks with high-dimensional outputs, including a constrained multi-fidelity optimization task involving shape optimization of rotor blades in turbo-machinery.
Unveiling the underlying governing equations of nonlinear dynamic systems remains a significant challenge, especially when encountering noisy observations and no prior knowledge available. This study proposes R-DISCOVER, a framework designed to robustly uncover open-form partial differential equations (PDEs) from limited and noisy data. The framework operates through two alternating update processes: discovering and embedding. The discovering phase employs symbolic representation and a reinforcement learning (RL)-guided hybrid PDE generator to efficiently produce diverse open-form PDEs with tree structures. A neural network-based predictive model fits the system response and serves as the reward evaluator for the generated PDEs. PDEs with superior fits are utilized to iteratively optimize the generator via the RL method and the best-performing PDE is selected by a parameter-free stability metric. The embedding phase integrates the initially identified PDE from the discovering process as a physical constraint into the predictive model for robust training. The traversal of PDE trees automates the construction of the computational graph and the embedding process without human intervention. Numerical experiments demonstrate our framework's capability to uncover governing equations from nonlinear dynamic systems with limited and highly noisy data and outperform other physics-informed neural network-based discovery methods. This work opens new potential for exploring real-world systems with limited understanding.
We use fixed point theory to analyze nonnegative neural networks, which we define as neural networks that map nonnegative vectors to nonnegative vectors. We first show that nonnegative neural networks with nonnegative weights and biases can be recognized as monotonic and (weakly) scalable functions within the framework of nonlinear Perron-Frobenius theory. This fact enables us to provide conditions for the existence of fixed points of nonnegative neural networks having inputs and outputs of the same dimension, and these conditions are weaker than those recently obtained using arguments in convex analysis. Furthermore, we prove that the shape of the fixed point set of nonnegative neural networks with nonnegative weights and biases is an interval, which under mild conditions degenerates to a point. These results are then used to obtain the existence of fixed points of more general nonnegative neural networks. From a practical perspective, our results contribute to the understanding of the behavior of autoencoders, and the main theoretical results are verified in numerical simulations using the Modified National Institute of Standards and Technology (MNIST) dataset.
We study scalable machine learning models for full event reconstruction in high-energy electron-positron collisions based on a highly granular detector simulation. Particle-flow (PF) reconstruction can be formulated as a supervised learning task using tracks and calorimeter clusters or hits. We compare a graph neural network and kernel-based transformer and demonstrate that both avoid quadratic memory allocation and computational cost while achieving realistic PF reconstruction. We show that hyperparameter tuning on a supercomputer significantly improves the physics performance of the models. We also demonstrate that the resulting model is highly portable across hardware processors, supporting Nvidia, AMD, and Intel Habana cards. Finally, we demonstrate that the model can be trained on highly granular inputs consisting of tracks and calorimeter hits, resulting in a competitive physics performance with the baseline. Datasets and software to reproduce the studies are published following the findable, accessible, interoperable, and reusable (FAIR) principles.
We introduce a novel ridge detection algorithm for time-frequency (TF) analysis, particularly tailored for intricate nonstationary time series encompassing multiple non-sinusoidal oscillatory components. The algorithm is rooted in the distinctive geometric patterns that emerge in the TF domain due to such non-sinusoidal oscillations. We term this method \textit{shape-adaptive mode decomposition-based multiple harmonic ridge detection} (\textsf{SAMD-MHRD}). A swift implementation is available when supplementary information is at hand. We demonstrate the practical utility of \textsf{SAMD-MHRD} through its application to a real-world challenge. We employ it to devise a cutting-edge walking activity detection algorithm, leveraging accelerometer signals from an inertial measurement unit across diverse body locations of a moving subject.
A method for sound field decomposition based on neural networks is proposed. The method comprises two stages: a sound field separation stage and a single-source localization stage. In the first stage, the sound pressure at microphones synthesized by multiple sources is separated into one excited by each sound source. In the second stage, the source location is obtained as a regression from the sound pressure at microphones consisting of a single sound source. The estimated location is not affected by discretization because the second stage is designed as a regression rather than a classification. Datasets are generated by simulation using Green's function, and the neural network is trained for each frequency. Numerical experiments reveal that, compared with conventional methods, the proposed method can achieve higher source-localization accuracy and higher sound-field-reconstruction accuracy.
Hashing has been widely used in approximate nearest search for large-scale database retrieval for its computation and storage efficiency. Deep hashing, which devises convolutional neural network architecture to exploit and extract the semantic information or feature of images, has received increasing attention recently. In this survey, several deep supervised hashing methods for image retrieval are evaluated and I conclude three main different directions for deep supervised hashing methods. Several comments are made at the end. Moreover, to break through the bottleneck of the existing hashing methods, I propose a Shadow Recurrent Hashing(SRH) method as a try. Specifically, I devise a CNN architecture to extract the semantic features of images and design a loss function to encourage similar images projected close. To this end, I propose a concept: shadow of the CNN output. During optimization process, the CNN output and its shadow are guiding each other so as to achieve the optimal solution as much as possible. Several experiments on dataset CIFAR-10 show the satisfying performance of SRH.
Recent advances in 3D fully convolutional networks (FCN) have made it feasible to produce dense voxel-wise predictions of volumetric images. In this work, we show that a multi-class 3D FCN trained on manually labeled CT scans of several anatomical structures (ranging from the large organs to thin vessels) can achieve competitive segmentation results, while avoiding the need for handcrafting features or training class-specific models. To this end, we propose a two-stage, coarse-to-fine approach that will first use a 3D FCN to roughly define a candidate region, which will then be used as input to a second 3D FCN. This reduces the number of voxels the second FCN has to classify to ~10% and allows it to focus on more detailed segmentation of the organs and vessels. We utilize training and validation sets consisting of 331 clinical CT images and test our models on a completely unseen data collection acquired at a different hospital that includes 150 CT scans, targeting three anatomical organs (liver, spleen, and pancreas). In challenging organs such as the pancreas, our cascaded approach improves the mean Dice score from 68.5 to 82.2%, achieving the highest reported average score on this dataset. We compare with a 2D FCN method on a separate dataset of 240 CT scans with 18 classes and achieve a significantly higher performance in small organs and vessels. Furthermore, we explore fine-tuning our models to different datasets. Our experiments illustrate the promise and robustness of current 3D FCN based semantic segmentation of medical images, achieving state-of-the-art results. Our code and trained models are available for download: //github.com/holgerroth/3Dunet_abdomen_cascade.