This article presents the openCFS submodule scattered data reader for coupling multi-physical simulations performed in different simulation programs. For instance, by considering a forward-coupling of a surface vibration simulation (mechanical system) to an acoustic propagation simulation using time-dependent acoustic absorbing material as a noise mitigation measure. The nearest-neighbor search of the target and source points from the interpolation is performed using the FLANN or the CGAL library. In doing so, the coupled field (e.g., surface velocity) is interpolated from a source representation consisting of field values physically stored and organized in a file directory to a target representation being the quadrature points in the case of the finite element method. A test case of the functionality is presented in the "testsuite" module of the openCFS software called "Abc2dcsvt". This scattered data reader module was successfully applied in numerous studies on flow-induced sound generation. Within this short article, the functionality, and usability of this module are described.
We introduce a flexible method to simultaneously infer both the drift and volatility functions of a discretely observed scalar diffusion. We introduce spline bases to represent these functions and develop a Markov chain Monte Carlo algorithm to infer, a posteriori, the coefficients of these functions in the spline basis. A key innovation is that we use spline bases to model transformed versions of the drift and volatility functions rather than the functions themselves. The output of the algorithm is a posterior sample of plausible drift and volatility functions that are not constrained to any particular parametric family. The flexibility of this approach provides practitioners a powerful investigative tool, allowing them to posit a variety of parametric models to better capture the underlying dynamics of their processes of interest. We illustrate the versatility of our method by applying it to challenging datasets from finance, paleoclimatology, and astrophysics. In view of the parametric diffusion models widely employed in the literature for those examples, some of our results are surprising since they call into question some aspects of these models.
In this paper, we revisit McFadden (1978)'s correction factor for sampling of alternatives in multinomial logit (MNL) and mixed multinomial logit (MMNL) models. McFadden (1978) proved that consistent parameter estimates are obtained when estimating MNL models using a sampled subset of alternatives, including the chosen alternative, in combination with a correction factor. We decompose this correction factor into i) a correction for overestimating the MNL choice probability due to using a smaller subset of alternatives, and ii) a correction for which a subset of alternatives is contrasted through utility differences and thereby the extent to which we learn about the parameters of interest in MNL. Keane and Wasi (2016) proved that the overall expected positive information divergence - comprising the above two elements - is minimised between the true and sampled likelihood when applying a sampling protocol satisfying uniform conditioning. We generalise their result to the case of positive conditioning and show that whilst McFadden (1978)'s correction factor may not minimise the overall expected information divergence, it does minimise the expected information loss with respect to the parameters of interest. We apply this result in the context of Bayesian analysis and show that McFadden (1978)'s correction factor minimises the expected information loss regarding the parameters of interest across the entire posterior density irrespective of sample size. In other words, McFadden (1978)'s correction factor has desirable small and large sample properties. We also show that our results for Bayesian MNL models transfer to MMNL and that only McFadden (1978) correction factor is sufficient to minimise the expected information loss in the parameters of interest. Monte Carlo simulations illustrate the successful application of sampling of alternatives in Bayesian MMNL models.
Effective application of mathematical models to interpret biological data and make accurate predictions often requires that model parameters are identifiable. Approaches to assess the so-called structural identifiability of models are well-established for ordinary differential equation models, yet there are no commonly adopted approaches that can be applied to assess the structural identifiability of the partial differential equation (PDE) models that are requisite to capture spatial features inherent to many phenomena. The differential algebra approach to structural identifiability has recently been demonstrated to be applicable to several specific PDE models. In this brief article, we present general methodology for performing structural identifiability analysis on partially observed linear reaction-advection-diffusion (RAD) PDE models. We show that the differential algebra approach can always, in theory, be applied to linear RAD models. Moreover, despite the perceived complexity introduced by the addition of advection and diffusion terms, identifiability of spatial analogues of non-spatial models cannot decrease structural identifiability. Finally, we show that our approach can also be applied to a class of non-linear PDE models that are linear in the unobserved variables, and conclude by discussing future possibilities and computational cost of performing structural identifiability analysis on more general PDE models in mathematical biology.
We introduce the modified planar rotator method (MPRS), a physically inspired machine learning method for spatial/temporal regression. MPRS is a non-parametric model which incorporates spatial or temporal correlations via short-range, distance-dependent ``interactions'' without assuming a specific form for the underlying probability distribution. Predictions are obtained by means of a fully autonomous learning algorithm which employs equilibrium conditional Monte Carlo simulations. MPRS is able to handle scattered data and arbitrary spatial dimensions. We report tests on various synthetic and real-word data in one, two and three dimensions which demonstrate that the MPRS prediction performance (without parameter tuning) is competitive with standard interpolation methods such as ordinary kriging and inverse distance weighting. In particular, MPRS is a particularly effective gap-filling method for rough and non-Gaussian data (e.g., daily precipitation time series). MPRS shows superior computational efficiency and scalability for large samples. Massive data sets involving millions of nodes can be processed in a few seconds on a standard personal computer.
The growth of dendritic grains during solidification is often modelled using the Grain Envelope Model (GEM), in which the envelope of the dendrite is an interface tracked by the Phase Field Interface Capturing (PFIC) method. In the PFIC method, an phase-field equation is solved on a fixed mesh to track the position of the envelope. While being versatile and robust, PFIC introduces certain numerical artefacts. In this work, we present an alternative approach for the solution of the GEM that employs a Meshless (sharp) Interface Tracking (MIT) formulation, which uses direct, artefact-free interface tracking. In the MIT, the envelope (interface) is defined as a moving domain boundary and the interface-tracking nodes are boundary nodes for the diffusion problem solved in the domain. To increase the accuracy of the method for the diffusion-controlled moving-boundary problem, an \h-adaptive spatial discretization is used, thus, the node spacing is refined in the vicinity of the envelope. MIT combines a parametric surface reconstruction, a mesh-free discretization of the parametric surfaces and the space enclosed by them, and a high-order approximation of the partial differential operators and of the solute concentration field using radial basis functions augmented with monomials. The proposed method is demonstrated on a two-dimensional \h-adaptive solution of the diffusive growth of dendrite and evaluated by comparing the results to the PFIC approach. It is shown that MIT can reproduce the results calculated with PFIC, that it is convergent and that it can capture more details in the envelope shape than PFIC with a similar spatial discretization.
We present ReCAT, a recursive composition augmented Transformer that is able to explicitly model hierarchical syntactic structures of raw texts without relying on gold trees during both learning and inference. Existing research along this line restricts data to follow a hierarchical tree structure and thus lacks inter-span communications. To overcome the problem, we propose a novel contextual inside-outside (CIO) layer that learns contextualized representations of spans through bottom-up and top-down passes, where a bottom-up pass forms representations of high-level spans by composing low-level spans, while a top-down pass combines information inside and outside a span. By stacking several CIO layers between the embedding layer and the attention layers in Transformer, the ReCAT model can perform both deep intra-span and deep inter-span interactions, and thus generate multi-grained representations fully contextualized with other spans. Moreover, the CIO layers can be jointly pre-trained with Transformers, making ReCAT enjoy scaling ability, strong performance, and interpretability at the same time. We conduct experiments on various sentence-level and span-level tasks. Evaluation results indicate that ReCAT can significantly outperform vanilla Transformer models on all span-level tasks and baselines that combine recursive networks with Transformers on natural language inference tasks. More interestingly, the hierarchical structures induced by ReCAT exhibit strong consistency with human-annotated syntactic trees, indicating good interpretability brought by the CIO layers.
The standard paired-sample testing approach in the multidimensional setting applies multiple univariate tests on the individual features, followed by p-value adjustments. Such an approach suffers when the data carry numerous features. A number of studies have shown that classification accuracy can be seen as a proxy for two-sample testing. However, neither theoretical foundations nor practical recipes have been proposed so far on how this strategy could be extended to multidimensional paired-sample testing. In this work, we put forward the idea that scoring functions can be produced by the decision rules defined by the perpendicular bisecting hyperplanes of the line segments connecting each pair of instances. Then, the optimal scoring function can be obtained by the pseudomedian of those rules, which we estimate by extending naturally the Hodges-Lehmann estimator. We accordingly propose a framework of a two-step testing procedure. First, we estimate the bisecting hyperplanes for each pair of instances and an aggregated rule derived through the Hodges-Lehmann estimator. The paired samples are scored by this aggregated rule to produce a unidimensional representation. Second, we perform a Wilcoxon signed-rank test on the obtained representation. Our experiments indicate that our approach has substantial performance gains in testing accuracy compared to the traditional multivariate and multiple testing, while at the same time estimates each feature's contribution to the final result.
In the present paper, we study a multipoint boundary value problem for a system of Fredholm integro-differenial equations by the method of parameterization. The case of a degenerate kernel is studied separately, for which we obtain well-posedness conditions and propose some algorithms to find approximate and numerical solutions to the problem. Then we establish necessary and sufficient conditions for the well-posedness of the multipoint problem for the system of Fredholm integro-differential equations and develop some algorithms for finding its approximate solutions. These algorithms are based on the solutions of an approximating problem for the system of integro-differential equations with degenerate kernel.
Predictions for physical systems often rely upon knowledge acquired from ensembles of entities, e.g., ensembles of cells in biological sciences. For qualitative and quantitative analysis, these ensembles are simulated with parametric families of mechanistic models (MM). Two classes of methodologies, based on Bayesian inference and Population of Models, currently prevail in parameter estimation for physical systems. However, in Bayesian analysis, uninformative priors for MM parameters introduce undesirable bias. Here, we propose how to infer parameters within the framework of stochastic inverse problems (SIP), also termed data-consistent inversion, wherein the prior targets only uncertainties that arise due to MM non-invertibility. To demonstrate, we introduce new methods to solve SIP based on rejection sampling, Markov chain Monte Carlo, and generative adversarial networks (GANs). In addition, to overcome limitations of SIP, we reformulate SIP based on constrained optimization and present a novel GAN to solve the constrained optimization problem.
Reinforcement learning(RL) algorithms face the challenge of limited data efficiency, particularly when dealing with high-dimensional state spaces and large-scale problems. Most of RL methods often rely solely on state transition information within the same episode when updating the agent's Critic, which can lead to low data efficiency and sub-optimal training time consumption. Inspired by human-like analogical reasoning abilities, we introduce a novel mesh information propagation mechanism, termed the 'Imagination Mechanism (IM)', designed to significantly enhance the data efficiency of RL algorithms. Specifically, IM enables information generated by a single sample to be effectively broadcasted to different states across episodes, instead of simply transmitting in the same episode. This capability enhances the model's comprehension of state interdependencies and facilitates more efficient learning of limited sample information. To promote versatility, we extend the IM to function as a plug-and-play module that can be seamlessly and fluidly integrated into other widely adopted RL algorithms. Our experiments demonstrate that IM consistently boosts four mainstream SOTA RL algorithms, such as SAC, PPO, DDPG, and DQN, by a considerable margin, ultimately leading to superior performance than before across various tasks. For access to our code and data, please visit //github.com/OuAzusaKou/imagination_mechanism