This work successfully generates uncertainty aware surrogate models, via the Bayesian neural network with noise contrastive prior (BNN-NCP) technique, of the EuroPED plasma pedestal model using data from the JET-ILW pedestal database and subsequent model evaluations. All this conform EuroPED-NN. The BNN-NCP technique is proven to be a good fit for uncertainty aware surrogate models, matching the output results as a regular neural network, providing prediction's confidence as uncertainties, and highlighting the out of distribution (OOD) regions using surrogate model uncertainties. This provides critical insights into model robustness and reliability. EuroPED-NN has been physically validated, first, analyzing electron density $n_e\!\left(\psi_{\text{pol}}=0.94\right)$ with respect to increasing plasma current, $I_p$, and second, validating the $\Delta-\beta_{p,ped}$ relation associated with the EuroPED model. Affirming the robustness of the underlying physics learned by the surrogate model.
Algorithms for initializing particle distribution in SPH simulations of complex geometries have been proven essential for improving the accuracy of SPH simulations. However, no such algorithms exist for boundary integral SPH models, which can model complex geometries without needing virtual particle layers. This study introduces a Boundary Integral based Particle Initialization (BIPI) algorithm. It consists of a particle-shifting technique carefully designed to redistribute particles to fit the boundary by using the boundary integral formulation for particles adjacent to the boundary. The proposed BIPI algorithm gives special consideration to particles adjacent to the boundary to prevent artificial volume compression. It can automatically produce a "uniform" particle distribution with reduced and stabilized concentration gradient for domains with complex geometrical shapes. Finally, a number of examples are presented to demonstrate the effectiveness of the proposed algorithm.
Large Language Models (LLMs), benefiting from the auto-regressive modelling approach performed on massive unannotated texts corpora, demonstrates powerful perceptual and reasoning capabilities. However, as for extending auto-regressive modelling to multi-modal scenarios to build Large Multi-modal Models (LMMs), there lies a great difficulty that the image information is processed in the LMM as continuous visual embeddings, which cannot obtain discrete supervised labels for classification. In this paper, we successfully perform multi-modal auto-regressive modeling with a unified objective for the first time. Specifically, we propose the concept of visual words, which maps the visual features to probability distributions over LLM's vocabulary, providing supervision information for visual modelling. We further explore the distribution of visual features in the semantic space within LMM and the possibility of using text embeddings to represent visual information. Experimental results and ablation studies on 5 VQA tasks and 4 benchmark toolkits validate the powerful performance of our proposed approach.
We present ReCAT, a recursive composition augmented Transformer that is able to explicitly model hierarchical syntactic structures of raw texts without relying on gold trees during both learning and inference. Existing research along this line restricts data to follow a hierarchical tree structure and thus lacks inter-span communications. To overcome the problem, we propose a novel contextual inside-outside (CIO) layer that learns contextualized representations of spans through bottom-up and top-down passes, where a bottom-up pass forms representations of high-level spans by composing low-level spans, while a top-down pass combines information inside and outside a span. By stacking several CIO layers between the embedding layer and the attention layers in Transformer, the ReCAT model can perform both deep intra-span and deep inter-span interactions, and thus generate multi-grained representations fully contextualized with other spans. Moreover, the CIO layers can be jointly pre-trained with Transformers, making ReCAT enjoy scaling ability, strong performance, and interpretability at the same time. We conduct experiments on various sentence-level and span-level tasks. Evaluation results indicate that ReCAT can significantly outperform vanilla Transformer models on all span-level tasks and baselines that combine recursive networks with Transformers on natural language inference tasks. More interestingly, the hierarchical structures induced by ReCAT exhibit strong consistency with human-annotated syntactic trees, indicating good interpretability brought by the CIO layers.
Threat modeling is a popular method to securely develop systems by achieving awareness of potential areas of future damage caused by adversaries. The benefit of threat modeling lies in its ability to indicate areas of concern, paving the way to consider mitigation during the design stage. However, threat modeling for systems relying on Artificial Intelligence is still not well explored. While conventional threat modeling methods and tools did not address AI-related threats, research on this amalgamation still lacks solutions capable of guiding and automating the process, as well as providing evidence that the methods hold up in practice. To evaluate that the work at hand is able to guide and automatically identify AI-related threats during the architecture definition stage, several experts were tasked to create a threat model of an AI system designed in the healthcare domain. The usability of the solution was well-perceived, and the results indicate that it is effective for threat identification.
This article is concerned with the multilevel Monte Carlo (MLMC) methods for approximating expectations of some functions of the solution to the Heston 3/2-model from mathematical finance, which takes values in $(0, \infty)$ and possesses superlinearly growing drift and diffusion coefficients. To discretize the SDE model, a new Milstein-type scheme is proposed to produce independent sample paths. The proposed scheme can be explicitly solved and is positivity-preserving unconditionally, i.e., for any time step-size $h>0$. This positivity-preserving property for large discretization time steps is particularly desirable in the MLMC setting. Furthermore, a mean-square convergence rate of order one is proved in the non-globally Lipschitz regime, which is not trivial, as the diffusion coefficient grows super-linearly. The obtained order-one convergence in turn promises the desired relevant variance of the multilevel estimator and justifies the optimal complexity $\mathcal{O}(\epsilon^{-2})$ for the MLMC approach, where $\epsilon > 0$ is the required target accuracy. Numerical experiments are finally reported to confirm the theoretical findings.
The purpose of this work is to introduce a strategy for determining the nodes and weights of a low-cardinality positive cubature formula nearly exact for polynomials of a given degree over spherical polygons. In the numerical section we report the results about numerical cubature over a spherical polygon $\cal P$ approximating Australia and reconstruction of functions over such $\cal P$, also affected by perturbations, via hyperinterpolation and some of its variants. The open-source Matlab software used in the numerical tests is available at the author's homepage.
In this manuscript, we combine non-intrusive reduced order models (ROMs) with space-dependent aggregation techniques to build a mixed-ROM. The prediction of the mixed formulation is given by a convex linear combination of the predictions of some previously-trained ROMs, where we assign to each model a space-dependent weight. The ROMs taken into account to build the mixed model exploit different reduction techniques, such as Proper Orthogonal Decomposition (POD) and AutoEncoders (AE), and/or different approximation techniques, namely a Radial Basis Function Interpolation (RBF), a Gaussian Process Regression (GPR) or a feed-forward Artificial Neural Network (ANN). The contribution of each model is retained with higher weights in the regions where the model performs best, and, vice versa, with smaller weights where the model has a lower accuracy with respect to the other models. Finally, a regression technique, namely a Random Forest, is exploited to evaluate the weights for unseen conditions. The performance of the aggregated model is evaluated on two different test cases: the 2D flow past a NACA 4412 airfoil, with an angle of attack of 5 degrees, having as parameter the Reynolds number varying between 1e5 and 1e6 and a transonic flow over a NACA 0012 airfoil, considering as parameter the angle of attack. In both cases, the mixed-ROM has provided improved accuracy with respect to each individual ROM technique.
This work considers the nodal finite element approximation of peridynamics, in which the nodal displacements satisfy the peridynamics equation at each mesh node. For the nonlinear bond-based peridynamics model, it is shown that, under the suitable assumptions on an exact solution, the discretized solution associated with the central-in-time and nodal finite element discretization converges to the exact solution in $L^2$ norm at the rate $C_1 \Delta t + C_2 h^2/\epsilon^2$. Here, $\Delta t$, $h$, and $\epsilon$ are time step size, mesh size, and the size of the horizon or nonlocal length scale, respectively. Constants $C_1$ and $C_2$ are independent of $h$ and $\Delta t$ and depend on the norms of the exact solution. Several numerical examples involving pre-crack, void, and notch are considered, and the efficacy of the proposed nodal finite element discretization is analyzed.
We present self-supervised geometric perception (SGP), the first general framework to learn a feature descriptor for correspondence matching without any ground-truth geometric model labels (e.g., camera poses, rigid transformations). Our first contribution is to formulate geometric perception as an optimization problem that jointly optimizes the feature descriptor and the geometric models given a large corpus of visual measurements (e.g., images, point clouds). Under this optimization formulation, we show that two important streams of research in vision, namely robust model fitting and deep feature learning, correspond to optimizing one block of the unknown variables while fixing the other block. This analysis naturally leads to our second contribution -- the SGP algorithm that performs alternating minimization to solve the joint optimization. SGP iteratively executes two meta-algorithms: a teacher that performs robust model fitting given learned features to generate geometric pseudo-labels, and a student that performs deep feature learning under noisy supervision of the pseudo-labels. As a third contribution, we apply SGP to two perception problems on large-scale real datasets, namely relative camera pose estimation on MegaDepth and point cloud registration on 3DMatch. We demonstrate that SGP achieves state-of-the-art performance that is on-par or superior to the supervised oracles trained using ground-truth labels.
Network representation learning in low dimensional vector space has attracted considerable attention in both academic and industrial domains. Most real-world networks are dynamic with addition/deletion of nodes and edges. The existing graph embedding methods are designed for static networks and they cannot capture evolving patterns in a large dynamic network. In this paper, we propose a dynamic embedding method, dynnode2vec, based on the well-known graph embedding method node2vec. Node2vec is a random walk based embedding method for static networks. Applying static network embedding in dynamic settings has two crucial problems: 1) Generating random walks for every time step is time consuming 2) Embedding vector spaces in each timestamp are different. In order to tackle these challenges, dynnode2vec uses evolving random walks and initializes the current graph embedding with previous embedding vectors. We demonstrate the advantages of the proposed dynamic network embedding by conducting empirical evaluations on several large dynamic network datasets.