With recent advancements in computer hardware and software platforms, there has been a surge of interest in solving partial differential equations with deep learning-based methods, and the integration with domain decomposition strategies has attracted considerable attention owing to its enhanced representation and parallelization capacities of the network solution. While there are already several works that substitute the subproblem solver with neural networks for overlapping Schwarz methods, the non-overlapping counterpart has not been extensively explored because of the inaccurate flux estimation at interface that would propagate errors to neighbouring subdomains and eventually hinder the convergence of outer iterations. In this study, a novel learning approach for solving elliptic boundary value problems, i.e., the compensated deep Ritz method using neural network extension operators, is proposed to enable reliable flux transmission across subdomain interfaces, thereby allowing us to construct effective learning algorithms for realizing non-overlapping domain decomposition methods (DDMs) in the presence of erroneous interface conditions. Numerical experiments on a variety of elliptic problems, including regular and irregular interfaces, low and high dimensions, two and four subdomains, and smooth and high-contrast coefficients are carried out to validate the effectiveness of our proposed algorithms.
Scientific and engineering problems often involve parametric partial differential equations (PDEs), such as uncertainty quantification, optimizations, and inverse problems. However, solving these PDEs repeatedly can be prohibitively expensive, especially for large-scale complex applications. To address this issue, reduced order modeling (ROM) has emerged as an effective method to reduce computational costs. However, ROM often requires significant modifications to the existing code, which can be time-consuming and complex, particularly for large-scale legacy codes. Non-intrusive methods have gained attention as an alternative approach. However, most existing non-intrusive approaches are purely data-driven and may not respect the underlying physics laws during the online stage, resulting in less accurate approximations of the reduced solution. In this study, we propose a new non-intrusive bi-fidelity reduced basis method for time-independent parametric PDEs. Our algorithm utilizes the discrete operator, solutions, and right-hand sides obtained from the high-fidelity legacy solver. By leveraging a low-fidelity model, we efficiently construct the reduced operator and right-hand side for new parameter values during the online stage. Unlike other non-intrusive ROM methods, we enforce the reduced equation during the online stage. In addition, the non-intrusive nature of our algorithm makes it straightforward and applicable to general nonlinear time-independent problems. We demonstrate its performance through several benchmark examples, including nonlinear and multiscale PDEs.
Given a limited labeling budget, active learning (AL) aims to sample the most informative instances from an unlabeled pool to acquire labels for subsequent model training. To achieve this, AL typically measures the informativeness of unlabeled instances based on uncertainty and diversity. However, it does not consider erroneous instances with their neighborhood error density, which have great potential to improve the model performance. To address this limitation, we propose $REAL$, a novel approach to select data instances with $\underline{R}$epresentative $\underline{E}$rrors for $\underline{A}$ctive $\underline{L}$earning. It identifies minority predictions as \emph{pseudo errors} within a cluster and allocates an adaptive sampling budget for the cluster based on estimated error density. Extensive experiments on five text classification datasets demonstrate that $REAL$ consistently outperforms all best-performing baselines regarding accuracy and F1-macro scores across a wide range of hyperparameter settings. Our analysis also shows that $REAL$ selects the most representative pseudo errors that match the distribution of ground-truth errors along the decision boundary. Our code is publicly available at //github.com/withchencheng/ECML_PKDD_23_Real.
In this paper, we study the partial pole assignment problem in symmetric quadratic pencil with time delay. A novel multi-step method is proposed to solve this problem, resulting in the undesired eigenvalues being moved to desired values, and the remaining eigenvalues unchanged. By establishing a new matrix equality relation and using a multi-step method, the problem is transformed into solving linear systems with low order. Specifically, assuming that there are $p$ undesired eigenvalues requiring reassigned, the size of the linear system we finally solved is $p^2$. Notably, the method demonstrates high efficiency for large systems with only a few poles requiring reassigned. Numerical examples are provided to illustrate the effectiveness of the proposed method
The detection of exoplanets with the radial velocity method consists in detecting variations of the stellar velocity caused by an unseen sub-stellar companion. Instrumental errors, irregular time sampling, and different noise sources originating in the intrinsic variability of the star can hinder the interpretation of the data, and even lead to spurious detections. In recent times, work began to emerge in the field of extrasolar planets that use Machine Learning algorithms, some with results that exceed those obtained with the traditional techniques in the field. We seek to explore the scope of the neural networks in the radial velocity method, in particular for exoplanet detection in the presence of correlated noise of stellar origin. In this work, a neural network is proposed to replace the computation of the significance of the signal detected with the radial velocity method and to classify it as of planetary origin or not. The algorithm is trained using synthetic data of systems with and without planetary companions. We injected realistic correlated noise in the simulations, based on previous studies of the behaviour of stellar activity. The performance of the network is compared to the traditional method based on null hypothesis significance testing. The network achieves 28 % fewer false positives. The improvement is observed mainly in the detection of small-amplitude signals associated with low-mass planets. In addition, its execution time is five orders of magnitude faster than the traditional method. The superior performance exhibited by the algorithm has only been tested on simulated radial velocity data so far. Although in principle it should be straightforward to adapt it for use in real time series, its performance has to be tested thoroughly. Future work should permit evaluating its potential for adoption as a valuable tool for exoplanet detection.
We propose an accurate and energy-stable parametric finite element method for solving the sharp-interface continuum model of solid-state dewetting in three-dimensional space. The model describes the motion of the film\slash vapor interface with contact line migration and is governed by the surface diffusion equation with proper boundary conditions at the contact line. We present a new weak formulation for the problem, in which the interface and its contact line are evolved simultaneously. By using piecewise linear elements in space and backward Euler in time, we then discretize the weak formulation to obtain a fully discretized parametric finite element approximation. The resulting numerical method is shown to be well-posed and unconditionally energy-stable. Furthermore, the numerical method is extended for solving the sharp interface model of solid-state dewetting with anisotropic surface energies in the Riemmanian metric form. Numerical results are reported to show the convergence and efficiency of the proposed numerical method as well as the anisotropic effects on the morphological evolution of thin films in solid-state dewetting.
In this paper, practically computable low-order approximations of potentially high-dimensional differential equations driven by geometric rough paths are proposed and investigated. In particular, equations are studied that cover the linear setting, but we allow for a certain type of dissipative nonlinearity in the drift as well. In a first step, a linear subspace is found that contains the solution space of the underlying rough differential equation (RDE). This subspace is associated to covariances of linear Ito-stochastic differential equations which is shown exploiting a Gronwall lemma for matrix differential equations. Orthogonal projections onto the identified subspace lead to a first exact reduced order system. Secondly, a linear map of the RDE solution (quantity of interest) is analyzed in terms of redundant information meaning that state variables are found that do not contribute to the quantity of interest. Once more, a link to Ito-stochastic differential equations is used. Removing such unnecessary information from the RDE provides a further dimension reduction without causing an error. Finally, we discretize a linear parabolic rough partial differential equation in space. The resulting large-order RDE is subsequently tackled with the exact reduction techniques studied in this paper. We illustrate the enormous complexity reduction potential in the corresponding numerical experiments.
Object detection in 3D is a crucial aspect in the context of autonomous vehicles and drones. However, prototyping detection algorithms is time-consuming and costly in terms of energy and environmental impact. To address these challenges, one can check the effectiveness of different models by training on a subset of the original training set. In this paper, we present a comparison of three algorithms for selecting such a subset - random sampling, random per class sampling, and our proposed MONSPeC (Maximum Object Number Sampling per Class). We provide empirical evidence for the superior effectiveness of random per class sampling and MONSPeC over basic random sampling. By replacing random sampling with one of the more efficient algorithms, the results obtained on the subset are more likely to transfer to the results on the entire dataset. The code is available at: //github.com/vision-agh/monspec.
Parallel software codes in high performance computing (HPC) continue to grow in complexity and scale as we enter the exascale era. A diverse set of emerging hardware and programming paradigms make developing, optimizing, and maintaining parallel software burdensome for developers. One way to alleviate some of these burdens is with automated development and analysis tools. Such tools can perform complex and/or remedial tasks for developers that increase their productivity and decrease the chance for error. So far, such tools for code development and performance analysis have been limited in the complexity of tasks they can perform. However, with recent advancements in language modeling, and the wealth of code related data that is now available online, these tools have started to utilize predictive language models to automate more complex tasks. In this paper, we show how large language models (LLMs) can be applied to tasks specific to high performance and scientific codes. We train LLMs using code and performance data that is specific to parallel codes. We compare several recent LLMs on HPC related tasks and introduce a new model, HPC-Coder, trained on parallel code. In our experiments we show that this model can auto-complete HPC functions where general models cannot, decorate for loops with OpenMP pragmas, and model performance changes in two scientific application repositories.
Unsupervised domain adaptation has recently emerged as an effective paradigm for generalizing deep neural networks to new target domains. However, there is still enormous potential to be tapped to reach the fully supervised performance. In this paper, we present a novel active learning strategy to assist knowledge transfer in the target domain, dubbed active domain adaptation. We start from an observation that energy-based models exhibit free energy biases when training (source) and test (target) data come from different distributions. Inspired by this inherent mechanism, we empirically reveal that a simple yet efficient energy-based sampling strategy sheds light on selecting the most valuable target samples than existing approaches requiring particular architectures or computation of the distances. Our algorithm, Energy-based Active Domain Adaptation (EADA), queries groups of targe data that incorporate both domain characteristic and instance uncertainty into every selection round. Meanwhile, by aligning the free energy of target data compact around the source domain via a regularization term, domain gap can be implicitly diminished. Through extensive experiments, we show that EADA surpasses state-of-the-art methods on well-known challenging benchmarks with substantial improvements, making it a useful option in the open world. Code is available at //github.com/BIT-DA/EADA.
This paper aims at revisiting Graph Convolutional Neural Networks by bridging the gap between spectral and spatial design of graph convolutions. We theoretically demonstrate some equivalence of the graph convolution process regardless it is designed in the spatial or the spectral domain. The obtained general framework allows to lead a spectral analysis of the most popular ConvGNNs, explaining their performance and showing their limits. Moreover, the proposed framework is used to design new convolutions in spectral domain with a custom frequency profile while applying them in the spatial domain. We also propose a generalization of the depthwise separable convolution framework for graph convolutional networks, what allows to decrease the total number of trainable parameters by keeping the capacity of the model. To the best of our knowledge, such a framework has never been used in the GNNs literature. Our proposals are evaluated on both transductive and inductive graph learning problems. Obtained results show the relevance of the proposed method and provide one of the first experimental evidence of transferability of spectral filter coefficients from one graph to another. Our source codes are publicly available at: //github.com/balcilar/Spectral-Designed-Graph-Convolutions