Data collected by different modalities can provide a wealth of complementary information, such as hyperspectral image (HSI) to offer rich spectral-spatial properties, synthetic aperture radar (SAR) to provide structural information about the Earth's surface, and light detection and ranging (LiDAR) to cover altitude information about ground elevation. Therefore, a natural idea is to combine multimodal images for refined and accurate land-cover interpretation. Although many efforts have been attempted to achieve multi-source remote sensing image classification, there are still three issues as follows: 1) indiscriminate feature representation without sufficiently considering modal heterogeneity, 2) abundant features and complex computations associated with modeling long-range dependencies, and 3) overfitting phenomenon caused by sparsely labeled samples. To overcome the above barriers, a transformer-based heterogeneously salient graph representation (THSGR) approach is proposed in this paper. First, a multimodal heterogeneous graph encoder is presented to encode distinctively non-Euclidean structural features from heterogeneous data. Then, a self-attention-free multi-convolutional modulator is designed for effective and efficient long-term dependency modeling. Finally, a mean forward is put forward in order to avoid overfitting. Based on the above structures, the proposed model is able to break through modal gaps to obtain differentiated graph representation with competitive time cost, even for a small fraction of training samples. Experiments and analyses on three benchmark datasets with various state-of-the-art (SOTA) methods show the performance of the proposed approach.
There is a constant need for high-performing and computationally efficient neural network models for image super-resolution: computationally efficient models can be used via low-capacity devices and reduce carbon footprints. One way to obtain such models is to compress models, e.g. quantization. Another way is a neural architecture search that automatically discovers new, more efficient solutions. We propose a novel quantization-aware procedure, the QuantNAS that combines pros of these two approaches. To make QuantNAS work, the procedure looks for quantization-friendly super-resolution models. The approach utilizes entropy regularization, quantization noise, and Adaptive Deviation for Quantization (ADQ) module to enhance the search procedure. The entropy regularization technique prioritizes a single operation within each block of the search space. Adding quantization noise to parameters and activations approximates model degradation after quantization, resulting in a more quantization-friendly architectures. ADQ helps to alleviate problems caused by Batch Norm blocks in super-resolution models. Our experimental results show that the proposed approximations are better for search procedure than direct model quantization. QuantNAS discovers architectures with better PSNR/BitOps trade-off than uniform or mixed precision quantization of fixed architectures. We showcase the effectiveness of our method through its application to two search spaces inspired by the state-of-the-art SR models and RFDN. Thus, anyone can design a proper search space based on an existing architecture and apply our method to obtain better quality and efficiency. The proposed procedure is 30\% faster than direct weight quantization and is more stable.
The solution to a stochastic optimal control problem can be determined by computing the value function from a discretization of the associated Hamilton-Jacobi-Bellman equation. Alternatively, the problem can be reformulated in terms of a pair of forward-backward SDEs, which makes Monte-Carlo techniques applicable. More recently, the problem has also been viewed from the perspective of forward and reverse time SDEs and their associated Fokker-Planck equations. This approach is closely related to techniques used in diffusion-based generative models. Forward and reverse time formulations express the value function as the ratio of two probability density functions; one stemming from a forward McKean-Vlasov SDE and another one from a reverse McKean-Vlasov SDE. In this paper, we extend this approach to a more general class of stochastic optimal control problems and combine it with ensemble Kalman filter type and diffusion map approximation techniques in order to obtain efficient and robust particle-based algorithms.
Digital image correlation (DIC) has become a valuable tool to monitor and evaluate mechanical experiments of cracked specimen, but the automatic detection of cracks is often difficult due to inherent noise and artefacts. Machine learning models have been extremely successful in detecting crack paths and crack tips using DIC-measured, interpolated full-field displacements as input to a convolution-based segmentation model. Still, big data is needed to train such models. However, scientific data is often scarce as experiments are expensive and time-consuming. In this work, we present a method to directly generate large amounts of artificial displacement data of cracked specimen resembling real interpolated DIC displacements. The approach is based on generative adversarial networks (GANs). During training, the discriminator receives physical domain knowledge in the form of the derived von Mises equivalent strain. We show that this physics-guided approach leads to improved results in terms of visual quality of samples, sliced Wasserstein distance, and geometry score when compared to a classical unguided GAN approach.
To handle the large scale of whole slide images in computational pathology, most approaches first tessellate the images into smaller patches, extract features from these patches, and finally aggregate the feature vectors with weakly-supervised learning. The performance of this workflow strongly depends on the quality of the extracted features. Recently, foundation models in computer vision showed that leveraging huge amounts of data through supervised or self-supervised learning improves feature quality and generalizability for a variety of tasks. In this study, we benchmark the most popular vision foundation models as feature extractors for histopathology data. We evaluate the models in two settings: slide-level classification and patch-level classification. We show that foundation models are a strong baseline. Our experiments demonstrate that by finetuning a foundation model on a single GPU for only two hours or three days depending on the dataset, we can match or outperform state-of-the-art feature extractors for computational pathology. These findings imply that even with little resources one can finetune a feature extractor tailored towards a specific downstream task and dataset. This is a considerable shift from the current state, where only few institutions with large amounts of resources and datasets are able to train a feature extractor. We publish all code used for training and evaluation as well as the finetuned models.
Dynamic crack branching in unsaturated porous media holds significant relevance in various fields, including geotechnical engineering, geosciences, and petroleum engineering. This article presents a numerical investigation into dynamic crack branching in unsaturated porous media using a recently developed coupled micro-periporomechanics paradigm. This paradigm extends the periporomechanics model by incorporating the micro-rotation of the solid skeleton. Within this framework, each material point is equipped with three degrees of freedom: displacement, micro-rotation, and fluid pressure. Consistent with the Cosserat continuum theory, a length scale associated with the micro-rotation of material points is inherently integrated into the model. This study encompasses several key aspects: (1) Validation of the coupled micro-periporomechanics paradigm for effectively modeling crack branching in deformable porous media, (2) Examination of the transition from a single branch to multiple branches in porous media under drained conditions, (3) Simulation of single crack branching in unsaturated porous media under dynamic loading conditions, and (4) Investigation of multiple crack branching in unsaturated porous media under dynamic loading conditions. The numerical results obtained in this study are systematically analyzed to elucidate the factors that influence dynamic crack branching in porous media subjected to dynamic loading. Furthermore, the comprehensive numerical findings underscore the efficacy and robustness of the coupled micro-periporomechanics paradigm in accurately modeling dynamic crack branching in variably saturated porous media.
The purpose of remote sensing image change detection (RSCD) is to detect differences between bi-temporal images taken at the same place. Deep learning has been extensively used to RSCD tasks, yielding significant results in terms of result recognition. However, due to the shooting angle of the satellite, the impacts of thin clouds, and certain lighting conditions, the problem of fuzzy edges in the change region in some remote sensing photographs cannot be properly handled using current RSCD algorithms. To solve this issue, we proposed a Body Decouple Multi-Scale by fearure Aggregation change detection (BD-MSA), a novel model that collects both global and local feature map information in the channel and space dimensions of the feature map during the training and prediction phases. This approach allows us to successfully extract the change region's boundary information while also divorcing the change region's main body from its boundary. Numerous studies have shown that the assessment metrics and evaluation effects of the model described in this paper on the publicly available datasets DSIFN-CD and S2Looking are the best when compared to other models.
Thermal radiation transport (TRT) is a time dependent, high dimensional partial integro-differential equation. In practical applications such as inertial confinement fusion, TRT is coupled to other physics such as hydrodynamics, plasmas, etc., and the timescales one is interested in capturing are often much slower than the radiation timescale. As a result, TRT is treated implicitly, and due to its stiffness and high dimensionality, is often a dominant computational cost in multiphysics simulations. Here we develop a new approach for implicit-explicit (IMEX) integration of gray TRT in the deterministic S$_N$ setting, which requires only one sweep per stage, with the simplest first-order method requiring only one sweep per time step. The partitioning of equations is done via a moment-based high-order low-order formulation of TRT, where the streaming operator and first two moments are used to capture the asymptotic stiff regimes of the streaming limit and diffusion limit. Absorption-reemission is treated explicitly, and although stiff, is sufficiently damped by the implicit solve that we achieve stable accurate time integration without incorporating the coupling of the high order and low order equations implicitly. Due to nonlinear coupling of the high-order and low-order equations through temperature-dependent opacities, to facilitate IMEX partitioning and higher-order methods, we use a semi-implicit integration approach amenable to nonlinear partitions. Results are demonstrated on thick Marshak and crooked pipe benchmark problems, demonstrating orders of magnitude improvement in accuracy and wallclock compared with the standard first-order implicit integration typically used.
A technique that allows a formation-enforcing control (FEC) derived from graph rigidity theory to interface with a realistic relative localization system onboard lightweight Unmanned Aerial Vehicles (UAVs) is proposed in this paper. The proposed methodology enables reliable real-world deployment of UAVs in tight formation using real relative localization systems burdened by non-negligible sensory noise, which is typically not fully taken into account in FEC algorithms. The proposed solution is based on decomposition of the gradient descent-based FEC command into interpretable elements, and then modifying these individually based on the estimated distribution of sensory noise, such that the resulting action limits the probability of overshooting the desired formation. The behavior of the system has been analyzed and the practicality of the proposed solution has been compared to pure gradient-descent in real-world experiments where it presented significantly better performance in terms of oscillations, deviation from the desired state and convergence time.
In many medical image classification problems, labeled data is scarce while unlabeled data is more available. Semi-supervised learning and self-supervised learning are two different research directions that can improve accuracy by learning from extra unlabeled data. Recent methods from both directions have reported significant gains on traditional benchmarks. Yet past benchmarks do not focus on medical tasks and rarely compare self- and semi- methods together on equal footing. Furthermore, past benchmarks often handle hyperparameter tuning suboptimally. First, they may not tune hyperparameters at all, leading to underfitting. Second, when tuning does occur, it often unrealistically uses a labeled validation set much larger than the train set. Both cases make previously published rankings of methods difficult to translate to practical settings. This study contributes a systematic evaluation of self- and semi- methods with a unified experimental protocol intended to guide a practitioner with scarce overall labeled data and a limited compute budget. We answer two key questions: Can hyperparameter tuning be effective with realistic-sized validation sets? If so, when all methods are tuned well, which self- or semi-supervised methods reach the best accuracy? Our study compares 13 representative semi- and self-supervised methods to strong labeled-set-only baselines on 4 medical datasets. From 20000+ total GPU hours of computation, we provide valuable best practices to resource-constrained, results-focused practitioners.
Graph representation learning for hypergraphs can be used to extract patterns among higher-order interactions that are critically important in many real world problems. Current approaches designed for hypergraphs, however, are unable to handle different types of hypergraphs and are typically not generic for various learning tasks. Indeed, models that can predict variable-sized heterogeneous hyperedges have not been available. Here we develop a new self-attention based graph neural network called Hyper-SAGNN applicable to homogeneous and heterogeneous hypergraphs with variable hyperedge sizes. We perform extensive evaluations on multiple datasets, including four benchmark network datasets and two single-cell Hi-C datasets in genomics. We demonstrate that Hyper-SAGNN significantly outperforms the state-of-the-art methods on traditional tasks while also achieving great performance on a new task called outsider identification. Hyper-SAGNN will be useful for graph representation learning to uncover complex higher-order interactions in different applications.