In this paper we present a non-local numerical scheme based on the Local Discontinuous Galerkin method for a non-local diffusive partial differential equation with application to traffic flow. In this model, the velocity is determined by both the average of the traffic density as well as the changes in the traffic density at a neighborhood of each point. We discuss nonphysical behaviors that can arise when including diffusion, and our measures to prevent them in our model. The numerical results suggest that this is an accurate method for solving this type of equation and that the model can capture desired traffic flow behavior. We show that computation of the non-local convolution results in $\mathcal{O}(n^2)$ complexity, but the increased computation time can be mitigated with high-order schemes like the one proposed.
We describe a new general method for segmentation in MRI scans using Topological Data Analysis (TDA), offering several advantages over traditional machine learning approaches. It works in three steps, first identifying the whole object to segment via automatic thresholding, then detecting a distinctive subset whose topology is known in advance, and finally deducing the various components of the segmentation. Although convoking classical ideas of TDA, such an algorithm has never been proposed separately from deep learning methods. To achieve this, our approach takes into account, in addition to the homology of the image, the localization of representative cycles, a piece of information that seems never to have been exploited in this context. In particular, it offers the ability to perform segmentation without the need for large annotated data sets. TDA also provides a more interpretable and stable framework for segmentation by explicitly mapping topological features to segmentation components. By adapting the geometric object to be detected, the algorithm can be adjusted to a wide range of data segmentation challenges. We carefully study the examples of glioblastoma segmentation in brain MRI, where a sphere is to be detected, as well as myocardium in cardiac MRI, involving a cylinder, and cortical plate detection in fetal brain MRI, whose 2D slices are circles. We compare our method to state-of-the-art algorithms.
This paper introduces a novel hierarchical Bayesian model specifically designed to address challenges in Inverse Uncertainty Quantification (IUQ) for time-dependent problems in nuclear Thermal Hydraulics (TH) systems. The unique characteristics of time-dependent data, such as high dimensionality and correlation in model outputs requires special attention in the IUQ process. By integrating Gaussian Processes (GP) with Principal Component Analysis (PCA), we efficiently construct surrogate models that effectively handle the complexity of dynamic TH systems. Additionally, we incorporate Neural Network (NN) models for time series regression, enhancing the computational accuracy and facilitating derivative calculations for efficient posterior sampling using the Hamiltonian Monte Carlo Method - No U-Turn Sampler (NUTS). We demonstrate the effectiveness of this hierarchical Bayesian approach using the transient experiments in the PSBT benchmark. Our results show improved estimates of Physical Model Parameters' posterior distributions and a reduced tendency for over-fitting, compared to conventional single-level Bayesian models. This approach offers a promising framework for extending IUQ to more complex, time-dependent problems.
In this paper, we propose an orthogonal block wise Kaczmarz (POBK) algorithm based on preprocessing techniques to solve large-scale sparse linear systems $Ax=f$. Firstly, the Reverse Cuthill McKee Algorithm (RCM) algorithm is used to preprocess the linear system, and then a new partitioning strategy is proposed to divide orthogonal blocks into one category, in order to accelerate the convergence rate of the Kaczmarz algorithm. The convergence of the POBK algorithm has been theoretically proven, and a theoretical analysis of its faster convergence is also provided. In addition, the experimental results confirm that this algorithm is far superior to GRBK, RBK(k), and GREBK(k) algorithms in both iteration steps (IT) and CPU time aspects.
This paper proposes a novel unsupervised domain adaption (UDA) method based on contrastive bi-projector (CBP), which can improve the existing UDA methods. It is called CBPUDA here, which effectively promotes the feature extractors (FEs) to reduce the generation of ambiguous features for classification and domain adaption. The CBP differs from traditional bi-classifier-based methods at that these two classifiers are replaced with two projectors of performing a mapping from the input feature to two distinct features. These two projectors and the FEs in the CBPUDA can be trained adversarially to obtain more refined decision boundaries so that it can possess powerful classification performance. Two properties of the proposed loss function are analyzed here. The first property is to derive an upper bound of joint prediction entropy, which is used to form the proposed loss function, contrastive discrepancy (CD) loss. The CD loss takes the advantages of the contrastive learning and the bi-classifier. The second property is to analyze the gradient of the CD loss and then overcome the drawback of the CD loss. The result of the second property is utilized in the development of the gradient scaling (GS) scheme in this paper. The GS scheme can be exploited to tackle the unstable problem of the CD loss because training the CBPUDA requires using contrastive learning and adversarial learning at the same time. Therefore, using the CD loss with the GS scheme overcomes the problem mentioned above to make features more compact for intra-class and distinguishable for inter-class. Experimental results express that the CBPUDA is superior to conventional UDA methods under consideration in this paper for UDA and fine-grained UDA tasks.
Neural Radiance Fields (NeRF) have recently emerged as a powerful method for image-based 3D reconstruction, but the lengthy per-scene optimization limits their practical usage, especially in resource-constrained settings. Existing approaches solve this issue by reducing the number of input views and regularizing the learned volumetric representation with either complex losses or additional inputs from other modalities. In this paper, we present KeyNeRF, a simple yet effective method for training NeRF in few-shot scenarios by focusing on key informative rays. Such rays are first selected at camera level by a view selection algorithm that promotes baseline diversity while guaranteeing scene coverage, then at pixel level by sampling from a probability distribution based on local image entropy. Our approach performs favorably against state-of-the-art methods, while requiring minimal changes to existing NeRF codebases.
In this paper, we investigate the capacity of a multiple-input multiple-output (MIMO) optical intensity channel (OIC) under per-antenna peak- and average-intensity constraints. We first consider the case where the average intensities of input are required to be equal to preassigned constants due to the requirement of illumination quality and color temperature. When the channel graph of the MIMO OIC is strongly connected, we prove that the strongest eigen-subchannel must have positive channel gains, which simplifies the capacity analysis. Then we derive various capacity bounds by utilizing linear precoding, generalized entropy power inequality, and QR decomposition, etc. These bounds are numerically verified to approach the capacity in the low or high signal-to-noise ratio regime. Specifically, when the channel rank is one less than the number of transmit antennas, we derive an equivalent capacity expression from the perspective of convex geometry, and new lower bounds are derived based on this equivalent expression. Finally, the developed results are extended to the more general case where the average intensities of input are required to be no larger than preassigned constants.
In this paper, we propose a novel Feature Decomposition and Reconstruction Learning (FDRL) method for effective facial expression recognition. We view the expression information as the combination of the shared information (expression similarities) across different expressions and the unique information (expression-specific variations) for each expression. More specifically, FDRL mainly consists of two crucial networks: a Feature Decomposition Network (FDN) and a Feature Reconstruction Network (FRN). In particular, FDN first decomposes the basic features extracted from a backbone network into a set of facial action-aware latent features to model expression similarities. Then, FRN captures the intra-feature and inter-feature relationships for latent features to characterize expression-specific variations, and reconstructs the expression feature. To this end, two modules including an intra-feature relation modeling module and an inter-feature relation modeling module are developed in FRN. Experimental results on both the in-the-lab databases (including CK+, MMI, and Oulu-CASIA) and the in-the-wild databases (including RAF-DB and SFEW) show that the proposed FDRL method consistently achieves higher recognition accuracy than several state-of-the-art methods. This clearly highlights the benefit of feature decomposition and reconstruction for classifying expressions.
Non-IID data present a tough challenge for federated learning. In this paper, we explore a novel idea of facilitating pairwise collaborations between clients with similar data. We propose FedAMP, a new method employing federated attentive message passing to facilitate similar clients to collaborate more. We establish the convergence of FedAMP for both convex and non-convex models, and propose a heuristic method to further improve the performance of FedAMP when clients adopt deep neural networks as personalized models. Our extensive experiments on benchmark data sets demonstrate the superior performance of the proposed methods.
In this paper, we proposed to apply meta learning approach for low-resource automatic speech recognition (ASR). We formulated ASR for different languages as different tasks, and meta-learned the initialization parameters from many pretraining languages to achieve fast adaptation on unseen target language, via recently proposed model-agnostic meta learning algorithm (MAML). We evaluated the proposed approach using six languages as pretraining tasks and four languages as target tasks. Preliminary results showed that the proposed method, MetaASR, significantly outperforms the state-of-the-art multitask pretraining approach on all target languages with different combinations of pretraining languages. In addition, since MAML's model-agnostic property, this paper also opens new research direction of applying meta learning to more speech-related applications.
In this paper, we propose the joint learning attention and recurrent neural network (RNN) models for multi-label classification. While approaches based on the use of either model exist (e.g., for the task of image captioning), training such existing network architectures typically require pre-defined label sequences. For multi-label classification, it would be desirable to have a robust inference process, so that the prediction error would not propagate and thus affect the performance. Our proposed model uniquely integrates attention and Long Short Term Memory (LSTM) models, which not only addresses the above problem but also allows one to identify visual objects of interests with varying sizes without the prior knowledge of particular label ordering. More importantly, label co-occurrence information can be jointly exploited by our LSTM model. Finally, by advancing the technique of beam search, prediction of multiple labels can be efficiently achieved by our proposed network model.