The continuous development of the photovoltaic (PV) industry has raised high requirements for the quality of monocrystalline of PV module cells. When learning to segment defect regions in PV module cell images, Tiny Hidden Cracks (THC) lead to extremely-imbalanced samples. The ratio of defect pixels to normal pixels can be as low as 1:2000. This extreme imbalance makes it difficult to segment the THC of PV module cells, which is also a challenge for semantic segmentation. To address the problem of segmenting defects on extremely-imbalanced THC data, the paper makes contributions from three aspects: (1) it proposes an explicit measure for output imbalance; (2) it generalizes a distribution-based loss that can handle different types of output imbalances; and (3) it introduces a compound loss with our adaptive hyperparameter selection algorithm that can keep the consistency of training and inference for harmonizing the output imbalance on extremelyimbalanced input data. The proposed method is evaluated on four widely-used deep learning architectures and four datasets with varying degrees of input imbalance. The experimental results show that the proposed method outperforms existing methods.
In the field of medical imaging, there are seldom large-scale public datasets with high-quality annotations due to data privacy and annotation cost. To address this issue, we release SynFundus-1M, a high-quality synthetic dataset containing over \textbf{1 million} fundus images w.r.t. 11 disease types. Moreover, we intentionally diversify the readability of the images and accordingly provide 4 types of the quality score for each image. To the best of our knowledge, SynFundus-1M is currently the largest fundus dataset with the most sophisticated annotations. All the images are generated by a Denoising Diffusion Probabilistic Model, named SynFundus-Generator. Trained with over 1.3 million private fundus images, our SynFundus-Generator achieves significant superior performance in generating fundus images compared to some recent related works. Furthermore, we blend some synthetic images from SynFundus-1M with real fundus images, and ophthalmologists can hardly distinguish the synthetic images from real ones. Through extensive experiments, we demonstrate that both convolutional neural networs (CNN) and Vision Transformer (ViT) can benefit from SynFundus-1M by pretraining or training directly. Compared to datasets like ImageNet or EyePACS, models trained on SynFundus-1M not only achieve better performance but also faster convergence on various downstream tasks.
Singularly perturbed boundary value problems pose a significant challenge for their numerical approximations because of the presence of sharp boundary layers. These sharp boundary layers are responsible for the stiffness of solutions, which leads to large computational errors, if not properly handled. It is well-known that the classical numerical methods as well as the Physics-Informed Neural Networks (PINNs) require some special treatments near the boundary, e.g., using extensive mesh refinements or finer collocation points, in order to obtain an accurate approximate solution especially inside of the stiff boundary layer. In this article, we modify the PINNs and construct our new semi-analytic SL-PINNs suitable for singularly perturbed boundary value problems. Performing the boundary layer analysis, we first find the corrector functions describing the singular behavior of the stiff solutions inside boundary layers. Then we obtain the SL-PINN approximations of the singularly perturbed problems by embedding the explicit correctors in the structure of PINNs or by training the correctors together with the PINN approximations. Our numerical experiments confirm that our new SL-PINN methods produce stable and accurate approximations for stiff solutions.
The joint modeling of multiple longitudinal biomarkers together with a time-to-event outcome is a challenging modeling task of continued scientific interest. In particular, the computational complexity of high dimensional (generalized) mixed effects models often restricts the flexibility of shared parameter joint models, even when the subject-specific marker trajectories follow highly nonlinear courses. We propose a parsimonious multivariate functional principal components representation of the shared random effects. This allows better scalability, as the dimension of the random effects does not directly increase with the number of markers, only with the chosen number of principal component basis functions used in the approximation of the random effects. The functional principal component representation additionally allows to estimate highly flexible subject-specific random trajectories without parametric assumptions. The modeled trajectories can thus be distinctly different for each biomarker. We build on the framework of flexible Bayesian additive joint models implemented in the R-package 'bamlss', which also supports estimation of nonlinear covariate effects via Bayesian P-splines. The flexible yet parsimonious functional principal components basis used in the estimation of the joint model is first estimated in a preliminary step. We validate our approach in a simulation study and illustrate its advantages by analyzing a study on primary biliary cholangitis.
The analysis of the psoas muscle in morphological and functional imaging has proved to be an accurate approach to assess sarcopenia, i.e. a systemic loss of skeletal muscle mass and function that may be correlated to multifactorial etiological aspects. The inclusion of sarcopenia assessment into a radiological workflow would need the implementation of computational pipelines for image processing that guarantee segmentation reliability and a significant degree of automation. The present study utilizes three-dimensional numerical schemes for psoas segmentation in low-dose X-ray computed tomography images. Specifically, here we focused on the level set methodology and compared the performances of two standard approaches, a classical evolution model and a three-dimension geodesic model, with the performances of an original first-order modification of this latter one. The results of this analysis show that these gradient-based schemes guarantee reliability with respect to manual segmentation and that the first-order scheme requires a computational burden that is significantly smaller than the one needed by the second-order approach.
The accurate and efficient evaluation of Newtonian potentials over general 2-D domains is important for the numerical solution of Poisson's equation and volume integral equations. In this paper, we present a simple and efficient high-order algorithm for computing the Newtonian potential over a planar domain discretized by an unstructured mesh. The algorithm is based on the use of Green's third identity for transforming the Newtonian potential into a collection of layer potentials over the boundaries of the mesh elements, which can be easily evaluated by the Helsing-Ojala method. One important component of our algorithm is the use of high-order (up to order 20) bivariate polynomial interpolation in the monomial basis, for which we provide extensive justification. The performance of our algorithm is illustrated through several numerical experiments.
Brain atrophy and white matter hyperintensity (WMH) are critical neuroimaging features for ascertaining brain injury in cerebrovascular disease and multiple sclerosis. Automated segmentation and quantification is desirable but existing methods require high-resolution MRI with good signal-to-noise ratio (SNR). This precludes application to clinical and low-field portable MRI (pMRI) scans, thus hampering large-scale tracking of atrophy and WMH progression, especially in underserved areas where pMRI has huge potential. Here we present a method that segments white matter hyperintensity and 36 brain regions from scans of any resolution and contrast (including pMRI) without retraining. We show results on six public datasets and on a private dataset with paired high- and low-field scans (3T and 64mT), where we attain strong correlation between the WMH ($\rho$=.85) and hippocampal volumes (r=.89) estimated at both fields. Our method is publicly available as part of FreeSurfer, at: //surfer.nmr.mgh.harvard.edu/fswiki/WMH-SynthSeg.
We implement AntClust, a clustering algorithm based on the chemical recognition system of ants and use it to cluster images of cars. We will give a short recap summary of the main working principles of the algorithm as devised by the original paper [1]. Further, we will describe how to define a similarity function for images and how the implementation is used to cluster images of cars from the vehicle re-identification data set. We then test the clustering performance of AntClust against DBSCAN, HDBSCAN and OPTICS. Finally one of the core parts in AntClust, the rule set can be easily redefined with our implementation, enabling a way for other bio-inspired algorithms to find rules in an automated process. The implementation can be found on GitLab [9].
The locations of different mRNA molecules can be revealed by multiplexed in situ RNA detection. By assigning detected mRNA molecules to individual cells, it is possible to identify many different cell types in parallel. This in turn enables investigation of the spatial cellular architecture in tissue, which is crucial for furthering our understanding of biological processes and diseases. However, cell typing typically depends on the segmentation of cell nuclei, which is often done based on images of a DNA stain, such as DAPI. Limiting cell definition to a nuclear stain makes it fundamentally difficult to determine accurate cell borders, and thereby also difficult to assign mRNA molecules to the correct cell. As such, we have developed a computational tool that segments cells solely based on the local composition of mRNA molecules. First, a small neural network is trained to compute attractive and repulsive edges between pairs of mRNA molecules. The signed graph is then partitioned by a mutex watershed into components corresponding to different cells. We evaluated our method on two publicly available datasets and compared it against the current state-of-the-art and older baselines. We conclude that combining neural networks with combinatorial optimization is a promising approach for cell segmentation of in situ transcriptomics data.
Current approaches to generic segmentation start by creating a hierarchy of nested image partitions and then specifying a segmentation from it. Our first contribution is to describe several ways, most of them new, for specifying segmentations using the hierarchy elements. Then, we consider the best hierarchy-induced segmentation specified by a limited number of hierarchy elements. We focus on a common quality measure for binary segmentations, the Jaccard index (also known as IoU). Optimizing the Jaccard index is highly non-trivial, and yet we propose an efficient approach for doing exactly that. This way we get algorithm-independent upper bounds on the quality of any segmentation created from the hierarchy. We found that the obtainable segmentation quality varies significantly depending on the way that the segments are specified by the hierarchy elements, and that representing a segmentation with only a few hierarchy elements is often possible. (Code is available).
In large-scale systems there are fundamental challenges when centralised techniques are used for task allocation. The number of interactions is limited by resource constraints such as on computation, storage, and network communication. We can increase scalability by implementing the system as a distributed task-allocation system, sharing tasks across many agents. However, this also increases the resource cost of communications and synchronisation, and is difficult to scale. In this paper we present four algorithms to solve these problems. The combination of these algorithms enable each agent to improve their task allocation strategy through reinforcement learning, while changing how much they explore the system in response to how optimal they believe their current strategy is, given their past experience. We focus on distributed agent systems where the agents' behaviours are constrained by resource usage limits, limiting agents to local rather than system-wide knowledge. We evaluate these algorithms in a simulated environment where agents are given a task composed of multiple subtasks that must be allocated to other agents with differing capabilities, to then carry out those tasks. We also simulate real-life system effects such as networking instability. Our solution is shown to solve the task allocation problem to 6.7% of the theoretical optimal within the system configurations considered. It provides 5x better performance recovery over no-knowledge retention approaches when system connectivity is impacted, and is tested against systems up to 100 agents with less than a 9% impact on the algorithms' performance.