The development of successful artificial intelligence models for chest X-ray analysis relies on large, diverse datasets with high-quality annotations. While several databases of chest X-ray images have been released, most include disease diagnosis labels but lack detailed pixel-level anatomical segmentation labels. To address this gap, we introduce an extensive chest X-ray multi-center segmentation dataset with uniform and fine-grain anatomical annotations for images coming from six well-known publicly available databases: CANDID-PTX, ChestX-ray8, Chexpert, MIMIC-CXR-JPG, Padchest, and VinDr-CXR, resulting in 676,803 segmentation masks. Our methodology utilizes the HybridGNet model to ensure consistent and high-quality segmentations across all datasets. Rigorous validation, including expert physician evaluation and automatic quality control, was conducted to validate the resulting masks. Additionally, we provide individualized quality indices per mask and an overall quality estimation per dataset. This dataset serves as a valuable resource for the broader scientific community, streamlining the development and assessment of innovative methodologies in chest X-ray analysis. The CheXmask dataset is publicly available at: //physionet.org/content/chexmask-cxr-segmentation-data/
Embedding graphs in continous spaces is a key factor in designing and developing algorithms for automatic information extraction to be applied in diverse tasks (e.g., learning, inferring, predicting). The reliability of graph embeddings directly depends on how much the geometry of the continuous space matches the graph structure. Manifolds are mathematical structure that can enable to incorporate in their topological spaces the graph characteristics, and in particular nodes distances. State-of-the-art of manifold-based graph embedding algorithms take advantage of the assumption that the projection on a tangential space of each point in the manifold (corresponding to a node in the graph) would locally resemble a Euclidean space. Although this condition helps in achieving efficient analytical solutions to the embedding problem, it does not represent an adequate set-up to work with modern real life graphs, that are characterized by weighted connections across nodes often computed over sparse datasets with missing records. In this work, we introduce a new class of manifold, named soft manifold, that can solve this situation. In particular, soft manifolds are mathematical structures with spherical symmetry where the tangent spaces to each point are hypocycloids whose shape is defined according to the velocity of information propagation across the data points. Using soft manifolds for graph embedding, we can provide continuous spaces to pursue any task in data analysis over complex datasets. Experimental results on reconstruction tasks on synthetic and real datasets show how the proposed approach enable more accurate and reliable characterization of graphs in continuous spaces with respect to the state-of-the-art.
Time series forecasting represents a significant and challenging task across various fields. Recently, methods based on mode decomposition have dominated the forecasting of complex time series because of the advantages of capturing local characteristics and extracting intrinsic modes from data. Unfortunately, most models fail to capture the implied volatilities that contain significant information. To enhance the prediction of contemporary diverse and complex time series, we propose a novel time series forecasting paradigm that integrates decomposition with the capability to capture the underlying fluctuation information of the series. In our methodology, we implement the Variational Mode Decomposition algorithm to decompose the time series into K distinct sub-modes. Following this decomposition, we apply the Generalized Autoregressive Conditional Heteroskedasticity (GARCH) model to extract the volatility information in these sub-modes. Subsequently, both the numerical data and the volatility information for each sub-mode are harnessed to train a neural network. This network is adept at predicting the information of the sub-modes, and we aggregate the predictions of all sub-modes to generate the final output. By integrating econometric and artificial intelligence methods, and taking into account both the numerical and volatility information of the time series, our proposed framework demonstrates superior performance in time series forecasting, as evidenced by the significant decrease in MSE, RMSE, and MAPE in our comparative experimental results.
Image steganography is a technique of hiding secret information inside another image, so that the secret is not visible to human eyes and can be recovered when needed. Most of the existing image steganography methods have low hiding robustness when the container images affected by distortion. Such as Gaussian noise and lossy compression. This paper proposed PRIS to improve the robustness of image steganography, it based on invertible neural networks, and put two enhance modules before and after the extraction process with a 3-step training strategy. Moreover, rounding error is considered which is always ignored by existing methods, but actually it is unavoidable in practical. A gradient approximation function (GAF) is also proposed to overcome the undifferentiable issue of rounding distortion. Experimental results show that our PRIS outperforms the state-of-the-art robust image steganography method in both robustness and practicability. Codes are available at //github.com/yanghangAI/PRIS, demonstration of our model in practical at //yanghang.site/hide/.
The diverse spectrum of material characteristics including band gap, mechanical moduli, color, phonon and electronic density of states, along with catalytic and surface properties are intricately intertwined with the atomic structure and the corresponding interatomic bond-lengths. This interconnection extends to the manifestation of interplanar spacings within a crystalline lattice. Analysis of these interplanar spacings and the comprehension of any deviations, whether it be lattice compression or expansion, commonly referred to as strain, hold paramount significance in unraveling various unknowns within the field. Transmission Electron Microscopy (TEM) is widely used to capture atomic-scale ordering, facilitating direct investigation of interplanar spacings. However, creating critical contour maps for visualizing and interpreting lattice stresses in TEM images remains a challenging task. Here we developed a Python code for TEM image processing that can handle a wide range of materials including nanoparticles, 2D materials, pure crystals and solid solutions. This algorithm converts local differences in interplanar spacings into contour maps allowing for a visual representation of lattice expansion and compression. The tool is very generic and can significantly aid in analyzing material properties using TEM images, allowing for a more in-depth exploration of the underlying science behind strain engineering via strain contour maps at the atomic level.
The reconstruction task in photoacoustic tomography can vary a lot depending on measured targets, geometry, and especially the quantity we want to recover. Specifically, as the signal is generated due to the coupling of light and sound by the photoacoustic effect, we have the possibility to recover acoustic as well as optical tissue parameters. This is referred to as quantitative imaging, i.e, correct recovery of physical parameters and not just a qualitative image. In this chapter, we aim to give an overview on established reconstruction techniques in photoacoustic tomography. We start with modelling of the optical and acoustic phenomena, necessary for a reliable recovery of quantitative values. Furthermore, we give an overview of approaches for the tomographic reconstruction problem with an emphasis on the recovery of quantitative values, from direct and fast analytic approaches to computationally involved optimisation based techniques and recent data-driven approaches.
The task of bandwidth extension addresses the generation of missing high frequencies of audio signals based on knowledge of the low-frequency part of the sound. This task applies to various problems, such as audio coding or audio restoration. In this article, we focus on efficient bandwidth extension of monophonic and polyphonic musical signals using a differentiable digital signal processing (DDSP) model. Such a model is composed of a neural network part with relatively few parameters trained to infer the parameters of a differentiable digital signal processing model, which efficiently generates the output full-band audio signal. We first address bandwidth extension of monophonic signals, and then propose two methods to explicitely handle polyphonic signals. The benefits of the proposed models are first demonstrated on monophonic and polyphonic synthetic data against a baseline and a deep-learning-based resnet model. The models are next evaluated on recorded monophonic and polyphonic data, for a wide variety of instruments and musical genres. We show that all proposed models surpass a higher complexity deep learning model for an objective metric computed in the frequency domain. A MUSHRA listening test confirms the superiority of the proposed approach in terms of perceptual quality.
Existing methods attempt to improve models' generalization ability on real-world hazy images by exploring well-designed training schemes (e.g., CycleGAN, prior loss). However, most of them need very complicated training procedures to achieve satisfactory results. In this work, we present a totally novel testing pipeline called Prompt-based Test-Time Dehazing (PTTD) to help generate visually pleasing results of real-captured hazy images during the inference phase. We experimentally find that given a dehazing model trained on synthetic data, by fine-tuning the statistics (i.e., mean and standard deviation) of encoding features, PTTD is able to narrow the domain gap, boosting the performance of real image dehazing. Accordingly, we first apply a prompt generation module (PGM) to generate a visual prompt, which is the source of appropriate statistical perturbations for mean and standard deviation. And then, we employ the feature adaptation module (FAM) into the existing dehazing models for adjusting the original statistics with the guidance of the generated prompt. Note that, PTTD is model-agnostic and can be equipped with various state-of-the-art dehazing models trained on synthetic hazy-clean pairs. Extensive experimental results demonstrate that our PTTD is flexible meanwhile achieves superior performance against state-of-the-art dehazing methods in real-world scenarios. The source code of our PTTD will be made available at //github.com/cecret3350/PTTD-Dehazing.
Optical backbone networks are required to be highly dynamic in supporting requests with flexible bandwidth granularities to cope with the demands of new broadband wireless and fixed access networks. To provide this flexibility, services are offered by taking requested bandwidth profile into consideration, instead of assigning a fixed amount of bandwidth to each request. New techniques are developed for the resource management of the elastic optical networks to realize services with a specified bandwidth profile, consisting of minimum, average, and maximum required number of spectrum slots, in addition to holding time. In this work, two new schemes are proposed to realize such services, exploiting a probabilistic spectrum partitioning approach. This new probabilistic spectrum partitioning scheme is devised to enhance the chance of accommodating requests and consequently lower request blocking probability. It enforces different probabilities to contributing spectrum partitions in a certain service realization. Taking advantage of this probabilistic spectrum partitioning and a profile-based routing, we introduce two multistage spectrum assignment methods to make a certain lightpath meet the requested service profile constraints, considering the time-weighted average of the assigned spectrum slots. The results indicate that our algorithms can successfully realize the requests with the probability of 0.993 for the offered loads less than 400 erlang.
The correlation of optical measurements with a correct pathology label is often hampered by imprecise registration caused by deformations in histology images. This study explores an automated multi-modal image registration technique utilizing deep learning principles to align snapshot breast specimen images with corresponding histology images. The input images, acquired through different modalities, present challenges due to variations in intensities and structural visibility, making linear assumptions inappropriate. An unsupervised and supervised learning approach, based on the VoxelMorph model, was explored, making use of a dataset with manually registered images used as ground truth. Evaluation metrics, including Dice scores and mutual information, reveal that the unsupervised model outperforms the supervised (and manual approach) significantly, achieving superior image alignment. This automated registration approach holds promise for improving the validation of optical technologies by minimizing human errors and inconsistencies associated with manual registration.
Networks are one of the most valuable data structures for modeling problems in the real world. However, the most recent node embedding strategies have focused on undirected graphs, with limited attention to directed graphs, especially directed heterogeneous graphs. In this study, we first investigated the network properties of directed heterogeneous graphs. Based on network analysis, we proposed an embedding method, a bidirectional heterogeneous graph neural network with random teleport (BHGNN-RT), for directed heterogeneous graphs, that leverages bidirectional message-passing process and network heterogeneity. With the optimization of teleport proportion, BHGNN-RT is beneficial to overcome the over-smoothing problem. Extensive experiments on various datasets were conducted to verify the efficacy and efficiency of BHGNN-RT. Furthermore, we investigated the effects of message components, model layer, and teleport proportion on model performance. The performance comparison with all other baselines illustrates that BHGNN-RT achieves state-of-the-art performance, outperforming the benchmark methods in both node classification and unsupervised clustering tasks.