The efficient segmentation of foreground text information from the background in degraded color document images is a critical challenge in the preservation of ancient manuscripts. The imperfect preservation of ancient manuscripts over time has led to various types of degradation, such as staining, yellowing, and ink seepage, significantly affecting image binarization results. This work proposes a three-stage method using Generative Adversarial Networks (GAN) for enhancing and binarizing degraded color document images through Discrete Wavelet Transform (DWT). Stage-1 involves applying DWT and retaining the Low-Low (LL) subband images for image enhancement. In Stage-2, the original input image is divided into four single-channel images (Red, Green, Blue, and Gray), and each is trained with independent adversarial networks to extract color foreground information. In Stage-3, the output image from Stage-2 and the original input image are used to train independent adversarial networks for document binarization, enabling the integration of global and local features. The experimental results demonstrate that our proposed method outperforms other classic and state-of-the-art (SOTA) methods on the Document Image Binarization Contest (DIBCO) datasets. We have released our implementation code at //github.com/abcpp12383/ThreeStageBinarization.
Friction drag from a turbulent fluid moving past or inside an object plays a crucial role in domains as diverse as transportation, public utility infrastructure, energy technology, and human health. As a direct measure of the shear-induced friction forces, an accurate prediction of the wall-shear stress can contribute to sustainability, conservation of resources, and carbon neutrality in civil aviation as well as enhanced medical treatment of vascular diseases and cancer. Despite such importance for our modern society, we still lack adequate experimental methods to capture the instantaneous wall-shear stress dynamics. In this contribution, we present a holistic approach that derives velocity and wall-shear stress fields with impressive spatial and temporal resolution from flow measurements using a deep optical flow estimator with physical knowledge. The validity and physical correctness of the derived flow quantities is demonstrated with synthetic and real-world experimental data covering a range of relevant fluid flows.
When a thin liquid film flows down on a vertical fiber, one can observe the complex and captivating interfacial dynamics of an unsteady flow. Such dynamics are applicable in various fluid experiments due to their high surface area-to-volume ratio. Recent studies verified that when the flow undergoes regime transitions, the magnitude of the film thickness changes dramatically, making numerical simulations challenging. In this paper, we present a computationally efficient numerical method that can maintain the positivity of the film thickness as well as conserve the volume of the fluid under the coarse mesh setting. A series of comparisons to laboratory experiments and previously proposed numerical methods supports the validity of our numerical method. We also prove that our method is second-order consistent in space and satisfies the entropy estimate.
Finding the optimal design of experiments in the Bayesian setting typically requires estimation and optimization of the expected information gain functional. This functional consists of one outer and one inner integral, separated by the logarithm function applied to the inner integral. When the mathematical model of the experiment contains uncertainty about the parameters of interest and nuisance uncertainty, (i.e., uncertainty about parameters that affect the model but are not themselves of interest to the experimenter), two inner integrals must be estimated. Thus, the already considerable computational effort required to determine good approximations of the expected information gain is increased further. The Laplace approximation has been applied successfully in the context of experimental design in various ways, and we propose two novel estimators featuring the Laplace approximation to alleviate the computational burden of both inner integrals considerably. The first estimator applies Laplace's method followed by a Laplace approximation, introducing a bias. The second estimator uses two Laplace approximations as importance sampling measures for Monte Carlo approximations of the inner integrals. Both estimators use Monte Carlo approximation for the remaining outer integral estimation. We provide three numerical examples demonstrating the applicability and effectiveness of our proposed estimators.
We describe the implementation of the Giudici-Green Metropolis sampling method for decomposable graphs using a variety of structures to represent the graph. These comprise the graph itself, the Junction tree, the Almond tree and the Ibarra clique-separator graph. For each structure, we describe the process for ascertaining whether adding or deleting a specific edge results in a new graph that is also decomposable, and the updates that need to be made to the structure if the edge perturbation is made. For the Almond tree and Ibarra graph these procedures are novel. We find that using the graph itself is generally at least competitive in terms of computational efficiency for a variety of graph distributions, but note that the other structures may allow and suggest samplers using different perturbations with lower rejection rates and/or better mixing properties. The sampler has applications in estimating graphical models for systems of multivariate Gaussian or Multinomial variables.
The ability of Deep Learning to process and extract relevant information in complex brain dynamics from raw EEG data has been demonstrated in various recent works. Deep learning models, however, have also been shown to perform best on large corpora of data. When processing EEG, a natural approach is to combine EEG datasets from different experiments to train large deep-learning models. However, most EEG experiments use custom channel montages, requiring the data to be transformed into a common space. Previous methods have used the raw EEG signal to extract features of interest and focused on using a common feature space across EEG datasets. While this is a sensible approach, it underexploits the potential richness of EEG raw data. Here, we explore using spatial attention applied to EEG electrode coordinates to perform channel harmonization of raw EEG data, allowing us to train deep learning on EEG data using different montages. We test this model on a gender classification task. We first show that spatial attention increases model performance. Then, we show that a deep learning model trained on data using different channel montages performs significantly better than deep learning models trained on fixed 23- and 128-channel data montages.
We present a versatile open-source pipeline for simulating inhomogeneous reaction-diffusion processes in highly resolved, image-based geometries of porous media with reactive boundaries. Resolving realistic pore-scale geometries in numerical models is challenging and computationally demanding, as the scale differences between the sizes of the interstitia and the whole system can lead to prohibitive memory requirements. The present pipeline combines a level-set method with geometry-adapted sparse block grids on GPUs to efficiently simulate reaction-diffusion processes in image-based geometries. We showcase the method by applying it to fertilizer diffusion in soil, heat transfer in porous ceramics, and determining effective diffusion coefficients and tortuosity. The present approach enables solving reaction-diffusion partial differential equations in real-world geometries applicable to porous media across fields such as engineering, environmental science, and biology.
Computer-generated holography (CGH) can be used to display three-dimensional (3D) images and has a special feature that no other technology possesses: it can reconstruct arbitrary object wavefronts. In this study, we investigated a high-speed full-color reconstruction method for improving the realism of 3D images produced using CGH. The proposed method uses a digital micromirror device (DMD) with a high-speed switching capability as the hologram display device. It produces 3D video by time-division multiplexing using an optical system incorporating fiber-coupled laser diodes (LDs) operating in red, green, and blue wavelengths. The wavelength dispersion of the DMD is compensated for by superimposing plane waves on the hologram. Fourier transform optics are used to separate the object, conjugate, and zeroth-order light, thus eliminating the need for an extensive 4f system. The resources used in this research, such as the programs used for the hologram generation and the schematics of the LD driver, are available on GitHub.
Microring resonators (MRRs) are promising devices for time-delay photonic reservoir computing, but the impact of the different physical effects taking place in the MRRs on the reservoir computing performance is yet to be fully understood. We numerically analyze the impact of linear losses as well as thermo-optic and free-carrier effects relaxation times on the prediction error of the time-series task NARMA-10. We demonstrate the existence of three regions, defined by the input power and the frequency detuning between the optical source and the microring resonance, that reveal the cavity transition from linear to nonlinear regimes. One of these regions offers very low error in time-series prediction under relatively low input power and number of nodes while the other regions either lack nonlinearity or become unstable. This study provides insight into the design of the MRR and the optimization of its physical properties for improving the prediction performance of time-delay reservoir computing.
Spectral independence is a recently-developed framework for obtaining sharp bounds on the convergence time of the classical Glauber dynamics. This new framework has yielded optimal $O(n \log n)$ sampling algorithms on bounded-degree graphs for a large class of problems throughout the so-called uniqueness regime, including, for example, the problems of sampling independent sets, matchings, and Ising-model configurations. Our main contribution is to relax the bounded-degree assumption that has so far been important in establishing and applying spectral independence. Previous methods for avoiding degree bounds rely on using $L^p$-norms to analyse contraction on graphs with bounded connective constant (Sinclair, Srivastava, Yin; FOCS'13). The non-linearity of $L^p$-norms is an obstacle to applying these results to bound spectral independence. Our solution is to capture the $L^p$-analysis recursively by amortising over the subtrees of the recurrence used to analyse contraction. Our method generalises previous analyses that applied only to bounded-degree graphs. As a main application of our techniques, we consider the random graph $G(n,d/n)$, where the previously known algorithms run in time $n^{O(\log d)}$ or applied only to large $d$. We refine these algorithmic bounds significantly, and develop fast $n^{1+o(1)}$ algorithms based on Glauber dynamics that apply to all $d$, throughout the uniqueness regime.
Hashing has been widely used in approximate nearest search for large-scale database retrieval for its computation and storage efficiency. Deep hashing, which devises convolutional neural network architecture to exploit and extract the semantic information or feature of images, has received increasing attention recently. In this survey, several deep supervised hashing methods for image retrieval are evaluated and I conclude three main different directions for deep supervised hashing methods. Several comments are made at the end. Moreover, to break through the bottleneck of the existing hashing methods, I propose a Shadow Recurrent Hashing(SRH) method as a try. Specifically, I devise a CNN architecture to extract the semantic features of images and design a loss function to encourage similar images projected close. To this end, I propose a concept: shadow of the CNN output. During optimization process, the CNN output and its shadow are guiding each other so as to achieve the optimal solution as much as possible. Several experiments on dataset CIFAR-10 show the satisfying performance of SRH.