Nowadays, deep-learning image coding solutions have shown similar or better compression efficiency than conventional solutions based on hand-crafted transforms and spatial prediction techniques. These deep-learning codecs require a large training set of images and a training methodology to obtain a suitable model (set of parameters) for efficient compression. The training is performed with an optimization algorithm which provides a way to minimize the loss function. Therefore, the loss function plays a key role in the overall performance and includes a differentiable quality metric that attempts to mimic human perception. The main objective of this paper is to study the perceptual impact of several image quality metrics that can be used in the loss function of the training process, through a crowdsourcing subjective image quality assessment study. From this study, it is possible to conclude that the choice of the quality metric is critical for the perceptual performance of the deep-learning codec and that can vary depending on the image content.
Pseudo-Hamiltonian neural networks (PHNN) were recently introduced for learning dynamical systems that can be modelled by ordinary differential equations. In this paper, we extend the method to partial differential equations. The resulting model is comprised of up to three neural networks, modelling terms representing conservation, dissipation and external forces, and discrete convolution operators that can either be learned or be given as input. We demonstrate numerically the superior performance of PHNN compared to a baseline model that models the full dynamics by a single neural network. Moreover, since the PHNN model consists of three parts with different physical interpretations, these can be studied separately to gain insight into the system, and the learned model is applicable also if external forces are removed or changed.
After introducing a bit-plane quantum representation for a multi-image, we present a novel way to encrypt/decrypt multiple images using a quantum computer. Our encryption scheme is based on a two-stage scrambling of the images and of the bit planes on one hand and of the pixel positions on the other hand, each time using quantum baker maps. The resulting quantum multi-image is then diffused with controlled CNOT gates using a sine chaotification of a two-dimensional H\'enon map as well as Chebyshev polynomials. The decryption is processed by operating all the inverse quantum gates in the reverse order.
Proper X-ray radiation design (via dynamic fluence field modulation, FFM) allows to reduce effective radiation dose in computed tomography without compromising image quality. It takes into account patient anatomy, radiation sensitivity of different organs and tissues, and location of regions of interest. We account all these factors within a general convex optimization framework.
Approximate Bayesian computation (ABC) methods are standard tools for inferring parameters of complex models when the likelihood function is analytically intractable. A popular approach to improving the poor acceptance rate of the basic rejection sampling ABC algorithm is to use sequential Monte Carlo (ABC SMC) to produce a sequence of proposal distributions adapting towards the posterior, instead of generating values from the prior distribution of the model parameters. Proposal distribution for the subsequent iteration is typically obtained from a weighted set of samples, often called particles, of the current iteration of this sequence. Current methods for constructing these proposal distributions treat all the particles equivalently, regardless of the corresponding value generated by the sampler, which may lead to inefficiency when propagating the information across iterations of the algorithm. To improve sampler efficiency, we introduce a modified approach called stratified distance ABC SMC. Our algorithm stratifies particles based on their distance between the corresponding synthetic and observed data, and then constructs distinct proposal distributions for all the strata. Taking into account the distribution of distances across the particle space leads to substantially improved acceptance rate of the rejection sampling. We further show that efficiency can be gained by introducing a novel stopping rule for the sequential process based on the stratified posterior samples and demonstrate these advances by several examples.
Multiscale stochastic dynamical systems have been widely adopted to a variety of scientific and engineering problems due to their capability of depicting complex phenomena in many real world applications. This work is devoted to investigating the effective dynamics for slow-fast stochastic dynamical systems. Given observation data on a short-term period satisfying some unknown slow-fast stochastic systems, we propose a novel algorithm including a neural network called Auto-SDE to learn invariant slow manifold. Our approach captures the evolutionary nature of a series of time-dependent autoencoder neural networks with the loss constructed from a discretized stochastic differential equation. Our algorithm is also validated to be accurate, stable and effective through numerical experiments under various evaluation metrics.
Detecting differences in gene expression is an important part of single-cell RNA sequencing experiments, and many statistical methods have been developed for this aim. Most differential expression analyses focus on comparing expression between two groups (e.g., treatment vs. control). But there is increasing interest in multi-condition differential expression analyses in which expression is measured in many conditions, and the aim is to accurately detect and estimate expression differences in all conditions. We show that directly modeling single-cell RNA-seq counts in all conditions simultaneously, while also inferring how expression differences are shared across conditions, leads to greatly improved performance for detecting and estimating expression differences compared to existing methods. We illustrate the potential of this new approach by analyzing data from a single-cell experiment studying the effects of cytokine stimulation on gene expression. We call our new method "Poisson multivariate adaptive shrinkage", and it is implemented in an R package available online at //github.com/stephenslab/poisson.mash.alpha.
We derive information-theoretic generalization bounds for supervised learning algorithms based on the information contained in predictions rather than in the output of the training algorithm. These bounds improve over the existing information-theoretic bounds, are applicable to a wider range of algorithms, and solve two key challenges: (a) they give meaningful results for deterministic algorithms and (b) they are significantly easier to estimate. We show experimentally that the proposed bounds closely follow the generalization gap in practical scenarios for deep learning.
Deep learning is usually described as an experiment-driven field under continuous criticizes of lacking theoretical foundations. This problem has been partially fixed by a large volume of literature which has so far not been well organized. This paper reviews and organizes the recent advances in deep learning theory. The literature is categorized in six groups: (1) complexity and capacity-based approaches for analyzing the generalizability of deep learning; (2) stochastic differential equations and their dynamic systems for modelling stochastic gradient descent and its variants, which characterize the optimization and generalization of deep learning, partially inspired by Bayesian inference; (3) the geometrical structures of the loss landscape that drives the trajectories of the dynamic systems; (4) the roles of over-parameterization of deep neural networks from both positive and negative perspectives; (5) theoretical foundations of several special structures in network architectures; and (6) the increasingly intensive concerns in ethics and security and their relationships with generalizability.
Hashing has been widely used in approximate nearest search for large-scale database retrieval for its computation and storage efficiency. Deep hashing, which devises convolutional neural network architecture to exploit and extract the semantic information or feature of images, has received increasing attention recently. In this survey, several deep supervised hashing methods for image retrieval are evaluated and I conclude three main different directions for deep supervised hashing methods. Several comments are made at the end. Moreover, to break through the bottleneck of the existing hashing methods, I propose a Shadow Recurrent Hashing(SRH) method as a try. Specifically, I devise a CNN architecture to extract the semantic features of images and design a loss function to encourage similar images projected close. To this end, I propose a concept: shadow of the CNN output. During optimization process, the CNN output and its shadow are guiding each other so as to achieve the optimal solution as much as possible. Several experiments on dataset CIFAR-10 show the satisfying performance of SRH.
Graph representation learning for hypergraphs can be used to extract patterns among higher-order interactions that are critically important in many real world problems. Current approaches designed for hypergraphs, however, are unable to handle different types of hypergraphs and are typically not generic for various learning tasks. Indeed, models that can predict variable-sized heterogeneous hyperedges have not been available. Here we develop a new self-attention based graph neural network called Hyper-SAGNN applicable to homogeneous and heterogeneous hypergraphs with variable hyperedge sizes. We perform extensive evaluations on multiple datasets, including four benchmark network datasets and two single-cell Hi-C datasets in genomics. We demonstrate that Hyper-SAGNN significantly outperforms the state-of-the-art methods on traditional tasks while also achieving great performance on a new task called outsider identification. Hyper-SAGNN will be useful for graph representation learning to uncover complex higher-order interactions in different applications.