Despite the limited availability and quantum volume of quantum computers, quantum image representation is a widely researched area. Currently developed methods use quantum entanglement to encode information about pixel positions. These methods range from using the angle parameter of the rotation gate (e.g., the Flexible Representation of Quantum Images, FRQI), sequences of qubits (e.g., Novel Enhanced Quantum Representation, NEQR), or the angle parameter of the phase shift gates (e.g., Local Phase Image Quantum Encoding, LPIQE) for storing color information. All these methods are significantly affected by decoherence and other forms of quantum noise, which is an inseparable part of quantum computing in the noisy intermediate-scale quantum era. These phenomena can highly influence the measurements and result in extracted images that are visually dissimilar to the originals. Because this process is at its foundation quantum, the computational reversal of this process is possible. There are many methods for error correction, mitigation, and reduction, but all of them use quantum computer time or additional qubits to achieve the desired result. We report the successful use of a generative adversarial network trained for image-to-image translation, in conjunction with Phase Distortion Unraveling error reduction method, for reducing overall error in images encoded using LPIQE.
Despite numerous years of research into the merits and trade-offs of various model selection criteria, obtaining robust results that elucidate the behavior of cross-validation remains a challenging endeavor. In this paper, we highlight the inherent limitations of cross-validation when employed to discern the structure of a Gaussian graphical model. We provide finite-sample bounds on the probability that the Lasso estimator for the neighborhood of a node within a Gaussian graphical model, optimized using a prediction oracle, misidentifies the neighborhood. Our results pertain to both undirected and directed acyclic graphs, encompassing general, sparse covariance structures. To support our theoretical findings, we conduct an empirical investigation of this inconsistency by contrasting our outcomes with other commonly used information criteria through an extensive simulation study. Given that many algorithms designed to learn the structure of graphical models require hyperparameter selection, the precise calibration of this hyperparameter is paramount for accurately estimating the inherent structure. Consequently, our observations shed light on this widely recognized practical challenge.
Large machine learning models are revolutionary technologies of artificial intelligence whose bottlenecks include huge computational expenses, power, and time used both in the pre-training and fine-tuning process. In this work, we show that fault-tolerant quantum computing could possibly provide provably efficient resolutions for generic (stochastic) gradient descent algorithms, scaling as O(T^2 polylog(n)), where n is the size of the models and T is the number of iterations in the training, as long as the models are both sufficiently dissipative and sparse, with small learning rates. Based on earlier efficient quantum algorithms for dissipative differential equations, we find and prove that similar algorithms work for (stochastic) gradient descent, the primary algorithm for machine learning. In practice, we benchmark instances of large machine learning models from 7 million to 103 million parameters. We find that, in the context of sparse training, a quantum enhancement is possible at the early stage of learning after model pruning, motivating a sparse parameter download and re-upload scheme. Our work shows solidly that fault-tolerant quantum algorithms could potentially contribute to most state-of-the-art, large-scale machine-learning problems.
In recent literature, for modeling reasons, fractional differential problems have been considered equipped with anti-symmetric boundary conditions. Twenty years ago the anti-reflective boundary conditions were introduced in a context of signal processing and imaging for increasing the quality of the reconstruction of a blurred signal/image contaminated by noise and for reducing the overall complexity to that of few fast sine transforms i.e. to $O(N\log N)$ real arithmetic operations, where $N$ is the number of pixels. Here we consider the anti-symmetric boundary conditions and we introduce the anti-reflective boundary conditions in the context of nonlocal problems of fractional differential type. In the latter context, we study both types of boundary conditions, which in reality are similar in the essentials, from the perspective of computational efficiency, by considering nontruncated and truncated versions. Several numerical tests, tables, and visualizations are provided and critically discussed.
This study investigates the potential of automated deep learning to enhance the accuracy and efficiency of multi-class classification of bird vocalizations, compared against traditional manually-designed deep learning models. Using the Western Mediterranean Wetland Birds dataset, we investigated the use of AutoKeras, an automated machine learning framework, to automate neural architecture search and hyperparameter tuning. Comparative analysis validates our hypothesis that the AutoKeras-derived model consistently outperforms traditional models like MobileNet, ResNet50 and VGG16. Our approach and findings underscore the transformative potential of automated deep learning for advancing bioacoustics research and models. In fact, the automated techniques eliminate the need for manual feature engineering and model design while improving performance. This study illuminates best practices in sampling, evaluation and reporting to enhance reproducibility in this nascent field. All the code used is available at https: //github.com/giuliotosato/AutoKeras-bioacustic Keywords: AutoKeras; automated deep learning; audio classification; Wetlands Bird dataset; comparative analysis; bioacoustics; validation dataset; multi-class classification; spectrograms.
We consider the problem of estimating unknown parameters in stochastic differential equations driven by colored noise, given continuous-time observations. Colored noise is modelled as a sequence of mean zero Gaussian stationary processes with an exponential autocorrelation function, with decreasing correlation time. Our goal is to infer parameters in the limit equation, driven by white noise, given observations of the colored noise dynamics. As in the case of parameter estimation for multiscale diffusions, the observations are only compatible with the data in the white noise limit, and classic estimators become biased, implying the need of preprocessing the data. We consider both the maximum likelihood and the stochastic gradient descent in continuous time estimators, and we propose modified versions of these methods, in which the observations are filtered using an exponential filter. Both stochastic differential equations with additive and multiplicative noise are considered. We provide a convergence analysis for our novel estimators in the limit of infinite data, and in the white noise limit, showing that the estimators are asymptotically unbiased. We consider in detail the case of multiplicative colored noise, in particular when the L\'evy area correction drift appears in the limiting white noise equation. A series of numerical experiments corroborates our theoretical results.
An a posteriori error estimator based on an equilibrated flux reconstruction is proposed for defeaturing problems in the context of finite element discretizations. Defeaturing consists in the simplification of a geometry by removing features that are considered not relevant for the approximation of the solution of a given PDE. In this work, the focus is on Poisson equation with Neumann boundary conditions on the feature boundary. The estimator accounts both for the so-called defeaturing error and for the numerical error committed by approximating the solution on the defeatured domain. Unlike other estimators that were previously proposed for defeaturing problems, the use of the equilibrated flux reconstruction allows to obtain a sharp bound for the numerical component of the error. Furthermore, it does not require the evaluation of the normal trace of the numerical flux on the feature boundary: this makes the estimator well-suited for finite element discretizations, in which the normal trace of the numerical flux is typically discontinuous across elements. The reliability of the estimator is proven and verified on several numerical examples. Its capability to identify the most relevant features is also shown, in anticipation of a future application to an adaptive strategy.
We consider the problem of synchronizing a multi-agent system (MAS) composed of several identical linear systems connected through a directed graph.To design a suitable controller, we construct conditions based on Bilinear Matrix Inequalities (BMIs) that ensure state synchronization.Since these conditions are non-convex, we propose an iterative algorithm based on a suitable relaxation that allows us to formulate Linear Matrix Inequality (LMI) conditions.As a result, the algorithm yields a common static state-feedback matrix for the controller that satisfies general linear performance constraints.Our results are achieved under the mild assumption that the graph is time-invariant and connected.
Adaptive first-order optimizers are fundamental tools in deep learning, although they may suffer from poor generalization due to the nonuniform gradient scaling. In this work, we propose AdamL, a novel variant of the Adam optimizer, that takes into account the loss function information to attain better generalization results. We provide sufficient conditions that together with the Polyak-Lojasiewicz inequality, ensure the linear convergence of AdamL. As a byproduct of our analysis, we prove similar convergence properties for the EAdam, and AdaBelief optimizers. Experimental results on benchmark functions show that AdamL typically achieves either the fastest convergence or the lowest objective function values when compared to Adam, EAdam, and AdaBelief. These superior performances are confirmed when considering deep learning tasks such as training convolutional neural networks, training generative adversarial networks using vanilla convolutional neural networks, and long short-term memory networks. Finally, in the case of vanilla convolutional neural networks, AdamL stands out from the other Adam's variants and does not require the manual adjustment of the learning rate during the later stage of the training.
The remarkable practical success of deep learning has revealed some major surprises from a theoretical perspective. In particular, simple gradient methods easily find near-optimal solutions to non-convex optimization problems, and despite giving a near-perfect fit to training data without any explicit effort to control model complexity, these methods exhibit excellent predictive accuracy. We conjecture that specific principles underlie these phenomena: that overparametrization allows gradient methods to find interpolating solutions, that these methods implicitly impose regularization, and that overparametrization leads to benign overfitting. We survey recent theoretical progress that provides examples illustrating these principles in simpler settings. We first review classical uniform convergence results and why they fall short of explaining aspects of the behavior of deep learning methods. We give examples of implicit regularization in simple settings, where gradient methods lead to minimal norm functions that perfectly fit the training data. Then we review prediction methods that exhibit benign overfitting, focusing on regression problems with quadratic loss. For these methods, we can decompose the prediction rule into a simple component that is useful for prediction and a spiky component that is useful for overfitting but, in a favorable setting, does not harm prediction accuracy. We focus specifically on the linear regime for neural networks, where the network can be approximated by a linear model. In this regime, we demonstrate the success of gradient flow, and we consider benign overfitting with two-layer networks, giving an exact asymptotic analysis that precisely demonstrates the impact of overparametrization. We conclude by highlighting the key challenges that arise in extending these insights to realistic deep learning settings.
When and why can a neural network be successfully trained? This article provides an overview of optimization algorithms and theory for training neural networks. First, we discuss the issue of gradient explosion/vanishing and the more general issue of undesirable spectrum, and then discuss practical solutions including careful initialization and normalization methods. Second, we review generic optimization methods used in training neural networks, such as SGD, adaptive gradient methods and distributed methods, and theoretical results for these algorithms. Third, we review existing research on the global issues of neural network training, including results on bad local minima, mode connectivity, lottery ticket hypothesis and infinite-width analysis.