The standard approach to tackling computer vision problems is to train deep convolutional neural network (CNN) models using large-scale image datasets which are representative of the target task. However, in many scenarios, it is often challenging to obtain sufficient image data for the target task. Data augmentation is a way to mitigate this challenge. A common practice is to explicitly transform existing images in desired ways so as to create the required volume and variability of training data necessary to achieve good generalization performance. In situations where data for the target domain is not accessible, a viable workaround is to synthesize training data from scratch--i.e., synthetic data augmentation. This paper presents an extensive review of synthetic data augmentation techniques. It covers data synthesis approaches based on realistic 3D graphics modeling, neural style transfer (NST), differential neural rendering, and generative artificial intelligence (AI) techniques such as generative adversarial networks (GANs) and variational autoencoders (VAEs). For each of these classes of methods, we focus on the important data generation and augmentation techniques, general scope of application and specific use-cases, as well as existing limitations and possible workarounds. Additionally, we provide a summary of common synthetic datasets for training computer vision models, highlighting the main features, application domains and supported tasks. Finally, we discuss the effectiveness of synthetic data augmentation methods. Since this is the first paper to explore synthetic data augmentation methods in great detail, we are hoping to equip readers with the necessary background information and in-depth knowledge of existing methods and their attendant issues.
Physics-informed neural networks (PINN) is a extremely powerful paradigm used to solve equations encountered in scientific computing applications. An important part of the procedure is the minimization of the equation residual which includes, when the equation is time-dependent, a time sampling. It was argued in the literature that the sampling need not be uniform but should overweight initial time instants, but no rigorous explanation was provided for these choice. In this paper we take some prototypical examples and, under standard hypothesis concerning the neural network convergence, we show that the optimal time sampling follows a truncated exponential distribution. In particular we explain when the time sampling is best to be uniform and when it should not be. The findings are illustrated with numerical examples on linear equation, Burgers' equation and the Lorenz system.
Domain decomposition provides an effective way to tackle the dilemma of physics-informed neural networks (PINN) which struggle to accurately and efficiently solve partial differential equations (PDEs) in the whole domain, but the lack of efficient tools for dealing with the interfaces between two adjacent sub-domains heavily hinders the training effects, even leads to the discontinuity of the learned solutions. In this paper, we propose a symmetry group based domain decomposition strategy to enhance the PINN for solving the forward and inverse problems of the PDEs possessing a Lie symmetry group. Specifically, for the forward problem, we first deploy the symmetry group to generate the dividing-lines having known solution information which can be adjusted flexibly and are used to divide the whole training domain into a finite number of non-overlapping sub-domains, then utilize the PINN and the symmetry-enhanced PINN methods to learn the solutions in each sub-domain and finally stitch them to the overall solution of PDEs. For the inverse problem, we first utilize the symmetry group acting on the data of the initial and boundary conditions to generate labeled data in the interior domain of PDEs and then find the undetermined parameters as well as the solution by only training the neural networks in a sub-domain. Consequently, the proposed method can predict high-accuracy solutions of PDEs which are failed by the vanilla PINN in the whole domain and the extended physics-informed neural network in the same sub-domains. Numerical results of the Korteweg-de Vries equation with a translation symmetry and the nonlinear viscous fluid equation with a scaling symmetry show that the accuracies of the learned solutions are improved largely.
Convolutional neural networks (CNNs) trained with cross-entropy loss have proven to be extremely successful in classifying images. In recent years, much work has been done to also improve the theoretical understanding of neural networks. Nevertheless, it seems limited when these networks are trained with cross-entropy loss, mainly because of the unboundedness of the target function. In this paper, we aim to fill this gap by analyzing the rate of the excess risk of a CNN classifier trained by cross-entropy loss. Under suitable assumptions on the smoothness and structure of the a posteriori probability, it is shown that these classifiers achieve a rate of convergence which is independent of the dimension of the image. These rates are in line with the practical observations about CNNs.
Advanced Krylov subspace methods are investigated for the solution of large sparse linear systems arising from stiff adjoint-based aerodynamic shape optimization problems. A special attention is paid to the flexible inner-outer GMRES strategy combined with most relevant preconditioning and deflation techniques. The choice of this specific class of Krylov solvers for challenging problems is based on its outstanding convergence properties. Typically in our implementation the efficiency of the preconditioner is enhanced with a domain decomposition method with overlapping. However, maintaining the performance of the preconditioner may be challenging since scalability and efficiency of a preconditioning technique are properties often antagonistic to each other. In this paper we demonstrate how flexible inner-outer Krylov methods are able to overcome this critical issue. A numerical study is performed considering either a Finite Volume (FV), or a high-order Discontinuous Galerkin (DG) discretization which affect the arithmetic intensity and memory-bandwith of the algebraic operations. We consider test cases of transonic turbulent flows with RANS modelling over the two-dimensional supercritical OAT15A airfoil and the three-dimensional ONERA M6 wing. Benefits in terms of robustness and convergence compared to standard GMRES solvers are obtained. Strong scalability analysis shows satisfactory results. Based on these representative problems a discussion of the recommended numerical practices is proposed.
Two new omnibus tests of uniformity for data on the hypersphere are proposed. The new test statistics exploit closed-form expressions for orthogonal polynomials, feature tuning parameters, and are related to a ``smooth maximum'' function and the Poisson kernel. We obtain exact moments of the test statistics under uniformity and rotationally symmetric alternatives, and give their null asymptotic distributions. We consider approximate oracle tuning parameters that maximize the power of the tests against known generic alternatives and provide tests that estimate oracle parameters through cross-validated procedures while maintaining the significance level. Numerical experiments explore the effectiveness of null asymptotic distributions and the accuracy of inexpensive approximations of exact null distributions. A simulation study compares the powers of the new tests with other tests of the Sobolev class, showing the benefits of the former. The proposed tests are applied to the study of the (seemingly uniform) nursing times of wild polar bears.
We study a query model of computation in which an n-vertex k-hypergraph can be accessed only via its independence oracle or via its colourful independence oracle, and each oracle query may incur a cost depending on the size of the query. In each of these models, we obtain oracle algorithms to approximately count the hypergraph's edges, and we unconditionally prove that no oracle algorithm for this problem can have significantly smaller worst-case oracle cost than our algorithms.
The influence of natural image transformations on receptive field responses is crucial for modelling visual operations in computer vision and biological vision. In this regard, covariance properties with respect to geometric image transformations in the earliest layers of the visual hierarchy are essential for expressing robust image operations, and for formulating invariant visual operations at higher levels. This paper defines and proves a set of joint covariance properties under compositions of spatial scaling transformations, spatial affine transformations, Galilean transformations and temporal scaling transformations, which make it possible to characterize how different types of image transformations interact with each other and the associated spatio-temporal receptive field responses. In this regard, we also extend the notion of scale-normalized derivatives to affine-normalized derivatives, to be able to obtain true affine-covariant properties of spatial derivatives, that are computed based on spatial smoothing with affine Gaussian kernels. The derived relations show how the parameters of the receptive fields need to be transformed, in order to match the output from spatio-temporal receptive fields under composed spatio-temporal image transformations. As a side effect, the presented proof for the joint covariance property over the integrated combination of the different geometric image transformations also provides specific proofs for the individual transformation properties, which have not previously been fully reported in the literature. The paper also presents an in-depth theoretical analysis of geometric interpretations of the derived covariance properties, as well as outlines a number of biological interpretations of these results.
Twin nodes in a static network capture the idea of being substitutes for each other for maintaining paths of the same length anywhere in the network. In dynamic networks, we model twin nodes over a time-bounded interval, noted $(\Delta,d)$-twins, as follows. A periodic undirected time-varying graph $\mathcal G=(G_t)_{t\in\mathbb N}$ of period $p$ is an infinite sequence of static graphs where $G_t=G_{t+p}$ for every $t\in\mathbb N$. For $\Delta$ and $d$ two integers, two distinct nodes $u$ and $v$ in $\mathcal G$ are $(\Delta,d)$-twins if, starting at some instant, the outside neighbourhoods of $u$ and $v$ has non-empty intersection and differ by at most $d$ elements for $\Delta$ consecutive instants. In particular when $d=0$, $u$ and $v$ can act during the $\Delta$ instants as substitutes for each other in order to maintain journeys of the same length in time-varying graph $\mathcal G$. We propose a distributed deterministic algorithm enabling each node to enumerate its $(\Delta,d)$-twins in $2p$ rounds, using messages of size $O(\delta_\mathcal G\log n)$, where $n$ is the total number of nodes and $\delta_\mathcal G$ is the maximum degree of the graphs $G_t$'s. Moreover, using randomized techniques borrowed from distributed hash function sampling, we reduce the message size down to $O(\log n)$ w.h.p.
With the advent of massive data sets much of the computational science and engineering community has moved toward data-intensive approaches in regression and classification. However, these present significant challenges due to increasing size, complexity and dimensionality of the problems. In particular, covariance matrices in many cases are numerically unstable and linear algebra shows that often such matrices cannot be inverted accurately on a finite precision computer. A common ad hoc approach to stabilizing a matrix is application of a so-called nugget. However, this can change the model and introduce error to the original solution. It is well known from numerical analysis that ill-conditioned matrices cannot be accurately inverted. In this paper we develop a multilevel computational method that scales well with the number of observations and dimensions. A multilevel basis is constructed adapted to a kD-tree partitioning of the observations. Numerically unstable covariance matrices with large condition numbers can be transformed into well conditioned multilevel ones without compromising accuracy. Moreover, it is shown that the multilevel prediction exactly solves the Best Linear Unbiased Predictor (BLUP) and Generalized Least Squares (GLS) model, but is numerically stable. The multilevel method is tested on numerically unstable problems of up to 25 dimensions. Numerical results show speedups of up to 42,050 times for solving the BLUP problem, but with the same accuracy as the traditional iterative approach. For very ill-conditioned cases the speedup is infinite. In addition, decay estimates of the multilevel covariance matrices are derived based on high dimensional interpolation techniques from the field of numerical analysis. This work lies at the intersection of statistics, uncertainty quantification, high performance computing and computational applied mathematics.
Most state-of-the-art machine learning techniques revolve around the optimisation of loss functions. Defining appropriate loss functions is therefore critical to successfully solving problems in this field. We present a survey of the most commonly used loss functions for a wide range of different applications, divided into classification, regression, ranking, sample generation and energy based modelling. Overall, we introduce 33 different loss functions and we organise them into an intuitive taxonomy. Each loss function is given a theoretical backing and we describe where it is best used. This survey aims to provide a reference of the most essential loss functions for both beginner and advanced machine learning practitioners.