There exist a wide range of single number metrics for assessing performance of classification algorithms, including AUC and the F1-score (Wikipedia lists 17 such metrics, with 27 different names). In this article, I propose a new metric to answer the following question: when an algorithm is tuned so that it can no longer distinguish labelled cats from real cats, how often does a randomly chosen image that has been labelled as containing a cat actually contain a cat? The steps to construct this metric are as follows. First, we set a threshold score such that when the algorithm is shown two randomly-chosen images -- one that has a score greater than the threshold (i.e. a picture labelled as containing a cat) and another from those pictures that really does contain a cat -- the probability that the image with the highest score is the one chosen from the set of real cat images is 50\%. At this decision threshold, the set of positively labelled images are indistinguishable from the set of images which are positive. Then, as a second step, we measure performance by asking how often a randomly chosen picture from those labelled as containing a cat actually contains a cat. This metric can be thought of as {\it precision at the indistinguishability threshold}. While this new metric doesn't address the tradeoff between precision and recall inherent to all such metrics, I do show why this method avoids pitfalls that can occur when using, for example AUC, and it is better motivated than, for example, the F1-score.
Adversarial generative models, such as Generative Adversarial Networks (GANs), are widely applied for generating various types of data, i.e., images, text, and audio. Accordingly, its promising performance has led to the GAN-based adversarial attack methods in the white-box and black-box attack scenarios. The importance of transferable black-box attacks lies in their ability to be effective across different models and settings, more closely aligning with real-world applications. However, it remains challenging to retain the performance in terms of transferable adversarial examples for such methods. Meanwhile, we observe that some enhanced gradient-based transferable adversarial attack algorithms require prolonged time for adversarial sample generation. Thus, in this work, we propose a novel algorithm named GE-AdvGAN to enhance the transferability of adversarial samples whilst improving the algorithm's efficiency. The main approach is via optimising the training process of the generator parameters. With the functional and characteristic similarity analysis, we introduce a novel gradient editing (GE) mechanism and verify its feasibility in generating transferable samples on various models. Moreover, by exploring the frequency domain information to determine the gradient editing direction, GE-AdvGAN can generate highly transferable adversarial samples while minimizing the execution time in comparison to the state-of-the-art transferable adversarial attack algorithms. The performance of GE-AdvGAN is comprehensively evaluated by large-scale experiments on different datasets, which results demonstrate the superiority of our algorithm. The code for our algorithm is available at: //github.com/LMBTough/GE-advGAN
This paper focuses on the construction of non-intrusive Scientific Machine Learning (SciML) Reduced-Order Models (ROMs) for nonlinear, chaotic plasma turbulence simulations. In particular, we propose using Operator Inference (OpInf) to build low-cost physics-based ROMs from data for such simulations. As a representative example, we focus on the Hasegawa-Wakatani (HW) equations used for modeling two-dimensional electrostatic drift-wave plasma turbulence. For a comprehensive perspective of the potential of OpInf to construct accurate ROMs for this model, we consider a setup for the HW equations that leads to the formation of complex, nonlinear, and self-driven dynamics, and perform two sets of experiments. We first use the data obtained via a direct numerical simulation of the HW equations starting from a specific initial condition and train OpInf ROMs for predictions beyond the training time horizon. In the second, more challenging set of experiments, we train ROMs using the same dataset as before but this time perform predictions for six other initial conditions. Our results show that the OpInf ROMs capture the important features of the turbulent dynamics and generalize to new and unseen initial conditions while reducing the evaluation time of the high-fidelity model by up to five orders of magnitude in single-core performance. In the broader context of fusion research, this shows that non-intrusive SciML ROMs have the potential to drastically accelerate numerical studies, which can ultimately enable tasks such as the design and real-time control of optimized fusion devices.
A numerical algorithm for regularization of the solution of the source problem for the diffusion-logistic model based on information about the process at fixed moments of time of integral type has been developed. The peculiarity of the problem under study is the discrete formulation in space and impossibility to apply classical algorithms for its numerical solution. The regularization of the problem is based on the application of A.N. Tikhonov's approach and a priori information about the source of the process. The problem was formulated in a variational formulation and solved by the global tensor optimization method. It is shown that in the case of noisy data regularization improves the accuracy of the reconstructed source.
High order schemes are known to be unstable in the presence of shock discontinuities or under-resolved solution features for nonlinear conservation laws. Entropy stable schemes address this instability by ensuring that physically relevant solutions satisfy a semi-discrete entropy inequality independently of discretization parameters. This work extends high order entropy stable schemes to the quasi-1D shallow water equations and the quasi-1D compressible Euler equations, which model one-dimensional flows through channels or nozzles with varying width. We introduce new non-symmetric entropy conservative finite volume fluxes for both sets of quasi-1D equations, as well as a generalization of the entropy conservation condition to non-symmetric fluxes. When combined with an entropy stable interface flux, the resulting schemes are high order accurate, conservative, and semi-discretely entropy stable. For the quasi-1D shallow water equations, the resulting schemes are also well-balanced.
The summary receiver operating characteristic (SROC) curve has been recommended as one important meta-analytical summary to represent the accuracy of a diagnostic test in the presence of heterogeneous cutoff values. However, selective publication of diagnostic studies for meta-analysis can induce publication bias (PB) on the estimate of the SROC curve. Several sensitivity analysis methods have been developed to quantify PB on the SROC curve, and all these methods utilize parametric selection functions to model the selective publication mechanism. The main contribution of this article is to propose a new sensitivity analysis approach that derives the worst-case bounds for the SROC curve by adopting nonparametric selection functions under minimal assumptions. The estimation procedures of the worst-case bounds use the Monte Carlo method to obtain the SROC curves along with the corresponding area under the curves in the worst case where the maximum possible PB under a range of marginal selection probabilities is considered. We apply the proposed method to a real-world meta-analysis to show that the worst-case bounds of the SROC curves can provide useful insights for discussing the robustness of meta-analytical findings on diagnostic test accuracy.
Regularization promotes well-posedness in solving an inverse problem with incomplete measurement data. The regularization term is typically designed based on a priori characterization of the unknown signal, such as sparsity or smoothness. The standard inhomogeneous regularization incorporates a spatially changing exponent $p$ of the standard $\ell_p$ norm-based regularization to recover a signal whose characteristic varies spatially. This study proposes a weighted inhomogeneous regularization that extends the standard inhomogeneous regularization through new exponent design and weighting using spatially varying weights. The new exponent design avoids misclassification when different characteristics stay close to each other. The weights handle another issue when the region of one characteristic is too small to be recovered effectively by the $\ell_p$ norm-based regularization even after identified correctly. A suite of numerical tests shows the efficacy of the proposed weighted inhomogeneous regularization, including synthetic image experiments and real sea ice recovery from its incomplete wave measurements.
By using the stochastic particle method, the truncated Euler-Maruyama (TEM) method is proposed for numerically solving McKean-Vlasov stochastic differential equations (MV-SDEs), possibly with both drift and diffusion coefficients having super-linear growth in the state variable. Firstly, the result of the propagation of chaos in the L^q (q\geq 2) sense is obtained under general assumptions. Then, the standard 1/2-order strong convergence rate in the L^q sense of the proposed method corresponding to the particle system is derived by utilizing the stopping time analysis technique. Furthermore, long-time dynamical properties of MV-SDEs, including the moment boundedness, stability, and the existence and uniqueness of the invariant probability measure, can be numerically realized by the TEM method. Additionally, it is proven that the numerical invariant measure converges to the underlying one of MV-SDEs in the L^2-Wasserstein metric. Finally, the conclusions obtained in this paper are verified through examples and numerical simulations.
We propose a deep learning based method for simulating the large bending deformation of bilayer plates. Inspired by the greedy algorithm, we propose a pre-training method on a series of nested domains, which accelerate the convergence of training and find the absolute minimizer more effectively. The proposed method exhibits the capability to converge to an absolute minimizer, overcoming the limitation of gradient flow methods getting trapped in the local minimizer basins. We showcase better performance with fewer numbers of degrees of freedom for the relative energy errors and relative $L^2$-errors of the minimizer through numerical experiments. Furthermore, our method successfully maintains the $L^2$-norm of the isometric constraint, leading to an improvement of accuracy.
We derive information-theoretic generalization bounds for supervised learning algorithms based on the information contained in predictions rather than in the output of the training algorithm. These bounds improve over the existing information-theoretic bounds, are applicable to a wider range of algorithms, and solve two key challenges: (a) they give meaningful results for deterministic algorithms and (b) they are significantly easier to estimate. We show experimentally that the proposed bounds closely follow the generalization gap in practical scenarios for deep learning.
When and why can a neural network be successfully trained? This article provides an overview of optimization algorithms and theory for training neural networks. First, we discuss the issue of gradient explosion/vanishing and the more general issue of undesirable spectrum, and then discuss practical solutions including careful initialization and normalization methods. Second, we review generic optimization methods used in training neural networks, such as SGD, adaptive gradient methods and distributed methods, and theoretical results for these algorithms. Third, we review existing research on the global issues of neural network training, including results on bad local minima, mode connectivity, lottery ticket hypothesis and infinite-width analysis.