Neural networks have recently shown promise for likelihood-free inference, providing orders-of-magnitude speed-ups over classical methods. However, current implementations are suboptimal when estimating parameters from independent replicates. In this paper, we use a decision-theoretic framework to argue that permutation-invariant neural networks are ideally placed for constructing Bayes estimators for arbitrary models, provided that simulation from these models is straightforward. We show that the resulting neural Bayes estimators can quickly and optimally estimate parameters in weakly-identified and highly-parameterised models with relative ease, and that they are highly competitive and much faster than traditional likelihood-based estimators. We apply our estimator on a spatial analysis of sea-surface temperature in the Red Sea where, after training, we obtain parameter estimates, and uncertainty quantification of the estimates via bootstrap sampling, from hundreds of spatial fields in a fraction of a second.
Novel class discovery (NCD) aims at learning a model that transfers the common knowledge from a class-disjoint labelled dataset to another unlabelled dataset and discovers new classes (clusters) within it. Many methods, as well as elaborate training pipelines and appropriate objectives, have been proposed and considerably boosted performance on NCD tasks. Despite all this, we find that the existing methods do not sufficiently take advantage of the essence of the NCD setting. To this end, in this paper, we propose to model both inter-class and intra-class constraints in NCD based on the symmetric Kullback-Leibler divergence (sKLD). Specifically, we propose an inter-class sKLD constraint to effectively exploit the disjoint relationship between labelled and unlabelled classes, enforcing the separability for different classes in the embedding space. In addition, we present an intra-class sKLD constraint to explicitly constrain the intra-relationship between a sample and its augmentations and ensure the stability of the training process at the same time. We conduct extensive experiments on the popular CIFAR10, CIFAR100 and ImageNet benchmarks and successfully demonstrate that our method can establish a new state of the art and can achieve significant performance improvements, e.g., 3.5%/3.7% clustering accuracy improvements on CIFAR100-50 dataset split under the task-aware/-agnostic evaluation protocol, over previous state-of-the-art methods. Code is available at //github.com/FanZhichen/NCD-IIC.
The Euler characteristic transform (ECT) is a signature from topological data analysis (TDA) which summarises shapes embedded in Euclidean space. Compared with other TDA methods, the ECT is fast to compute and it is a sufficient statistic for a broad class of shapes. However, small perturbations of a shape can lead to large distortions in its ECT. In this paper, we propose a new metric on compact one-dimensional shapes and prove that the ECT is stable with respect to this metric. Crucially, our result uses curvature, rather than the size of a triangulation of an underlying shape, to control stability. We further construct a computationally tractable statistical estimator of the ECT based on the theory of Gaussian processes. We use our stability result to prove that our estimator is consistent on shapes perturbed by independent ambient noise; i.e., the estimator converges to the true ECT as the sample size increases.
Numerical simulations with rigid particles, drops or vesicles constitute some examples that involve 3D objects with spherical topology. When the numerical method is based on boundary integral equations, the error in using a regular quadrature rule to approximate the layer potentials that appear in the formulation will increase rapidly as the evaluation point approaches the surface and the integrand becomes sharply peaked. To determine when the accuracy becomes insufficient, and a more costly special quadrature method should be used, error estimates are needed. In this paper we present quadrature error estimates for layer potentials evaluated near surfaces of genus 0, parametrized using a polar and an azimuthal angle, discretized by a combination of the Gauss-Legendre and the trapezoidal quadrature rules. The error estimates involve no unknown coefficients, but complex valued roots of a specified distance function. The evaluation of the error estimates in general requires a one dimensional local root-finding procedure, but for specific geometries we obtain analytical results. Based on these explicit solutions, we derive simplified error estimates for layer potentials evaluated near spheres; these simple formulas depend only on the distance from the surface, the radius of the sphere and the number of discretization points. The usefulness of these error estimates is illustrated with numerical examples.
Most recent 6D object pose estimation methods first use object detection to obtain 2D bounding boxes before actually regressing the pose. However, the general object detection methods they use are ill-suited to handle cluttered scenes, thus producing poor initialization to the subsequent pose network. To address this, we propose a rigidity-aware detection method exploiting the fact that, in 6D pose estimation, the target objects are rigid. This lets us introduce an approach to sampling positive object regions from the entire visible object area during training, instead of naively drawing samples from the bounding box center where the object might be occluded. As such, every visible object part can contribute to the final bounding box prediction, yielding better detection robustness. Key to the success of our approach is a visibility map, which we propose to build using a minimum barrier distance between every pixel in the bounding box and the box boundary. Our results on seven challenging 6D pose estimation datasets evidence that our method outperforms general detection frameworks by a large margin. Furthermore, combined with a pose regression network, we obtain state-of-the-art pose estimation results on the challenging BOP benchmark.
The success of a football team depends on various individual skills and performances of the selected players as well as how cohesively they perform. This work proposes a two-stage process for selecting optimal playing eleven of a football team from its pool of available players. In the first stage, for the reference team, a LASSO-induced modified trinomial logistic regression model is derived to analyze the probabilities of the three possible outcomes. The model takes into account strengths of the players in the team as well as those of the opponent, home advantage, and also the effects of individual players and player combinations beyond the recorded performances of these players. Careful use of the LASSO technique acts as an appropriate enabler of the player selection exercise while keeping the number of variables at a reasonable level. Then, in the second stage, a GRASP-type meta-heuristic is implemented for the team selection which maximizes the probability of win for the team. The work is illustrated with English Premier League data from 2008/09 to 2015/16. The application demonstrates that the model in the first stage furnishes valuable insights about the deciding factors for different teams whereas the optimization steps can be effectively used to determine the best possible starting lineup under various circumstances. Based on the adopted model and methodology, we propose a measure of efficiency in team selection by the team management and analyze the performance of EPL teams on this front.
Various real-world scientific applications involve the mathematical modeling of complex uncertain systems with numerous unknown parameters. Accurate parameter estimation is often practically infeasible in such systems, as the available training data may be insufficient and the cost of acquiring additional data may be high. In such cases, based on a Bayesian paradigm, we can design robust operators retaining the best overall performance across all possible models and design optimal experiments that can effectively reduce uncertainty to enhance the performance of such operators maximally. While objective-based uncertainty quantification (objective-UQ) based on MOCU (mean objective cost of uncertainty) provides an effective means for quantifying uncertainty in complex systems, the high computational cost of estimating MOCU has been a challenge in applying it to real-world scientific/engineering problems. In this work, we propose a novel scheme to reduce the computational cost for objective-UQ via MOCU based on a data-driven approach. We adopt a neural message-passing model for surrogate modeling, incorporating a novel axiomatic constraint loss that penalizes an increase in the estimated system uncertainty. As an illustrative example, we consider the optimal experimental design (OED) problem for uncertain Kuramoto models, where the goal is to predict the experiments that can most effectively enhance robust synchronization performance through uncertainty reduction. We show that our proposed approach can accelerate MOCU-based OED by four to five orders of magnitude, without any visible performance loss compared to the state-of-the-art. The proposed approach applies to general OED tasks, beyond the Kuramoto model.
We review Quasi Maximum Likelihood estimation of factor models for high-dimensional panels of time series. We consider two cases: (1) estimation when no dynamic model for the factors is specified (Bai and Li, 2016); (2) estimation based on the Kalman smoother and the Expectation Maximization algorithm thus allowing to model explicitly the factor dynamics (Doz et al., 2012). Our interest is in approximate factor models, i.e., when we allow for the idiosyncratic components to be mildly cross-sectionally, as well as serially, correlated. Although such setting apparently makes estimation harder, we show, in fact, that factor models do not suffer of the curse of dimensionality problem, but instead they enjoy a blessing of dimensionality property. In particular, we show that if the cross-sectional dimension of the data, $N$, grows to infinity, then: (i) identification of the model is still possible, (ii) the mis-specification error due to the use of an exact factor model log-likelihood vanishes. Moreover, if we let also the sample size, $T$, grow to infinity, we can also consistently estimate all parameters of the model and make inference. The same is true for estimation of the latent factors which can be carried out by weighted least-squares, linear projection, or Kalman filtering/smoothing. We also compare the approaches presented with: Principal Component analysis and the classical, fixed $N$, exact Maximum Likelihood approach. We conclude with a discussion on efficiency of the considered estimators.
In complex large-scale systems such as climate, important effects are caused by a combination of confounding processes that are not fully observable. The identification of sources from observations of system state is vital for attribution and prediction, which inform critical policy decisions. The difficulty of these types of inverse problems lies in the inability to isolate sources and the cost of simulating computational models. Surrogate models may enable the many-query algorithms required for source identification, but data challenges arise from high dimensionality of the state and source, limited ensembles of costly model simulations to train a surrogate model, and few and potentially noisy state observations for inversion due to measurement limitations. The influence of auxiliary processes adds an additional layer of uncertainty that further confounds source identification. We introduce a framework based on (1) calibrating deep neural network surrogates to the flow maps provided by an ensemble of simulations obtained by varying sources, and (2) using these surrogates in a Bayesian framework to identify sources from observations via optimization. Focusing on an atmospheric dispersion exemplar, we find that the expressive and computationally efficient nature of the deep neural network operator surrogates in appropriately reduced dimension allows for source identification with uncertainty quantification using limited data. Introducing a variable wind field as an auxiliary process, we find that a Bayesian approximation error approach is essential for reliable source inversion when uncertainty due to wind stresses the algorithm.
This work addresses a novel and challenging problem of estimating the full 3D hand shape and pose from a single RGB image. Most current methods in 3D hand analysis from monocular RGB images only focus on estimating the 3D locations of hand keypoints, which cannot fully express the 3D shape of hand. In contrast, we propose a Graph Convolutional Neural Network (Graph CNN) based method to reconstruct a full 3D mesh of hand surface that contains richer information of both 3D hand shape and pose. To train networks with full supervision, we create a large-scale synthetic dataset containing both ground truth 3D meshes and 3D poses. When fine-tuning the networks on real-world datasets without 3D ground truth, we propose a weakly-supervised approach by leveraging the depth map as a weak supervision in training. Through extensive evaluations on our proposed new datasets and two public datasets, we show that our proposed method can produce accurate and reasonable 3D hand mesh, and can achieve superior 3D hand pose estimation accuracy when compared with state-of-the-art methods.
High spectral dimensionality and the shortage of annotations make hyperspectral image (HSI) classification a challenging problem. Recent studies suggest that convolutional neural networks can learn discriminative spatial features, which play a paramount role in HSI interpretation. However, most of these methods ignore the distinctive spectral-spatial characteristic of hyperspectral data. In addition, a large amount of unlabeled data remains an unexploited gold mine for efficient data use. Therefore, we proposed an integration of generative adversarial networks (GANs) and probabilistic graphical models for HSI classification. Specifically, we used a spectral-spatial generator and a discriminator to identify land cover categories of hyperspectral cubes. Moreover, to take advantage of a large amount of unlabeled data, we adopted a conditional random field to refine the preliminary classification results generated by GANs. Experimental results obtained using two commonly studied datasets demonstrate that the proposed framework achieved encouraging classification accuracy using a small number of data for training.