We present a neural technique for learning to select a local sub-region around a point which can be used for mesh parameterization. The motivation for our framework is driven by interactive workflows used for decaling, texturing, or painting on surfaces. Our key idea is to incorporate segmentation probabilities as weights of a classical parameterization method, implemented as a novel differentiable parameterization layer within a neural network framework. We train a segmentation network to select 3D regions that are parameterized into 2D and penalized by the resulting distortion, giving rise to segmentations which are distortion-aware. Following training, a user can use our system to interactively select a point on the mesh and obtain a large, meaningful region around the selection which induces a low-distortion parameterization. Our code and project page are currently available.
Filter-decomposition-based group-equivariant convolutional neural networks (G-CNN) have been demonstrated to increase CNN's data efficiency and contribute to better interpretability and controllability of CNN models. However, so far filter-decomposition-based affine G-CNN methods rely on parameter sharing for achieving high parameter efficiency and suffer from a heavy computational burden. They also use a limited number of transformations and in particular ignore the shear transform in the application. In this paper, we address these problems by emphasizing the importance of the diversity of transformations. We propose a flexible and efficient strategy based on weighted filter-wise Monte Carlo sampling. In addition, we introduce shear equivariant CNN to address the highly sparse representations of natural images. We demonstrate that the proposed methods are intrinsically an efficient generalization of traditional CNNs, and we explain the advantage of bottleneck architectures used in the existing state-of-the-art CNN models such as ResNet, ResNext, and ConvNeXt from the group-equivariant perspective. Experiments on image classification and image denoising tasks show that with a set of suitable filter basis, our methods achieve superior performance to standard CNN with high data efficiency. The code will be available at //github.com/ZhaoWenzhao/MCG_CNN.
This paper addresses the $\epsilon$-close parameter tuning problem for Bayesian Networks (BNs): find a minimal $\epsilon$-close amendment of probability entries in a given set of (rows in) conditional probability tables that make a given quantitative constraint on the BN valid. Based on the state-of-the-art "region verification" techniques for parametric Markov chains, we propose an algorithm whose capabilities go beyond any existing techniques. Our experiments show that $\epsilon$-close tuning of large BN benchmarks with up to 8 parameters is feasible. In particular, by allowing (i) varied parameters in multiple CPTs and (ii) inter-CPT parameter dependencies, we treat subclasses of parametric BNs that have received scant attention so far.
Augmented Krylov subspace methods aid in accelerating the convergence of a standard Krylov subspace method by including additional vectors in the search space. A residual projection framework based on residual (Petrov-) Galerkin constraints was presented in [Gaul et al. SIAM J. Matrix Anal. Appl 2013], and later generalised in a recent survey on subspace recycling iterative methods [Soodhalter et al. GAMM-Mitt. 2020]. The framework describes augmented Krylov subspace methods in terms of applying a standard Krylov subspace method to an appropriately projected problem. In this work we show that the projected problem has an equivalent unprojected formulation, and that viewing the framework in this way provides a similar description for the class of unprojected augmented Krylov subspace methods. We derive the first unprojected augmented Full Orthogonalization Method (FOM), and demonstrate its effectiveness as a recycling method. We then show how the R$^{3}$GMRES algorithm fits within the framework. We show that unprojected augmented short recurrence methods fit within the framework, but can only be implemented in practice under certain conditions on the augmentation subspace. We demonstrate this using the Augmented Conjugate Gradient (AugCG) algorithm as an example.
Recent advances in machine learning-aided lossy compression are incorporating perceptual fidelity into the rate-distortion theory. In this paper, we study the rate-distortion-perception trade-off when the perceptual quality is measured by the total variation distance between the empirical and product distributions of the discrete memoryless source and its reconstruction. We consider the general setting, where two types of resources are available at both the encoder and decoder: a common side information sequence, correlated with the source sequence, and common randomness. We show that the region under the strong perceptual constraint is a subset of that for the weaker empirical perceptual constraint. When sufficient common randomness is provided, the required communication rate is the minimum conditional mutual information such that the distortion and perceptual constraints are satisfied. The coding scheme in the proof of achievability takes advantage of the likelihood encoder.
It is very well-known that when the exact line search gradient descent method is applied to a convex quadratic objective, the worst case rate of convergence (among all seed vectors) deteriorates as the condition number of the Hessian of the objective grows. By an elegant analysis by H. Akaike, it is generally believed -- but not proved -- that in the ill-conditioned regime the ROC for almost all initial vectors, and hence also the average ROC, is close to the worst case ROC. We complete Akaike's analysis using the theorem of center and stable manifolds. Our analysis also makes apparent the effect of an intermediate eigenvalue in the Hessian by establishing the following somewhat amusing result: In the absence of an intermediate eigenvalue, the average ROC gets arbitrarily fast -- not slow -- as the Hessian gets increasingly ill-conditioned. We discuss in passing some contemporary applications of exact line search GD to polynomial optimization problems arising from imaging and data sciences.
Deep learning in computer vision has achieved great success with the price of large-scale labeled training data. However, exhaustive data annotation is impracticable for each task of all domains of interest, due to high labor costs and unguaranteed labeling accuracy. Besides, the uncontrollable data collection process produces non-IID training and test data, where undesired duplication may exist. All these nuisances may hinder the verification of typical theories and exposure to new findings. To circumvent them, an alternative is to generate synthetic data via 3D rendering with domain randomization. We in this work push forward along this line by doing profound and extensive research on bare supervised learning and downstream domain adaptation. Specifically, under the well-controlled, IID data setting enabled by 3D rendering, we systematically verify the typical, important learning insights, e.g., shortcut learning, and discover the new laws of various data regimes and network architectures in generalization. We further investigate the effect of image formation factors on generalization, e.g., object scale, material texture, illumination, camera viewpoint, and background in a 3D scene. Moreover, we use the simulation-to-reality adaptation as a downstream task for comparing the transferability between synthetic and real data when used for pre-training, which demonstrates that synthetic data pre-training is also promising to improve real test results. Lastly, to promote future research, we develop a new large-scale synthetic-to-real benchmark for image classification, termed S2RDA, which provides more significant challenges for transfer from simulation to reality. The code and datasets are available at //github.com/huitangtang/On_the_Utility_of_Synthetic_Data.
Physical models with uncertain inputs are commonly represented as parametric partial differential equations (PDEs). That is, PDEs with inputs that are expressed as functions of parameters with an associated probability distribution. Developing efficient and accurate solution strategies that account for errors on the space, time and parameter domains simultaneously is highly challenging. Indeed, it is well known that standard polynomial-based approximations on the parameter domain can incur errors that grow in time. In this work, we focus on advection-diffusion problems with parameter-dependent wind fields. A novel adaptive solution strategy is proposed that allows users to combine stochastic collocation on the parameter domain with off-the-shelf adaptive timestepping algorithms with local error control. This is a non-intrusive strategy that builds a polynomial-based surrogate that is adapted sequentially in time. The algorithm is driven by a so-called hierarchical estimator for the parametric error and balances this against an estimate for the global timestepping error which is derived from a scaling argument.
We introduce new techniques for the parameterized verification of disjunctive timed networks (DTNs), i.e., networks of timed automata (TAs) that communicate via location guards that enable a transition only if at least one process is in a given location. This computational model has been considered in the literature before, and example applications are gossiping clock synchronization protocols or planning problems. We address the minimum-time reachability problem (minreach) in DTNs, and show how to efficiently solve it based on a novel zone-graph algorithm. We further show that solving minreach allows us to construct a summary TA capturing exactly the possible behaviors of a single TA within a DTN of arbitrary size. The combination of these two results enables the parameterized verification of DTNs, while avoiding the construction of an exponential-size cutoff-system required by existing results. Our techniques are also implemented, and experiments show their practicality.
Recent advances in 3D fully convolutional networks (FCN) have made it feasible to produce dense voxel-wise predictions of volumetric images. In this work, we show that a multi-class 3D FCN trained on manually labeled CT scans of several anatomical structures (ranging from the large organs to thin vessels) can achieve competitive segmentation results, while avoiding the need for handcrafting features or training class-specific models. To this end, we propose a two-stage, coarse-to-fine approach that will first use a 3D FCN to roughly define a candidate region, which will then be used as input to a second 3D FCN. This reduces the number of voxels the second FCN has to classify to ~10% and allows it to focus on more detailed segmentation of the organs and vessels. We utilize training and validation sets consisting of 331 clinical CT images and test our models on a completely unseen data collection acquired at a different hospital that includes 150 CT scans, targeting three anatomical organs (liver, spleen, and pancreas). In challenging organs such as the pancreas, our cascaded approach improves the mean Dice score from 68.5 to 82.2%, achieving the highest reported average score on this dataset. We compare with a 2D FCN method on a separate dataset of 240 CT scans with 18 classes and achieve a significantly higher performance in small organs and vessels. Furthermore, we explore fine-tuning our models to different datasets. Our experiments illustrate the promise and robustness of current 3D FCN based semantic segmentation of medical images, achieving state-of-the-art results. Our code and trained models are available for download: //github.com/holgerroth/3Dunet_abdomen_cascade.
In this paper, we propose a conceptually simple and geometrically interpretable objective function, i.e. additive margin Softmax (AM-Softmax), for deep face verification. In general, the face verification task can be viewed as a metric learning problem, so learning large-margin face features whose intra-class variation is small and inter-class difference is large is of great importance in order to achieve good performance. Recently, Large-margin Softmax and Angular Softmax have been proposed to incorporate the angular margin in a multiplicative manner. In this work, we introduce a novel additive angular margin for the Softmax loss, which is intuitively appealing and more interpretable than the existing works. We also emphasize and discuss the importance of feature normalization in the paper. Most importantly, our experiments on LFW BLUFR and MegaFace show that our additive margin softmax loss consistently performs better than the current state-of-the-art methods using the same network architecture and training dataset. Our code has also been made available at //github.com/happynear/AMSoftmax