Oriented normals are common pre-requisites for many geometric algorithms based on point clouds, such as Poisson surface reconstruction. However, it is not trivial to obtain a consistent orientation. In this work, we bridge orientation and reconstruction in the implicit space and propose a novel approach to orient point cloud normals by incorporating isovalue constraints to the Poisson equation. In implicit surface reconstruction, the reconstructed shape is represented as an isosurface of an implicit function defined in the ambient space. Therefore, when such a surface is reconstructed from a set of sample points, the implicit function values at the points should be close to the isovalue corresponding to the surface. Based on this observation and the Poisson equation, we propose an optimization formulation that combines isovalue constraints with local consistency requirements for normals. We optimize normals and implicit functions simultaneously and solve for a globally consistent orientation. Thanks to the sparsity of the linear system, our method can work on an average laptop with reasonable computational time. Experiments show that our method can achieve high performance in non-uniform and noisy data and manage varying sampling densities, artifacts, multiple connected components, and nested surfaces. The source code is available at \url{//github.com/Submanifold/IsoConstraints}.
In this work we propose tailored model order reduction for varying boundary optimal control problems governed by parametric partial differential equations. With varying boundary control, we mean that a specific parameter changes where the boundary control acts on the system. This peculiar formulation might benefit from model order reduction. Indeed, fast and reliable simulations of this model can be of utmost usefulness in many applied fields, such as geophysics and energy engineering. However, varying boundary control features very complicated and diversified parametric behaviour for the state and adjoint variables. The state solution, for example, changing the boundary control parameter, might feature transport phenomena. Moreover, the problem loses its affine structure. It is well known that classical model order reduction techniques fail in this setting, both in accuracy and in efficiency. Thus, we propose reduced approaches inspired by the ones used when dealing with wave-like phenomena. Indeed, we compare standard proper orthogonal decomposition with two tailored strategies: geometric recasting and local proper orthogonal decomposition. Geometric recasting solves the optimization system in a reference domain simplifying the problem at hand avoiding hyper-reduction, while local proper orthogonal decomposition builds local bases to increase the accuracy of the reduced solution in very general settings (where geometric recasting is unfeasible). We compare the various approaches on two different numerical experiments based on geometries of increasing complexity.
Action recognition has long been a fundamental and intriguing problem in artificial intelligence. The task is challenging due to the high dimensionality nature of an action, as well as the subtle motion details to be considered. Current state-of-the-art approaches typically learn from articulated motion sequences in the straightforward 3D Euclidean space. However, the vanilla Euclidean space is not efficient for modeling important motion characteristics such as the joint-wise angular acceleration, which reveals the driving force behind the motion. Moreover, current methods typically attend to each channel equally and lack theoretical constrains on extracting task-relevant features from the input. In this paper, we seek to tackle these challenges from three aspects: (1) We propose to incorporate an acceleration representation, explicitly modeling the higher-order variations in motion. (2) We introduce a novel Stream-GCN network equipped with multi-stream components and channel attention, where different representations (i.e., streams) supplement each other towards a more precise action recognition while attention capitalizes on those important channels. (3) We explore feature-level supervision for maximizing the extraction of task-relevant information and formulate this into a mutual information loss. Empirically, our approach sets the new state-of-the-art performance on three benchmark datasets, NTU RGB+D, NTU RGB+D 120, and NW-UCLA. Our code is anonymously released at //github.com/ActionR-Group/Stream-GCN, hoping to inspire the community.
This paper presents a new distance metric to compare two continuous probability density functions. The main advantage of this metric is that, unlike other statistical measurements, it can provide an analytic, closed-form expression for a mixture of Gaussian distributions while satisfying all metric properties. These characteristics enable fast, stable, and efficient calculations, which are highly desirable in real-world signal processing applications. The application in mind is Gaussian Mixture Reduction (GMR), which is widely used in density estimation, recursive tracking, and belief propagation. To address this problem, we developed a novel algorithm dubbed the Optimization-based Greedy GMR (OGGMR), which employs our metric as a criterion to approximate a high-order Gaussian mixture with a lower order. Experimental results show that the OGGMR algorithm is significantly faster and more efficient than state-of-the-art GMR algorithms while retaining the geometric shape of the original mixture.
One-shot channel simulation is a fundamental data compression problem concerned with encoding a single sample from a target distribution $Q$ using a coding distribution $P$ using as few bits as possible on average. Algorithms that solve this problem find applications in neural data compression and differential privacy and can serve as a more efficient alternative to quantization-based methods. Sadly, existing solutions are too slow or have limited applicability, preventing widespread adoption. In this paper, we conclusively solve one-shot channel simulation for one-dimensional problems where the target-proposal density ratio is unimodal by describing an algorithm with optimal runtime. We achieve this by constructing a rejection sampling procedure equivalent to greedily searching over the points of a Poisson process. Hence, we call our algorithm greedy Poisson rejection sampling (GPRS) and analyze the correctness and time complexity of several of its variants. Finally, we empirically verify our theorems, demonstrating that GPRS significantly outperforms the current state-of-the-art method, A* coding.
The Poisson-Boltzmann equation (PBE) is an implicit solvent continuum model for calculating the electrostatic potential and energies of ionic solvated biomolecules. However, its numerical solution remains a significant challenge due strong singularities and nonlinearity caused by the singular source terms and the exponential nonlinear terms, respectively. An efficient method for the treatment of singularities in the linear PBE was introduced in \cite{BeKKKS:18}, that is based on the RS tensor decomposition for both electrostatic potential and the discretized Dirac delta distribution. In this paper, we extend this regularization method to the nonlinear PBE. We apply the PBE only to the regular part of the solution corresponding to the modified right-hand side via extraction of the long-range part in the discretized Dirac delta distribution. The total electrostatic potential is obtained by adding the long-range solution to the directly precomputed short-range potential. The main computational benefit of the approach is the automatic maintaining of the continuity in the Cauchy data on the solute-solvent interface. The boundary conditions are also obtained from the long-range component of the precomputed canonical tensor representation of the Newton kernel. In the numerical experiments, we illustrate the accuracy of the nonlinear regularized PBE (NRPBE) over the classical variant.
Capturing intricate and subtle variations in human expressiveness in music performance using computational approaches is challenging. In this paper, we propose a novel approach for reconstructing human expressiveness in piano performance with a multi-layer bi-directional Transformer encoder. To address the needs for large amounts of accurately captured and score-aligned performance data in training neural networks, we use transcribed scores obtained from an existing transcription model to train our model. We integrate pianist identities to control the sampling process and explore the ability of our system to model variations in expressiveness for different pianists. The system is evaluated through statistical analysis of generated expressive performances and a listening test. Overall, the results suggest that our method achieves state-of-the-art in generating human-like piano performances from transcribed scores, while fully and consistently reconstructing human expressiveness poses further challenges.
Generating realistic human 3D reconstructions using image or video data is essential for various communication and entertainment applications. While existing methods achieved impressive results for body and facial regions, realistic hair modeling still remains challenging due to its high mechanical complexity. This work proposes an approach capable of accurate hair geometry reconstruction at a strand level from a monocular video or multi-view images captured in uncontrolled lighting conditions. Our method has two stages, with the first stage performing joint reconstruction of coarse hair and bust shapes and hair orientation using implicit volumetric representations. The second stage then estimates a strand-level hair reconstruction by reconciling in a single optimization process the coarse volumetric constraints with hair strand and hairstyle priors learned from the synthetic data. To further increase the reconstruction fidelity, we incorporate image-based losses into the fitting process using a new differentiable renderer. The combined system, named Neural Haircut, achieves high realism and personalization of the reconstructed hairstyles.
In this paper, we propose a novel construction for secure distributed matrix multiplication (SDMM) based on algebraic geometry (AG) codes, which we call the PoleGap SDMM scheme. The proposed construction is inspired by the GASP code, where so-called gaps in a certain polynomial are utilized to achieve higher communication rates. Our construction considers the gaps in a Weierstrass semigroup of a rational place in an algebraic function field to achieve a similar increase in the rate. This construction shows that there is potential in utilizing AG codes and their subcodes in SDMM since we demonstrate a better performance compared to state-of-the-art schemes in some parameter regimes.
We present a novel method for reconstructing a 3D implicit surface from a large-scale, sparse, and noisy point cloud. Our approach builds upon the recently introduced Neural Kernel Fields (NKF) representation. It enjoys similar generalization capabilities to NKF, while simultaneously addressing its main limitations: (a) We can scale to large scenes through compactly supported kernel functions, which enable the use of memory-efficient sparse linear solvers. (b) We are robust to noise, through a gradient fitting solve. (c) We minimize training requirements, enabling us to learn from any dataset of dense oriented points, and even mix training data consisting of objects and scenes at different scales. Our method is capable of reconstructing millions of points in a few seconds, and handling very large scenes in an out-of-core fashion. We achieve state-of-the-art results on reconstruction benchmarks consisting of single objects, indoor scenes, and outdoor scenes.
With the advent of deep neural networks, learning-based approaches for 3D reconstruction have gained popularity. However, unlike for images, in 3D there is no canonical representation which is both computationally and memory efficient yet allows for representing high-resolution geometry of arbitrary topology. Many of the state-of-the-art learning-based 3D reconstruction approaches can hence only represent very coarse 3D geometry or are limited to a restricted domain. In this paper, we propose occupancy networks, a new representation for learning-based 3D reconstruction methods. Occupancy networks implicitly represent the 3D surface as the continuous decision boundary of a deep neural network classifier. In contrast to existing approaches, our representation encodes a description of the 3D output at infinite resolution without excessive memory footprint. We validate that our representation can efficiently encode 3D structure and can be inferred from various kinds of input. Our experiments demonstrate competitive results, both qualitatively and quantitatively, for the challenging tasks of 3D reconstruction from single images, noisy point clouds and coarse discrete voxel grids. We believe that occupancy networks will become a useful tool in a wide variety of learning-based 3D tasks.