We present an accelerated and hardware parallelized integral-equation solver for the problem of acoustic scattering by a two-dimensional surface in three-dimensional space. The approach is based, in part, on the novel Interpolated Factored Green Function acceleration method (IFGF) that, without recourse to the Fast Fourier Transform (FFT), evaluates the action of Green function-based integral operators for an $N$-point surface discretization at a complexity of $\Ord(N\log N)$ operations instead of the $\Ord(N^2)$ cost associated with nonaccelerated methods. The IFGF algorithm exploits the slow variations of factored Green functions to enable the fast evaluation of fields generated by groups of sources on the basis of a recursive interpolation scheme. In the proposed approach, the IFGF method is used to account for the vast majority of the computations, while, for the relatively few singular, nearly-singular and neighboring non-singular integral operator evaluations, a high-order rectangular-polar quadrature approach is employed instead. Since the overall approach does not rely on the FFT, it is amenable to efficient shared- and distributed-memory parallelization; this paper demonstrates such a capability by means of an OpenMP parallel implementation of the method. A variety of numerical examples presented in this paper demonstrate that the proposed methods enable the efficient solution of large problems over complex geometries on small parallel hardware infrastructures. Numerical examples include acoustic scattering by a sphere of up to $128$ wavelengths, an $80$-wavelength submarine, and a turbofan nacelle that is more than $80$ wavelengths in size, requiring, on a 28-core computer, computing times of the order of a few minutes per iteration and a few tens of iterations of the GMRES iterative solver.
Obreshkov-like numerical integrators have been widely applied to power system transient simulation. Misuse of the numerical integrators as numerical differentiators may lead to numerical oscillation or bias. Criteria for Obreshkov-like numerical integrators to be used as numerical differentiators are proposed in this paper to avoid these misleading phenomena. The coefficients of a numerical integrator for the highest order derivative turn out to determine its suitability. Some existing Obreshkov-like numerical integrators are examined under the proposed criteria. It is revealed that the notorious numerical oscillations induced by the implicit trapezoidal method cannot always be eliminated by using the backward Euler method for a few time steps. Guided by the proposed criteria, a frequency response optimized integrator considering second order derivative is put forward which is suitable to be used as a numerical differentiator. Theoretical observations are demonstrated in time domain via case studies. The paper points out how to properly select the numerical integrators for power system transient simulation and helps to prevent their misuse.
We propose a collocation method based on multivariate polynomial splines over triangulation or tetrahedralization for the numerical solution of partial differential equations. We start with a detailed explanation of the method for the Poisson equation and then extend the study to the second-order elliptic PDE in non-divergence form. We shall show that the numerical solution can approximate the exact PDE solution very well. Then we present a large amount of numerical experimental results to demonstrate the performance of the method over the 2D and 3D settings. In addition, we present a comparison with the existing multivariate spline methods in \cite{ALW06} and \cite{LW17} to show that the new method produces a similar and sometimes more accurate approximation in a more efficient fashion.
This paper deals with the grouped variable selection problem. A widely used strategy is to augment the negative log-likelihood function with a sparsity-promoting penalty. Existing methods include the group Lasso, group SCAD, and group MCP. The group Lasso solves a convex optimization problem but is plagued by underestimation bias. The group SCAD and group MCP avoid this estimation bias but require solving a nonconvex optimization problem that may be plagued by suboptimal local optima. In this work, we propose an alternative method based on the generalized minimax concave (GMC) penalty, which is a folded concave penalty that maintains the convexity of the objective function. We develop a new method for grouped variable selection in linear regression, the group GMC, that generalizes the strategy of the original GMC estimator. We present an efficient algorithm for computing the group GMC estimator and also prove properties of the solution path to guide its numerical computation and tuning parameter selection in practice. We establish error bounds for both the group GMC and original GMC estimators. A rich set of simulation studies and a real data application indicate that the proposed group GMC approach outperforms existing methods in several different aspects under a wide array of scenarios.
Multi-material problems often exhibit complex geometries along with physical responses presenting large spatial gradients or discontinuities. In these cases, providing high-quality body-fitted finite element analysis meshes and obtaining accurate solutions remain challenging. Immersed boundary techniques provide elegant solutions for such problems. Enrichment methods alleviate the need for generating conforming analysis grids by capturing discontinuities within mesh elements. Additionally, increased accuracy of physical responses and geometry description can be achieved with higher-order approximation bases. In particular, using B-splines has become popular with the development of IsoGeometric Analysis. In this work, an eXtended IsoGeometric Analysis (XIGA) approach is proposed for multi-material problems. The computational domain geometry is described implicitly by level set functions. A novel generalized Heaviside enrichment strategy is employed to accommodate an arbitrary number of materials without artificially stiffening the physical response. Higher-order B-spline functions are used for both geometry representation and analysis. Boundary and interface conditions are enforced weakly via Nitsche's method, and a new face-oriented ghost stabilization methodology is used to mitigate numerical instabilities arising from small material integration subdomains. Two- and three-dimensional heat transfer and elasticity problems are solved to validate the approach. Numerical studies provide insight into the ability to handle multiple materials considering sharp-edged and curved interfaces, as well as the impact of higher-order bases and stabilization on the solution accuracy and conditioning.
The present paper continues our investigation of an implementation of a least-squares collocation method for higher-index differential-algebraic equations. In earlier papers, we were able to substantiate the choice of basis functions and collocation points for a robust implementation as well as algorithms for the solution of the discrete system. The present paper is devoted to an analytic estimation of condition numbers for different components of an implementation. We present error estimations, which show the sources for the different errors.
Eigendecomposition of symmetric matrices is at the heart of many computer vision algorithms. However, the derivatives of the eigenvectors tend to be numerically unstable, whether using the SVD to compute them analytically or using the Power Iteration (PI) method to approximate them. This instability arises in the presence of eigenvalues that are close to each other. This makes integrating eigendecomposition into deep networks difficult and often results in poor convergence, particularly when dealing with large matrices. While this can be mitigated by partitioning the data into small arbitrary groups, doing so has no theoretical basis and makes it impossible to exploit the full power of eigendecomposition. In previous work, we mitigated this using SVD during the forward pass and PI to compute the gradients during the backward pass. However, the iterative deflation procedure required to compute multiple eigenvectors using PI tends to accumulate errors and yield inaccurate gradients. Here, we show that the Taylor expansion of the SVD gradient is theoretically equivalent to the gradient obtained using PI without relying in practice on an iterative process and thus yields more accurate gradients. We demonstrate the benefits of this increased accuracy for image classification and style transfer.
In order to avoid the curse of dimensionality, frequently encountered in Big Data analysis, there was a vast development in the field of linear and nonlinear dimension reduction techniques in recent years. These techniques (sometimes referred to as manifold learning) assume that the scattered input data is lying on a lower dimensional manifold, thus the high dimensionality problem can be overcome by learning the lower dimensionality behavior. However, in real life applications, data is often very noisy. In this work, we propose a method to approximate $\mathcal{M}$ a $d$-dimensional $C^{m+1}$ smooth submanifold of $\mathbb{R}^n$ ($d \ll n$) based upon noisy scattered data points (i.e., a data cloud). We assume that the data points are located "near" the lower dimensional manifold and suggest a non-linear moving least-squares projection on an approximating $d$-dimensional manifold. Under some mild assumptions, the resulting approximant is shown to be infinitely smooth and of high approximation order (i.e., $O(h^{m+1})$, where $h$ is the fill distance and $m$ is the degree of the local polynomial approximation). The method presented here assumes no analytic knowledge of the approximated manifold and the approximation algorithm is linear in the large dimension $n$. Furthermore, the approximating manifold can serve as a framework to perform operations directly on the high dimensional data in a computationally efficient manner. This way, the preparatory step of dimension reduction, which induces distortions to the data, can be avoided altogether.
Transfer learning is one of the subjects undergoing intense study in the area of machine learning. In object recognition and object detection there are known experiments for the transferability of parameters, but not for neural networks which are suitable for object-detection in real time embedded applications, such as the SqueezeDet neural network. We use transfer learning to accelerate the training of SqueezeDet to a new group of classes. Also, experiments are conducted to study the transferability and co-adaptation phenomena introduced by the transfer learning process. To accelerate training, we propose a new implementation of the SqueezeDet training which provides a faster pipeline for data processing and achieves $1.8$ times speedup compared to the initial implementation. Finally, we created a mechanism for automatic hyperparamer optimization using an empirical method.
Network embedding has attracted considerable research attention recently. However, the existing methods are incapable of handling billion-scale networks, because they are computationally expensive and, at the same time, difficult to be accelerated by distributed computing schemes. To address these problems, we propose RandNE, a novel and simple billion-scale network embedding method. Specifically, we propose a Gaussian random projection approach to map the network into a low-dimensional embedding space while preserving the high-order proximities between nodes. To reduce the time complexity, we design an iterative projection procedure to avoid the explicit calculation of the high-order proximities. Theoretical analysis shows that our method is extremely efficient, and friendly to distributed computing schemes without any communication cost in the calculation. We demonstrate the efficacy of RandNE over state-of-the-art methods in network reconstruction and link prediction tasks on multiple datasets with different scales, ranging from thousands to billions of nodes and edges.
In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes. We also describe efficient ways of applying these mobile models to object detection in a novel framework we call SSDLite. Additionally, we demonstrate how to build mobile semantic segmentation models through a reduced form of DeepLabv3 which we call Mobile DeepLabv3. The MobileNetV2 architecture is based on an inverted residual structure where the input and output of the residual block are thin bottleneck layers opposite to traditional residual models which use expanded representations in the input an MobileNetV2 uses lightweight depthwise convolutions to filter features in the intermediate expansion layer. Additionally, we find that it is important to remove non-linearities in the narrow layers in order to maintain representational power. We demonstrate that this improves performance and provide an intuition that led to this design. Finally, our approach allows decoupling of the input/output domains from the expressiveness of the transformation, which provides a convenient framework for further analysis. We measure our performance on Imagenet classification, COCO object detection, VOC image segmentation. We evaluate the trade-offs between accuracy, and number of operations measured by multiply-adds (MAdd), as well as the number of parameters