The spatial impulse response (SIR) method is a well-known approach to calculate transient acoustic fields of arbitrary-shape transducers. It involves the evaluation of a time-dependent surface integral. Although analytic expressions of the SIR exist for some geometries, numerical methods based on the discretization of transducer surfaces have become the standard. The proposed method consists of representing the transducer as a non-uniform rational B-spline (NURBS) surface, and decomposing it into smooth B\'ezier patches onto which quadrature rules can be deployed. The evaluation of the SIR can then be expressed in B-spline bases, resulting in a sum of shifted-and-weighted basis functions. Field signals are eventually obtained by a convolution of the basis coefficients, derived from the excitation waveform, and the basis SIR. The use of NURBS enables exact representations of common transducer elements. High-order Gaussian quadrature rules enable high accuracy with few quadrature points. High-order B-spline bases are ideally suited to exploit efficiently the bandlimited property of excitation waveforms. Numerical experiments demonstrate that the proposed approach enables sampling the SIR at low sampling rates, as required by the excitation waveform, without introducing additional errors on simulated field signals.
We share our experience with the recently released WILDS benchmark, a collection of ten datasets dedicated to developing models and training strategies which are robust to domain shifts. Several experiments yield a couple of critical observations which we believe are of general interest for any future work on WILDS. Our study focuses on two datasets: iWildCam and FMoW. We show that (1) Conducting separate cross-validation for each evaluation metric is crucial for both datasets, (2) A weak correlation between validation and test performance might make model development difficult for iWildCam, (3) Minor changes in the training of hyper-parameters improve the baseline by a relatively large margin (mainly on FMoW), (4) There is a strong correlation between certain domains and certain target labels (mainly on iWildCam). To the best of our knowledge, no prior work on these datasets has reported these observations despite their obvious importance. Our code is public.
Achieving high channel estimation accuracy and reducing hardware cost as well as power dissipation constitute substantial challenges in the design of massive multiple-input multiple-output (MIMO) systems. To resolve these difficulties, sophisticated pilot designs have been conceived for the family of energy-efficient hybrid analog-digital (HAD) beamforming architecture relying on adaptive-resolution analog-to-digital converters (RADCs). In this paper, we jointly optimize the pilot sequences, the number of RADC quantization bits and the hybrid receiver combiner in the uplink of multiuser massive MIMO systems. We solve the associated mean square error (MSE) minimization problem of channel estimation in the context of correlated Rayleigh fading channels subject to practical constraints. The associated mixed-integer problem is quite challenging due to the nonconvex nature of the objective function and of the constraints. By relying on advanced fractional programming (FP) techniques, we first recast the original problem into a more tractable yet equivalent form, which allows the decoupling of the fractional objective function. We then conceive a pair of novel algorithms for solving the resultant problems for codebook-based and codebook-free pilot schemes, respectively. To reduce the design complexity, we also propose a simplified algorithm for the codebook-based pilot scheme. Our simulation results confirm the superiority of the proposed algorithms over the relevant state-of-the-art benchmark schemes.
Graph sampling theory extends the traditional sampling theory to graphs with topological structures. As a key part of the graph sampling theory, subset selection chooses nodes on graphs as samples to reconstruct the original signal. Due to the eigen-decomposition operation for Laplacian matrices of graphs, however, existing subset selection methods usually require high-complexity calculations. In this paper, with an aim of enhancing the computational efficiency of subset selection on graphs, we propose a novel objective function based on the optimal experimental design. Theoretical analysis shows that this function enjoys an $\alpha$-supermodular property with a provable lower bound on $\alpha$. The objective function, together with an approximate of the low-pass filter on graphs, suggests a fast subset selection method that does not require any eigen-decomposition operation. Experimental results show that the proposed method exhibits high computational efficiency, while having competitive results compared to the state-of-the-art ones, especially when the sampling rate is low.
Trimming consists of cutting away parts of a geometric domain, without reconstructing a global parametrization (meshing). It is a widely used operation in computer aided design, which generates meshes that are unfitted with the described physical object. This paper develops an adaptive mesh refinement strategy on trimmed geometries in the context of hierarchical B-spline based isogeometric analysis. A residual a posteriori estimator of the energy norm of the numerical approximation error is derived, in the context of Poisson equation. The reliability of the estimator is proven, and the effectivity index is shown to be independent from the number of hierarchical levels and from the way the trimmed boundaries cut the underlying mesh. In particular, it is thus independent from the size of the active part of the trimmed mesh elements. Numerical experiments are performed to validate the presented theory.
This paper presents a novel Res2Net-based fusion framework for infrared and visible images. The proposed fusion model has three parts: an encoder, a fusion layer and a decoder, respectively. The Res2Net-based encoder is used to extract multi-scale features of source images, the paper introducing a new training strategy for training a Res2Net-based encoder that uses only a single image. Then, a new fusion strategy is developed based on the attention model. Finally, the fused image is reconstructed by the decoder. The proposed approach is also analyzed in detail. Experiments show that our method achieves state-of-the-art fusion performance in objective and subjective assessment by comparing with the existing methods.
Modern-day problems in statistics often face the challenge of exploring and analyzing complex non-Euclidean object data that do not conform to vector space structures or operations. Examples of such data objects include covariance matrices, graph Laplacians of networks, and univariate probability distribution functions. In the current contribution a new concurrent regression model is proposed to characterize the time-varying relation between an object in a general metric space (as a response) and a vector in $\reals^p$ (as a predictor), where concepts from Fr\'echet regression is employed. Concurrent regression has been a well-developed area of research for Euclidean predictors and responses, with many important applications for longitudinal studies and functional data. However, there is no such model available so far for general object data as responses. We develop generalized versions of both global least squares regression and locally weighted least squares smoothing in the context of concurrent regression for responses that are situated in general metric spaces and propose estimators that can accommodate sparse and/or irregular designs. Consistency results are demonstrated for sample estimates of appropriate population targets along with the corresponding rates of convergence. The proposed models are illustrated with human mortality data and resting state functional Magnetic Resonance Imaging data (fMRI) as responses.
For supervised classification problems, this paper considers estimating the query's label probability through local regression using observed covariates. Well-known nonparametric kernel smoother and $k$-nearest neighbor ($k$-NN) estimator, which take label average over a ball around the query, are consistent but asymptotically biased particularly for a large radius of the ball. To eradicate such bias, local polynomial regression (LPoR) and multiscale $k$-NN (MS-$k$-NN) learn the bias term by local regression around the query and extrapolate it to the query itself. However, their theoretical optimality has been shown for the limit of the infinite number of training samples. For correcting the asymptotic bias with fewer observations, this paper proposes a local radial regression (LRR) and its logistic regression variant called local radial logistic regression (LRLR), by combining the advantages of LPoR and MS-$k$-NN. The idea is simple: we fit the local regression to observed labels by taking the radial distance as the explanatory variable and then extrapolate the estimated label probability to zero distance. Our numerical experiments, including real-world datasets of daily stock indices, demonstrate that LRLR outperforms LPoR and MS-$k$-NN.
This paper studies the single image super-resolution problem using adder neural networks (AdderNet). Compared with convolutional neural networks, AdderNet utilizing additions to calculate the output features thus avoid massive energy consumptions of conventional multiplications. However, it is very hard to directly inherit the existing success of AdderNet on large-scale image classification to the image super-resolution task due to the different calculation paradigm. Specifically, the adder operation cannot easily learn the identity mapping, which is essential for image processing tasks. In addition, the functionality of high-pass filters cannot be ensured by AdderNet. To this end, we thoroughly analyze the relationship between an adder operation and the identity mapping and insert shortcuts to enhance the performance of SR models using adder networks. Then, we develop a learnable power activation for adjusting the feature distribution and refining details. Experiments conducted on several benchmark models and datasets demonstrate that, our image super-resolution models using AdderNet can achieve comparable performance and visual quality to that of their CNN baselines with an about 2$\times$ reduction on the energy consumption.
Swapping text in scene images while preserving original fonts, colors, sizes and background textures is a challenging task due to the complex interplay between different factors. In this work, we present SwapText, a three-stage framework to transfer texts across scene images. First, a novel text swapping network is proposed to replace text labels only in the foreground image. Second, a background completion network is learned to reconstruct background images. Finally, the generated foreground image and background image are used to generate the word image by the fusion network. Using the proposing framework, we can manipulate the texts of the input images even with severe geometric distortion. Qualitative and quantitative results are presented on several scene text datasets, including regular and irregular text datasets. We conducted extensive experiments to prove the usefulness of our method such as image based text translation, text image synthesis, etc.
We introduce a new multi-dimensional nonlinear embedding -- Piecewise Flat Embedding (PFE) -- for image segmentation. Based on the theory of sparse signal recovery, piecewise flat embedding with diverse channels attempts to recover a piecewise constant image representation with sparse region boundaries and sparse cluster value scattering. The resultant piecewise flat embedding exhibits interesting properties such as suppressing slowly varying signals, and offers an image representation with higher region identifiability which is desirable for image segmentation or high-level semantic analysis tasks. We formulate our embedding as a variant of the Laplacian Eigenmap embedding with an $L_{1,p} (0<p\leq1)$ regularization term to promote sparse solutions. First, we devise a two-stage numerical algorithm based on Bregman iterations to compute $L_{1,1}$-regularized piecewise flat embeddings. We further generalize this algorithm through iterative reweighting to solve the general $L_{1,p}$-regularized problem. To demonstrate its efficacy, we integrate PFE into two existing image segmentation frameworks, segmentation based on clustering and hierarchical segmentation based on contour detection. Experiments on four major benchmark datasets, BSDS500, MSRC, Stanford Background Dataset, and PASCAL Context, show that segmentation algorithms incorporating our embedding achieve significantly improved results.