Full waveform inversion (FWI) is an iterative identification process that serves to minimize the misfit of model-based simulated and experimentally measured wave field data, with the goal of identifying a field of parameters for a given physical object. The inverse optimization process of FWI is based on forward and backward solutions of the (elastic or acoustic) eave equation. In a previous paper [1], we explored opportunities of using the finite cell method (FCM) as the wave field solver to incorporate highly complex geometric models. Furthermore, we demonstrated that the identification of the model's density outperforms that of the velocity -- particularly in cases where unknown voids characterized by homogeneous Neumann boundary conditions need to be detected. The paper at hand extends this previous study: The isogeometric finite cell analysis (IGA-FCM) -- a combination of isogeometric analysis (IGA) and FCM -- is applied for the wave field solver, with the advantage that the polynomial degree and subsequently also the sampling frequency of the wave field can be increased quite easily. Since the inversion efficiency strongly depends on the accuracy of the forward and backward wave field solution and of the gradient of the functional, consistent and lumped mass matrix discretization are compared. The resolution of the grid describing the unknown material density is the decouple from the knot span grid. Finally, we propose an adaptive multi-resolution algorithm that refines the material grid only locally using an image processing-based refinement indicator. The developed inversion framework allows fast and memory-efficient wave simulation and object identification. While we study the general behavior of the proposed approach on 2D benchmark problems, a final 3D problem shows that it can also be used to identify voids in geometrically complex spatial structures.
It has been observed that the performances of many high-dimensional estimation problems are universal with respect to underlying sensing (or design) matrices. Specifically, matrices with markedly different constructions seem to achieve identical performance if they share the same spectral distribution and have ``generic'' singular vectors. We prove this universality phenomenon for the case of convex regularized least squares (RLS) estimators under a linear regression model with additive Gaussian noise. Our main contributions are two-fold: (1) We introduce a notion of universality classes for sensing matrices, defined through a set of deterministic conditions that fix the spectrum of the sensing matrix and precisely capture the previously heuristic notion of generic singular vectors; (2) We show that for all sensing matrices that lie in the same universality class, the dynamics of the proximal gradient descent algorithm for solving the regression problem, as well as the performance of RLS estimators themselves (under additional strong convexity conditions) are asymptotically identical. In addition to including i.i.d. Gaussian and rotational invariant matrices as special cases, our universality class also contains highly structured, strongly correlated, or even (nearly) deterministic matrices. Examples of the latter include randomly signed versions of incoherent tight frames and randomly subsampled Hadamard transforms. As a consequence of this universality principle, the asymptotic performance of regularized linear regression on many structured matrices constructed with limited randomness can be characterized by using the rotationally invariant ensemble as an equivalent yet mathematically more tractable surrogate.
This paper proposes a novel communication-efficient split learning (SL) framework, named SplitFC, which reduces the communication overhead required for transmitting intermediate feature and gradient vectors during the SL training process. The key idea of SplitFC is to leverage different dispersion degrees exhibited in the columns of the matrices. SplitFC incorporates two compression strategies: (i) adaptive feature-wise dropout and (ii) adaptive feature-wise quantization. In the first strategy, the intermediate feature vectors are dropped with adaptive dropout probabilities determined based on the standard deviation of these vectors. Then, by the chain rule, the intermediate gradient vectors associated with the dropped feature vectors are also dropped. In the second strategy, the non-dropped intermediate feature and gradient vectors are quantized using adaptive quantization levels determined based on the ranges of the vectors. To minimize the quantization error, the optimal quantization levels of this strategy are derived in a closed-form expression. Simulation results on the MNIST, CIFAR-10, and CelebA datasets demonstrate that SplitFC provides more than a 5.6% increase in classification accuracy compared to state-of-the-art SL frameworks, while they require 320 times less communication overhead compared to the vanilla SL framework without compression.
The traditional method of computing singular value decomposition (SVD) of a data matrix is based on a least squares principle, thus, is very sensitive to the presence of outliers. Hence the resulting inferences across different applications using the classical SVD are extremely degraded in the presence of data contamination (e.g., video surveillance background modelling tasks, etc.). A robust singular value decomposition method using the minimum density power divergence estimator (rSVDdpd) has been found to provide a satisfactory solution to this problem and works well in applications. For example, it provides a neat solution to the background modelling problem of video surveillance data in the presence of camera tampering. In this paper, we investigate the theoretical properties of the rSVDdpd estimator such as convergence, equivariance and consistency under reasonable assumptions. Since the dimension of the parameters, i.e., the number of singular values and the dimension of singular vectors can grow linearly with the size of the data, the usual M-estimation theory has to be suitably modified with concentration bounds to establish the asymptotic properties. We believe that we have been able to accomplish this satisfactorily in the present work. We also demonstrate the efficiency of rSVDdpd through extensive simulations.
Imaging problems such as the one in nanoCT require the solution of an inverse problem, where it is often taken for granted that the forward operator, i.e., the underlying physical model, is properly known. In the present work we address the problem where the forward model is inexact due to stochastic or deterministic deviations during the measurement process. We particularly investigate the performance of non-learned iterative reconstruction methods dealing with inexactness and learned reconstruction schemes, which are based on U-Nets and conditional invertible neural networks. The latter also provide the opportunity for uncertainty quantification. A synthetic large data set in line with a typical nanoCT setting is provided and extensive numerical experiments are conducted evaluating the proposed methods.
In this study we consider domains that are composed of an infinite sequence of self-similar rings and corresponding finite element spaces over those domains. The rings are parameterized using piecewise polynomial or tensor-product B-spline mappings of degree $q$ over quadrilateral meshes. We then consider finite element discretizations which, over each ring, are mapped, piecewise polynomial functions of degree $p$. Such domains that are composed of self-similar rings may be created through a subdivision scheme or from a scaled boundary parameterization. We study approximation properties over such recursively parameterized domains. The main finding is that, for generic isoparametric discretizations (i.e., where $p=q$), the approximation properties always depend only on the degree of polynomials that can be reproduced exactly in the physical domain and not on the degree $p$ of the mapped elements. Especially, in general, $L^\infty$-errors converge at most with the rate $h^2$, where $h$ is the mesh size, independent of the degree $p=q$. This has implications for subdivision based isogeometric analysis, which we will discuss in this paper.
We propose a novel approach to the linear viscoelastic problem of shear-deformable geometrically exact beams. The generalized Maxwell model for one-dimensional solids is here efficiently extended to the case of arbitrarily curved beams undergoing finite displacement and rotations. High efficiency is achieved by combining a series of distinguishing features, that are i) the formulation is displacement-based, therefore no additional unknowns, other than incremental displacements and rotations, are needed for the internal variables associated with the rate-dependent material; ii) the governing equations are discretized in space using the isogeometric collocation method, meaning that elements integration is totally bypassed; iii) finite rotations are updated using the incremental rotation vector, leading to two main benefits: minimum number of rotation unknowns (the three components of the incremental rotation vector) and no singularity problems; iv) the same $\rm SO(3)$-consistent linearization of the governing equations and update procedures as for non-rate-dependent linear elastic material can be used; v) a standard second-order accurate time integration scheme is made consistent with the underlying geometric structure of the kinematic problem. Moreover, taking full advantage of the isogeometric analysis features, the formulation permits accurately representing beams and beam structures with highly complex initial shape and topology, paving the way for a large number of potential applications in the field of architectured materials, meta-materials, morphing/programmable objects, topological optimizations, etc. Numerical applications are finally presented in order to demonstrate attributes and potentialities of the proposed formulation.
The landscape of applications and subroutines relying on shortest path computations continues to grow steadily. This growth is driven by the undeniable success of shortest path algorithms in theory and practice. It also introduces new challenges as the models and assessing the optimality of paths become more complicated. Hence, multiple recent publications in the field adapt existing labeling methods in an ad-hoc fashion to their specific problem variant without considering the underlying general structure: they always deal with multi-criteria scenarios and those criteria define different partial orders on the paths. In this paper, we introduce the partial order shortest path problem (POSP), a generalization of the multi-objective shortest path problem (MOSP) and in turn also of the classical shortest path problem. POSP captures the particular structure of many shortest path applications as special cases. In this generality, we study optimality conditions or the lack of them, depending on the objective functions' properties. Our final contribution is a big lookup table summarizing our findings and providing the reader an easy way to choose among the most recent multicriteria shortest path algorithms depending on their problem's weight structure. Examples range from time-dependent shortest path and bottleneck path problems to the fuzzy shortest path problem and complex financial weight functions studied in the public transportation community. Our results hold for general digraphs and therefore surpass previous generalizations that were limited to acyclic graphs.
In this paper, we study a priori error estimates for the finite element approximation of the nonlinear Schr\"{o}dinger-Poisson model. The electron density is defined by an infinite series over all eigenvalues of the Hamiltonian operator. To establish the error estimate, we present a unified theory of error estimates for a class of nonlinear problems. The theory is based on three conditions: 1) the original problem has a solution $u$ which is the fixed point of a compact operator $\Ca$, 2) $\Ca$ is Fr\'{e}chet-differentiable at $u$ and $\Ci-\Ca'[u]$ has a bounded inverse in a neighborhood of $u$, and 3) there exists an operator $\Ca_h$ which converges to $\Ca$ in the neighborhood of $u$. The theory states that $\Ca_h$ has a fixed point $u_h$ which solves the approximate problem. It also gives the error estimate between $u$ and $u_h$, without assumptions on the well-posedness of the approximate problem. We apply the unified theory to the finite element approximation of the Schr\"{o}dinger-Poisson model and obtain optimal error estimate between the numerical solution and the exact solution. Numerical experiments are presented to verify the convergence rates of numerical solutions.
The central problem we address in this work is estimation of the parameter support set S, the set of indices corresponding to nonzero parameters, in the context of a sparse parametric likelihood model for count-valued multivariate time series. We develop a computationally-intensive algorithm that performs the estimation by aggregating support sets obtained by applying the LASSO to data subsamples. Our approach is to identify several well-fitting candidate models and estimate S by the most frequently-used parameters, thus \textit{aggregating} candidate models rather than selecting a single candidate deemed optimal in some sense. While our method is broadly applicable to any selection problem, we focus on the generalized vector autoregressive model class, and in particular the Poisson case, due to (i) the difficulty of the support estimation problem due to complex dependence in the data, (ii) recent work applying the LASSO in this context, and (iii) interesting applications in network recovery from discrete multivariate time series. We establish benchmark methods based on the LASSO and present empirical results demonstrating the superior performance of our method. Additionally, we present an application estimating ecological interaction networks from paleoclimatology data.
With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.