Independent Component Analysis (ICA) is intended to recover the mutually independent sources from their linear mixtures, and F astICA is one of the most successful ICA algorithms. Although it seems reasonable to improve the performance of F astICA by introducing more nonlinear functions to the negentropy estimation, the original fixed-point method (approximate Newton method) in F astICA degenerates under this circumstance. To alleviate this problem, we propose a novel method based on the second-order approximation of minimum discrimination information (MDI). The joint maximization in our method is consisted of minimizing single weighted least squares and seeking unmixing matrix by the fixed-point method. Experimental results validate its efficiency compared with other popular ICA algorithms.
Low-rank approximation of images via singular value decomposition is well-received in the era of big data. However, singular value decomposition (SVD) is only for order-two data, i.e., matrices. It is necessary to flatten a higher order input into a matrix or break it into a series of order-two slices to tackle higher order data such as multispectral images and videos with the SVD. Higher order singular value decomposition (HOSVD) extends the SVD and can approximate higher order data using sums of a few rank-one components. We consider the problem of generalizing HOSVD over a finite dimensional commutative algebra. This algebra, referred to as a t-algebra, generalizes the field of complex numbers. The elements of the algebra, called t-scalars, are fix-sized arrays of complex numbers. One can generalize matrices and tensors over t-scalars and then extend many canonical matrix and tensor algorithms, including HOSVD, to obtain higher-performance versions. The generalization of HOSVD is called THOSVD. Its performance of approximating multi-way data can be further improved by an alternating algorithm. THOSVD also unifies a wide range of principal component analysis algorithms. To exploit the potential of generalized algorithms using t-scalars for approximating images, we use a pixel neighborhood strategy to convert each pixel to "deeper-order" t-scalar. Experiments on publicly available images show that the generalized algorithm over t-scalars, namely THOSVD, compares favorably with its canonical counterparts.
We propose a new wavelet-based method for density estimation when the data are size-biased. More specifically, we consider a power of the density of interest, where this power exceeds 1/2. Warped wavelet bases are employed, where warping is attained by some continuous cumulative distribution function. This can be seen as a general framework in which the conventional orthonormal wavelet estimation is the case where warping distribution is the standard uniform c.d.f. We show that both linear and nonlinear wavelet estimators are consistent, with optimal and/or near-optimal rates. Monte Carlo simulations are performed to compare four special settings which are easy to interpret in practice. An application with a real dataset on fatal traffic accidents involving alcohol illustrates the method. We observe that warped bases provide more flexible and superior estimates for both simulated and real data. Moreover, we find that estimating the power of a density (for instance, its square root) further improves the results.
This paper studies Quasi Maximum Likelihood estimation of Dynamic Factor Models for large panels of time series. Specifically, we consider the case in which the autocorrelation of the factors is explicitly accounted for, and therefore the model has a state-space form. Estimation of the factors and their loadings is implemented through the Expectation Maximization (EM) algorithm, jointly with the Kalman smoother.~We prove that as both the dimension of the panel $n$ and the sample size $T$ diverge to infinity, up to logarithmic terms: (i) the estimated loadings are $\sqrt T$-consistent and asymptotically normal if $\sqrt T/n\to 0$; (ii) the estimated factors are $\sqrt n$-consistent and asymptotically normal if $\sqrt n/T\to 0$; (iii) the estimated common component is $\min(\sqrt n,\sqrt T)$-consistent and asymptotically normal regardless of the relative rate of divergence of $n$ and $T$. Although the model is estimated as if the idiosyncratic terms were cross-sectionally and serially uncorrelated and normally distributed, we show that these mis-specifications do not affect consistency. Moreover, the estimated loadings are asymptotically as efficient as those obtained with the Principal Components estimator, while the estimated factors are more efficient if the idiosyncratic covariance is sparse enough.~We then propose robust estimators of the asymptotic covariances, which can be used to conduct inference on the loadings and to compute confidence intervals for the factors and common components. Finally, we study the performance of our estimators and we compare them with the traditional Principal Components approach through MonteCarlo simulations and analysis of US macroeconomic data.
We establish a new perturbation theory for orthogonal polynomials using a Riemann--Hilbert approach and consider applications in numerical linear algebra and random matrix theory. This new approach shows that the orthogonal polynomials with respect to two measures can be effectively compared using the difference of their Stieltjes transforms on a suitably chosen contour. Moreover, when two measures are close and satisfy some regularity conditions, we use the theta functions of a hyperelliptic Riemann surface to derive explicit and accurate expansion formulae for the perturbed orthogonal polynomials. In contrast to other approaches, a key strength of the methodology is that estimates can remain valid as the degree of the polynomial grows. The results are applied to analyze several numerical algorithms from linear algebra, including the Lanczos tridiagonalization procedure, the Cholesky factorization and the conjugate gradient algorithm. As a case study, we investigate these algorithms applied to a general spiked sample covariance matrix model by considering the eigenvector empirical spectral distribution and its limits. For the first time, we give precise estimates on the output of the algorithms, applied to this wide class of random matrices, as the number of iterations diverges. In this setting, beyond the first order expansion, we also derive a new mesoscopic central limit theorem for the associated orthogonal polynomials and other quantities relevant to numerical algorithms.
The isogeometric approximation of the Stokes problem in a trimmed domain is studied. This setting is characterized by an underlying mesh unfitted with the boundary of the physical domain making the imposition of the essential boundary conditions a challenging problem. A very popular strategy is to rely on the so-called Nitsche method \cite{MR3264337}. We show that the Nitsche method lacks stability in some degenerate trimmed domain configurations, potentially polluting the computed solutions. After extending the stabilization procedure of \cite{MR4155233} to incompressible flow problems, we show that we recover the well-posedness of the formulation and, consequently, optimal a priori error estimates. Numerical experiments illustrating stability and converge rates are included.
We use a numerical-analytic technique to construct a sequence of successive approximations to the solution of a system of fractional differential equations, subject to Dirichlet boundary conditions. We prove the uniform convergence of the sequence of approximations to a limit function, which is the unique solution to the boundary value problem under consideration, and give necessary and sufficient conditions for the existence of solutions. The obtained theoretical results are confirmed by a model example.
We establish some limit theorems for quasi-arithmetic means of random variables. This class of means contains the arithmetic, geometric and harmonic means. Our feature is that the generators of quasi-arithmetic means are allowed to be complex-valued, which makes considerations for quasi-arithmetic means of random variables which could take negative values possible. Our motivation for the limit theorems is finding simple estimators of the parameters of the Cauchy distribution. By applying the limit theorems, we obtain some closed-form unbiased strongly-consistent estimators for the joint of the location and scale parameters of the Cauchy distribution, which are easy to compute and analyze.
In this work, we propose an interesting method that aims to approximate an activation function over some domain by polynomials of the presupposing low degree. The main idea behind this method can be seen as an extension of the ordinary least square method and includes the gradient of activation function into the cost function to minimize.
Second-order optimizers are thought to hold the potential to speed up neural network training, but due to the enormous size of the curvature matrix, they typically require approximations to be computationally tractable. The most successful family of approximations are Kronecker-Factored, block-diagonal curvature estimates (KFAC). Here, we combine tools from prior work to evaluate exact second-order updates with careful ablations to establish a surprising result: Due to its approximations, KFAC is not closely related to second-order updates, and in particular, it significantly outperforms true second-order updates. This challenges widely held believes and immediately raises the question why KFAC performs so well. We answer this question by showing that KFAC approximates a first-order algorithm, which performs gradient descent on neurons rather than weights. Finally, we show that this optimizer often improves over KFAC in terms of computational cost and data-efficiency.
Overdetermined systems of first kind integral equations appear in many applications. When the right-hand side is discretized, the resulting finite-data problem is ill-posed and admits infinitely many solutions. We propose a numerical method to compute the minimal-norm solution in the presence of boundary constraints. The algorithm stems from the Riesz representation theorem and operates in a reproducing kernel Hilbert space. Since the resulting linear system is strongly ill-conditioned, we construct a regularization method depending on a discrete parameter. It is based on the expansion of the minimal-norm solution in terms of the singular functions of the integral operator defining the problem. Two estimation techniques are tested for the automatic determination of the regularization parameter, namely, the discrepancy principle and the L-curve method. Numerical results concerning two artificial test problems demonstrate the excellent performance of the proposed method. Finally, a particular model typical of geophysical applications, which reproduces the readings of a frequency domain electromagnetic induction device, is investigated. The results show that the new method is extremely effective when the sought solution is smooth, but gives significant information on the solution even for non-smooth solutions.