In a series of recent papers the spectral behavior of the matrix sequence $\{Y_nT_n(f)\}$ is studied in the sense of the spectral distribution, where $Y_n$ is the main antidiagonal (or flip matrix) and $T_n(f)$ is the Toeplitz matrix generated by the function $f$, with $f$ being Lebesgue integrable and with real Fourier coefficients. This kind of study is also motivated by computational purposes for the solution of the related large linear systems using the (preconditioned) MINRES algorithm. Here we complement the spectral study with more results holding both asymptotically and for a fixed dimension $n$, and with regard to eigenvalues, singular values, and eigenvectors of $T_n(f), Y_nT_n(f)$ and to several relationships among them: beside fast linear solvers, a further target is the design of ad hoc procedures for the computation of the related spectra via matrix-less algorithms, with a cost being linear in the number of computed eigenvalues. We emphasize that the challenge of the case of non-monotone generating functions is considered in the current work, for which the previous matrix-less algorithms fail. Numerical experiments are reported and commented, with the aim of showing in a visual way the theoretical analysis.
Stochastic kriging has been widely employed for simulation metamodeling to predict the response surface of complex simulation models. However, its use is limited to cases where the design space is low-dimensional because, in general, the sample complexity (i.e., the number of design points required for stochastic kriging to produce an accurate prediction) grows exponentially in the dimensionality of the design space. The large sample size results in both a prohibitive sample cost for running the simulation model and a severe computational challenge due to the need to invert large covariance matrices. Based on tensor Markov kernels and sparse grid experimental designs, we develop a novel methodology that dramatically alleviates the curse of dimensionality. We show that the sample complexity of the proposed methodology grows only slightly in the dimensionality, even under model misspecification. We also develop fast algorithms that compute stochastic kriging in its exact form without any approximation schemes. We demonstrate via extensive numerical experiments that our methodology can handle problems with a design space of more than 10,000 dimensions, improving both prediction accuracy and computational efficiency by orders of magnitude relative to typical alternative methods in practice.
This thesis investigates the quality of randomly collected data by employing a framework built on information-based complexity, a field related to the numerical analysis of abstract problems. The quality or power of gathered information is measured by its radius which is the uniform error obtainable by the best possible algorithm using it. The main aim is to present progress towards understanding the power of random information for approximation and integration problems.
In business process landscapes, a common challenge is to provide the necessary computational resources to enact the single process steps. One well-known approach to solve this issue in a cost-efficient way is to use the notion of elasticity, i.e., to provide cloud-based computational resources in a rapid fashion and to enact the single process steps on these resources. Existing approaches to provide elastic processes are mostly based on Virtual Machines (VMs). Utilizing container technologies could enable a more fine-grained allocation of process steps to computational resources, leading to a better resource utilization and improved cost efficiency. In this paper, we propose an approach to optimize resource allocation for elastic processes by applying a four-fold auto-scaling approach. The main goal is to minimize the cost of process enactments by using containers. To this end, we formulate and implement a multi-objective optimization problem applying Mixed-Integer Linear Programming and use a transformation step to allocate software services to containers. We thoroughly evaluate the optimization problem and show that it can lead to significant cost savings while maintaining Service Lev
The absolute value equations (AVE) problem is an algebraic problem of solving Ax+|x|=b. So far, most of the research focused on methods for solving AVEs, but we address the problem itself by analysing properties of AVE and the corresponding solution set. In particular, we investigate topological properties of the solution set, such as convexity, boundedness, connectedness, or whether it consists of finitely many solutions. Further, we address problems related to nonnegativity of solutions such as solvability or unique solvability. AVE can be formulated by means of different optimization problems, and in this regard we are interested in how the solutions of AVE are related with optima, Karush-Kuhn-Tucker points and feasible solutions of these optimization problems. We characterize the matrix classes associated with the above mentioned properties and inspect the computational complexity of the recognition problem; some of the classes are polynomially recognizable, but some others are proved to be NP-hard. For the intractable cases, we propose various sufficient conditions. We also post new challenging problems that raised during the investigation of the problem.
We propose an estimator for the singular vectors of high-dimensional low-rank matrices corrupted by additive subgaussian noise, where the noise matrix is allowed to have dependence within rows and heteroskedasticity between them. We prove finite-sample $\ell_{2,\infty}$ bounds and a Berry-Esseen theorem for the individual entries of the estimator, and we apply these results to high-dimensional mixture models. Our Berry-Esseen theorem clearly shows the geometric relationship between the signal matrix, the covariance structure of the noise, and the distribution of the errors in the singular vector estimation task. These results are illustrated in numerical simulations. Unlike previous results of this type, which rely on assumptions of gaussianity or independence between the entries of the additive noise, handling the dependence between entries in the proofs of these results requires careful leave-one-out analysis and conditioning arguments. Our results depend only on the signal-to-noise ratio, the sample size, and the spectral properties of the signal matrix.
Bayesian analysis methods often use some form of iterative simulation such as Monte Carlo computation. Models that involve discrete variables can sometime pose a challenge, either because the methods used do not support such variables (e.g. Hamiltonian Monte Carlo) or because the presence of such variables can slow down the computation. A common workaround is to marginalise the discrete variables out of the model. While it is reasonable to expect that such marginalisation would also lead to more time-efficient computations, to our knowledge this has not been demonstrated beyond a few specialised models. We explored the impact of marginalisation on the computational efficiency for a few simple statistical models. Specifically, we considered two- and three-component Gaussian mixture models, and also the Dawid-Skene model for categorical ratings. We explored each with two software implementations of Markov chain Monte Carlo techniques: JAGS and Stan. We directly compared marginalised and non-marginalised versions of the same model using the samplers on the same software. Our results show that marginalisation on its own does not necessarily boost performance. Nevertheless, the best performance was usually achieved with Stan, which requires marginalisation. We conclude that there is no simple answer to whether or not marginalisation is helpful. It is not necessarily the case that, when turned 'on', this technique can be assured to provide computational benefit independent of other factors, nor is it likely to be the model component that has the largest impact on computational efficiency.
We study the rank of sub-matrices arising out of kernel functions, $F(\pmb{x},\pmb{y}): \mathbb{R}^d \times \mathbb{R}^d \mapsto \mathbb{R}$, where $\pmb{x},\pmb{y} \in \mathbb{R}^d$, that have a singularity along the diagonal $\pmb{x}=\pmb{y}$. Such kernel functions are frequently encountered in a wide range of applications such as $N$ body problems, Green's functions, integral equations, geostatistics, kriging, Gaussian processes, etc. One of the challenges in dealing with these kernel functions is that the corresponding matrix associated with these kernels is large and dense and thereby, the computational cost of matrix operations is high. In this article, we prove new theorems bounding the numerical rank of sub-matrices arising out of these kernel functions. Under reasonably mild assumptions, we prove that the rank of certain sub-matrices is rank-deficient in finite precision. This rank depends on the dimension of the ambient space and also on the type of interaction between the hyper-cubes containing the corresponding set of particles. This rank structure can be leveraged to reduce the computational cost of certain matrix operations such as matrix-vector products, solving linear systems, etc. We also present numerical results on the growth of rank of certain sub-matrices in $1$D, $2$D, $3$D and $4$D, which, not surprisingly, agrees with the theoretical results.
Research in derivative-free global optimization is under active development, and many solution techniques are available today. Therefore, the experimental comparison of previous and emerging algorithms must be kept up to date. This paper considers the solution to the bound-constrained, possibly black-box global optimization problem. It compares 64 derivative-free deterministic algorithms against classic and state-of-the-art stochastic solvers. Among deterministic ones, a particular emphasis is on DIRECT-type, where, in recent years, significant progress has been made. A set of 800 test problems generated by the well-known GKLS generator and 397 traditional test problems from DIRECTGOLib v1.2 collection are utilized in a computational study. More than 239400 solver runs were carried out, requiring more than 531 days of single CPU time to complete them. It has been found that deterministic algorithms perform excellently on GKLS-type and low-dimensional problems, while stochastic algorithms have shown to be more efficient in higher dimensions.
Approximate solutions to large least squares problems can be computed efficiently using leverage score-based row-sketches, but directly computing the leverage scores, or sampling according to them with naive methods, still requires an expensive manipulation and processing of the design matrix. In this paper we develop efficient leverage score-based sampling methods for matrices with certain Kronecker product-type structure; in particular we consider matrices that are monotone lower column subsets of Kronecker product matrices. Our discussion is general, encompassing least squares problems on infinite domains, in which case matrices formally have infinitely many rows. We briefly survey leverage score-based sampling guarantees from the numerical linear algebra and approximation theory communities, and follow this with efficient algorithms for sampling when the design matrix has Kronecker-type structure. Our numerical examples confirm that sketches based on exact leverage score sampling for our class of structured matrices achieve superior residual compared to approximate leverage score sampling methods.
We study the problem of estimating an unknown parameter in a distributed and online manner. Existing work on distributed online learning typically either focuses on asymptotic analysis, or provides bounds on regret. However, these results may not directly translate into bounds on the error of the learned model after a finite number of time-steps. In this paper, we propose a distributed online estimation algorithm which enables each agent in a network to improve its estimation accuracy by communicating with neighbors. We provide non-asymptotic bounds on the estimation error, leveraging the statistical properties of the underlying model. Our analysis demonstrates a trade-off between estimation error and communication costs. Further, our analysis allows us to determine a time at which the communication can be stopped (due to the costs associated with communications), while meeting a desired estimation accuracy. We also provide a numerical example to validate our results.