Following a recently considered generalisation of linear equations to unordered-data vectors and to ordered-data vectors, we perform a further generalisation to k-element-sets-of-unordered-data vectors. These generalised equations naturally appear in the analysis of vector addition systems (or Petri nets) extended so that each token carries a set of unordered data. We show that nonnegative-integer solvability of linear equations is in nondeterministic-exponential-time while integer solvability is in polynomial-time.
We propose a location-adaptive self-normalization (SN) based test for change points in time series. The SN technique has been extensively used in change-point detection for its capability to avoid direct estimation of nuisance parameters. However, we find that the power of the SN-based test is susceptible to the location of the break and may suffer from a severe power loss, especially when the change occurs at the early or late stage of the sequence. This phenomenon is essentially caused by the unbalance of the data used before and after the change point when one is building a test statistic based on the cumulative sum (CUSUM) process. Hence, we consider leaving out the samples far away from the potential locations of change points and propose an optimal data selection scheme. Based on this scheme, a new SN-based test statistic adaptive to the locations of breaks is established. The new test can significantly improve the power of the existing SN-based tests while maintaining a satisfactory size. It is a unified treatment that can be readily extended to tests for general quantities of interest, such as the median and the model parameters. The derived optimal subsample selection strategy is not specific to the SN-based tests but is applicable to any method that relies on the CUSUM process, which may provide new insights in the area for future research.
We propose an adaptive finite element algorithm to approximate solutions of elliptic problems whose forcing data is locally defined and is approximated by regularization (or mollification). We show that the energy error decay is quasi-optimal in two dimensional space and sub-optimal in three dimensional space. Numerical simulations are provided to confirm our findings.
The support vector machine (SVM) and minimum Euclidean norm least squares regression are two fundamentally different approaches to fitting linear models, but they have recently been connected in models for very high-dimensional data through a phenomenon of support vector proliferation, where every training example used to fit an SVM becomes a support vector. In this paper, we explore the generality of this phenomenon and make the following contributions. First, we prove a super-linear lower bound on the dimension (in terms of sample size) required for support vector proliferation in independent feature models, matching the upper bounds from previous works. We further identify a sharp phase transition in Gaussian feature models, bound the width of this transition, and give experimental support for its universality. Finally, we hypothesize that this phase transition occurs only in much higher-dimensional settings in the $\ell_1$ variant of the SVM, and we present a new geometric characterization of the problem that may elucidate this phenomenon for the general $\ell_p$ case.
There are proposals that extend the classical generalized additive models (GAMs) to accommodate high-dimensional data ($p>>n$) using group sparse regularization. However, the sparse regularization may induce excess shrinkage when estimating smoothing functions, damaging predictive performance. Moreover, most of these GAMs consider an "all-in-all-out" approach for functional selection, rendering them difficult to answer if nonlinear effects are necessary. While some Bayesian models can address these shortcomings, using Markov chain Monte Carlo algorithms for model fitting creates a new challenge, scalability. Hence, we propose Bayesian hierarchical generalized additive models as a solution: we consider the smoothing penalty for proper shrinkage of curve interpolation and separation of smoothing function linear and nonlinear spaces. A novel spike-and-slab spline prior is proposed to select components of smoothing functions. Two scalable and deterministic algorithms, EM-Coordinate Descent and EM-Iterative Weighted Least Squares, are developed for different utilities. Simulation studies and metabolomics data analyses demonstrate improved predictive or computational performance against state-of-the-art models, mgcv, COSSO and sparse Bayesian GAM. The software implementation of the proposed models is freely available via an R package BHAM.
We study approximation methods for a large class of mixed models with a probit link function that includes mixed versions of the binomial model, the multinomial model, and generalized survival models. The class of models is special because the marginal likelihood can be expressed as Gaussian weighted integrals or as multivariate Gaussian cumulative density functions. The latter approach is unique to the probit link function models and has been proposed for parameter estimation in complex, mixed effects models. However, it has not been investigated in which scenarios either form is preferable. Our simulations and data example show that neither form is preferable in general and give guidance on when to approximate the cumulative density functions and when to approximate the Gaussian weighted integrals and, in the case of the latter, which general purpose method to use among a large list of methods.
In this paper we are concerned with energy-conserving methods for Poisson problems, which are effectively solved by defining a suitable generalization of HBVMs, a class of energy-conserving methods for Hamiltonian problems. The actual implementation of the methods is fully discussed, with a particular emphasis on the conservation of Casimirs. Some numerical tests are reported, in order to assess the theoretical findings.
This paper proposes an isogeometric boundary element method (IGBEM) to solve the electromagnetic scattering problems for three-dimensional doubly-periodic multi-layered structures. The main concerns are the constructions of (i) an open surface (between two layers) and (ii) a vector basis function with using the B-spline functions. Regarding (i), we considered an algorithm to generate a doubly-periodic open surface with the tensor product of the B-spline functions of any degree. Regarding (ii), we employed the vector basis function based on the B-spline functions, which was proposed by Buffa et al. (2010), and adapted it to the underlying periodic problems so that it can satisfy the quasi-periodic condition on the boundary of an open surface. The proposed IGBEM worked for solving some numerical examples satisfactorily and proved the applicability to plasmonic simulations.
Recently, a new variant of the BiCGStab method, known as the pipeline BiCGStab, has been proposed. This method can achieve a higher degree of scalability and speed-up rates through a mechanism in which the communication phase for the computation of the inner product can be overlapped with the computation of the matrix-vector product. On the other hand, there exist several generalized iteration methods with better convergence behavior than BiCGStab such as ssBiCGSafe, BiCGSafe, GPBi-CG. Of these methods, ssBiCGSafe, which requires a single phase of computing inner products per one iteration, is best suited for high-performance computing systems. In this paper, inspired by the success of the pipelined BiCGStab method, we propose variations of the ssBiCGSafe method, in which only one phase of inner product computation per iteration is required and this inner product computation phase can be overlapped with the matrix-vector computation. Through numerical experiments, we show that the proposed methods lead to improvements in convergence behavior and execution time compared to the pipelined BiCGStab and ssBiCGSafe methods.
Finite linear least squares is one of the core problems of numerical linear algebra, with countless applications across science and engineering. Consequently, there is a rich and ongoing literature on algorithms for solving linear least squares problems. In this paper, we explore a variant in which the system's matrix has one infinite dimension (i.e., it is a quasimatrix). We call such problems semi-infinite linear regression problems. As we show, the semi-infinite case arises in several applications, such as supervised learning and function approximation, and allows for novel interpretations of existing algorithms. We explore semi-infinite linear regression rigorously and algorithmically. To that end, we give a formal framework for working with quasimatrices, and generalize several algorithms designed for the finite problem to the infinite case. Finally, we suggest the use of various sampling methods for obtaining an approximate solution.
This paper proposes a metric for sets of trajectories to evaluate multi-object tracking algorithms that includes time-weighted costs for localisation errors of properly detected targets, for false targets, missed targets and track switches. The proposed metric extends the metric in [1] by including weights to the costs associated to different time steps. The time-weighted costs increase the flexibility of the metric [1] to fit more applications and user preferences. We first introduce a metric based on multi-dimensional assignments, and then its linear programming relaxation, which is computable in polynomial time and is also a metric. The metrics can also be extended to metrics on random finite sets of trajectories to evaluate and rank algorithms across different scenarios, each with a ground truth set of trajectories.