We consider the use of multipreconditioning, which allows for multiple preconditioners to be applied in parallel, on high-frequency Helmholtz problems. Typical applications present challenging sparse linear systems which are complex non-Hermitian and, due to the pollution effect, either very large or else still large but under-resolved in terms of the physics. These factors make finding general purpose, efficient and scalable solvers difficult and no one approach has become the clear method of choice. In this work we take inspiration from domain decomposition strategies known as sweeping methods, which have gained notable interest for their ability to yield nearly-linear asymptotic complexity and which can also be favourable for high-frequency problems. While successful approaches exist, such as those based on higher-order interface conditions, perfectly matched layers (PMLs), or complex tracking of wave fronts, they can often be quite involved or tedious to implement. We investigate here the use of simple sweeping techniques applied in different directions which can then be incorporated in parallel into a multipreconditioned GMRES strategy. Preliminary numerical results on a two-dimensional benchmark problem will demonstrate the potential of this approach.
In this paper, we propose multicontinuum splitting schemes for multiscale problems, focusing on a parabolic equation with a high-contrast coefficient. Using the framework of multicontinuum homogenization, we introduce spatially smooth macroscopic variables and decompose the multicontinuum solution space into two components to effectively separate the dynamics at different speeds (or the effects of contrast in high-contrast cases). By treating the component containing fast dynamics (or dependent on the contrast) implicitly and the component containing slow dynamics (or independent of the contrast) explicitly, we construct partially explicit time discretization schemes, which can reduce computational cost. The derived stability conditions are contrast-independent, provided the continua are chosen appropriately. Additionally, we discuss possible methods to obtain an optimized decomposition of the solution space, which relaxes the stability conditions while enhancing computational efficiency. A Rayleigh quotient problem in tensor form is formulated, and simplifications are achieved under certain assumptions. Finally, we present numerical results for various coefficient fields and different continua to validate our proposed approach. It can be observed that the multicontinuum splitting schemes enjoy high accuracy and efficiency.
We investigate notions of complete representation by partial functions, where the operations in the signature include antidomain restriction and may include composition, intersection, update, preferential union, domain, antidomain, and set difference. When the signature includes both antidomain restriction and intersection, the join-complete and the meet-complete representations coincide. Otherwise, for the signatures we consider, meet-complete is strictly stronger than join-complete. A necessary condition to be meet-completely representable is that the atoms are separating. For the signatures we consider, this condition is sufficient if and only if composition is not in the signature. For each of the signatures we consider, the class of (meet-)completely representable algebras is not axiomatisable by any existential-universal-existential first-order theory. For 14 expressively distinct signatures, we show, by giving an explicit representation, that the (meet-)completely representable algebras form a basic elementary class, axiomatisable by a universal-existential-universal first-order sentence. The signatures we axiomatise are those containing antidomain restriction and any of intersection, update, and preferential union and also those containing antidomain restriction, composition, and intersection and any of update, preferential union, domain, and antidomain.
We introduce a new framework for dimension reduction in the context of high-dimensional regression. Our proposal is to aggregate an ensemble of random projections, which have been carefully chosen based on the empirical regression performance after being applied to the covariates. More precisely, we consider disjoint groups of independent random projections, apply a base regression method after each projection, and retain the projection in each group based on the empirical performance. We aggregate the selected projections by taking the singular value decomposition of their empirical average and then output the leading order singular vectors. A particularly appealing aspect of our approach is that the singular values provide a measure of the relative importance of the corresponding projection directions, which can be used to select the final projection dimension. We investigate in detail (and provide default recommendations for) various aspects of our general framework, including the projection distribution and the base regression method, as well as the number of random projections used. Additionally, we investigate the possibility of further reducing the dimension by applying our algorithm twice in cases where projection dimension recommended in the initial application is too large. Our theoretical results show that the error of our algorithm stabilises as the number of groups of projections increases. We demonstrate the excellent empirical performance of our proposal in a large numerical study using simulated and real data.
With the increase in computational power for the available hardware, the demand for high-resolution data in computer graphics applications increases. Consequently, classical geometry processing techniques based on linear algebra solutions are starting to become obsolete. In this setting, we propose a novel approach for tackling mesh deformation tasks on high-resolution meshes. By reducing the input size with a fast remeshing technique and preserving a consistent representation of the original mesh with local reference frames, we provide a solution that is both scalable and robust in multiple applications, such as as-rigid-as-possible deformations, non-rigid isometric transformations, and pose transfer tasks. We extensively test our technique and compare it against state-of-the-art methods, proving that our approach can handle meshes with hundreds of thousands of vertices in tens of seconds while still achieving results comparable with the other solutions.
Nonparametric procedures are more powerful for detecting interaction in two-way ANOVA when the data are non-normal. In this paper, we compute null critical values for the aligned rank-based tests (APCSSA/APCSSM) where the levels of the factors are between 2 and 6. We compare the performance of these new procedures with the ANOVA F-test for interaction, the adjusted rank transform test (ART), Conover's rank transform procedure (RT), and a rank-based ANOVA test (raov) using Monte Carlo simulations. The new procedures APCSSA/APCSSM are comparable with existing competitors in all settings. Even though there is no single dominant test in detecting interaction effects for non-normal data, nonparametric procedure APCSSM is the most highly recommended procedure for Cauchy errors settings.
A Gaussian process is proposed as a model for the posterior distribution of the local predictive ability of a model or expert, conditional on a vector of covariates, from historical predictions in the form of log predictive scores. Assuming Gaussian expert predictions and a Gaussian data generating process, a linear transformation of the predictive score follows a noncentral chi-squared distribution with one degree of freedom. Motivated by this we develop a noncentral chi-squared Gaussian process regression to flexibly model local predictive ability, with the posterior distribution of the latent GP function and kernel hyperparameters sampled by Hamiltonian Monte Carlo. We show that a cube-root transformation of the log scores is approximately Gaussian with homoscedastic variance, making it possible to estimate the model much faster by marginalizing the latent GP function analytically. A multi-output Gaussian process regression is also introduced to model the dependence in predictive ability between experts, both for inference and prediction purposes. Linear pools based on learned local predictive ability are applied to predict daily bike usage in Washington DC.
Modeling the complex relationships between multiple categorical response variables as a function of predictors is a fundamental task in the analysis of categorical data. However, existing methods can be difficult to interpret and may lack flexibility. To address these challenges, we introduce a penalized likelihood method for multivariate categorical response regression that relies on a novel subspace decomposition to parameterize interpretable association structures. Our approach models the relationships between categorical responses by identifying mutual, joint, and conditionally independent associations, which yields a linear problem within a tensor product space. We establish theoretical guarantees for our estimator, including error bounds in high-dimensional settings, and demonstrate the method's interpretability and prediction accuracy through comprehensive simulation studies.
Animal vocalization denoising is a task similar to human speech enhancement, a well-studied field of research. In contrast to the latter, it is applied to a higher diversity of sound production mechanisms and recording environments, and this higher diversity is a challenge for existing models. Adding to the challenge and in contrast to speech, we lack large and diverse datasets comprising clean vocalizations. As a solution we use as training data pseudo-clean targets, i.e. pre-denoised vocalizations, and segments of background noise without a vocalization. We propose a train set derived from bioacoustics datasets and repositories representing diverse species, acoustic environments, geographic regions. Additionally, we introduce a non-overlapping benchmark set comprising clean vocalizations from different taxa and noise samples. We show that that denoising models (demucs, CleanUNet) trained on pseudo-clean targets obtained with speech enhancement models achieve competitive results on the benchmarking set. We publish data, code, libraries, and demos //mariusmiron.com/research/biodenoising.
To obtain reliable results of expertise, which usually use individual and group expert pairwise comparisons, it is important to summarize (aggregate) expert estimates provided that they are sufficiently consistent. There are several ways to determine the threshold level of consistency sufficient for aggregation of estimates. They can be used for different consistency indices, but none of them relates the threshold value to the requirements for the reliability of the expertise's results. Therefore, a new approach to determining this consistency threshold is required. The proposed approach is based on simulation modeling of expert pairwise comparisons and a targeted search for the most inconsistent among the modeled pairwise comparison matrices. Thus, the search for the least consistent matrix is carried out for a given perturbation of the perfectly consistent matrix. This allows for determining the consistency threshold corresponding to a given permissible relative deviation of the resulting weight of an alternative from its hypothetical reference value.
Researchers have long run regressions of an outcome variable (Y) on a treatment (D) and covariates (X) to estimate treatment effects. Even absent unobserved confounding, the regression coefficient on D in this setup reports a conditional variance weighted average of strata-wise average effects, not generally equal to the average treatment effect (ATE). Numerous proposals have been offered to cope with this "weighting problem", including interpretational tools to help characterize the weights and diagnostic aids to help researchers assess the potential severity of this problem. We make two contributions that together suggest an alternative direction for researchers and this literature. Our first contribution is conceptual, demystifying these weights. Simply put, under heterogeneous treatment effects (and varying probability of treatment), the linear regression of Y on D and X will be misspecified. The "weights" of regression offer one characterization for the coefficient from regression that helps to clarify how it will depart from the ATE. We also derive a more general expression for the weights than what is usually referenced. Our second contribution is practical: as these weights simply characterize misspecification bias, we suggest simply avoiding them through an approach that tolerate heterogeneous effects. A wide range of longstanding alternatives (regression-imputation/g-computation, interacted regression, and balancing weights) relax specification assumptions to allow heterogeneous effects. We make explicit the assumption of "separate linearity", under which each potential outcome is separately linear in X. This relaxation of conventional linearity offers a common justification for all of these methods and avoids the weighting problem, at an efficiency cost that will be small when there are few covariates relative to sample size.