In this paper, we investigate score function-based tests to check the significance of an ultrahigh-dimensional sub-vector of the model coefficients when the nuisance parameter vector is also ultrahigh-dimensional in linear models. We first reanalyze and extend a recently proposed score function-based test to derive, under weaker conditions, its limiting distributions under the null and local alternative hypotheses. As it may fail to work when the correlation between testing covariates and nuisance covariates is high, we propose an orthogonalized score function-based test with two merits: debiasing to make the non-degenerate error term degenerate and reducing the asymptotic variance to enhance power performance. Simulations evaluate the finite-sample performances of the proposed tests, and a real data analysis illustrates its application.
Hidden Markov models (HMMs) are a versatile statistical framework commonly used in ecology to characterize behavioural patterns from animal movement data. In HMMs, the observed data depend on a finite number of underlying hidden states, generally interpreted as the animal's unobserved behaviour. The number of states is a crucial parameter, controlling the trade-off between ecological interpretability of behaviours (fewer states) and the goodness of fit of the model (more states). Selecting the number of states, commonly referred to as order selection, is notoriously challenging. Common model selection metrics, such as AIC and BIC, often perform poorly in determining the number of states, particularly when models are misspecified. Building on existing methods for HMMs and mixture models, we propose a double penalized likelihood maximum estimate (DPMLE) for the simultaneous estimation of the number of states and parameters of non-stationary HMMs. The DPMLE differs from traditional information criteria by using two penalty functions on the stationary probabilities and state-dependent parameters. For non-stationary HMMs, forward and backward probabilities are used to approximate stationary probabilities. Using a simulation study that includes scenarios with additional complexity in the data, we compare the performance of our method with that of AIC and BIC. We also illustrate how the DPMLE differs from AIC and BIC using narwhal (Monodon monoceros) movement data. The proposed method outperformed AIC and BIC in identifying the correct number of states under model misspecification. Furthermore, its capacity to handle non-stationary dynamics allowed for more realistic modeling of complex movement data, offering deeper insights into narwhal behaviour. Our method is a powerful tool for order selection in non-stationary HMMs, with potential applications extending beyond the field of ecology.
Harnessing the potential computational advantage of quantum computers for machine learning tasks relies on the uploading of classical data onto quantum computers through what are commonly referred to as quantum encodings. The choice of such encodings may vary substantially from one task to another, and there exist only a few cases where structure has provided insight into their design and implementation, such as symmetry in geometric quantum learning. Here, we propose the perspective that category theory offers a natural mathematical framework for analyzing encodings that respect structure inherent in datasets and learning tasks. We illustrate this with pedagogical examples, which include geometric quantum machine learning, quantum metric learning, topological data analysis, and more. Moreover, our perspective provides a language in which to ask meaningful and mathematically precise questions for the design of quantum encodings and circuits for quantum machine learning tasks.
Sparse matrices have recently played a significant and impactful role in scientific computing, including artificial intelligence-related fields. According to historical studies on sparse matrix--vector multiplication (SpMV), Krylov subspace methods are particularly sensitive to the effects of round-off errors when using floating-point arithmetic. By employing multiple-precision linear computation, convergence can be stabilized by reducing these round-off errors. In this paper, we present the performance of our accelerated SpMV using SIMD instructions, demonstrating its effectiveness through various examples, including Krylov subspace methods.
High-quality representation of transactional sequences is vital for modern banking applications, including risk management, churn prediction, and personalized customer offers. Different tasks require distinct representation properties: local tasks benefit from capturing the client's current state, while global tasks rely on general behavioral patterns. Previous research has demonstrated that various self-supervised approaches yield representations that better capture either global or local qualities. This study investigates the integration of two self-supervised learning techniques - instance-wise contrastive learning and a generative approach based on restoring masked events in latent space. The combined approach creates representations that balance local and global transactional data characteristics. Experiments conducted on several public datasets, focusing on sequence classification and next-event type prediction, show that the integrated method achieves superior performance compared to individual approaches and demonstrates synergistic effects. These findings suggest that the proposed approach offers a robust framework for advancing event sequences representation learning in the financial sector.
Modern machine learning methods and the availability of large-scale data have significantly advanced our ability to predict target quantities from large sets of covariates. However, these methods often struggle under distributional shifts, particularly in the presence of hidden confounding. While the impact of hidden confounding is well-studied in causal effect estimation, e.g., instrumental variables, its implications for prediction tasks under shifting distributions remain underexplored. This work addresses this gap by introducing a strong notion of invariance that, unlike existing weaker notions, allows for distribution generalization even in the presence of nonlinear, non-identifiable structural functions. Central to this framework is the Boosted Control Function (BCF), a novel, identifiable target of inference that satisfies the proposed strong invariance notion and is provably worst-case optimal under distributional shifts. The theoretical foundation of our work lies in Simultaneous Equation Models for Distribution Generalization (SIMDGs), which bridge machine learning with econometrics by describing data-generating processes under distributional shifts. To put these insights into practice, we propose the ControlTwicing algorithm to estimate the BCF using flexible machine-learning techniques and demonstrate its generalization performance on synthetic and real-world datasets compared to traditional empirical risk minimization approaches.
Modeling the propagation of cracks at the microscopic level is fundamental to understand the effect of the microstructure on the fracture process. Nevertheless, microscopic propagation is often unstable and when using phase field fracture poor convergence is found or, in the case of using staggered algorithms, leads to the presence of jumps in the evolution of the cracks. In this work, a novel method is proposed to perform micromechanical simulations with phase field fracture imposing monotonic increases of crack length and allowing the use of monolithic implementations, being able to resolve all the snap-backs during the unstable propagation phases. The method is derived for FFT based solvers in order to exploit its very high numerical performance n micromechanical problems, but an equivalent method is also developed for Finite Elements (FE) showing the equivalence of both implementations. It is shown that the stress-strain curves and the crack paths obtained using the crack control method are superposed in stable propagation regimes to those obtained using strain control with a staggered scheme. J-integral calculations confirm that during the propagation process in the crack control method, the energy release rate remains constant and equal to an effective fracture energy that has been determined as function of the discretization for FFT simulations. Finally, to show the potential of the method, the technique is applied to simulate crack propagation through the microstructure of composites and porous materials providing an estimation of the effective fracture toughness.
In this paper we analyze a nonconforming virtual element method to approximate the eigenfunctions and eigenvalues of the two dimensional Oseen eigenvalue problem. The spaces under consideration lead to a divergence-free method which is capable to capture properly the divergence at discrete level and the eigenvalues and eigenfunctions. Under the compact theory for operators we prove convergence and error estimates for the method. By employing the theory of compact operators we recover the double order of convergence of the spectrum. Finally, we present numerical tests to assess the performance of the proposed numerical scheme.
One of the strategies to detect the pose and shape of unknown objects is their geometric modeling, consisting on fitting known geometric entities. Classical geometric modeling fits simple shapes such as spheres or cylinders, but often those don't cover the variety of shapes that can be encountered. For those situations, one solution is the use of superquadrics, which can adapt to a wider variety of shapes. One of the limitations of superquadrics is that they cannot model objects with holes, such as those with handles. This work aims to fit supersurfaces of degree four, in particular supertoroids, to objects with a single hole. Following the results of superquadrics, simple expressions for the major and minor radial distances are derived, which lead to the fitting of the intrinsic and extrinsic parameters of the supertoroid. The differential geometry of the surface is also studied as a function of these parameters. The result is a supergeometric modeling that can be used for symmetric objects with and without holes with a simple distance function for the fitting. The proposed algorithm expands considerably the amount of shapes that can be targeted for geometric modeling.
In contemporary problems involving genetic or neuroimaging data, thousands of hypotheses need to be tested. Due to their high power, and finite sample guarantees on type-I error under weak assumptions, Monte-Carlo permutation tests are often considered as gold standard for these settings. However, the enormous computational effort required for (thousands of) permutation tests is a major burden. Recently, Fischer and Ramdas (2024) constructed a permutation test for a single hypothesis in which the permutations are drawn sequentially one-by-one and the testing process can be stopped at any point without inflating the type-I error. They showed that the number of permutations can be substantially reduced (under null and alternative) while the power remains similar. We show how their approach can be modified to make it suitable for a broad class of multiple testing procedures and particularly discuss its use with the Benjamini-Hochberg procedure. The resulting method provides valid error rate control and outperforms all existing approaches significantly in terms of power and/or required computational time. We provide fast implementations and illustrate its application on large datasets, both synthetic and real.
The use of non-probability data sources for statistical purposes has become increasingly popular in recent years, also in official statistics. However, statistical inference based on non-probability samples is made more difficult by nature of them being biased and not representative of the target population. In this paper we propose quantile balancing inverse probability weighting estimator (QBIPW) for non-probability samples. We use the idea of Harms and Duchesne (2006) which allows to include quantile information in the estimation process so known totals and distribution for auxiliary variables are being reproduced. We discuss the estimation of the QBIPW probabilities and its variance. Our simulation study has demonstrated that the proposed estimators are robust against model mis-specification and, as a result, help to reduce bias and mean squared error. Finally, we applied the proposed methods to estimate the share of vacancies aimed at Ukrainian workers in Poland using an integrated set of administrative and survey data about job vacancies.