This paper explores the application of machine learning techniques to predict where hedging occurs in peer-tutoring interactions. The study uses a naturalistic face-to-face dataset annotated for natural language turns, conversational strategies, tutoring strategies, and nonverbal behaviours. These elements are processed into a vector representation of the previous turns, which serves as input to several machine learning models. Results show that embedding layers, that capture the semantic information of the previous turns, significantly improves the model's performance. Additionally, the study provides insights into the importance of various features, such as interpersonal rapport and nonverbal behaviours, in predicting hedges by using Shapley values for feature explanation. We discover that the eye gaze of both the tutor and the tutee has a significant impact on hedge prediction. We further validate this observation through a follow-up ablation study.
Parametric mathematical models such as parameterizations of partial differential equations with random coefficients have received a lot of attention within the field of uncertainty quantification. The model uncertainties are often represented via a series expansion in terms of the parametric variables. In practice, this series expansion needs to be truncated to a finite number of terms, introducing a dimension truncation error to the numerical simulation of a parametric mathematical model. There have been several studies of the dimension truncation error corresponding to different models of the input random field in recent years, but many of these analyses have been carried out within the context of numerical integration. In this paper, we study the $L^2$ dimension truncation error of the parametric model problem. Estimates of this kind arise in the assessment of the dimension truncation error for function approximation in high dimensions. In addition, we show that the dimension truncation error rate is invariant with respect to certain transformations of the parametric variables. Numerical results are presented which showcase the sharpness of the theoretical results.
Quantum neural networks (QNNs) and quantum kernels stand as prominent figures in the realm of quantum machine learning, poised to leverage the nascent capabilities of near-term quantum computers to surmount classical machine learning challenges. Nonetheless, the training efficiency challenge poses a limitation on both QNNs and quantum kernels, curbing their efficacy when applied to extensive datasets. To confront this concern, we present a unified approach: coreset selection, aimed at expediting the training of QNNs and quantum kernels by distilling a judicious subset from the original training dataset. Furthermore, we analyze the generalization error bounds of QNNs and quantum kernels when trained on such coresets, unveiling the comparable performance with those training on the complete original dataset. Through systematic numerical simulations, we illuminate the potential of coreset selection in expediting tasks encompassing synthetic data classification, identification of quantum correlations, and quantum compiling. Our work offers a useful way to improve diverse quantum machine learning models with a theoretical guarantee while reducing the training cost.
We present a training method with linguistic speech regularization that improves the robustness of spontaneous speech synthesis methods with filled pause (FP) insertion. Spontaneous speech synthesis is aimed at producing speech with human-like disfluencies, such as FPs. Because modeling the complex data distribution of spontaneous speech with a rich FP vocabulary is challenging, the quality of FP-inserted synthetic speech is often limited. To address this issue, we present a method for synthesizing spontaneous speech that improves robustness to diverse FP insertions. Regularization is used to stabilize the synthesis of the linguistic speech (i.e., non-FP) elements. To further improve robustness to diverse FP insertions, it utilizes pseudo-FPs sampled using an FP word prediction model as well as ground-truth FPs. Our experiments demonstrated that the proposed method improves the naturalness of synthetic speech with ground-truth and predicted FPs by 0.24 and 0.26, respectively.
We present a robust deep incremental learning framework for regression tasks on financial temporal tabular datasets which is built upon the incremental use of commonly available tabular and time series prediction models to adapt to distributional shifts typical of financial datasets. The framework uses a simple basic building block (decision trees) to build self-similar models of any required complexity to deliver robust performance under adverse situations such as regime changes, fat-tailed distributions, and low signal-to-noise ratios. As a detailed study, we demonstrate our scheme using XGBoost models trained on the Numerai dataset and show that a two layer deep ensemble of XGBoost models over different model snapshots delivers high quality predictions under different market regimes. We also show that the performance of XGBoost models with different number of boosting rounds in three scenarios (small, standard and large) is monotonically increasing with respect to model size and converges towards the generalisation upper bound. We also evaluate the robustness of the model under variability of different hyperparameters, such as model complexity and data sampling settings. Our model has low hardware requirements as no specialised neural architectures are used and each base model can be independently trained in parallel.
Hierarchical matrices approximate a given matrix by a decomposition into low-rank submatrices that can be handled efficiently in factorized form. $\mathcal{H}^2$-matrices refine this representation following the ideas of fast multipole methods in order to achieve linear, i.e., optimal complexity for a variety of important algorithms. The matrix multiplication, a key component of many more advanced numerical algorithms, has so far proven tricky: the only linear-time algorithms known so far either require the very special structure of HSS-matrices or need to know a suitable basis for all submatrices in advance. In this article, a new and fairly general algorithm for multiplying $\mathcal{H}^2$-matrices in linear complexity with adaptively constructed bases is presented. The algorithm consists of two phases: first an intermediate representation with a generalized block structure is constructed, then this representation is re-compressed in order to match the structure prescribed by the application. The complexity and accuracy are analysed and numerical experiments indicate that the new algorithm can indeed be significantly faster than previous attempts.
We discuss techniques of estimation and inference for nonlinear cohort panels with learning from experience, showing, inter alia, the consistency and asymptotic normality of the nonlinear least squares estimator employed in the seminal paper by Malmendier and Nagel (2016). Potential pitfalls for hypothesis testing are identified and solutions proposed. Monte Carlo simulations verify the properties of the estimator and corresponding test statistics in finite samples, while an application to a panel of survey expectations demonstrates the usefulness of the theory developed.
This paper introduces an innovative method for conducting conditional independence testing in high-dimensional data, facilitating the automated discovery of significant associations within distinct subgroups of a population, all while controlling the false discovery rate. This is achieved by expanding upon the model-X knockoff filter to provide more informative inferences. Our enhanced inferences can help explain sample heterogeneity and uncover interactions, making better use of the capabilities offered by modern machine learning models. Specifically, our method is able to leverage any model for the identification of data-driven hypotheses pertaining to interesting population subgroups. Then, it rigorously test these hypotheses without succumbing to selection bias. Importantly, our approach is efficient and does not require sample splitting. We demonstrate the effectiveness of our method through simulations and numerical experiments, using data derived from a randomized experiment featuring multiple treatment variables.
Solving multiphysics-based inverse problems for geological carbon storage monitoring can be challenging when multimodal time-lapse data are expensive to collect and costly to simulate numerically. We overcome these challenges by combining computationally cheap learned surrogates with learned constraints. Not only does this combination lead to vastly improved inversions for the important fluid-flow property, permeability, it also provides a natural platform for inverting multimodal data including well measurements and active-source time-lapse seismic data. By adding a learned constraint, we arrive at a computationally feasible inversion approach that remains accurate. This is accomplished by including a trained deep neural network, known as a normalizing flow, which forces the model iterates to remain in-distribution, thereby safeguarding the accuracy of trained Fourier neural operators that act as surrogates for the computationally expensive multiphase flow simulations involving partial differential equation solves. By means of carefully selected experiments, centered around the problem of geological carbon storage, we demonstrate the efficacy of the proposed constrained optimization method on two different data modalities, namely time-lapse well and time-lapse seismic data. While permeability inversions from both these two modalities have their pluses and minuses, their joint inversion benefits from either, yielding valuable superior permeability inversions and CO2 plume predictions near, and far away, from the monitoring wells.
This paper addresses the problem of end-effector formation control for a mixed group of two-link manipulators moving in a horizontal plane that comprises of fully-actuated manipulators and underactuated manipulators with only the second joint being actuated (referred to as the passive-active (PA) manipulators). The problem is solved by extending the distributed end-effector formation controller for the fully-actuated manipulator to the PA manipulator moving in a horizontal plane by using its integrability. This paper presents stability analysis of the closed-loop systems under a given necessary condition, and we prove that the manipulators' end-effector converge to the desired formation shape. The proposed method is validated by simulations.
When modelling discontinuities (interfaces) using the finite element method, the standard approach is to use a conforming finite-element mesh in which the mesh matches the interfaces. However, this approach can prove cumbersome if the geometry is complex, in particular in 3D. In this work, we develop an efficient technique for a non-conforming finite-element treatment of weak discontinuities by using laminated microstructures. The approach is inspired by the so-called composite voxel technique that has been developed for FFT-based spectral solvers in computational homogenization. The idea behind the method is rather simple. Each finite element that is cut by an interface is treated as a simple laminate with the volume fraction of the phases and the lamination orientation determined in terms of the actual geometrical arrangement of the interface within the element. The approach is illustrated by several computational examples relevant to the micromechanics of heterogeneous materials. Elastic and elastic-plastic materials at small and finite strain are considered in the examples. The performance of the proposed method is compared to two alternative, simple methods showing that the new approach is in most cases superior to them while maintaining the simplicity.