Nonlinear systems arising from time integrators like Backward Euler can sometimes be reformulated as optimization problems, known as incremental potentials. We show through a comprehensive experimental analysis that the widely used Projected Newton method, which relies on unconditional semidefinite projection of Hessian contributions, typically exhibits a reduced convergence rate compared to classical Newton's method. We demonstrate how factors like resolution, element order, projection method, material model and boundary handling impact convergence of Projected Newton and Newton. Drawing on these findings, we propose the hybrid method Project-on-Demand Newton, which projects only conditionally, and show that it enjoys both the robustness of Projected Newton and convergence rate of Newton. We additionally introduce Kinetic Newton, a regularization-based method that takes advantage of the structure of incremental potentials and avoids projection altogether. We compare the four solvers on hyperelasticity and contact problems. We also present a nuanced discussion of convergence criteria, and propose a new acceleration-based criterion that avoids problems associated with existing residual norm criteria and is easier to interpret. We finally address a fundamental limitation of the Armijo backtracking line search that occasionally blocks convergence, especially for stiff problems. We propose a novel parameter-free, robust line search technique to eliminate this issue.
Clinical neuroimaging data is naturally hierarchical. Different magnetic resonance imaging (MRI) sequences within a series, different slices covering the head, and different regions within each slice all confer different information. In this work we present a hierarchical attention network for abnormality detection using MRI scans obtained in a clinical hospital setting. The proposed network is suitable for non-volumetric data (i.e. stacks of high-resolution MRI slices), and can be trained from binary examination-level labels. We show that this hierarchical approach leads to improved classification, while providing interpretability through either coarse inter- and intra-slice abnormality localisation, or giving importance scores for different slices and sequences, making our model suitable for use as an automated triaging system in radiology departments.
Despite recent availability of large transcribed Kinyarwanda speech data, achieving robust speech recognition for Kinyarwanda is still challenging. In this work, we show that using self-supervised pre-training, following a simple curriculum schedule during fine-tuning and using semi-supervised learning to leverage large unlabelled speech data significantly improve speech recognition performance for Kinyarwanda. Our approach focuses on using public domain data only. A new studio-quality speech dataset is collected from a public website, then used to train a clean baseline model. The clean baseline model is then used to rank examples from a more diverse and noisy public dataset, defining a simple curriculum training schedule. Finally, we apply semi-supervised learning to label and learn from large unlabelled data in four successive generations. Our final model achieves 3.2% word error rate (WER) on the new dataset and 15.9% WER on Mozilla Common Voice benchmark, which is state-of-the-art to the best of our knowledge. Our experiments also indicate that using syllabic rather than character-based tokenization results in better speech recognition performance for Kinyarwanda.
We tackle the challenging tasks of monitoring on unstable HPC platforms the performance of CFD applications all along their development. We have designed and implemented a monitoring framework, integrated at the end of a CI-CD pipeline. Measures retrieved during the automatic execution of production simulations are analyzed within a visual analytics interface we developed, providing advanced visualizations and interaction. We have validated this approach by monitoring the CFD code Alya over two years, detecting and resolving issues related to the platform, and highlighting performance improvement.
Large Language Models (LLMs), with their flexible generation abilities, can be powerful data sources in domains with few or no available corpora. However, problems like hallucinations and biases limit such applications. In this case study, we pick nutrition counselling, a domain lacking any public resource, and show that high-quality datasets can be gathered by combining LLMs, crowd-workers and nutrition experts. We first crowd-source and cluster a novel dataset of diet-related issues, then work with experts to prompt ChatGPT into producing related supportive text. Finally, we let the experts evaluate the safety of the generated text. We release HAI-coaching, the first expert-annotated nutrition counselling dataset containing ~2.4K dietary struggles from crowd workers, and ~97K related supportive texts generated by ChatGPT. Extensive analysis shows that ChatGPT while producing highly fluent and human-like text, also manifests harmful behaviours, especially in sensitive topics like mental health, making it unsuitable for unsupervised use.
With insurers benefiting from ever-larger amounts of data of increasing complexity, we explore a data-driven method to model dependence within multilevel claims in this paper. More specifically, we start from a non-parametric estimator for Archimedean copula generators introduced by Genest and Rivest (1993), and we extend it to diverse flexible censoring scenarios using techniques derived from survival analysis. We implement a graphical selection procedure for copulas that we validate using goodness-of-fit methods applied to complete, single-censored, and double-censored bivariate data. We illustrate the performance of our model with multiple simulation studies. We then apply our methodology to a recent Canadian automobile insurance dataset where we seek to model the dependence between the activation delays of correlated coverages. We show that our model performs quite well in selecting the best-fitted copula for the data at hand, especially when the dataset is large, and that the results can then be used as part of a larger claims reserving methodology.
This report on axisymmetric ultraspherical/Gegenbauer polynomials and their use in Ambisonic directivity design in 2D and 3D presents an alternative mathematical formalism to what can be read in, e.g., my and Matthias Frank's book on Ambisonics or J\'er\^ome Daniel's thesis, Gary Elko's differential array book chapters, or Boaz Rafaely's spherical microphone array book. Ultraspherical/Gegenbauer polynomials are highly valuable when designing axisymmetric beams and understanding spherical t designs, and this report will shed some light on what circular, spherical, and ultraspherical axisymmetric polynomials are. While mathematically interesting by themselves already, they can be useful in spherical beamforming as described in the literature on spherical and differential microphone arrays. In this report, these ultraspherical/Gegenbauer polynomials will be used to uniformly derive for arbitrary dimensions D the various directivity designs or Ambisonic order weightings known from literature: max-DI/basic, max-rE , supercardioid, cardioid/inphase. Is there a way to relate higher-order cardioids and supercardioids? How could one define directivity patterns with an on-axis flatness constraint?
Neural implicit surface representations are currently receiving a lot of interest as a means to achieve high-fidelity surface reconstruction at a low memory cost, compared to traditional explicit representations.However, state-of-the-art methods still struggle with excessive memory usage and non-smooth surfaces. This is particularly problematic in large-scale applications with sparse inputs, as is common in robotics use cases. To address these issues, we first introduce a sparse structure, \emph{tri-quadtrees}, which represents the environment using learnable features stored in three planar quadtree projections. Secondly, we concatenate the learnable features with a Fourier feature positional encoding. The combined features are then decoded into signed distance values through a small multi-layer perceptron. We demonstrate that this approach facilitates smoother reconstruction with a higher completion ratio with fewer holes. Compared to two recent baselines, one implicit and one explicit, our approach requires only 10\%--50\% as much memory, while achieving competitive quality.
Advances in survival analysis have facilitated unprecedented flexibility in data modeling, yet there remains a lack of tools for graphically illustrating the influence of continuous covariates on predicted survival outcomes. We propose the utilization of a colored contour plot to depict the predicted survival probabilities over time, and provide a Shiny app and R package as implementations of this tool. Our approach is capable of supporting conventional models, including the Cox and Fine-Gray models. However, its capability shines when coupled with cutting-edge machine learning models such as random survival forests and deep neural networks.
Deep neural network based recommendation systems have achieved great success as information filtering techniques in recent years. However, since model training from scratch requires sufficient data, deep learning-based recommendation methods still face the bottlenecks of insufficient data and computational inefficiency. Meta-learning, as an emerging paradigm that learns to improve the learning efficiency and generalization ability of algorithms, has shown its strength in tackling the data sparsity issue. Recently, a growing number of studies on deep meta-learning based recommenddation systems have emerged for improving the performance under recommendation scenarios where available data is limited, e.g. user cold-start and item cold-start. Therefore, this survey provides a timely and comprehensive overview of current deep meta-learning based recommendation methods. Specifically, we propose a taxonomy to discuss existing methods according to recommendation scenarios, meta-learning techniques, and meta-knowledge representations, which could provide the design space for meta-learning based recommendation methods. For each recommendation scenario, we further discuss technical details about how existing methods apply meta-learning to improve the generalization ability of recommendation models. Finally, we also point out several limitations in current research and highlight some promising directions for future research in this area.
Deep learning constitutes a recent, modern technique for image processing and data analysis, with promising results and large potential. As deep learning has been successfully applied in various domains, it has recently entered also the domain of agriculture. In this paper, we perform a survey of 40 research efforts that employ deep learning techniques, applied to various agricultural and food production challenges. We examine the particular agricultural problems under study, the specific models and frameworks employed, the sources, nature and pre-processing of data used, and the overall performance achieved according to the metrics used at each work under study. Moreover, we study comparisons of deep learning with other existing popular techniques, in respect to differences in classification or regression performance. Our findings indicate that deep learning provides high accuracy, outperforming existing commonly used image processing techniques.