In biomedical applications it is often necessary to estimate a physiological response to a treatment consisting of multiple components, and learn the separate effects of the components in addition to the joint effect. Here, we extend existing probabilistic nonparametric approaches to explicitly address this problem. We also develop a new convolution-based model for composite treatment-response curves that is more biologically interpretable. We validate our models by estimating the impact of carbohydrate and fat in meals on blood glucose. By differentiating treatment components, incorporating their dosages, and sharing statistical information across patients via a hierarchical multi-output Gaussian process, our method improves prediction accuracy over existing approaches, and allows us to interpret the different effects of carbohydrates and fat on the overall glucose response.
This paper presents the first application of the direct parametrisation method for invariant manifolds to a fully coupled multiphysics problem involving the nonlinear vibrations of deformable structures subjected to an electrostatic field. The formulation proposed is intended for model order reduction of electrostatically actuated resonating Micro-Electro-Mechanical Systems (MEMS). The continuous problem is first rewritten in a manner that can be directly handled by the parametrisation method, which relies upon automated asymptotic expansions. A new mixed fully Lagrangian formulation is thus proposed which contains only explicit polynomial nonlinearities, which is then discretised in the framework of finite element procedures. Validation is performed on the classical parallel plate configuration, where different formulations using either the general framework, or an approximation of the electrostatic field due to the geometric configuration selected, are compared. Reduced-order models along these formulations are also compared to full-order simulations operated with a time integration approach. Numerical results show a remarkable performance both in terms of accuracy and wealth of nonlinear effects that can be accounted for. In particular, the transition from hardening to softening behaviour of the primary resonance while increasing the constant voltage component of the electric actuation, is recovered. Secondary resonances leading to superharmonic and parametric resonances are also investigated with the reduced-order model.
A posteriori reduced-order models, e.g. proper orthogonal decomposition, are essential to affordably tackle realistic parametric problems. They rely on a trustful training set, that is a family of full-order solutions (snapshots) representative of all possible outcomes of the parametric problem. Having such a rich collection of snapshots is not, in many cases, computationally viable. A strategy for data augmentation, designed for parametric laminar incompressible flows, is proposed to enrich poorly populated training sets. The goal is to include in the new, artificial snapshots emerging features, not present in the original basis, that do enhance the quality of the reduced-order solution. The methodologies devised are based on exploiting basic physical principles, such as mass and momentum conservation, to devise physically-relevant, artificial snapshots at a fraction of the cost of additional full-order solutions. Interestingly, the numerical results show that the ideas exploiting only mass conservation (i.e., incompressibility) are not producing significant added value with respect to the standard linear combinations of snapshots. Conversely, accounting for the linearized momentum balance via the Oseen equation does improve the quality of the resulting approximation and therefore is an effective data augmentation strategy in the framework of viscous incompressible laminar flows.
Microring resonators (MRRs) are promising devices for time-delay photonic reservoir computing, but the impact of the different physical effects taking place in the MRRs on the reservoir computing performance is yet to be fully understood. We numerically analyze the impact of linear losses as well as thermo-optic and free-carrier effects relaxation times on the prediction error of the time-series task NARMA-10. We demonstrate the existence of three regions, defined by the input power and the frequency detuning between the optical source and the microring resonance, that reveal the cavity transition from linear to nonlinear regimes. One of these regions offers very low error in time-series prediction under relatively low input power and number of nodes while the other regions either lack nonlinearity or become unstable. This study provides insight into the design of the MRR and the optimization of its physical properties for improving the prediction performance of time-delay reservoir computing.
We present a fundamental improvement of a high polynomial degree time domain cell method recently introduced by the last three authors. The published work introduced a method featuring block-diagonal system matrices where the block size and conditioning scaled poorly with respect to polynomial degree. The issue is herein bypassed by the construction of new basis functions exploiting quadrature rule based mass lumping techniques for arbitrary polynomial degrees in two dimensions for the Maxwell equations and the acoustic wave equation in the first order velocity pressure formulation. We characterize the degrees of freedom of all new discrete approximation spaces we employ for differential forms and show that the resulting block diagonal (inverse) mass matrices have block sizes independent of the polynomial degree. We demonstrate on an extensive number of examples how the new technique is applicable and efficient for large scale computations.
As in many fields of medical research, survival analysis has witnessed a growing interest in the application of deep learning techniques to model complex, high-dimensional, heterogeneous, incomplete, and censored medical data. Current methods often make assumptions about the relations between data that may not be valid in practice. In response, we introduce SAVAE (Survival Analysis Variational Autoencoder), a novel approach based on Variational Autoencoders. SAVAE contributes significantly to the field by introducing a tailored ELBO formulation for survival analysis, supporting various parametric distributions for covariates and survival time (as long as the log-likelihood is differentiable). It offers a general method that consistently performs well on various metrics, demonstrating robustness and stability through different experiments. Our proposal effectively estimates time-to-event, accounting for censoring, covariate interactions, and time-varying risk associations. We validate our model in diverse datasets, including genomic, clinical, and demographic data, with varying levels of censoring. This approach demonstrates competitive performance compared to state-of-the-art techniques, as assessed by the Concordance Index and the Integrated Brier Score. SAVAE also offers an interpretable model that parametrically models covariates and time. Moreover, its generative architecture facilitates further applications such as clustering, data imputation, and the generation of synthetic patient data through latent space inference from survival data.
High-order tensor methods for solving both convex and nonconvex optimization problems have generated significant research interest, leading to algorithms with optimal global rates of convergence and local rates that are faster than Newton's method. On each iteration, these methods require the unconstrained local minimization of a (potentially nonconvex) multivariate polynomial of degree higher than two, constructed using third-order (or higher) derivative information, and regularized by an appropriate power of regularization. Developing efficient techniques for solving such subproblems is an ongoing topic of research, and this paper addresses the case of the third-order tensor subproblem. We propose the CQR algorithmic framework, for minimizing a nonconvex Cubic multivariate polynomial with Quartic Regularisation, by minimizing a sequence of local quadratic models that incorporate simple cubic and quartic terms. The role of the cubic term is to crudely approximate local tensor information, while the quartic one controls model regularization and progress. We provide necessary and sufficient optimality conditions that fully characterise the global minimizers of these cubic-quartic models. We then turn these conditions into secular equations that can be solved using nonlinear eigenvalue techniques. We show, using our optimality characterisations, that a CQR algorithmic variant has the optimal-order evaluation complexity of $\mathcal{O}(\epsilon^{-3/2})$ when applied to minimizing our quartically-regularised cubic subproblem, which can be further improved in special cases. We propose practical CQR variants that use local tensor information to construct the local cubic-quartic models. We test these variants numerically and observe them to be competitive with ARC and other subproblem solvers on typical instances and even superior on ill-conditioned subproblems with special structure.
We obtain error approximation bounds between expected suprema of canonical processes that are generated by random vectors with independent coordinates and expected suprema of Gaussian processes. In particular, we obtain a sharper proximity estimate for Rademacher and Gaussian complexities. Our estimates are dimension-free, and depend only on the geometric parameters and the numerical complexity of the underlying index set.
Most state-of-the-art machine learning techniques revolve around the optimisation of loss functions. Defining appropriate loss functions is therefore critical to successfully solving problems in this field. We present a survey of the most commonly used loss functions for a wide range of different applications, divided into classification, regression, ranking, sample generation and energy based modelling. Overall, we introduce 33 different loss functions and we organise them into an intuitive taxonomy. Each loss function is given a theoretical backing and we describe where it is best used. This survey aims to provide a reference of the most essential loss functions for both beginner and advanced machine learning practitioners.
We hypothesize that due to the greedy nature of learning in multi-modal deep neural networks, these models tend to rely on just one modality while under-fitting the other modalities. Such behavior is counter-intuitive and hurts the models' generalization, as we observe empirically. To estimate the model's dependence on each modality, we compute the gain on the accuracy when the model has access to it in addition to another modality. We refer to this gain as the conditional utilization rate. In the experiments, we consistently observe an imbalance in conditional utilization rates between modalities, across multiple tasks and architectures. Since conditional utilization rate cannot be computed efficiently during training, we introduce a proxy for it based on the pace at which the model learns from each modality, which we refer to as the conditional learning speed. We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning. The proposed algorithm improves the model's generalization on three datasets: Colored MNIST, Princeton ModelNet40, and NVIDIA Dynamic Hand Gesture.
Graph representation learning for hypergraphs can be used to extract patterns among higher-order interactions that are critically important in many real world problems. Current approaches designed for hypergraphs, however, are unable to handle different types of hypergraphs and are typically not generic for various learning tasks. Indeed, models that can predict variable-sized heterogeneous hyperedges have not been available. Here we develop a new self-attention based graph neural network called Hyper-SAGNN applicable to homogeneous and heterogeneous hypergraphs with variable hyperedge sizes. We perform extensive evaluations on multiple datasets, including four benchmark network datasets and two single-cell Hi-C datasets in genomics. We demonstrate that Hyper-SAGNN significantly outperforms the state-of-the-art methods on traditional tasks while also achieving great performance on a new task called outsider identification. Hyper-SAGNN will be useful for graph representation learning to uncover complex higher-order interactions in different applications.