We develop a novel deep learning approach for pricing European basket options written on assets that follow jump-diffusion dynamics. The option pricing problem is formulated as a partial integro-differential equation, which is approximated via a new implicit-explicit minimizing movement time-stepping approach, involving approximation by deep, residual-type Artificial Neural Networks (ANNs) for each time step. The integral operator is discretized via two different approaches: a) a sparse-grid Gauss--Hermite approximation following localised coordinate axes arising from singular value decompositions, and b) an ANN-based high-dimensional special-purpose quadrature rule. Crucially, the proposed ANN is constructed to ensure the asymptotic behavior of the solution for large values of the underlyings and also leads to consistent outputs with respect to a priori known qualitative properties of the solution. The performance and robustness with respect to the dimension of the methods are assessed in a series of numerical experiments involving the Merton jump-diffusion model.
The macro-element variant of the hybridized discontinuous Galerkin (HDG) method combines advantages of continuous and discontinuous finite element discretization. In this paper, we investigate the performance of the macro-element HDG method for the analysis of compressible flow problems at moderate Reynolds numbers. To efficiently handle the corresponding large systems of equations, we explore several strategies at the solver level. On the one hand, we devise a second-layer static condensation approach that reduces the size of the local system matrix in each macro-element and hence the factorization time of the local solver. On the other hand, we employ a multi-level preconditioner based on the FGMRES solver for the global system that integrates well within a matrix-free implementation. In addition, we integrate a standard diagonally implicit Runge-Kutta scheme for time integration. We test the matrix-free macro-element HDG method for compressible flow benchmarks, including Couette flow, flow past a sphere, and the Taylor-Green vortex. Our results show that unlike standard HDG, the macro-element HDG method can operate efficiently for moderate polynomial degrees, as the local computational load can be flexibly increased via mesh refinement within a macro-element. Our results also show that due to the balance of local and global operations, the reduction in degrees of freedom, and the reduction of the global problem size and the number of iterations for its solution, the macro-element HDG method can be a competitive option for the analysis of compressible flow problems.
We consider three distinct discrete-time models of learning and evolution in games: a biological model based on intra-species selective pressure, the dynamics induced by pairwise proportional imitation, and the exponential / multiplicative weights (EW) algorithm for online learning. Even though these models share the same continuous-time limit - the replicator dynamics - we show that second-order effects play a crucial role and may lead to drastically different behaviors in each model, even in very simple, symmetric $2\times2$ games. Specifically, we study the resulting discrete-time dynamics in a class of parametrized congestion games, and we show that (i) in the biological model of intra-species competition, the dynamics remain convergent for any parameter value; (ii) the dynamics of pairwise proportional imitation exhibit an entire range of behaviors for larger time steps and different equilibrium configurations (stability, instability, and even Li-Yorke chaos); while (iii) in the EW algorithm, increasing the time step (almost) inevitably leads to chaos (again, in the formal, Li-Yorke sense). This divergence of behaviors comes in stark contrast to the globally convergent behavior of the replicator dynamics, and serves to delineate the extent to which the replicator dynamics provide a useful predictor for the long-run behavior of their discrete-time origins.
Some mechanical systems, that are modeled to have inelastic collisions, nonetheless possess energy-conserving intermittent-contact solutions, known as collisionless solutions. Such a solution, representing a persistent hopping or walking across a level ground, may be important for understanding animal locomotion or for designing efficient walking machines. So far, collisionless motion has been analytically studied in simple two degrees of freedom (DOF) systems, or in a system that decouples into 2-DOF subsystems in the harmonic approximation. In this paper we extend the consideration to a N-DOF system, recovering the known solutions as a special N = 2 case of the general formulation. We show that in the harmonic approximation the collisionless solution is determined by the spectrum of the system. We formulate a solution existence condition, which requires the presence of at least one oscillating normal mode in the most constrained phase of the motion. An application of the developed general framework is illustrated by finding a collisionless solution for a rocking motion of a biped with an armed standing torso.
This work constructs the first-ever sixth-order exponential Runge--Kutta (ExpRK) methods for the time integration of stiff parabolic PDEs. First, we leverage the exponential B-series theory to restate the stiff order conditions for ExpRK methods of arbitrary order based on an essential set of trees only. Then, we explicitly provide the 36 order conditions required for sixth-order methods and present convergence results. In addition, we are able to solve the 36 stiff order conditions in both their weak and strong forms, resulting in two families of sixth-order parallel stages ExpRK schemes. Interestingly, while these new schemes require a high number of stages, they can be implemented efficiently similar to the cost of a 6-stage method. Numerical experiments are given to confirm the accuracy and efficiency of the new schemes.
We give a priori error estimates of second order in time fully explicit Runge-Kutta discontinuous Galerkin schemes using upwind fluxes to smooth solutions of scalar fractional conservation laws in one space dimension. Under the time step restrictions $\tau\leq c h$ for piecewise linear and $\tau\lesssim h^{4/3}$ for higher order finite elements, we prove a convergence rate for the energy norm $\|\cdot\|_{L^\infty_tL^2_x}+|\cdot|_{L^2_tH^{\lambda/2}_x}$ that is optimal for solutions and flux functions that are smooth enough. Our proof relies on a novel upwind projection of the exact solution.
Machine unlearning techniques, which involve retracting data records and reducing influence of said data on trained models, help with the user privacy protection objective but incur significant computational costs. Weight perturbation-based unlearning is a general approach, but it typically involves globally modifying the parameters. We propose fine-grained Top-K and Random-k parameters perturbed inexact machine unlearning strategies that address the privacy needs while keeping the computational costs tractable. In order to demonstrate the efficacy of our strategies we also tackle the challenge of evaluating the effectiveness of machine unlearning by considering the model's generalization performance across both unlearning and remaining data. To better assess the unlearning effect and model generalization, we propose novel metrics, namely, the forgetting rate and memory retention rate. However, for inexact machine unlearning, current metrics are inadequate in quantifying the degree of forgetting that occurs after unlearning strategies are applied. To address this, we introduce SPD-GAN, which subtly perturbs the distribution of data targeted for unlearning. Then, we evaluate the degree of unlearning by measuring the performance difference of the models on the perturbed unlearning data before and after the unlearning process. By implementing these innovative techniques and metrics, we achieve computationally efficacious privacy protection in machine learning applications without significant sacrifice of model performance. Furthermore, this approach provides a novel method for evaluating the degree of unlearning.
We propose a novel algorithm for the support estimation of partially known Gaussian graphical models that incorporates prior information about the underlying graph. In contrast to classical approaches that provide a point estimate based on a maximum likelihood or a maximum a posteriori criterion using (simple) priors on the precision matrix, we consider a prior on the graph and rely on annealed Langevin diffusion to generate samples from the posterior distribution. Since the Langevin sampler requires access to the score function of the underlying graph prior, we use graph neural networks to effectively estimate the score from a graph dataset (either available beforehand or generated from a known distribution). Numerical experiments demonstrate the benefits of our approach.
The ability to dynamically adjust the computational load of neural models during inference is crucial for on-device processing scenarios characterised by limited and time-varying computational resources. A promising solution is presented by early-exit architectures, in which additional exit branches are appended to intermediate layers of the encoder. In self-attention models for automatic speech recognition (ASR), early-exit architectures enable the development of dynamic models capable of adapting their size and architecture to varying levels of computational resources and ASR performance demands. Previous research on early-exiting ASR models has relied on pre-trained self-supervised models, fine-tuned with an early-exit loss. In this paper, we undertake an experimental comparison between fine-tuning pre-trained backbones and training models from scratch with the early-exiting objective. Experiments conducted on public datasets reveal that early-exit models trained from scratch not only preserve performance when using fewer encoder layers but also exhibit enhanced task accuracy compared to single-exit or pre-trained models. Furthermore, we explore an exit selection strategy grounded in posterior probabilities as an alternative to the conventional frame-based entropy approach. Results provide insights into the training dynamics of early-exit architectures for ASR models, particularly the efficacy of training strategies and exit selection methods.
Vessel segmentation and centerline extraction are two crucial preliminary tasks for many computer-aided diagnosis tools dealing with vascular diseases. Recently, deep-learning based methods have been widely applied to these tasks. However, classic deep-learning approaches struggle to capture the complex geometry and specific topology of vascular networks, which is of the utmost importance in most applications. To overcome these limitations, the clDice loss, a topological loss that focuses on the vessel centerlines, has been recently proposed. This loss requires computing, with a proposed soft-skeleton algorithm, the skeletons of both the ground truth and the predicted segmentation. However, the soft-skeleton algorithm provides suboptimal results on 3D images, which makes the clDice hardly suitable on 3D images. In this paper, we propose to replace the soft-skeleton algorithm by a U-Net which computes the vascular skeleton directly from the segmentation. We show that our method provides more accurate skeletons than the soft-skeleton algorithm. We then build upon this network a cascaded U-Net trained with the clDice loss to embed topological constraints during the segmentation. The resulting model is able to predict both the vessel segmentation and centerlines with a more accurate topology.
Full waveform inversion (FWI) is a large-scale nonlinear ill-posed problem for which computationally expensive Newton-type methods can become trapped in undesirable local minima, particularly when the initial model lacks a low-wavenumber component and the recorded data lacks low-frequency content. A modification to the Gauss-Newton (GN) method is proposed to address these issues. The standard GN system for multisource multireceiver FWI is reformulated into an equivalent matrix equation form, with the solution becoming a diagonal matrix rather than a vector as in the standard system. The search direction is transformed from a vector to a matrix by relaxing the diagonality constraint, effectively adding a degree of freedom to the subsurface offset axis. The relaxed system can be explicitly solved with only the inversion of two small matrices that deblur the data residual matrix along the source and receiver dimensions, which simplifies the inversion of the Hessian matrix. When used to solve the extended source FWI objective function, the Extended GN (EGN) method integrates the benefits of both model and source extension. The EGN method effectively combines the computational effectiveness of the reduced FWI method with the robustness characteristics of extended formulations and offers a promising solution for addressing the challenges of FWI. It bridges the gap between these extended formulations and the reduced FWI method, enhancing inversion robustness while maintaining computational efficiency. The robustness and stability of the EGN algorithm for waveform inversion are demonstrated numerically.