In this paper, we propose a new efficient method for calculating the Gerber-Shiu discounted penalty function. Generally, the Gerber-Shiu function usually satisfies a class of integro-differential equation. We introduce the physics-informed neural networks (PINN) which embed a differential equation into the loss of the neural network using automatic differentiation. In addition, PINN is more free to set boundary conditions and does not rely on the determination of the initial value. This gives us an idea to calculate more general Gerber-Shiu functions. Numerical examples are provided to illustrate the very good performance of our approximation.
The purpose of this paper is to look into how central notions in statistical learning theory, such as realisability, generalise under the assumption that train and test distribution are issued from the same credal set, i.e., a convex set of probability distributions. This can be considered as a first step towards a more general treatment of statistical learning under epistemic uncertainty.
In this paper, we carry out the numerical analysis of a nonsmooth quasilinear elliptic optimal control problem, where the coefficient in the divergence term of the corresponding state equation is not differentiable with respect to the state variable. Despite the lack of differentiability of the nonlinearity in the quasilinear elliptic equation, the corresponding control-to-state operator is of class $C^1$ but not of class $C^2$. Analogously, the discrete control-to-state operators associated with the approximated control problems are proven to be of class $C^1$ only. By using an explicit second-order sufficient optimality condition, we prove a priori error estimates for a variational approximation, a piecewise constant approximation, and a continuous piecewise linear approximation of the continuous optimal control problem. The numerical tests confirm these error estimates.
This paper delves into a nonparametric estimation approach for the interaction function within diffusion-type particle system models. We introduce two estimation methods based upon an empirical risk minimization. Our study encompasses an analysis of the stochastic and approximation errors associated with both procedures, along with an examination of certain minimax lower bounds. In particular, we show that there is a natural metric under which the corresponding minimax estimation error of the interaction function converges to zero with parametric rate. This result is rather suprising given complexity of the underlying estimation problem and rather large classes of interaction functions for which the above parametric rate holds.
In this paper we develop a new well-balanced discontinuous Galerkin (DG) finite element scheme with subcell finite volume (FV) limiter for the numerical solution of the Einstein--Euler equations of general relativity based on a first order hyperbolic reformulation of the Z4 formalism. The first order Z4 system, which is composed of 59 equations, is analyzed and proven to be strongly hyperbolic for a general metric. The well-balancing is achieved for arbitrary but a priori known equilibria by subtracting a discrete version of the equilibrium solution from the discretized time-dependent PDE system. Special care has also been taken in the design of the numerical viscosity so that the well-balancing property is achieved. As for the treatment of low density matter, e.g. when simulating massive compact objects like neutron stars surrounded by vacuum, we have introduced a new filter in the conversion from the conserved to the primitive variables, preventing superluminal velocities when the density drops below a certain threshold, and being potentially also very useful for the numerical investigation of highly rarefied relativistic astrophysical flows. Thanks to these improvements, all standard tests of numerical relativity are successfully reproduced, reaching three achievements: (i) we are able to obtain stable long term simulations of stationary black holes, including Kerr black holes with extreme spin, which after an initial perturbation return perfectly back to the equilibrium solution up to machine precision; (ii) a (standard) TOV star under perturbation is evolved in pure vacuum ($\rho$=$p$=0) up to t=1000 with no need to introduce any artificial atmosphere around the star; and, (iii) we solve the head on collision of two punctures black holes, that was previously considered un--tractable within the Z4 formalism.
In this work, we present a simple and unified analysis of the Johnson-Lindenstrauss (JL) lemma, a cornerstone in the field of dimensionality reduction critical for managing high-dimensional data. Our approach not only simplifies the understanding but also unifies various constructions under the JL framework, including spherical, binary-coin, sparse JL, Gaussian and sub-Gaussian models. This simplification and unification make significant strides in preserving the intrinsic geometry of data, essential across diverse applications from streaming algorithms to reinforcement learning. Notably, we deliver the first rigorous proof of the spherical construction's effectiveness and provide a general class of sub-Gaussian constructions within this simplified framework. At the heart of our contribution is an innovative extension of the Hanson-Wright inequality to high dimensions, complete with explicit constants, marking a substantial leap in the literature. By employing simple yet powerful probabilistic tools and analytical techniques, such as an enhanced diagonalization process, our analysis not only solidifies the JL lemma's theoretical foundation but also extends its practical reach, showcasing its adaptability and importance in contemporary computational algorithms.
This paper is concerned with an inverse wave-number-dependent/frequency-dependent source problem for the Helmholtz equation. In d-dimensions (d = 2,3), the unknown source term is supposed to be compactly supported in spatial variables but independent on x_d. The dependance of the source function on k is supposed to be unknown. Based on the Dirichlet-Laplacian method and the Fourier-Transform method, we develop two effcient non-iterative numerical algorithms to recover the wave-number-dependent source. Uniqueness and increasing stability analysis are proved. Numerical experiments are conducted to illustrate the effctiveness and effciency of the proposed method.
This paper proposes a novel approach to improve the training efficiency and the generalization performance of Feed Forward Neural Networks (FFNNs) resorting to an optimal rescaling of input features (OFR) carried out by a Genetic Algorithm (GA). The OFR reshapes the input space improving the conditioning of the gradient-based algorithm used for the training. Moreover, the scale factors exploration entailed by GA trials and selection corresponds to different initialization of the first layer weights at each training attempt, thus realizing a multi-start global search algorithm (even though restrained to few weights only) which fosters the achievement of a global minimum. The approach has been tested on a FFNN modeling the outcome of a real industrial process (centerless grinding).
In this paper, we study the exact recovery problem in the Gaussian weighted version of the Stochastic block model with two symmetric communities. We provide the information-theoretic threshold in terms of the signal-to-noise ratio (SNR) of the model and prove that when SNR $<1$, no statistical estimator can exactly recover the community structure with probability bounded away from zero. On the other hand, we show that when SNR $>1$, the Maximum likelihood estimator itself succeeds in exactly recovering the community structure with probability approaching one. Then, we provide two algorithms for achieving exact recovery. The Semi-definite relaxation as well as the spectral relaxation of the Maximum likelihood estimator can recover the community structure down to the threshold value of 1, establishing the absence of an information-computation gap for this model. Next, we compare the problem of community detection with the problem of recovering a planted densely weighted community within a graph and prove that the exact recovery of two symmetric communities is a strictly easier problem than recovering a planted dense subgraph of size half the total number of nodes, by establishing that when the same SNR$< 3/2$, no statistical estimator can exactly recover the planted community with probability bounded away from zero. More precisely, when $1 <$ SNR $< 3/2$ exact recovery of community detection is possible, both statistically and algorithmically, but it is impossible to exactly recover the planted community, even statistically, in the Gaussian weighted model. Finally, we show that when SNR $>2$, the Maximum likelihood estimator itself succeeds in exactly recovering the planted community with probability approaching one. We also prove that the Semi-definite relaxation of the Maximum likelihood estimator can recover the planted community structure down to the threshold value of 2.
We hypothesize that due to the greedy nature of learning in multi-modal deep neural networks, these models tend to rely on just one modality while under-fitting the other modalities. Such behavior is counter-intuitive and hurts the models' generalization, as we observe empirically. To estimate the model's dependence on each modality, we compute the gain on the accuracy when the model has access to it in addition to another modality. We refer to this gain as the conditional utilization rate. In the experiments, we consistently observe an imbalance in conditional utilization rates between modalities, across multiple tasks and architectures. Since conditional utilization rate cannot be computed efficiently during training, we introduce a proxy for it based on the pace at which the model learns from each modality, which we refer to as the conditional learning speed. We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning. The proposed algorithm improves the model's generalization on three datasets: Colored MNIST, Princeton ModelNet40, and NVIDIA Dynamic Hand Gesture.
In this paper we develop a novel neural network model for predicting implied volatility surface. Prior financial domain knowledge is taken into account. A new activation function that incorporates volatility smile is proposed, which is used for the hidden nodes that process the underlying asset price. In addition, financial conditions, such as the absence of arbitrage, the boundaries and the asymptotic slope, are embedded into the loss function. This is one of the very first studies which discuss a methodological framework that incorporates prior financial domain knowledge into neural network architecture design and model training. The proposed model outperforms the benchmarked models with the option data on the S&P 500 index over 20 years. More importantly, the domain knowledge is satisfied empirically, showing the model is consistent with the existing financial theories and conditions related to implied volatility surface.