A retention strategy based on an enlightened lapse model is a powerful profitabilitylever for a life insurer. Some machine learning models are excellent at predicting lapse,but from the insurer's perspective, predicting which policyholder is likely to lapse is notenough to design a retention strategy. In our paper, we define a lapse managementframework with an appropriate validation metric based on Customer Lifetime Value andprofitability. We include the risk of death in the study through competing risks considerations inparametric and tree-based models and show that further individualization of theexisting approaches leads to increased performance. We show that survival tree-basedmodels outperform parametric approaches and that the actuarial literature cansignificantly benefit from them. Then, we compare, on real data, how this frameworkleads to increased predicted gains for a life insurer and discuss the benefits of ourmodel in terms of commercial and strategic decision-making.
The efficient representation of random fields on geometrically complex domains is crucial for Bayesian modelling in engineering and machine learning. Today's prevalent random field representations are either intended for unbounded domains or are too restrictive in terms of possible field properties. Because of these limitations, techniques leveraging the historically established link between stochastic PDEs (SPDEs) and random fields have been gaining interest. The SPDE representation is especially appealing for engineering applications which already have a finite element discretisation for solving the physical conservation equations. In contrast to the dense covariance matrix of a random field, its inverse, the precision matrix, is usually sparse and equal to the stiffness matrix of an elliptic SPDE. We use the SPDE representation to develop a scalable framework for large-scale statistical finite element analysis and Gaussian process (GP) regression on complex geometries. The statistical finite element method (statFEM) introduced by Girolami et al. (2022) is a novel approach for synthesising measurement data and finite element models. In both statFEM and GP regression, we use the SPDE formulation to obtain the relevant prior probability densities with a sparse precision matrix. The properties of the priors are governed by the parameters and possibly fractional order of the SPDE so that we can model on bounded domains and manifolds anisotropic, non-stationary random fields with arbitrary smoothness. The observation models for statFEM and GP regression are such that the posterior probability densities are Gaussians with a closed-form mean and precision. The respective mean vector and precision matrix and can be evaluated using only sparse matrix operations. We demonstrate the versatility of the proposed framework and its convergence properties with Poisson and thin-shell examples.
Previous researchers conducting Just-In-Time (JIT) defect prediction tasks have primarily focused on the performance of individual pre-trained models, without exploring the relationship between different pre-trained models as backbones. In this study, we build six models: RoBERTaJIT, CodeBERTJIT, BARTJIT, PLBARTJIT, GPT2JIT, and CodeGPTJIT, each with a distinct pre-trained model as its backbone. We systematically explore the differences and connections between these models. Specifically, we investigate the performance of the models when using Commit code and Commit message as inputs, as well as the relationship between training efficiency and model distribution among these six models. Additionally, we conduct an ablation experiment to explore the sensitivity of each model to inputs. Furthermore, we investigate how the models perform in zero-shot and few-shot scenarios. Our findings indicate that each model based on different backbones shows improvements, and when the backbone's pre-training model is similar, the training resources that need to be consumed are much more closer. We also observe that Commit code plays a significant role in defect detection, and different pre-trained models demonstrate better defect detection ability with a balanced dataset under few-shot scenarios. These results provide new insights for optimizing JIT defect prediction tasks using pre-trained models and highlight the factors that require more attention when constructing such models. Additionally, CodeGPTJIT and GPT2JIT achieved better performance than DeepJIT and CC2Vec on the two datasets respectively under 2000 training samples. These findings emphasize the effectiveness of transformer-based pre-trained models in JIT defect prediction tasks, especially in scenarios with limited training data.
In this paper, a multiscale constitutive framework for one-dimensional blood flow modeling is presented and discussed. By analyzing the asymptotic limits of the proposed model, it is shown that different types of blood propagation phenomena in arteries and veins can be described through an appropriate choice of scaling parameters, which are related to distinct characterizations of the fluid-structure interaction mechanism (whether elastic or viscoelastic) that exist between vessel walls and blood flow. In these asymptotic limits, well-known blood flow models from the literature are recovered. Additionally, by analyzing the perturbation of the local elastic equilibrium of the system, a new viscoelastic blood flow model is derived. The proposed approach is highly flexible and suitable for studying the human cardiovascular system, which is composed of vessels with high morphological and mechanical variability. The resulting multiscale hyperbolic model of blood flow is solved using an asymptotic-preserving Implicit-Explicit Runge-Kutta Finite Volume method, which ensures the consistency of the numerical scheme with the different asymptotic limits of the mathematical model without affecting the choice of the time step by restrictions related to the smallness of the scaling parameters. Several numerical tests confirm the validity of the proposed methodology, including a case study investigating the hemodynamics of a thoracic aorta in the presence of a stent.
We study matrix sensing, which is the problem of reconstructing a low-rank matrix from a few linear measurements. It can be formulated as an overparameterized regression problem, which can be solved by factorized gradient descent when starting from a small random initialization. Linear neural networks, and in particular matrix sensing by factorized gradient descent, serve as prototypical models of non-convex problems in modern machine learning, where complex phenomena can be disentangled and studied in detail. Much research has been devoted to studying special cases of asymmetric matrix sensing, such as asymmetric matrix factorization and symmetric positive semi-definite matrix sensing. Our key contribution is introducing a continuous differential equation that we call the $\textit{perturbed gradient flow}$. We prove that the perturbed gradient flow converges quickly to the true target matrix whenever the perturbation is sufficiently bounded. The dynamics of gradient descent for matrix sensing can be reduced to this formulation, yielding a novel proof of asymmetric matrix sensing with factorized gradient descent. Compared to directly analyzing the dynamics of gradient descent, the continuous formulation allows bounding key quantities by considering their derivatives, often simplifying the proofs. We believe the general proof technique may prove useful in other settings as well.
Electrical circuits are present in a variety of technologies, making their design an important part of computer aided engineering. The growing number of tunable parameters that affect the final design leads to a need for new approaches of quantifying their impact. Machine learning may play a key role in this regard, however current approaches often make suboptimal use of existing knowledge about the system at hand. In terms of circuits, their description via modified nodal analysis is well-understood. This particular formulation leads to systems of differential-algebraic equations (DAEs) which bring with them a number of peculiarities, e.g. hidden constraints that the solution needs to fulfill. We aim to use the recently introduced dissection concept for DAEs that can decouple a given system into ordinary differential equations, only depending on differential variables, and purely algebraic equations that describe the relations between differential and algebraic variables. The idea then is to only learn the differential variables and reconstruct the algebraic ones using the relations from the decoupling. This approach guarantees that the algebraic constraints are fulfilled up to the accuracy of the nonlinear system solver, which represents the main benefit highlighted in this article.
We provide a new sequent calculus that enjoys syntactic cut-elimination and strongly terminating backward proof search for the intuitionistic Strong L\"ob logic $\sf{iSL}$, an intuitionistic modal logic with a provability interpretation. A novel measure on sequents is used to prove both the termination of the naive backward proof search strategy, and the admissibility of cut in a syntactic and direct way, leading to a straightforward cut-elimination procedure. All proofs have been formalised in the interactive theorem prover Coq.
The proximal Galerkin finite element method is a high-order, low iteration complexity, nonlinear numerical method that preserves the geometric and algebraic structure of bound constraints in infinite-dimensional function spaces. This paper introduces the proximal Galerkin method and applies it to solve free boundary problems, enforce discrete maximum principles, and develop scalable, mesh-independent algorithms for optimal design. The paper leads to a derivation of the latent variable proximal point (LVPP) algorithm: an unconditionally stable alternative to the interior point method. LVPP is an infinite-dimensional optimization algorithm that may be viewed as having an adaptive barrier function that is updated with a new informative prior at each (outer loop) optimization iteration. One of the main benefits of this algorithm is witnessed when analyzing the classical obstacle problem. Therein, we find that the original variational inequality can be replaced by a sequence of semilinear partial differential equations (PDEs) that are readily discretized and solved with, e.g., high-order finite elements. Throughout this work, we arrive at several unexpected contributions that may be of independent interest. These include (1) a semilinear PDE we refer to as the entropic Poisson equation; (2) an algebraic/geometric connection between high-order positivity-preserving discretizations and certain infinite-dimensional Lie groups; and (3) a gradient-based, bound-preserving algorithm for two-field density-based topology optimization. The complete latent variable proximal Galerkin methodology combines ideas from nonlinear programming, functional analysis, tropical algebra, and differential geometry and can potentially lead to new synergies among these areas as well as within variational and numerical analysis.
The optimal operation of water reservoir systems is a challenging task involving multiple conflicting objectives. The main source of complexity is the presence of the water inflow, which acts as an exogenous, highly uncertain disturbance on the system. When model predictive control (MPC) is employed, the optimal water release is usually computed based on the (predicted) trajectory of the inflow. This choice may jeopardize the closed-loop performance when the actual inflow differs from its forecast. In this work, we consider - for the first time - a stochastic MPC approach for water reservoirs, in which the control is optimized based on a set of plausible future inflows directly generated from past data. Such a scenario-based MPC strategy allows the controller to be more cautious, counteracting droughty periods (e.g., the lake level going below the dry limit) while at the same time guaranteeing that the agricultural water demand is satisfied. The method's effectiveness is validated through extensive Monte Carlo tests using actual inflow data from Lake Como, Italy.
Hawkes processes are often applied to model dependence and interaction phenomena in multivariate event data sets, such as neuronal spike trains, social interactions, and financial transactions. In the nonparametric setting, learning the temporal dependence structure of Hawkes processes is generally a computationally expensive task, all the more with Bayesian estimation methods. In particular, for generalised nonlinear Hawkes processes, Monte-Carlo Markov Chain methods applied to compute the doubly intractable posterior distribution are not scalable to high-dimensional processes in practice. Recently, efficient algorithms targeting a mean-field variational approximation of the posterior distribution have been proposed. In this work, we first unify existing variational Bayes approaches under a general nonparametric inference framework, and analyse the asymptotic properties of these methods under easily verifiable conditions on the prior, the variational class, and the nonlinear model. Secondly, we propose a novel sparsity-inducing procedure, and derive an adaptive mean-field variational algorithm for the popular sigmoid Hawkes processes. Our algorithm is parallelisable and therefore computationally efficient in high-dimensional setting. Through an extensive set of numerical simulations, we also demonstrate that our procedure is able to adapt to the dimensionality of the parameter of the Hawkes process, and is partially robust to some type of model mis-specification.
To create effective data visualizations, it helps to represent data using visual features in intuitive ways. When visualization designs match observer expectations, visualizations are easier to interpret. Prior work suggests that several factors influence such expectations. For example, the dark-is-more bias leads observers to infer that darker colors map to larger quantities, and the opaque-is-more bias leads them to infer that regions appearing more opaque (given the background color) map to larger quantities. Previous work suggested that the background color only plays a role if visualizations appear to vary in opacity. The present study challenges this claim. We hypothesized that the background color modulate inferred mappings for colormaps that should not appear to vary in opacity (by previous measures) if the visualization appeared to have a "hole" that revealed the background behind the map (hole hypothesis). We found that spatial aspects of the map contributed to inferred mappings, though the effects were inconsistent with the hole hypothesis. Our work raises new questions about how spatial distributions of data influence color semantics in colormap data visualizations.