A connected matching in a graph G consists of a set of pairwise disjoint edges whose covered vertices induce a connected subgraph of G. While finding a connected matching of maximum cardinality is a well-solved problem, it is NP-hard to determine an optimal connected matching in an edge-weighted graph, even in the planar bipartite case. We present two mixed integer programming formulations and a sophisticated branch-and-cut scheme to find weighted connected matchings in general graphs. The formulations explore different polyhedra associated to this problem, including strong valid inequalities both from the matching polytope and from the connected subgraph polytope. We conjecture that one attains a tight approximation of the convex hull of connected matchings using our strongest formulation, and report encouraging computational results over DIMACS Implementation Challenge benchmark instances. The source code of the complete implementation is also made available.
We present a new Krylov subspace recycling method for solving a linear system of equations, or a sequence of slowly changing linear systems. Our new method, named GMRES-SDR, combines randomized sketching and deflated restarting in a way that avoids orthogononalizing a full Krylov basis. We provide new theory which characterizes sketched GMRES with and without augmentation as a projection method using a semi-inner product. We present results of numerical experiments demonstrating the effectiveness of GMRES-SDR over competitor methods such as GMRES-DR and GCRO-DR.
Common regularization algorithms for linear regression, such as LASSO and Ridge regression, rely on a regularization hyperparameter that balances the tradeoff between minimizing the fitting error and the norm of the learned model coefficients. As this hyperparameter is scalar, it can be easily selected via random or grid search optimizing a cross-validation criterion. However, using a scalar hyperparameter limits the algorithm's flexibility and potential for better generalization. In this paper, we address the problem of linear regression with l2-regularization, where a different regularization hyperparameter is associated with each input variable. We optimize these hyperparameters using a gradient-based approach, wherein the gradient of a cross-validation criterion with respect to the regularization hyperparameters is computed analytically through matrix differential calculus. Additionally, we introduce two strategies tailored for sparse model learning problems aiming at reducing the risk of overfitting to the validation data. Numerical examples demonstrate that our multi-hyperparameter regularization approach outperforms LASSO, Ridge, and Elastic Net regression. Moreover, the analytical computation of the gradient proves to be more efficient in terms of computational time compared to automatic differentiation, especially when handling a large number of input variables. Application to the identification of over-parameterized Linear Parameter-Varying models is also presented.
We propose and compare methods for the analysis of extreme events in complex systems governed by PDEs that involve random parameters, in situations where we are interested in quantifying the probability that a scalar function of the system's solution is above a threshold. If the threshold is large, this probability is small and its accurate estimation is challenging. To tackle this difficulty, we blend theoretical results from large deviation theory (LDT) with numerical tools from PDE-constrained optimization. Our methods first compute parameters that minimize the LDT-rate function over the set of parameters leading to extreme events, using adjoint methods to compute the gradient of this rate function. The minimizers give information about the mechanism of the extreme events as well as estimates of their probability. We then propose a series of methods to refine these estimates, either via importance sampling or geometric approximation of the extreme event sets. Results are formulated for general parameter distributions and detailed expressions are provided when Gaussian distributions. We give theoretical and numerical arguments showing that the performance of our methods is insensitive to the extremeness of the events we are interested in. We illustrate the application of our approach to quantify the probability of extreme tsunami events on shore. Tsunamis are typically caused by a sudden, unpredictable change of the ocean floor elevation during an earthquake. We model this change as a random process, which takes into account the underlying physics. We use the one-dimensional shallow water equation to model tsunamis numerically. In the context of this example, we present a comparison of our methods for extreme event probability estimation, and find which type of ocean floor elevation change leads to the largest tsunamis on shore.
Numerical methods for computing the solutions of Markov backward stochastic differential equations (BSDEs) driven by continuous-time Markov chains (CTMCs) are explored. The main contributions of this paper are as follows: (1) we observe that Euler-Maruyama temporal discretization methods for solving Markov BSDEs driven by CTMCs are equivalent to exponential integrators for solving the associated systems of ordinary differential equations (ODEs); (2) we introduce multi-stage Euler-Maruyama methods for effectively solving "stiff" Markov BSDEs driven by CTMCs; these BSDEs typically arise from the spatial discretization of Markov BSDEs driven by Brownian motion; (3) we propose a multilevel spatial discretization method on sparse grids that efficiently approximates high-dimensional Markov BSDEs driven by Brownian motion with a combination of multiple Markov BSDEs driven by CTMCs on grids with different resolutions. We also illustrate the effectiveness of the presented methods with a number of numerical experiments in which we treat nonlinear BSDEs arising from option pricing problems in finance.
We provide full theoretical guarantees for the convergence behaviour of diffusion-based generative models under the assumption of strongly logconcave data distributions while our approximating class of functions used for score estimation is made of Lipschitz continuous functions. We demonstrate via a motivating example, sampling from a Gaussian distribution with unknown mean, the powerfulness of our approach. In this case, explicit estimates are provided for the associated optimization problem, i.e. score approximation, while these are combined with the corresponding sampling estimates. As a result, we obtain the best known upper bound estimates in terms of key quantities of interest, such as the dimension and rates of convergence, for the Wasserstein-2 distance between the data distribution (Gaussian with unknown mean) and our sampling algorithm. Beyond the motivating example and in order to allow for the use of a diverse range of stochastic optimizers, we present our results using an $L^2$-accurate score estimation assumption, which crucially is formed under an expectation with respect to the stochastic optimizer and our novel auxiliary process that uses only known information. This approach yields the best known convergence rate for our sampling algorithm.
Miura surfaces are the solutions of a constrained nonlinear elliptic system of equations. This system is derived by homogenization from the Miura fold, which is a type of origami fold with multiple applications in engineering. A previous inquiry, gave suboptimal conditions for existence of solutions and proposed an $H^2$-conformal finite element method to approximate them. In this paper, the existence of Miura surfaces is studied using a mixed formulation. It is also proved that the constraints propagate from the boundary to the interior of the domain for well-chosen boundary conditions. Then, a numerical method based on a least-squares formulation, Taylor--Hood finite elements and a Newton method is introduced to approximate Miura surfaces. The numerical method is proved to converge and numerical tests are performed to demonstrate its robustness.
Consider a regression or some regression-type model for a certain response variable where the linear predictor includes an ordered factor among the explanatory variables. The inclusion of a factor of this type can take place is a few different ways, discussed in the pertaining literature. The present contribution proposes a different way of tackling this problem, by constructing a numeric variable in an alternative way with respect to the current methodology. The proposed techniques appears to retain the data fitting capability of the existing methodology, but with a simpler interpretation of the model components.
For an even set of points in the plane, choose a max-sum matching, that is, a perfect matching maximizing the sum of Euclidean distances of its edges. For each edge of the max-sum matching, consider the ellipse with foci at the edge's endpoints and eccentricity $\sqrt 3 / 2$. Using an optimization approach, we prove that the convex sets bounded by these ellipses intersect, answering a Tverberg-type question of Andy Fingerhut from 1995.
We give a fully polynomial-time randomized approximation scheme (FPRAS) for two terminal reliability in directed acyclic graphs.
We present an unsupervised 3D shape co-segmentation method which learns a set of deformable part templates from a shape collection. To accommodate structural variations in the collection, our network composes each shape by a selected subset of template parts which are affine-transformed. To maximize the expressive power of the part templates, we introduce a per-part deformation network to enable the modeling of diverse parts with substantial geometry variations, while imposing constraints on the deformation capacity to ensure fidelity to the originally represented parts. We also propose a training scheme to effectively overcome local minima. Architecturally, our network is a branched autoencoder, with a CNN encoder taking a voxel shape as input and producing per-part transformation matrices, latent codes, and part existence scores, and the decoder outputting point occupancies to define the reconstruction loss. Our network, coined DAE-Net for Deforming Auto-Encoder, can achieve unsupervised 3D shape co-segmentation that yields fine-grained, compact, and meaningful parts that are consistent across diverse shapes. We conduct extensive experiments on the ShapeNet Part dataset, DFAUST, and an animal subset of Objaverse to show superior performance over prior methods.