Accurate and efficient estimation of rare events probabilities is of significant importance, since often the occurrences of such events have widespread impacts. The focus in this work is on precisely quantifying these probabilities, often encountered in reliability analysis of complex engineering systems, based on an introduced framework termed Approximate Sampling Target with Post-processing Adjustment (ASTPA), which herein is integrated with and supported by gradient-based Hamiltonian Markov Chain Monte Carlo (HMCMC) methods. The developed techniques in this paper are applicable from low- to high-dimensional stochastic spaces, and the basic idea is to construct a relevant target distribution by weighting the original random variable space through a one-dimensional output likelihood model, using the limit-state function. To sample from this target distribution, we exploit HMCMC algorithms, a family of MCMC methods that adopts physical system dynamics, rather than solely using a proposal probability distribution, to generate distant sequential samples, and we develop a new Quasi-Newton mass preconditioned HMCMC scheme (QNp-HMCMC), which is particularly efficient and suitable for high-dimensional spaces. To eventually compute the rare event probability, an original post-sampling step is devised using an inverse importance sampling procedure based on the already obtained samples. The statistical properties of the estimator are analyzed as well, and the performance of the proposed methodology is examined in detail and compared against Subset Simulation in a series of challenging low- and high-dimensional problems.
Several physical problems modeled by second-order partial differential equations can be efficiently solved using mixed finite elements of the Raviart-Thomas family for N-simplexes, introduced in the seventies. In case Neumann conditions are prescribed on a curvilinear boundary, the normal component of the flux variable should preferably not take up values at nodes shifted to the boundary of the approximating polytope in the corresponding normal direction. This is because the method's accuracy downgrades, which was shown in \cite{FBRT}. In that work an order-preserving technique was studied, based on a parametric version of these elements with curved simplexes. In this paper an alternative with straight-edged triangles for two-dimensional problems is proposed. The key point of this method is a Petrov-Galerkin formulation of the mixed problem, in which the test-flux space is a little different from the shape-flux space. After carrying out a well-posedness and stability analysis, error estimates of optimal order are proven.
Deep neural networks tend to make overconfident predictions and often require additional detectors for misclassifications, particularly for safety-critical applications. Existing detection methods usually only focus on adversarial attacks or out-of-distribution samples as reasons for false predictions. However, generalization errors occur due to diverse reasons often related to poorly learning relevant invariances. We therefore propose GIT, a holistic approach for the detection of generalization errors that combines the usage of gradient information and invariance transformations. The invariance transformations are designed to shift misclassified samples back into the generalization area of the neural network, while the gradient information measures the contradiction between the initial prediction and the corresponding inherent computations of the neural network using the transformed sample. Our experiments demonstrate the superior performance of GIT compared to the state-of-the-art on a variety of network architectures, problem setups and perturbation types.
The Independent Cutset problem asks whether there is a set of vertices in a given graph that is both independent and a cutset. Such a problem is $\textsf{NP}$-complete even when the input graph is planar and has maximum degree five. In this paper, we first present a $\mathcal{O}^*(1.4423^{n})$-time algorithm for the problem. We also show how to compute a minimum independent cutset (if any) in the same running time. Since the property of having an independent cutset is MSO$_1$-expressible, our main results are concerned with structural parameterizations for the problem considering parameters that are not bounded by a function of the clique-width of the input. We present $\textsf{FPT}$-time algorithms for the problem considering the following parameters: the dual of the maximum degree, the dual of the solution size, the size of a dominating set (where a dominating set is given as an additional input), the size of an odd cycle transversal, the distance to chordal graphs, and the distance to $P_5$-free graphs. We close by introducing the notion of $\alpha$-domination, which allows us to identify more fixed-parameter tractable and polynomial-time solvable cases.
To improve the convergence property of the randomized Kaczmarz (RK) method for solving linear systems, Bai and Wu (SIAM J. Sci. Comput., 40(1):A592--A606, 2018) originally introduced a greedy probability criterion for effectively selecting the working row from the coefficient matrix and constructed the greedy randomized Kaczmarz (GRK) method. Due to its simplicity and efficiency, this approach has inspired numerous subsequent works in recent years, such as the capped adaptive sampling rule, the greedy augmented randomized Kaczmarz method, and the greedy randomized coordinate descent method. Since the iterates of the GRK method are actually random variables, existing convergence analyses are all related to the expectation of the error. In this note, we prove that the linear convergence rate of the GRK method is deterministic, i.e. not in the sense of expectation. Moreover, the Polyak's heavy ball momentum technique is incorporated to improve the performance of the GRK method. We propose a refined convergence analysis, compared with the technique used in Loizou and Richt\'{a}rik (Comput. Optim. Appl., 77(3):653--710, 2020), of momentum variants of randomized iterative methods, which shows that the proposed GRK method with momentum (mGRK) also enjoys a deterministic linear convergence. Numerical experiments show that the mGRK method is more efficient than the GRK method.
We present methods for conditional and residual coding in the context of scalable coding for humans and machines. Our focus is on optimizing the rate-distortion performance of the reconstruction task using the information available in the computer vision task. We include an information analysis of both approaches to provide baselines and also propose an entropy model suitable for conditional coding with increased modelling capacity and similar tractability as previous work. We apply these methods to image reconstruction, using, in one instance, representations created for semantic segmentation on the Cityscapes dataset, and in another instance, representations created for object detection on the COCO dataset. In both experiments, we obtain similar performance between the conditional and residual methods, with the resulting rate-distortion curves contained within our baselines.
We consider the computation of free energy-like quantities for diffusions in high dimension, when resorting to Monte Carlo simulation is necessary. Such stochastic computations typically suffer from high variance, in particular in a low noise regime, because the expectation is dominated by rare trajectories for which the observable reaches large values. Although importance sampling, or tilting of trajectories, is now a standard technique for reducing the variance of such estimators, quantitative criteria for proving that a given control reduces variance are scarce, and often do not apply to practical situations. The goal of this work is to provide a quantitative criterion for assessing whether a given bias reduces variance, and at which order. We rely for this on a recently introduced notion of stochastic solution for Hamilton-Jacobi-Bellman (HJB) equations. Based on this tool, we introduce the notion of k-stochastic viscosity approximation (SVA) of a HJB equation. We next prove that such approximate solutions are associated with estimators having a relative variance of order k-1 at log-scale. In particular, a sampling scheme built from a 1-SVA has bounded variance as noise goes to zero. Finally, in order to show that our definition is relevant, we provide examples of stochastic viscosity approximations of order one and two, with a numerical illustration confirming our theoretical findings.
Mutual coherence is a measure of similarity between two opinions. Although the notion comes from philosophy, it is essential for a wide range of technologies, e.g., the Wahl-O-Mat system. In Germany, this system helps voters to find candidates that are the closest to their political preferences. The exact computation of mutual coherence is highly time-consuming due to the iteration over all subsets of an opinion. Moreover, for every subset, an instance of the SAT model counting problem has to be solved which is known to be a hard problem in computer science. This work is the first study to accelerate this computation. We model the distribution of the so-called confirmation values as a mixture of three Gaussians and present efficient heuristics to estimate its model parameters. The mutual coherence is then approximated with the expected value of the distribution. Some of the presented algorithms are fully polynomial-time, others only require solving a small number of instances of the SAT model counting problem. The average squared error of our best algorithm lies below 0.0035 which is insignificant if the efficiency is taken into account. Furthermore, the accuracy is precise enough to be used in Wahl-O-Mat-like systems.
The accurate and efficient simulation of Partial Differential Equations (PDEs) in and around arbitrarily defined geometries is critical for many application domains. Immersed boundary methods (IBMs) alleviate the usually laborious and time-consuming process of creating body-fitted meshes around complex geometry models (described by CAD or other representations, e.g., STL, point clouds), especially when high levels of mesh adaptivity are required. In this work, we advance the field of IBM in the context of the recently developed Shifted Boundary Method (SBM). In the SBM, the location where boundary conditions are enforced is shifted from the actual boundary of the immersed object to a nearby surrogate boundary, and boundary conditions are corrected utilizing Taylor expansions. This approach allows choosing surrogate boundaries that conform to a Cartesian mesh without losing accuracy or stability. Our contributions in this work are as follows: (a) we show that the SBM numerical error can be greatly reduced by an optimal choice of the surrogate boundary, (b) we mathematically prove the optimal convergence of the SBM for this optimal choice of the surrogate boundary, (c) we deploy the SBM on massively parallel octree meshes, including algorithmic advances to handle incomplete octrees, and (d) we showcase the applicability of these approaches with a wide variety of simulations involving complex shapes, sharp corners, and different topologies. Specific emphasis is given to Poisson's equation and the linear elasticity equations.
High-dimensional complex multi-parameter problems are prevalent in engineering, exceeding the capabilities of traditional surrogate models designed for low/medium-dimensional problems. These models face the curse of dimensionality, resulting in decreased modeling accuracy as the design parameter space expands. Furthermore, the lack of a parameter decoupling mechanism hinders the identification of couplings between design variables, particularly in highly nonlinear cases. To address these challenges and enhance prediction accuracy while reducing sample demand, this paper proposes a PC-Kriging-HDMR approximate modeling method within the framework of Cut-HDMR. The method leverages the precision of PC-Kriging and optimizes test point placement through a multi-stage adaptive sequential sampling strategy. This strategy encompasses a first-stage adaptive proportional sampling criterion and a second-stage central-based maximum entropy criterion. Numerical tests and a practical application involving a cantilever beam demonstrate the advantages of the proposed method. Key findings include: (1) The performance of traditional single-surrogate models, such as Kriging, significantly deteriorates in high-dimensional nonlinear problems compared to combined surrogate models under the Cut-HDMR framework (e.g., Kriging-HDMR, PCE-HDMR, SVR-HDMR, MLS-HDMR, and PC-Kriging-HDMR); (2) The number of samples required for PC-Kriging-HDMR modeling increases polynomially rather than exponentially as the parameter space expands, resulting in substantial computational cost reduction; (3) Among existing Cut-HDMR methods, no single approach outperforms the others in all aspects. However, PC-Kriging-HDMR exhibits improved modeling accuracy and efficiency within the desired improvement range compared to PCE-HDMR and Kriging-HDMR, demonstrating robustness.
When is heterogeneity in the composition of an autonomous robotic team beneficial and when is it detrimental? We investigate and answer this question in the context of a minimally viable model that examines the role of heterogeneous speeds in perimeter defense problems, where defenders share a total allocated speed budget. We consider two distinct problem settings and develop strategies based on dynamic programming and on local interaction rules. We present a theoretical analysis of both approaches and our results are extensively validated using simulations. Interestingly, our results demonstrate that the viability of heterogeneous teams depends on the amount of information available to the defenders. Moreover, our results suggest a universality property: across a wide range of problem parameters the optimal ratio of the speeds of the defenders remains nearly constant.