Lawson's iteration is a classical and effective method for solving the linear (polynomial) minimax approximation in the complex plane. Extension of Lawson's iteration for the rational minimax approximation with both computationally high efficiency and theoretical guarantee is challenging. A recent work [L.-H. Zhang, L. Yang, W. H. Yang and Y.-N. Zhang, A convex dual programming for the rational minimax approximation and Lawson's iteration, 2023, arxiv.org/pdf/2308.06991v1] reveals that Lawson's iteration can be viewed as a method for solving the dual problem of the original rational minimax approximation, and a new type of Lawson's iteration was proposed. Such a dual problem is guaranteed to obtain the original minimax solution under Ruttan's sufficient condition, and numerically, the proposed Lawson's iteration was observed to converge monotonically with respect to the dual objective function. In this paper, we perform theoretical convergence analysis for Lawson's iteration for both the linear and rational minimax approximations. In particular, we show that (i) for the linear minimax approximation, the near-optimal Lawson exponent $\beta$ in Lawson's iteration is $\beta=1$, and (ii) for the rational minimax approximation, the proposed Lawson's iteration converges monotonically with respect to the dual objective function for any sufficiently small $\beta>0$, and the convergent solution fulfills the complementary slackness: all nodes associated with positive weights achieve the maximum error.
We consider the numerical behavior of the fixed-stress splitting method for coupled poromechanics as undrained regimes are approached. We explain that pressure stability is related to the splitting error of the scheme, not the fact that the discrete saddle point matrix never appears in the fixed-stress approach. This observation reconciles previous results regarding the pressure stability of the splitting method. Using examples of compositional poromechanics with application to geological CO$_2$ sequestration, we see that solutions obtained using the fixed-stress scheme with a low order finite element-finite volume discretization which is not inherently inf-sup stable can exhibit the same pressure oscillations obtained with the corresponding fully implicit scheme. Moreover, pressure jump stabilization can effectively remove these spurious oscillations in the fixed-stress setting, while also improving the efficiency of the scheme in terms of the number of iterations required at every time step to reach convergence.
Random features models play a distinguished role in the theory of deep learning, describing the behavior of neural networks close to their infinite-width limit. In this work, we present a thorough analysis of the generalization performance of random features models for generic supervised learning problems with Gaussian data. Our approach, built with tools from the statistical mechanics of disordered systems, maps the random features model to an equivalent polynomial model, and allows us to plot average generalization curves as functions of the two main control parameters of the problem: the number of random features $N$ and the size $P$ of the training set, both assumed to scale as powers in the input dimension $D$. Our results extend the case of proportional scaling between $N$, $P$ and $D$. They are in accordance with rigorous bounds known for certain particular learning tasks and are in quantitative agreement with numerical experiments performed over many order of magnitudes of $N$ and $P$. We find good agreement also far from the asymptotic limits where $D\to \infty$ and at least one between $P/D^K$, $N/D^L$ remains finite.
In this paper, we design a new kind of high order inverse Lax-Wendroff (ILW) boundary treatment for solving hyperbolic conservation laws with finite difference method on a Cartesian mesh. This new ILW method decomposes the construction of ghost point values near inflow boundary into two steps: interpolation and extrapolation. At first, we impose values of some artificial auxiliary points through a polynomial interpolating the interior points near the boundary. Then, we will construct a Hermite extrapolation based on those auxiliary point values and the spatial derivatives at boundary obtained via the ILW procedure. This polynomial will give us the approximation to the ghost point value. By an appropriate selection of those artificial auxiliary points, high-order accuracy and stable results can be achieved. Moreover, theoretical analysis indicates that comparing with the original ILW method, especially for higher order accuracy, the new proposed one would require fewer terms using the relatively complicated ILW procedure and thus improve computational efficiency on the premise of maintaining accuracy and stability. We perform numerical experiments on several benchmarks, including one- and two-dimensional scalar equations and systems. The robustness and efficiency of the proposed scheme is numerically verified.
We consider the classical problems of interpolating a polynomial given a black box for evaluation, and of multiplying two polynomials, in the setting where the bit-lengths of the coefficients may vary widely, so-called unbalanced polynomials. Writing s for the total bit-length and D for the degree, our new algorithms have expected running time $\tilde{O}(s \log D)$, whereas previous methods for (resp.) dense or sparse arithmetic have at least $\tilde{O}(sD)$ or $\tilde{O}(s^2)$ bit complexity.
By interpreting planar polynomial curves as complex-valued functions of a real parameter, an inner product, norm, metric function, and the notion of orthogonality may be defined for such curves. This approach is applied to the complex pre-image polynomials that generate planar Pythagorean-hodograph (PH) curves, to facilitate the implementation of bounded modifications of them that preserve their PH nature. The problems of bounded modifications under the constraint of fixed curve end points and end tangent directions, and of increasing the arc length of a PH curve by a prescribed amount, are also addressed.
The broad class of multivariate unified skew-normal (SUN) distributions has been recently shown to possess fundamental conjugacy properties. When used as priors for the vector of parameters in general probit, tobit, and multinomial probit models, these distributions yield posteriors that still belong to the SUN family. Although such a core result has led to important advancements in Bayesian inference and computation, its applicability beyond likelihoods associated with fully-observed, discretized, or censored realizations from multivariate Gaussian models remains yet unexplored. This article covers such an important gap by proving that the wider family of multivariate unified skew-elliptical (SUE) distributions, which extends SUNs to more general perturbations of elliptical densities, guarantees conjugacy for broader classes of models, beyond those relying on fully-observed, discretized or censored Gaussians. Such a result leverages the closure under linear combinations, conditioning and marginalization of SUE to prove that such a family is conjugate to the likelihood induced by general multivariate regression models for fully-observed, censored or dichotomized realizations from skew-elliptical distributions. This advancement substantially enlarges the set of models that enable conjugate Bayesian inference to general formulations arising from elliptical and skew-elliptical families, including the multivariate Student's t and skew-t, among others.
We give a fully polynomial-time randomized approximation scheme (FPRAS) for two terminal reliability in directed acyclic graphs (DAGs). In contrast, we also show the complementing problem of approximating two terminal unreliability in DAGs is #BIS-hard.
Quantization for a Borel probability measure refers to the idea of estimating a given probability by a discrete probability with support containing a finite number of elements. In this paper, we have considered a Borel probability measure $P$ on $\mathbb R^2$, which has support a nonuniform stretched Sierpi\'{n}ski triangle generated by a set of three contractive similarity mappings on $\mathbb R^2$. For this probability measure, we investigate the optimal sets of $n$-means and the $n$th quantization errors for all positive integers $n$.
Mesh-based Graph Neural Networks (GNNs) have recently shown capabilities to simulate complex multiphysics problems with accelerated performance times. However, mesh-based GNNs require a large number of message-passing (MP) steps and suffer from over-smoothing for problems involving very fine mesh. In this work, we develop a multiscale mesh-based GNN framework mimicking a conventional iterative multigrid solver, coupled with adaptive mesh refinement (AMR), to mitigate challenges with conventional mesh-based GNNs. We use the framework to accelerate phase field (PF) fracture problems involving coupled partial differential equations with a near-singular operator due to near-zero modulus inside the crack. We define the initial graph representation using all mesh resolution levels. We perform a series of downsampling steps using Transformer MP GNNs to reach the coarsest graph followed by upsampling steps to reach the original graph. We use skip connectors from the generated embedding during coarsening to prevent over-smoothing. We use Transfer Learning (TL) to significantly reduce the size of training datasets needed to simulate different crack configurations and loading conditions. The trained framework showed accelerated simulation times, while maintaining high accuracy for all cases compared to physics-based PF fracture model. Finally, this work provides a new approach to accelerate a variety of mesh-based engineering multiphysics problems
Inner products of neural network feature maps arises in a wide variety of machine learning frameworks as a method of modeling relations between inputs. This work studies the approximation properties of inner products of neural networks. It is shown that the inner product of a multi-layer perceptron with itself is a universal approximator for symmetric positive-definite relation functions. In the case of asymmetric relation functions, it is shown that the inner product of two different multi-layer perceptrons is a universal approximator. In both cases, a bound is obtained on the number of neurons required to achieve a given accuracy of approximation. In the symmetric case, the function class can be identified with kernels of reproducing kernel Hilbert spaces, whereas in the asymmetric case the function class can be identified with kernels of reproducing kernel Banach spaces. Finally, these approximation results are applied to analyzing the attention mechanism underlying Transformers, showing that any retrieval mechanism defined by an abstract preorder can be approximated by attention through its inner product relations. This result uses the Debreu representation theorem in economics to represent preference relations in terms of utility functions.