We show that highly accurate approximations can often be obtained from constructing Thiele interpolating continued fractions by a Greedy selection of the interpolation points together with an early termination condition. The obtained results are comparable with the outcome from state-of-the-art rational interpolation techniques based on the barycentric form.
This paper studies the expressive power of artificial neural networks (NNs) with rectified linear units. To study them as a model of real-valued computation, we introduce the concept of Max-Affine Arithmetic Programs and show equivalence between them and NNs concerning natural complexity measures. We then use this result to show that two fundamental combinatorial optimization problems can be solved with polynomial-size NNs, which is equivalent to the existence of very special strongly polynomial time algorithms. First, we show that for any undirected graph with $n$ nodes, there is an NN of size $\mathcal{O}(n^3)$ that takes the edge weights as input and computes the value of a minimum spanning tree of the graph. Second, we show that for any directed graph with $n$ nodes and $m$ arcs, there is an NN of size $\mathcal{O}(m^2n^2)$ that takes the arc capacities as input and computes a maximum flow. These results imply in particular that the solutions of the corresponding parametric optimization problems where all edge weights or arc capacities are free parameters can be encoded in polynomial space and evaluated in polynomial time, and that such an encoding is provided by an NN.
In this paper, we study an initial-boundary value problem of Kirchhoff type involving memory term for non-homogeneous materials. The purpose of this research is threefold. First, we prove the existence and uniqueness of weak solutions to the problem using the Galerkin method. Second, to obtain numerical solutions efficiently, we develop a L1 type backward Euler-Galerkin FEM, which is $O(h+k^{2-\alpha})$ accurate, where $\alpha~ (0<\alpha<1)$ is the order of fractional time derivative, $h$ and $k$ are the discretization parameters for space and time directions, respectively. Next, to achieve the optimal rate of convergence in time, we propose a fractional Crank-Nicolson-Galerkin FEM based on L2-1$_{\sigma}$ scheme. We prove that the numerical solutions of this scheme converge to the exact solution with accuracy $O(h+k^{2})$. We also derive a priori bounds on numerical solutions for the proposed schemes. Finally, some numerical experiments are conducted to validate our theoretical claims.
State-of-the-art probabilistic model checkers perform verification on explicit-state Markov models defined in a high-level programming formalism like the PRISM modeling language. Typically, the low-level models resulting from such program-like specifications exhibit lots of structure such as repeating subpatterns. Established techniques like probabilistic bisimulation minimization are able to exploit these structures; however, they operate directly on the explicit-state model. On the other hand, methods for reducing structured state spaces by reasoning about the high-level program have not been investigated that much. In this paper, we present a new, simple, and fully automatic program-level technique to reduce the underlying Markov model. Our approach aims at computing the summary behavior of adjacent locations in the program's control-flow graph, thereby obtaining a program with fewer "control states". This reduction is immediately reflected in the program's operational semantics, enabling more efficient model checking. A key insight is that in principle, each (combination of) program variable(s) with finite domain can play the role of the program counter that defines the flow structure. Unlike most other reduction techniques, our approach is property-directed and naturally supports unspecified model parameters. Experiments demonstrate that our simple method yields state-space reductions of up to 80% on practically relevant benchmarks.
We propose an efficient method for the numerical approximation of a general class of two dimensional semilinear parabolic problems on polygonal meshes. The proposed approach takes advantage of the properties of the serendipity version of the Virtual Element Method (VEM), which not only significantly reduces the number of degrees of freedom compared to the classical VEM but also, under certain conditions on the mesh, allows to approximate the nonlinear term with an interpolant in the Serendipity VEM space; which substantially improves the efficiency of the method. An error analysis for the semi-discrete formulation is carried out, and an optimal estimate for the error in the $L_2$-norm is obtained. The accuracy and efficiency of the proposed method when combined with a second order Strang operator splitting time discretization is illustrated in our numerical experiments, with approximations up to order $6$.
A novel numerical approach to solving the shallow-water equations on the sphere using high-order numerical discretizations in both space and time is proposed. A space-time tensor formalism is used to express the equations of motion covariantly and to describe the geometry of the rotated cubed-sphere grid. The spatial discretization is done with the direct flux reconstruction method, which is an alternative formulation to the discontinuous Galerkin approach. The equations of motion are solved in differential form and the resulting discretization is free from quadrature rules. It is well known that the time step of traditional explicit methods is limited by the phase velocity of the fastest waves. Exponential integration is employed to enable integrations with significantly larger time step sizes and improve the efficiency of the overall time integration. New multistep-type exponential propagation iterative methods of orders 4, 5 and 6 are constructed and applied to integrate the shallow-water equations in time. These new schemes enable time integration with high-order accuracy but without significant increases in computational time compared to low-order methods. The exponential matrix functions-vector products used in the exponential schemes are approximated using the complex-step approximation of the Jacobian in the Krylov-based KIOPS (Krylov with incomplete orthogonalization procedure solver) algorithm. Performance of the new numerical methods is evaluated using a set of standard benchmark tests.
We provide matching upper and lower bounds of order $\sigma^2/\log(d/n)$ for the prediction error of the minimum $\ell_1$-norm interpolator, a.k.a. basis pursuit. Our result is tight up to negligible terms when $d \gg n$, and is the first to imply asymptotic consistency of noisy minimum-norm interpolation for isotropic features and sparse ground truths. Our work complements the literature on "benign overfitting" for minimum $\ell_2$-norm interpolation, where asymptotic consistency can be achieved only when the features are effectively low-dimensional.
Series expansions have been a cornerstone of applied mathematics and engineering for centuries. In this paper, we revisit the Taylor series expansion from a modern Machine Learning perspective. Specifically, we introduce the Fast Continuous Convolutional Taylor Transform (FC2T2), a variant of the Fast Multipole Method (FMM), that allows for the efficient approximation of low dimensional convolutional operators in continuous space. We build upon the FMM which is an approximate algorithm that reduces the computational complexity of N-body problems from O(NM) to O(N+M) and finds application in e.g. particle simulations. As an intermediary step, the FMM produces a series expansion for every cell on a grid and we introduce algorithms that act directly upon this representation. These algorithms analytically but approximately compute the quantities required for the forward and backward pass of the backpropagation algorithm and can therefore be employed as (implicit) layers in Neural Networks. Specifically, we introduce a root-implicit layer that outputs surface normals and object distances as well as an integral-implicit layer that outputs a rendering of a radiance field given a 3D pose. In the context of Machine Learning, $N$ and $M$ can be understood as the number of model parameters and model evaluations respectively which entails that, for applications that require repeated function evaluations which are prevalent in Computer Vision and Graphics, unlike regular Neural Networks, the techniques introduce in this paper scale gracefully with parameters. For some applications, this results in a 200x reduction in FLOPs compared to state-of-the-art approaches at a reasonable or non-existent loss in accuracy.
Federated learning is a distributed machine learning method that aims to preserve the privacy of sample features and labels. In a federated learning system, ID-based sample alignment approaches are usually applied with few efforts made on the protection of ID privacy. In real-life applications, however, the confidentiality of sample IDs, which are the strongest row identifiers, is also drawing much attention from many participants. To relax their privacy concerns about ID privacy, this paper formally proposes the notion of asymmetrical vertical federated learning and illustrates the way to protect sample IDs. The standard private set intersection protocol is adapted to achieve the asymmetrical ID alignment phase in an asymmetrical vertical federated learning system. Correspondingly, a Pohlig-Hellman realization of the adapted protocol is provided. This paper also presents a genuine with dummy approach to achieving asymmetrical federated model training. To illustrate its application, a federated logistic regression algorithm is provided as an example. Experiments are also made for validating the feasibility of this approach.
The piecewise constant Mumford-Shah (PCMS) model and the Rudin-Osher-Fatemi (ROF) model are two of the most famous variational models in image segmentation and image restoration, respectively. They have ubiquitous applications in image processing. In this paper, we explore the linkage between these two important models. We prove that for two-phase segmentation problem the optimal solution of the PCMS model can be obtained by thresholding the minimizer of the ROF model. This linkage is still valid for multiphase segmentation under mild assumptions. Thus it opens a new segmentation paradigm: image segmentation can be done via image restoration plus thresholding. This new paradigm, which circumvents the innate non-convex property of the PCMS model, therefore improves the segmentation performance in both efficiency (much faster than state-of-the-art methods based on PCMS model, particularly when the phase number is high) and effectiveness (producing segmentation results with better quality) due to the flexibility of the ROF model in tackling degraded images, such as noisy images, blurry images or images with information loss. As a by-product of the new paradigm, we derive a novel segmentation method, coined thresholded-ROF (T-ROF) method, to illustrate the virtue of manipulating image segmentation through image restoration techniques. The convergence of the T-ROF method under certain conditions is proved, and elaborate experimental results and comparisons are presented.
Autoencoders provide a powerful framework for learning compressed representations by encoding all of the information needed to reconstruct a data point in a latent code. In some cases, autoencoders can "interpolate": By decoding the convex combination of the latent codes for two datapoints, the autoencoder can produce an output which semantically mixes characteristics from the datapoints. In this paper, we propose a regularization procedure which encourages interpolated outputs to appear more realistic by fooling a critic network which has been trained to recover the mixing coefficient from interpolated data. We then develop a simple benchmark task where we can quantitatively measure the extent to which various autoencoders can interpolate and show that our regularizer dramatically improves interpolation in this setting. We also demonstrate empirically that our regularizer produces latent codes which are more effective on downstream tasks, suggesting a possible link between interpolation abilities and learning useful representations.