Sleeve functions are generalizations of the well-established ridge functions that play a major role in the theory of partial differential equation, medical imaging, statistics, and neural networks. Where ridge functions are non-linear, univariate functions of the distance to hyperplanes, sleeve functions are based on the squared distance to lower-dimensional manifolds. The present work is a first step to study general sleeve functions by starting with sleeve functions based on finite-length curves. To capture these curve-based sleeve functions, we propose and study a two-step method, where first the outer univariate function - the profile - is recovered, and second the underlying curve is represented by a polygonal chain. Introducing a concept of well-separation, we ensure that the proposed method always terminates and approximate the true sleeve function with a certain quality. Investigating the local geometry, we study an inexact version of our method and show its success under certain conditions.
We consider the Dynamical Low Rank (DLR) approximation of random parabolic equations and propose a class of fully discrete numerical schemes. Similarly to the continuous DLR approximation, our schemes are shown to satisfy a discrete variational formulation. By exploiting this property, we establish stability of our schemes: we show that our explicit and semi-implicit versions are conditionally stable under a parabolic type CFL condition which does not depend on the smallest singular value of the DLR solution; whereas our implicit scheme is unconditionally stable. Moreover, we show that, in certain cases, the semi-implicit scheme can be unconditionally stable if the randomness in the system is sufficiently small. Furthermore, we show that these schemes can be interpreted as projector-splitting integrators and are strongly related to the scheme proposed by Lubich et al. [BIT Num. Math., 54:171-188, 2014; SIAM J. on Num. Anal., 53:917-941, 2015], to which our stability analysis applies as well. The analysis is supported by numerical results showing the sharpness of the obtained stability conditions.
We study approximation by arbitrary linear combinations of $n$ translates of a single function of periodic functions. We construct some linear methods of this approximation for univariate functions in the class induced by the convolution with a single function, and prove upper bounds of the $L^p$-approximation convergence rate by these methods, when $n \to \infty$, for $1 \leq p \leq \infty$. We also generalize these results to classes of multivariate functions defined the convolution with the tensor product of a single function. In the case $p=2$, for this class, we also prove a lower bound of the quantity characterizing best approximation of by arbitrary linear combinations of $n$ translates of arbitrary function.
In this paper we discuss the numerical solution on a simple 2D domain of the Helmoltz equation with mixed boundary conditions. The so called radiation problem depends on the wavenumber constant parameter k and it is inspired here by medical applications, where a transducer emits a pulse at a given frequency. This problem has been successfully solved in the past with the classical Finite Element Method (FEM) for relative small values of k. But in modern applications the values of k can be of order of thousands and FEM faces up several numerical difficulties. To overcome these difficulties we solve the radiation problem using the Isogeometric Analysis (IgA), a kind of generalization of FEM. Starting with the variational formulation of the radiation problem, we show with details how to apply the isogeometric approach in order to compute the coefficients of the approximated solution of radiation problem in terms of the B-spline basis functions. Our implementation of IgA using GeoPDEs software shows that isogeometric approach is superior than FEM, since it is able to reduce substantially the pollution error, especially for high values of k, producing additionally smoother solutions which depend on less degrees of freedom.
In 1998, Brassard, Hoyer, Mosca, and Tapp (BHMT) gave a quantum algorithm for approximate counting. Given a list of $N$ items, $K$ of them marked, their algorithm estimates $K$ to within relative error $\varepsilon$ by making only $O\left( \frac{1}{\varepsilon}\sqrt{\frac{N}{K}}\right) $ queries. Although this speedup is of "Grover" type, the BHMT algorithm has the curious feature of relying on the Quantum Fourier Transform (QFT), more commonly associated with Shor's algorithm. Is this necessary? This paper presents a simplified algorithm, which we prove achieves the same query complexity using Grover iterations only. We also generalize this to a QFT-free algorithm for amplitude estimation. Related approaches to approximate counting were sketched previously by Grover, Abrams and Williams, Suzuki et al., and Wie (the latter two as we were writing this paper), but in all cases without rigorous analysis.
We construct finite element approximations of the Levi-Civita connection and its curvature on triangulations of oriented two-dimensional manifolds. Our construction relies on the Regge finite elements, which are piecewise polynomial symmetric (0,2)-tensor fields possessing single-valued tangential-tangential components along element interfaces. When used to discretize the Riemannian metric tensor, these piecewise polynomial tensor fields do not possess enough regularity to define connections and curvature in the classical sense, but we show how to make sense of these quantities in a distributional sense. We then show that these distributional quantities converge in certain dual Sobolev norms to their smooth counterparts under refinement of the triangulation. We also discuss projections of the distributional curvature and distributional connection onto piecewise polynomial finite element spaces. We show that the relevant projection operators commute with certain linearized differential operators, yielding a commutative diagram of differential complexes.
We develop a dynamic trading strategy in the Linear Quadratic Regulator (LQR) framework. By including a price mean-reversion signal into the optimization program, in a trading environment where market impact is linear and stage costs are quadratic, we obtain an optimal trading curve that reacts opportunistically to price changes while retaining its ability to satisfy smooth or hard completion constraints. The optimal allocation is affine in the spot price and in the number of outstanding shares at any time, and it can be fully derived iteratively. It is also aggressive in the money, meaning that it accelerates whenever the price is favorable, with an intensity that can be calibrated by the practitioner. Since the LQR may yield locally negative participation rates (i.e round trip trades) which are often undesirable, we show that the aforementioned optimization problem can be improved and solved under positivity constraints following a Model Predictive Control (MPC) approach. In particular, it is smoother and more consistent with the completion constraint than putting a hard floor on the participation rate. We finally examine how the LQR can be simplified in the continuous trading context, which allows us to derive a closed formula for the trading curve under further assumptions, and we document a two-step strategy for the case where trades can also occur in an additional dark pool.
Recently there has been increased interest in using machine learning techniques to improve classical algorithms. In this paper we study when it is possible to construct compact, composable sketches for weighted sampling and statistics estimation according to functions of data frequencies. Such structures are now central components of large-scale data analytics and machine learning pipelines. However, many common functions, such as thresholds and p-th frequency moments with p > 2, are known to require polynomial-size sketches in the worst case. We explore performance beyond the worst case under two different types of assumptions. The first is having access to noisy advice on item frequencies. This continues the line of work of Hsu et al. (ICLR 2019), who assume predictions are provided by a machine learning model. The second is providing guaranteed performance on a restricted class of input frequency distributions that are better aligned with what is observed in practice. This extends the work on heavy hitters under Zipfian distributions in a seminal paper of Charikar et al. (ICALP 2002). Surprisingly, we show analytically and empirically that "in practice" small polylogarithmic-size sketches provide accuracy for "hard" functions.
Counterfactual explanations are usually generated through heuristics that are sensitive to the search's initial conditions. The absence of guarantees of performance and robustness hinders trustworthiness. In this paper, we take a disciplined approach towards counterfactual explanations for tree ensembles. We advocate for a model-based search aiming at "optimal" explanations and propose efficient mixed-integer programming approaches. We show that isolation forests can be modeled within our framework to focus the search on plausible explanations with a low outlier score. We provide comprehensive coverage of additional constraints that model important objectives, heterogeneous data types, structural constraints on the feature space, along with resource and actionability restrictions. Our experimental analyses demonstrate that the proposed search approach requires a computational effort that is orders of magnitude smaller than previous mathematical programming algorithms. It scales up to large data sets and tree ensembles, where it provides, within seconds, systematic explanations grounded on well-defined models solved to optimality.
The problem of Approximate Nearest Neighbor (ANN) search is fundamental in computer science and has benefited from significant progress in the past couple of decades. However, most work has been devoted to pointsets whereas complex shapes have not been sufficiently treated. Here, we focus on distance functions between discretized curves in Euclidean space: they appear in a wide range of applications, from road segments to time-series in general dimension. For $\ell_p$-products of Euclidean metrics, for any $p$, we design simple and efficient data structures for ANN, based on randomized projections, which are of independent interest. They serve to solve proximity problems under a notion of distance between discretized curves, which generalizes both discrete Fr\'echet and Dynamic Time Warping distances. These are the most popular and practical approaches to comparing such curves. We offer the first data structures and query algorithms for ANN with arbitrarily good approximation factor, at the expense of increasing space usage and preprocessing time over existing methods. Query time complexity is comparable or significantly improved by our algorithms, our algorithm is especially efficient when the length of the curves is bounded.
Robust estimation is much more challenging in high dimensions than it is in one dimension: Most techniques either lead to intractable optimization problems or estimators that can tolerate only a tiny fraction of errors. Recent work in theoretical computer science has shown that, in appropriate distributional models, it is possible to robustly estimate the mean and covariance with polynomial time algorithms that can tolerate a constant fraction of corruptions, independent of the dimension. However, the sample and time complexity of these algorithms is prohibitively large for high-dimensional applications. In this work, we address both of these issues by establishing sample complexity bounds that are optimal, up to logarithmic factors, as well as giving various refinements that allow the algorithms to tolerate a much larger fraction of corruptions. Finally, we show on both synthetic and real data that our algorithms have state-of-the-art performance and suddenly make high-dimensional robust estimation a realistic possibility.