In this work, following the discrete de Rham (DDR) approach, we develop a discrete counterpart of a two-dimensional de Rham complex with enhanced regularity. The proposed construction supports general polygonal meshes and arbitrary approximation orders. We establish exactness on a contractible domain for both the versions of the complex with and without boundary conditions and, for the former, prove a complete set of Poincar\'e-type inequalities. The discrete complex is then used to derive a novel discretisation method for a quad-rot problem which, unlike other schemes in the literature, does not require the forcing term to be prepared. We carry out complete stability and convergence analyses for the proposed scheme and provide numerical validation of the results.
We propose a new paradigm for designing efficient p-adaptive arbitrary high order methods. We consider arbitrary high order iterative schemes that gain one order of accuracy at each iteration and we modify them in order to match the accuracy achieved in a specific iteration with the discretization accuracy of the same iteration. Apart from the computational advantage, the new modified methods allow to naturally perform p-adaptivity, stopping the iterations when appropriate conditions are met. Moreover, the modification is very easy to be included in an existing implementation of an arbitrary high order iterative scheme and it does not ruin the possibility of parallelization, if this was achievable by the original method. An application to the ADER method for hyperbolic Partial Differential Equations (PDEs) is presented here. We explain how such framework can be interpreted as an arbitrary high order iterative scheme, by recasting it as a Deferred Correction (DeC) method, and how to easily modify it to obtain a more efficient formulation, in which a local a posteriori limiter can be naturally integrated leading to p-adaptivity and structure preserving properties. Finally, the novel approach is extensively tested against classical benchmarks for compressible gas dynamics to show the robustness and the computational efficiency.
A path query extracts vertex tuples from a labeled graph, based on the words that are formed by the paths connecting the vertices. We study the computational complexity of measuring the contribution of edges and vertices to an answer to a path query, focusing on the class of conjunctive regular path queries. To measure this contribution, we adopt the traditional Shapley value from cooperative game theory. This value has been recently proposed and studied in the context of relational database queries and has uses in a plethora of other domains. We first study the contribution of edges and show that the exact Shapley value is almost always hard to compute. Specifically, it is #P-hard to calculate the contribution of an edge whenever at least one (non-redundant) conjunct allows for a word of length three or more. In the case of regular path queries (i.e., no conjunction), the problem is tractable if the query has only words of length at most two; hence, this property fully characterizes the tractability of the problem. On the other hand, if we allow for an approximation error, then it is straightforward to obtain an efficient scheme (FPRAS) for an additive approximation. Yet, a multiplicative approximation is harder to obtain. We establish that in the case of conjunctive regular path queries, a multiplicative approximation of the Shapley value of an edge can be computed in polynomial time if and only if all query atoms are finite languages (assuming non-redundancy and conventional complexity limitations). We also study the analogous situation where we wish to determine the contribution of a vertex, rather than an edge, and establish complexity results of similar nature.
Deep Neural Networks (DNN) are increasingly used as components of larger software systems that need to process complex data, such as images, written texts, audio/video signals. DNN predictions cannot be assumed to be always correct for several reasons, among which the huge input space that is dealt with, the ambiguity of some inputs data, as well as the intrinsic properties of learning algorithms, which can provide only statistical warranties. Hence, developers have to cope with some residual error probability. An architectural pattern commonly adopted to manage failure-prone components is the supervisor, an additional component that can estimate the reliability of the predictions made by untrusted (e.g., DNN) components and can activate an automated healing procedure when these are likely to fail, ensuring that the Deep Learning based System (DLS) does not cause damages, despite its main functionality being suspended. In this paper, we consider DLS that implement a supervisor by means of uncertainty estimation. After overviewing the main approaches to uncertainty estimation and discussing their pros and cons, we motivate the need for a specific empirical assessment method that can deal with the experimental setting in which supervisors are used, where the accuracy of the DNN matters only as long as the supervisor lets the DLS continue to operate. Then we present a large empirical study conducted to compare the alternative approaches to uncertainty estimation. We distilled a set of guidelines for developers that are useful to incorporate a supervisor based on uncertainty monitoring into a DLS.
In this paper, we consider a fully-discrete approximation of an abstract evolution equation deploying a non-conforming spatial approximation and finite differences in time (Rothe-Galerkin method). The main result is the convergence of the discrete solutions to a weak solution of the continuous problem. Therefore, the result can be interpreted either as a justification of the numerical method or as an alternative way of constructing weak solutions. We formulate the problem in the very general and abstract setting of so-called non-conforming Bochner pseudo-monotone operators, which allows for a unified treatment of several evolution problems. Our abstract results for non-conforming Bochner pseudo-monotone operators allow to establish (weak) convergence just by verifying a few natural assumptions on the operators time-by-time and on the discretization spaces. Hence, applications and extensions to several other evolution problems can be performed easily. We exemplify the applicability of our approach on several DG schemes for the unsteady $p$-Navier-Stokes problem. The results of some numerical experiments are reported in the final section.
The increasing interest in autonomous robots with a high number of degrees of freedom for industrial applications and service robotics demands control algorithms to handle multiple tasks as well as hard constraints efficiently. This paper presents a general framework in which both kinematic (velocity- or acceleration-based) and dynamic (torque-based) control of redundant robots are handled in a unified fashion. The framework allows for the specification of redundancy resolution problems featuring a hierarchy of arbitrary (equality and inequality) constraints, arbitrary weighting of the control effort in the cost function and an additional input used to optimize possibly remaining redundancy. To solve such problems, a generalization of the Saturation in the Null Space (SNS) algorithm is introduced, which extends the original method according to the features required by our general control framework. Variants of the developed algorithm are presented, which ensure both efficient computation and optimality of the solution. Experiments on a KUKA LBRiiwa robotic arm, as well as simulations with a highly redundant mobile manipulator are reported.
For solving a broad class of nonconvex programming problems on an unbounded constraint set, we provide a self-adaptive step-size strategy that does not include line-search techniques and establishes the convergence of a generic approach under mild assumptions. Specifically, the objective function may not satisfy the convexity condition. Unlike descent line-search algorithms, it does not need a known Lipschitz constant to figure out how big the first step should be. The crucial feature of this process is the steady reduction of the step size until a certain condition is fulfilled. In particular, it can provide a new gradient projection approach to optimization problems with an unbounded constrained set. The correctness of the proposed method is verified by preliminary results from some computational examples. To demonstrate the effectiveness of the proposed technique for large-scale problems, we apply it to some experiments on machine learning, such as supervised feature selection, multi-variable logistic regressions and neural networks for classification.
We consider the reliable implementation of high-order unfitted finite element methods on Cartesian meshes with hanging nodes for elliptic interface problems. We construct a reliable algorithm to merge small interface elements with their surrounding elements to automatically generate the finite element mesh whose elements are large with respect to both domains. We propose new basis functions for the interface elements to control the growth of the condition number of the stiffness matrix in terms of the finite element approximation order, the number of elements of the mesh, and the interface deviation which quantifies the mesh resolution of the geometry of the interface. Numerical examples are presented to illustrate the competitive performance of the method.
A critical problem in post hoc explainability is the lack of a common foundational goal among methods. For example, some methods are motivated by function approximation, some by game theoretic notions, and some by obtaining clean visualizations. This fragmentation of goals causes not only an inconsistent conceptual understanding of explanations but also the practical challenge of not knowing which method to use when. In this work, we begin to address these challenges by unifying eight popular post hoc explanation methods (LIME, C-LIME, SHAP, Occlusion, Vanilla Gradients, Gradients x Input, SmoothGrad, and Integrated Gradients). We show that these methods all perform local function approximation of the black-box model, differing only in the neighbourhood and loss function used to perform the approximation. This unification enables us to (1) state a no free lunch theorem for explanation methods which demonstrates that no single method can perform optimally across all neighbourhoods, and (2) provide a guiding principle to choose among methods based on faithfulness to the black-box model. We empirically validate these theoretical results using various real-world datasets, model classes, and prediction tasks. By bringing diverse explanation methods into a common framework, this work (1) advances the conceptual understanding of these methods, revealing their shared local function approximation objective, properties, and relation to one another, and (2) guides the use of these methods in practice, providing a principled approach to choose among methods and paving the way for the creation of new ones.
The conjoining of dynamical systems and deep learning has become a topic of great interest. In particular, neural differential equations (NDEs) demonstrate that neural networks and differential equation are two sides of the same coin. Traditional parameterised differential equations are a special case. Many popular neural network architectures, such as residual networks and recurrent networks, are discretisations. NDEs are suitable for tackling generative problems, dynamical systems, and time series (particularly in physics, finance, ...) and are thus of interest to both modern machine learning and traditional mathematical modelling. NDEs offer high-capacity function approximation, strong priors on model space, the ability to handle irregular data, memory efficiency, and a wealth of available theory on both sides. This doctoral thesis provides an in-depth survey of the field. Topics include: neural ordinary differential equations (e.g. for hybrid neural/mechanistic modelling of physical systems); neural controlled differential equations (e.g. for learning functions of irregular time series); and neural stochastic differential equations (e.g. to produce generative models capable of representing complex stochastic dynamics, or sampling from complex high-dimensional distributions). Further topics include: numerical methods for NDEs (e.g. reversible differential equations solvers, backpropagation through differential equations, Brownian reconstruction); symbolic regression for dynamical systems (e.g. via regularised evolution); and deep implicit models (e.g. deep equilibrium models, differentiable optimisation). We anticipate this thesis will be of interest to anyone interested in the marriage of deep learning with dynamical systems, and hope it will provide a useful reference for the current state of the art.
Since deep neural networks were developed, they have made huge contributions to everyday lives. Machine learning provides more rational advice than humans are capable of in almost every aspect of daily life. However, despite this achievement, the design and training of neural networks are still challenging and unpredictable procedures. To lower the technical thresholds for common users, automated hyper-parameter optimization (HPO) has become a popular topic in both academic and industrial areas. This paper provides a review of the most essential topics on HPO. The first section introduces the key hyper-parameters related to model training and structure, and discusses their importance and methods to define the value range. Then, the research focuses on major optimization algorithms and their applicability, covering their efficiency and accuracy especially for deep learning networks. This study next reviews major services and toolkits for HPO, comparing their support for state-of-the-art searching algorithms, feasibility with major deep learning frameworks, and extensibility for new modules designed by users. The paper concludes with problems that exist when HPO is applied to deep learning, a comparison between optimization algorithms, and prominent approaches for model evaluation with limited computational resources.