Invariant descriptors of point and rigid-body motion trajectories have been proposed in the past as representative task models for motion recognition and generalization. Currently, no invariant descriptor exists for representing force trajectories, which appear in contact tasks. This paper introduces invariant descriptors for force trajectories by exploiting the duality between motion and force. Two types of invariant descriptors are presented depending on whether the trajectories consist of screw or vector coordinates. Methods and software are provided for robustly calculating the invariant descriptors from noisy measurements using optimal control. Using experimental human demonstrations of 3D contour following and peg-on-hole alignment tasks, invariant descriptors are shown to result in task representations that do not depend on the calibration of reference frames or sensor locations. The tuning process for the optimal control problems is shown to be fast and intuitive. Similar to motions in free space, the proposed invariant descriptors for motion and force trajectories may prove useful for the recognition and generalization of constrained motions, such as during object manipulation in contact.
We present a numerical discretisation of the coupled moment systems, previously introduced in Dahm and Helzel, which approximate the kinetic multi-scale model by Helzel and Tzavaras for sedimentation in suspensions of rod-like particles for a two-dimensional flow problem and a shear flow problem. We use a splitting ansatz which, during each time step, separately computes the update of the macroscopic flow equation and of the moment system. The proof of the hyperbolicity of the moment systems in \cite{Dahm} suggests solving the moment systems with standard numerical methods for hyperbolic problems, like LeVeque's Wave Propagation Algorithm \cite{LeV}. The number of moment equations used in the hyperbolic moment system can be adapted to locally varying flow features. An error analysis is proposed, which compares the approximation with $2N+1$ moment equations to an approximation with $2N+3$ moment equations. This analysis suggests an error indicator which can be computed from the numerical approximation of the moment system with $2N+1$ moment equations. In order to use moment approximations with a different number of moment equations in different parts of the computational domain, we consider an interface coupling of moment systems with different resolution. Finally, we derive a conservative high-resolution Wave Propagation Algorithm for solving moment systems with different numbers of moment equations.
We use Markov categories to develop generalizations of the theory of Markov chains and hidden Markov models in an abstract setting. This comprises characterizations of hidden Markov models in terms of local and global conditional independences as well as existing algorithms for Bayesian filtering and smoothing applicable in all Markov categories with conditionals. We show that these algorithms specialize to existing ones such as the Kalman filter, forward-backward algorithm, and the Rauch-Tung-Striebel smoother when instantiated in appropriate Markov categories. Under slightly stronger assumptions, we also prove that the sequence of outputs of the Bayes filter is itself a Markov chain with a concrete formula for its transition maps. There are two main features of this categorical framework. The first is its generality, as it can be used in any Markov category with conditionals. In particular, it provides a systematic unified account of hidden Markov models and algorithms for filtering and smoothing in discrete probability, Gaussian probability, measure-theoretic probability, possibilistic nondeterminism and others at the same time. The second feature is the intuitive visual representation of information flow in these algorithms in terms of string diagrams.
We present an analysis of total-variation (TV) on non-Euclidean parameterized surfaces, a natural representation of the shapes used in 3D graphics. Our work explains recent experimental findings in shape spectral TV [Fumero et al., 2020] and adaptive anisotropic spectral TV [Biton and Gilboa, 2022]. A new way to generalize set convexity from the plane to surfaces is derived by characterizing the TV eigenfunctions on surfaces. Relationships between TV, area, eigenvalue, eigenfunctions and their discontinuities are discovered. Further, we expand the shape spectral TV toolkit to include versatile zero-homogeneous flows demonstrated through smoothing and exaggerating filters. Last but not least, we propose the first TV-based method for shape deformation, characterized by deformations along geometrical bottlenecks. We show these bottlenecks to be aligned with eigenfunction discontinuities. This research advances the field of spectral TV on surfaces and its application in 3D graphics, offering new perspectives for shape filtering and deformation.
In black-box optimization, noise in the objective function is inevitable. Noise disrupts the ranking of candidate solutions in comparison-based optimization, possibly deteriorating the search performance compared with a noiseless scenario. Explicit averaging takes the sample average of noisy objective function values and is widely used as a simple and versatile noise-handling technique. Although it is suitable for various applications, it is ineffective if the mean is not finite. We theoretically reveal that explicit averaging has a negative effect on the estimation of ground-truth rankings when assuming stably distributed noise without a finite mean. Alternatively, sign averaging is proposed as a simple but robust noise-handling technique. We theoretically prove that the sign averaging estimates the order of the medians of the noisy objective function values of a pair of points with arbitrarily high probability as the number of samples increases. Its advantages over explicit averaging and its robustness are also confirmed through numerical experiments.
Biped robots usually adopt feet with a rigid structure that simplifies walking on flat grounds and yet hinders ground adaptation in unstructured environments, thus jeopardizing stability. We recently explored in the SoftFoot the idea of adapting a robotic foot to ground irregularities along the sagittal plane. Building on the previous results, we propose in this paper a novel robotic foot able to adapt both in the sagittal and frontal planes, similarly to the human foot. It features five parallel modules with intrinsic longitudinal adaptability that can be combined in many possible designs through optional rigid or elastic connections. By following a methodological design approach, we narrow down the design space to five candidate foot designs and implement them on a modular system. Prototypes are tested experimentally via controlled application of force, through a robotic arm, onto a sensorized plate endowed with different obstacles. Their performance is compared, using also a rigid foot and the previous SoftFoot as a baseline. Analysis of footprint stability shows that the introduction of the transverse arch, by elastically connecting the five parallel modules, is advantageous for obstacle negotiation, especially when obstacles are located under the forefoot. In addition to biped robots' locomotion, this finding might also benefit lower-limb prostheses design.
Symbolic Computation algorithms and their implementation in computer algebra systems often contain choices which do not affect the correctness of the output but can significantly impact the resources required: such choices can benefit from having them made separately for each problem via a machine learning model. This study reports lessons on such use of machine learning in symbolic computation, in particular on the importance of analysing datasets prior to machine learning and on the different machine learning paradigms that may be utilised. We present results for a particular case study, the selection of variable ordering for cylindrical algebraic decomposition, but expect that the lessons learned are applicable to other decisions in symbolic computation. We utilise an existing dataset of examples derived from applications which was found to be imbalanced with respect to the variable ordering decision. We introduce an augmentation technique for polynomial systems problems that allows us to balance and further augment the dataset, improving the machine learning results by 28\% and 38\% on average, respectively. We then demonstrate how the existing machine learning methodology used for the problem $-$ classification $-$ might be recast into the regression paradigm. While this does not have a radical change on the performance, it does widen the scope in which the methodology can be applied to make choices.
The ubiquity of large-scale Pre-Trained Models (PTMs) is on the rise, sparking interest in model hubs, and dedicated platforms for hosting PTMs. Despite this trend, a comprehensive exploration of the challenges that users encounter and how the community leverages PTMs remains lacking. To address this gap, we conducted an extensive mixed-methods empirical study by focusing on discussion forums and the model hub of HuggingFace, the largest public model hub. Based on our qualitative analysis, we present a taxonomy of the challenges and benefits associated with PTM reuse within this community. We then conduct a quantitative study to track model-type trends and model documentation evolution over time. Our findings highlight prevalent challenges such as limited guidance for beginner users, struggles with model output comprehensibility in training or inference, and a lack of model understanding. We also identified interesting trends among models where some models maintain high upload rates despite a decline in topics related to them. Additionally, we found that despite the introduction of model documentation tools, its quantity has not increased over time, leading to difficulties in model comprehension and selection among users. Our study sheds light on new challenges in reusing PTMs that were not reported before and we provide recommendations for various stakeholders involved in PTM reuse.
The dominating NLP paradigm of training a strong neural predictor to perform one task on a specific dataset has led to state-of-the-art performance in a variety of applications (eg. sentiment classification, span-prediction based question answering or machine translation). However, it builds upon the assumption that the data distribution is stationary, ie. that the data is sampled from a fixed distribution both at training and test time. This way of training is inconsistent with how we as humans are able to learn from and operate within a constantly changing stream of information. Moreover, it is ill-adapted to real-world use cases where the data distribution is expected to shift over the course of a model's lifetime. The first goal of this thesis is to characterize the different forms this shift can take in the context of natural language processing, and propose benchmarks and evaluation metrics to measure its effect on current deep learning architectures. We then proceed to take steps to mitigate the effect of distributional shift on NLP models. To this end, we develop methods based on parametric reformulations of the distributionally robust optimization framework. Empirically, we demonstrate that these approaches yield more robust models as demonstrated on a selection of realistic problems. In the third and final part of this thesis, we explore ways of efficiently adapting existing models to new domains or tasks. Our contribution to this topic takes inspiration from information geometry to derive a new gradient update rule which alleviate catastrophic forgetting issues during adaptation.
Deep neural networks have revolutionized many machine learning tasks in power systems, ranging from pattern recognition to signal processing. The data in these tasks is typically represented in Euclidean domains. Nevertheless, there is an increasing number of applications in power systems, where data are collected from non-Euclidean domains and represented as the graph-structured data with high dimensional features and interdependency among nodes. The complexity of graph-structured data has brought significant challenges to the existing deep neural networks defined in Euclidean domains. Recently, many studies on extending deep neural networks for graph-structured data in power systems have emerged. In this paper, a comprehensive overview of graph neural networks (GNNs) in power systems is proposed. Specifically, several classical paradigms of GNNs structures (e.g., graph convolutional networks, graph recurrent neural networks, graph attention networks, graph generative networks, spatial-temporal graph convolutional networks, and hybrid forms of GNNs) are summarized, and key applications in power systems such as fault diagnosis, power prediction, power flow calculation, and data generation are reviewed in detail. Furthermore, main issues and some research trends about the applications of GNNs in power systems are discussed.
Sampling methods (e.g., node-wise, layer-wise, or subgraph) has become an indispensable strategy to speed up training large-scale Graph Neural Networks (GNNs). However, existing sampling methods are mostly based on the graph structural information and ignore the dynamicity of optimization, which leads to high variance in estimating the stochastic gradients. The high variance issue can be very pronounced in extremely large graphs, where it results in slow convergence and poor generalization. In this paper, we theoretically analyze the variance of sampling methods and show that, due to the composite structure of empirical risk, the variance of any sampling method can be decomposed into \textit{embedding approximation variance} in the forward stage and \textit{stochastic gradient variance} in the backward stage that necessities mitigating both types of variance to obtain faster convergence rate. We propose a decoupled variance reduction strategy that employs (approximate) gradient information to adaptively sample nodes with minimal variance, and explicitly reduces the variance introduced by embedding approximation. We show theoretically and empirically that the proposed method, even with smaller mini-batch sizes, enjoys a faster convergence rate and entails a better generalization compared to the existing methods.