Can robots mold soft plastic materials by shaping depth images? The short answer is no: current day robots can't. In this article, we address the problem of shaping plastic material with an anthropomorphic arm/hand robot, which observes the material with a fixed depth camera. Robots capable of molding could assist humans in many tasks, such as cooking, scooping or gardening. Yet, the problem is complex, due to its high-dimensionality at both perception and control levels. To address it, we design three alternative data-based methods for predicting the effect of robot actions on the material. Then, the robot can plan the sequence of actions and their positions, to mold the material into a desired shape. To make the prediction problem tractable, we rely on two original ideas. First, we prove that under reasonable assumptions, the shaping problem can be mapped from point cloud to depth image space, with many benefits (simpler processing, no need for registration, lower computation time and memory requirements). Second, we design a novel, simple metric for quickly measuring the distance between two depth images. The metric is based on the inherent point cloud representation of depth images, which enables direct and consistent comparison of image pairs through a non-uniform scaling approach, and therefore opens promising perspectives for designing \textit{depth image -- based} robot controllers. We assess our approach in a series of unprecedented experiments, where a robotic arm/hand molds flour from initial to final shapes, either with its own dataset, or by transfer learning from a human dataset. We conclude the article by discussing the limitations of our framework and those of current day hardware, which make human-like robot molding a challenging open research problem.
This paper proposes a new cognitive model, acting as the main component of an AGI agent. The model is introduced in its mature intelligence state, and as an extension of previous models, DENN, and especially AKREM, by including operational models (frames/classes) and will. This model's core assumption is that cognition is about operating on accumulated knowledge, with the guidance of an appropriate will. Also, we assume that the actions, part of knowledge, are learning to be aligned with will, during the evolution phase that precedes the mature intelligence state. In addition, this model is mainly based on the duality principle in every known intelligent aspect, such as exhibiting both top-down and bottom-up model learning, generalization verse specialization, and more. Furthermore, a holistic approach is advocated for AGI designing, and cognition under constraints or efficiency is proposed, in the form of reusability and simplicity. Finally, reaching this mature state is described via a cognitive evolution from infancy to adulthood, utilizing a consolidation principle. The final product of this cognitive model is a dynamic operational memory of models and instances. Lastly, some examples and preliminary ideas for the evolution phase to reach the mature state are presented.
Selecting the number of regimes in Hidden Markov models is an important problem. There are many criteria that are used to select this number, such as Akaike information criterion (AIC), Bayesian information criterion (BIC), integrated completed likelihood (ICL), deviance information criterion (DIC), and Watanabe-Akaike information criterion (WAIC), to name a few. In this article, we introduced goodness-of-fit tests for general Hidden Markov models with covariates, where the distribution of the observations is arbitrary, i.e., continuous, discrete, or a mixture of both. Then, a selection procedure is proposed based on this goodness-of-fit test. The main aim of this article is to compare the classical information criterion with the new criterion, when the outcome is either continuous, discrete or zero-inflated. Numerical experiments assess the finite sample performance of the goodness-of-fit tests, and comparisons between the different criteria are made.
Pre-trained models of source code have gained widespread popularity in many code intelligence tasks. Recently, with the scaling of the model and corpus size, large language models have shown the ability of in-context learning (ICL). ICL employs task instructions and a few examples as demonstrations, and then inputs the demonstrations to the language models for making predictions. This new learning paradigm is training-free and has shown impressive performance in various natural language processing and code intelligence tasks. However, the performance of ICL heavily relies on the quality of demonstrations, e.g., the selected examples. It is important to systematically investigate how to construct a good demonstration for code-related tasks. In this paper, we empirically explore the impact of three key factors on the performance of ICL in code intelligence tasks: the selection, order, and number of demonstration examples. We conduct extensive experiments on three code intelligence tasks including code summarization, bug fixing, and program synthesis. Our experimental results demonstrate that all the above three factors dramatically impact the performance of ICL in code intelligence tasks. Additionally, we summarize our findings and provide takeaway suggestions on how to construct effective demonstrations, taking into account these three perspectives. We also show that a carefully-designed demonstration based on our findings can lead to substantial improvements over widely-used demonstration construction methods, e.g., improving BLEU-4, EM, and EM by at least 9.90%, 175.96%, and 50.81% on code summarization, bug fixing, and program synthesis, respectively
Multiphysics processes in fractured porous media is a research field of importance for several subsurface applications and has received considerable attention over the last decade. The dynamics are characterised by strong couplings between processes as well as interaction between the processes and the structure of the fractured medium itself. The rich range of behavior calls for explorative mathematical modelling, such as experimentation with constitutive laws and novel coupling concepts between physical processes. Moreover, efficient simulations of the strong couplings between multiphysics processes and geological structures require the development of tailored numerical methods. We present a modelling framework and its implementation in the open-source simulation toolbox PorePy, which is designed for rapid prototyping of multiphysics processes in fractured porous media. PorePy uses a mixed-dimensional representation of the fracture geometry and generally applies fully implicit couplings between processes. The code design follows the paradigms of modularity and differentiable programming, which together allow for extreme flexibility in experimentation with governing equations with minimal changes to the code base. The code integrity is supported by a multilevel testing framework ensuring the reliability of the code. We present our modelling framework within a context of thermo-poroelasticity in deformable fractured porous media, illustrating the close relation between the governing equations and the source code. We furthermore discuss the design of the testing framework and present simulations showcasing the extendibility of PorePy, as well as the type of results that can be produced by mixed-dimensional simulation tools.
Human-robot co-manipulation of soft materials, such as fabrics, composites, and sheets of paper/cardboard, is a challenging operation that presents several relevant industrial applications. Estimating the deformation state of the co-manipulated material is one of the main challenges. Viable methods provide the indirect measure by calculating the human-robot relative distance. In this paper, we develop a data-driven model to estimate the deformation state of the material from a depth image through a Convolutional Neural Network (CNN). First, we define the deformation state of the material as the relative roto-translation from the current robot pose and a human grasping position. The model estimates the current deformation state through a Convolutional Neural Network, specifically a DenseNet-121 pretrained on ImageNet.The delta between the current and the desired deformation state is fed to the robot controller that outputs twist commands. The paper describes the developed approach to acquire, preprocess the dataset and train the model. The model is compared with the current state-of-the-art method based on a skeletal tracker from cameras. Results show that our approach achieves better performances and avoids the various drawbacks caused by using a skeletal tracker.Finally, we also studied the model performance according to different architectures and dataset dimensions to minimize the time required for dataset acquisition
We focus on the control of unknown Partial Differential Equations (PDEs). The system dynamics is unknown, but we assume we are able to observe its evolution for a given control input, as typical in a Reinforcement Learning framework. We propose an algorithm based on the idea to control and identify on the fly the unknown system configuration. In this work, the control is based on the State-Dependent Riccati approach, whereas the identification of the model on Bayesian linear regression. At each iteration, based on the observed data, we obtain an estimate of the a-priori unknown parameter configuration of the PDE and then we compute the control of the correspondent model. We show by numerical evidence the convergence of the method for infinite horizon control problems.
Large language models (LLMs) are routinely pre-trained on billions of tokens, only to restart the process over again once new data becomes available. A much cheaper and more efficient solution would be to enable the continual pre-training of these models, i.e. updating pre-trained models with new data instead of re-training them from scratch. However, the distribution shift induced by novel data typically results in degraded performance on past data. Taking a step towards efficient continual pre-training, in this work, we examine the effect of different warm-up strategies. Our hypothesis is that the learning rate must be re-increased to improve compute efficiency when training on a new dataset. We study the warmup phase of models pre-trained on the Pile (upstream data, 300B tokens) as we continue to pre-train on SlimPajama (downstream data, 297B tokens), following a linear warmup and cosine decay schedule. We conduct all experiments on the Pythia 410M language model architecture and evaluate performance through validation perplexity. We experiment with different pre-training checkpoints, various maximum learning rates, and various warmup lengths. Our results show that while rewarming models first increases the loss on upstream and downstream data, in the longer run it improves the downstream performance, outperforming models trained from scratch$\unicode{x2013}$even for a large downstream dataset.
Underground gas storage (UGS) is a worldwide well-established technology that is becoming even more important to cope with seasonal peaks of gas consumption due to the growing uncertainties of the energy market. Safety issues concerning the reactivation of pre-existing faults might arise if the target reservoir is located in a faulted basin, where human activities can trigger (micro-)seismicity events. In the Netherlands, it has been observed that fault activation can occur somehow "unexpectedly" after the primary production (PP), i.e., during cushion gas injection (CGI) and UGS cycles, when the stress regime should be in the unloading/reloading path. To understand the physical mechanisms responsible for such occurrences, a 3D mathematical model coupling frictional contact mechanics in faulted porous rocks with fluid flow is developed, implemented and tested. The final aim of this two-part work is to define a safe operational bandwidth for the pore pressure range for UGS activities in the faulted reservoirs of the Rotliegend formation. Part I of this work concerns the development of the mathematical and numerical model of frictional contact mechanics and flow in faulted porous rocks. A mixed discretization of the governing PDEs under frictional contact constraints along the faults is used. A slip-weakening constitutive law governing the fault macroscopic behavior is also presented. The model is tested in the setting of an ideal reservoir located in the Rotliegend formation. The analyses point out how fault reactivation during PP can lead to a stress redistribution, giving rise to a new equilibrium configuration. When the fault is reloaded in the opposite direction during the CGI and/or UGS stages, further activation events can occur even if the stress range does not exceed either the undisturbed initial value or the maximum strength ever experienced by the formation.
We introduce CartiMorph, a framework for automated knee articular cartilage morphometrics. It takes an image as input and generates quantitative metrics for cartilage subregions, including the percentage of full-thickness cartilage loss (FCL), mean thickness, surface area, and volume. CartiMorph leverages the power of deep learning models for hierarchical image feature representation. Deep learning models were trained and validated for tissue segmentation, template construction, and template-to-image registration. We established methods for surface-normal-based cartilage thickness mapping, FCL estimation, and rule-based cartilage parcellation. Our cartilage thickness map showed less error in thin and peripheral regions. We evaluated the effectiveness of the adopted segmentation model by comparing the quantitative metrics obtained from model segmentation and those from manual segmentation. The root-mean-squared deviation of the FCL measurements was less than 8%, and strong correlations were observed for the mean thickness (Pearson's correlation coefficient $\rho \in [0.82,0.97]$), surface area ($\rho \in [0.82,0.98]$) and volume ($\rho \in [0.89,0.98]$) measurements. We compared our FCL measurements with those from a previous study and found that our measurements deviated less from the ground truths. We observed superior performance of the proposed rule-based cartilage parcellation method compared with the atlas-based approach. CartiMorph has the potential to promote imaging biomarkers discovery for knee osteoarthritis.
Deep Learning (DL) is the most widely used tool in the contemporary field of computer vision. Its ability to accurately solve complex problems is employed in vision research to learn deep neural models for a variety of tasks, including security critical applications. However, it is now known that DL is vulnerable to adversarial attacks that can manipulate its predictions by introducing visually imperceptible perturbations in images and videos. Since the discovery of this phenomenon in 2013~[1], it has attracted significant attention of researchers from multiple sub-fields of machine intelligence. In [2], we reviewed the contributions made by the computer vision community in adversarial attacks on deep learning (and their defenses) until the advent of year 2018. Many of those contributions have inspired new directions in this area, which has matured significantly since witnessing the first generation methods. Hence, as a legacy sequel of [2], this literature review focuses on the advances in this area since 2018. To ensure authenticity, we mainly consider peer-reviewed contributions published in the prestigious sources of computer vision and machine learning research. Besides a comprehensive literature review, the article also provides concise definitions of technical terminologies for non-experts in this domain. Finally, this article discusses challenges and future outlook of this direction based on the literature reviewed herein and [2].