Twitter bots are a controversial element of the platform, and their negative impact is well known. In the field of scientific communication, they have been perceived in a more positive light, and the accounts that serve as feeds alerting about scientific publications are quite common. However, despite being aware of the presence of bots in the dissemination of science, no large-scale estimations have been made nor has it been evaluated if they can truly interfere with altmetrics. Analyzing a dataset of 3,744,231 papers published between 2017 and 2021 and their associated 51,230,936 Twitter mentions, our goal was to determine the volume of publications mentioned by bots and whether they skew altmetrics indicators. Using the BotometerLite API, we categorized Twitter accounts based on their likelihood of being bots. The results showed that 11,073 accounts (0.23% of total users) exhibited automated behavior, contributing to 4.72% of all mentions. A significant bias was observed in the activity of bots. Their presence was particularly pronounced in disciplines such as Mathematics, Physics, and Space Sciences, with some specialties even exceeding 70% of the tweets. However, these are extreme cases, and the impact of this activity on altmetrics varies by speciality, with minimal influence in Arts & Humanities and Social Sciences. This research emphasizes the importance of distinguishing between specialties and disciplines when using Twitter as an altmetric.
We present a rigorous and precise analysis of the maximum degree and the average degree in a dynamic duplication-divergence graph model introduced by Sol\'e, Pastor-Satorras et al. in which the graph grows according to a duplication-divergence mechanism, i.e. by iteratively creating a copy of some node and then randomly alternating the neighborhood of a new node with probability $p$. This model captures the growth of some real-world processes e.g. biological or social networks. In this paper, we prove that for some $0 < p < 1$ the maximum degree and the average degree of a duplication-divergence graph on $t$ vertices are asymptotically concentrated with high probability around $t^p$ and $\max\{t^{2 p - 1}, 1\}$, respectively, i.e. they are within at most a polylogarithmic factor from these values with probability at least $1 - t^{-A}$ for any constant $A > 0$.
While many phenomena in physics and engineering are formally high-dimensional, their long-time dynamics often live on a lower-dimensional manifold. The present work introduces an autoencoder framework that combines implicit regularization with internal linear layers and $L_2$ regularization (weight decay) to automatically estimate the underlying dimensionality of a data set, produce an orthogonal manifold coordinate system, and provide the mapping functions between the ambient space and manifold space, allowing for out-of-sample projections. We validate our framework's ability to estimate the manifold dimension for a series of datasets from dynamical systems of varying complexities and compare to other state-of-the-art estimators. We analyze the training dynamics of the network to glean insight into the mechanism of low-rank learning and find that collectively each of the implicit regularizing layers compound the low-rank representation and even self-correct during training. Analysis of gradient descent dynamics for this architecture in the linear case reveals the role of the internal linear layers in leading to faster decay of a "collective weight variable" incorporating all layers, and the role of weight decay in breaking degeneracies and thus driving convergence along directions in which no decay would occur in its absence. We show that this framework can be naturally extended for applications of state-space modeling and forecasting by generating a data-driven dynamic model of a spatiotemporally chaotic partial differential equation using only the manifold coordinates. Finally, we demonstrate that our framework is robust to hyperparameter choices.
Physics informed neural networks (PINNs) have recently been very successfully applied for efficiently approximating inverse problems for PDEs. We focus on a particular class of inverse problems, the so-called data assimilation or unique continuation problems, and prove rigorous estimates on the generalization error of PINNs approximating them. An abstract framework is presented and conditional stability estimates for the underlying inverse problem are employed to derive the estimate on the PINN generalization error, providing rigorous justification for the use of PINNs in this context. The abstract framework is illustrated with examples of four prototypical linear PDEs. Numerical experiments, validating the proposed theory, are also presented.
We present HoVer-UNet, an approach to distill the knowledge of the multi-branch HoVerNet framework for nuclei instance segmentation and classification in histopathology. We propose a compact, streamlined single UNet network with a Mix Vision Transformer backbone, and equip it with a custom loss function to optimally encode the distilled knowledge of HoVerNet, reducing computational requirements without compromising performances. We show that our model achieved results comparable to HoVerNet on the public PanNuke and Consep datasets with a three-fold reduction in inference time. We make the code of our model publicly available at //github.com/DIAGNijmegen/HoVer-UNet.
Interpretability methods aim to understand the algorithm implemented by a trained model (e.g., a Transofmer) by examining various aspects of the model, such as the weight matrices or the attention patterns. In this work, through a combination of theoretical results and carefully controlled experiments on synthetic data, we take a critical view of methods that exclusively focus on individual parts of the model, rather than consider the network as a whole. We consider a simple synthetic setup of learning a (bounded) Dyck language. Theoretically, we show that the set of models that (exactly or approximately) solve this task satisfy a structural characterization derived from ideas in formal languages (the pumping lemma). We use this characterization to show that the set of optima is qualitatively rich; in particular, the attention pattern of a single layer can be ``nearly randomized'', while preserving the functionality of the network. We also show via extensive experiments that these constructions are not merely a theoretical artifact: even after severely constraining the architecture of the model, vastly different solutions can be reached via standard training. Thus, interpretability claims based on inspecting individual heads or weight matrices in the Transformer can be misleading.
This work is concerned with the analysis of a space-time finite element discontinuous Galerkin method on polytopal meshes (XT-PolydG) for the numerical discretization of wave propagation in coupled poroelastic-elastic media. The mathematical model consists of the low-frequency Biot's equations in the poroelastic medium and the elastodynamics equation for the elastic one. To realize the coupling, suitable transmission conditions on the interface between the two domains are (weakly) embedded in the formulation. The proposed PolydG discretization in space is then coupled with a dG time integration scheme, resulting in a full space-time dG discretization. We present the stability analysis for both the continuous and the semidiscrete formulations, and we derive error estimates for the semidiscrete formulation in a suitable energy norm. The method is applied to a wide set of numerical test cases to verify the theoretical bounds. Examples of physical interest are also presented to investigate the capability of the proposed method in relevant geophysical scenarios.
We propose a geometric integrator to numerically approximate the flow of Lie systems. The key is a novel procedure that integrates the Lie system on a Lie group intrinsically associated with a Lie system on a general manifold via a Lie group action, and then generates the discrete solution of the Lie system on the manifold via a solution of the Lie system on the Lie group. One major result from the integration of a Lie system on a Lie group is that one is able to solve all associated Lie systems on manifolds at the same time, and that Lie systems on Lie groups can be described through first-order systems of linear homogeneous ordinary differential equations (ODEs) in normal form. This brings a lot of advantages, since solving a linear system of ODEs involves less numerical cost. Specifically, we use two families of numerical schemes on the Lie group, which are designed to preserve its geometrical structure: the first one based on the Magnus expansion, whereas the second is based on Runge-Kutta-Munthe-Kaas (RKMK) methods. Moreover, since the aforementioned action relates the Lie group and the manifold where the Lie system evolves, the resulting integrator preserves any geometric structure of the latter. We compare both methods for Lie systems with geometric invariants, particularly a class on Lie systems on curved spaces. We also illustrate the superiority of our method for describing long-term behavior and for differential equations admitting solutions whose geometric features depends heavily on initial conditions. As already mentioned, our milestone is to show that the method we propose preserves all the geometric invariants very faithfully, in comparison with nongeometric numerical methods.
This paper elucidates the challenges and opportunities inherent in integrating data-driven methodologies into geotechnics, drawing inspiration from the success of materials informatics. Highlighting the intricacies of soil complexity, heterogeneity, and the lack of comprehensive data, the discussion underscores the pressing need for community-driven database initiatives and open science movements. By leveraging the transformative power of deep learning, particularly in feature extraction from high-dimensional data and the potential of transfer learning, we envision a paradigm shift towards a more collaborative and innovative geotechnics field. The paper concludes with a forward-looking stance, emphasizing the revolutionary potential brought about by advanced computational tools like large language models in reshaping geotechnics informatics.
The classification of galaxies as spirals or ellipticals is a crucial task in understanding their formation and evolution. With the arrival of large-scale astronomical surveys, such as the Sloan Digital Sky Survey (SDSS), astronomers now have access to images of a vast number of galaxies. However, the visual inspection of these images is an impossible task for humans due to the sheer number of galaxies to be analyzed. To solve this problem, the Galaxy Zoo project was created to engage thousands of citizen scientists to classify the galaxies based on their visual features. In this paper, we present a machine learning model for galaxy classification using numerical data from the Galaxy Zoo[5] project. Our model utilizes a convolutional neural network architecture to extract features from galaxy images and classify them into spirals or ellipticals. We demonstrate the effectiveness of our model by comparing its performance with that of human classifiers using a subset of the Galaxy Zoo dataset. Our results show that our model achieves high accuracy in classifying galaxies and has the potential to significantly enhance our understanding of the formation and evolution of galaxies.
We hypothesize that due to the greedy nature of learning in multi-modal deep neural networks, these models tend to rely on just one modality while under-fitting the other modalities. Such behavior is counter-intuitive and hurts the models' generalization, as we observe empirically. To estimate the model's dependence on each modality, we compute the gain on the accuracy when the model has access to it in addition to another modality. We refer to this gain as the conditional utilization rate. In the experiments, we consistently observe an imbalance in conditional utilization rates between modalities, across multiple tasks and architectures. Since conditional utilization rate cannot be computed efficiently during training, we introduce a proxy for it based on the pace at which the model learns from each modality, which we refer to as the conditional learning speed. We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning. The proposed algorithm improves the model's generalization on three datasets: Colored MNIST, Princeton ModelNet40, and NVIDIA Dynamic Hand Gesture.