Bibliometric studies based on the Web of Science (WOS) database have become an increasingly popular method for analysing the structure of scientific research. So do network approaches, which, based on empirical data, make it possible to characterize the emergence of topological structures over time and across multiple research areas. Our paper is a contribution to interweaving these two lines of research that have progressed in separate ways but whose common applications have been increasingly more frequent. Among other attributes, Author Keywords and Keywords Plus are used as units of analysis that enable us to identify changes in the topics of interest and related bibliography. By considering the co-occurrence of those keywords with the Author Keyword \texttt{Complexity}, we provide an overview of the evolution of studies on Complexity Sciences, and compare this evolution in seven scientific fields. The results show a considerable increase in the number of papers dealing with complexity, as well as a general tendency across different disciplines for this literature to move from a more foundational, general and conceptual to a more applied and specific set of co-occurring keywords. Moreover, we provide evidence of changing topologies of networks of co-occurring keywords, which are described through the computation of some topological coefficients. In so doing, we emphasize the distinguishing structures that characterize the networks of the seven research areas.
Existing research has shown the potential of classifying Alzheimers Disease (AD) from eye-tracking (ET) data with classifiers that rely on task-specific engineered features. In this paper, we investigate whether we can improve on existing results by using a Deep-Learning classifier trained end-to-end on raw ET data. This classifier (VTNet) uses a GRU and a CNN in parallel to leverage both visual (V) and temporal (T) representations of ET data and was previously used to detect user confusion while processing visual displays. A main challenge in applying VTNet to our target AD classification task is that the available ET data sequences are much longer than those used in the previous confusion detection task, pushing the limits of what is manageable by LSTM-based models. We discuss how we address this challenge and show that VTNet outperforms the state-of-the-art approaches in AD classification, providing encouraging evidence on the generality of this model to make predictions from ET data.
A rigidity circuit (in 2D) is a minimal dependent set in the rigidity matroid, i.e. a minimal graph supporting a non-trivial stress in any generic placement of its vertices in $\mathbb R^2$. Any rigidity circuit on $n\geq 5$ vertices can be obtained from rigidity circuits on a fewer number of vertices by applying the combinatorial resultant (CR) operation. The inverse operation is called a combinatorial resultant decomposition (CR-decomp). Any rigidity circuit on $n\geq 5$ vertices can be successively decomposed into smaller circuits, until the complete graphs $K_4$ are reached. This sequence of CR-decomps has the structure of a rooted binary tree called the combinatorial resultant tree (CR-tree). A CR-tree encodes an elimination strategy for computing circuit polynomials via Sylvester resultants. Different CR-trees lead to elimination strategies that can vary greatly in time and memory consumption. It is an open problem to establish criteria for optimal CR-trees, or at least to characterize those CR-trees that lead to good elimination strategies. In [12] we presented an algorithm for enumerating CR-trees where we give the algorithms for decomposing 3-connected rigidity circuits in polynomial time. In this paper we focus on those circuits that are not 3-connected, which we simply call 2-connected. In order to enumerate CR-decomps of 2-connected circuits $G$, a brute force exp-time search has to be performed among the subgraphs induced by the subsets of $V(G)$. This exp-time bottleneck is not present in the 3-connected case. In this paper we will argue that we do not have to account for all possible CR-decomps of 2-connected rigidity circuits to find a good elimination strategy; we only have to account for those CR-decomps that are a 2-split, all of which can be enumerated in polynomial time. We present algorithms and computational evidence in support of this heuristic.
In this era of large language models (LLMs), the traditional training of models has become increasingly unimaginable for regular users and institutions. The exploration of efficient fine-tuning for high-resource languages on these models is an undeniable trend that is gradually gaining popularity. However, there has been very little exploration for various low-resource languages, such as Tibetan. Research in Tibetan NLP is inherently scarce and limited. While there is currently no existing large language model for Tibetan due to its low-resource nature, that day will undoubtedly arrive. Therefore, research on efficient fine-tuning for low-resource language models like Tibetan is highly necessary. Our research can serve as a reference to fill this crucial gap. Efficient fine-tuning strategies for pre-trained language models (PLMs) in Tibetan have seen minimal exploration. We conducted three types of efficient fine-tuning experiments on the publicly available TNCC-title dataset: "prompt-tuning," "Adapter lightweight fine-tuning," and "prompt-tuning + Adapter fine-tuning." The experimental results demonstrate significant improvements using these methods, providing valuable insights for advancing Tibetan language applications in the context of pre-trained models.
Point source localisation is generally modelled as a Lasso-type problem on measures. However, optimisation methods in non-Hilbert spaces, such as the space of Radon measures, are much less developed than in Hilbert spaces. Most numerical algorithms for point source localisation are based on the Frank-Wolfe conditional gradient method, for which ad hoc convergence theory is developed. We develop extensions of proximal-type methods to spaces of measures. This includes forward-backward splitting, its inertial version, and primal-dual proximal splitting. Their convergence proofs follow standard patterns. We demonstrate their numerical efficacy.
Geographical and Temporal Weighted Regression (GTWR) model is an important local technique for exploring spatial heterogeneity in data relationships, as well as temporal dependence due to its high fitting capacity when it comes to real data. In this article, we consider a GTWR model driven by a spatio-temporal noise, colored in space and fractional in time. Concerning the covariates, we consider that they are correlated, taking into account two interaction types between covariates, weak and strong interaction. Under these assumptions, Weighted Least Squares Estimator (WLS) is obtained, as well as its rate of convergence. In order to evidence the good performance of the estimator studied, it is provided a simulation study of four different scenarios, where it is observed that the residuals oscillate with small variation around zero. The STARMA package of the R software allows obtaining a variant of the $R^{2}$ coefficient, with values very close to 1, which means that most of the variability is explained by the model.
The Mixup method has proven to be a powerful data augmentation technique in Computer Vision, with many successors that perform image mixing in a guided manner. One of the interesting research directions is transferring the underlying Mixup idea to other domains, e.g. Natural Language Processing (NLP). Even though there already exist several methods that apply Mixup to textual data, there is still room for new, improved approaches. In this work, we introduce AttentionMix, a novel mixing method that relies on attention-based information. While the paper focuses on the BERT attention mechanism, the proposed approach can be applied to generally any attention-based model. AttentionMix is evaluated on 3 standard sentiment classification datasets and in all three cases outperforms two benchmark approaches that utilize Mixup mechanism, as well as the vanilla BERT method. The results confirm that the attention-based information can be effectively used for data augmentation in the NLP domain.
The study further explores randomized QMC (RQMC), which maintains the QMC convergence rate and facilitates computational efficiency analysis. Emphasis is laid on integrating randomly shifted lattice rules, a distinct RQMC quadrature, with IS,a classic variance reduction technique. The study underscores the intricacies of establishing a theoretical convergence rate for IS in QMC compared to MC, given the influence of problem dimensions and smoothness on QMC. The research also touches on the significance of IS density selection and its potential implications. The study culminates in examining the error bound of IS with a randomly shifted lattice rule, drawing inspiration from the reproducing kernel Hilbert space (RKHS). In the realm of finance and statistics, many problems boil down to computing expectations, predominantly integrals concerning a Gaussian measure. This study considers optimal drift importance sampling (ODIS) and Laplace importance sampling (LapIS) as common importance densities. Conclusively, the paper establishes that under certain conditions, the IS-randomly shifted lattice rule can achieve a near $O(N^{-1})$ error bound.
We present ResMLP, an architecture built entirely upon multi-layer perceptrons for image classification. It is a simple residual network that alternates (i) a linear layer in which image patches interact, independently and identically across channels, and (ii) a two-layer feed-forward network in which channels interact independently per patch. When trained with a modern training strategy using heavy data-augmentation and optionally distillation, it attains surprisingly good accuracy/complexity trade-offs on ImageNet. We will share our code based on the Timm library and pre-trained models.
Hashing has been widely used in approximate nearest search for large-scale database retrieval for its computation and storage efficiency. Deep hashing, which devises convolutional neural network architecture to exploit and extract the semantic information or feature of images, has received increasing attention recently. In this survey, several deep supervised hashing methods for image retrieval are evaluated and I conclude three main different directions for deep supervised hashing methods. Several comments are made at the end. Moreover, to break through the bottleneck of the existing hashing methods, I propose a Shadow Recurrent Hashing(SRH) method as a try. Specifically, I devise a CNN architecture to extract the semantic features of images and design a loss function to encourage similar images projected close. To this end, I propose a concept: shadow of the CNN output. During optimization process, the CNN output and its shadow are guiding each other so as to achieve the optimal solution as much as possible. Several experiments on dataset CIFAR-10 show the satisfying performance of SRH.
Graph representation learning for hypergraphs can be used to extract patterns among higher-order interactions that are critically important in many real world problems. Current approaches designed for hypergraphs, however, are unable to handle different types of hypergraphs and are typically not generic for various learning tasks. Indeed, models that can predict variable-sized heterogeneous hyperedges have not been available. Here we develop a new self-attention based graph neural network called Hyper-SAGNN applicable to homogeneous and heterogeneous hypergraphs with variable hyperedge sizes. We perform extensive evaluations on multiple datasets, including four benchmark network datasets and two single-cell Hi-C datasets in genomics. We demonstrate that Hyper-SAGNN significantly outperforms the state-of-the-art methods on traditional tasks while also achieving great performance on a new task called outsider identification. Hyper-SAGNN will be useful for graph representation learning to uncover complex higher-order interactions in different applications.