The initial motivation for this work was the linguistic case of the spread of Germanic syntactic features into Romance dialects of North-Eastern Italy, which occurred after the immigration of German people to Tyrol during the High Middle Ages. To obtain a representation of the data over the territory suitable for a mathematical formulation, an interactive map is produced as a first step, using tools of what is called Geographic Data Science. A smooth two-dimensional surface G is introduced, expressing locally which fraction of territory uses a given German language feature: it is obtained by a piecewise cubic curvature minimizing interpolant of the discrete function that says if at any surveyed locality that feature is used or not. This surface G is thought of as the value at the present time of a function describing a diffusion-convection phenomenon in two dimensions (here said tidal mode), which is subjected in a very natural way to the same equation used in physics, introducing a contextual diffusivity concept: it is shown that with two different assumptions about diffusivity, solutions of this equation, evaluated at the present time, fit well with the data interpolated by G, thus providing two convincing different pictures of diffusion-convection in the case under study, albeit simplifications and approximations. Very importantly, it is shown that the linguistic diffusion model known to linguists as Schmidt waves can be counted among the solutions of the diffusion equation
We establish a general convergence theory of the Rayleigh--Ritz method and the refined Rayleigh--Ritz method for computing some simple eigenpair ($\lambda_{*},x_{*}$) of a given analytic nonlinear eigenvalue problem (NEP). In terms of the deviation $\varepsilon$ of $x_{*}$ from a given subspace $\mathcal{W}$, we establish a priori convergence results on the Ritz value, the Ritz vector and the refined Ritz vector, and present sufficient convergence conditions for them. The results show that, as $\varepsilon\rightarrow 0$, there is a Ritz value that unconditionally converges to $\lambda_*$ and the corresponding refined Ritz vector does so too but the Ritz vector may fail to converge and even may not be unique. We also present an error bound for the approximate eigenvector in terms of the computable residual norm of a given approximate eigenpair, and give lower and upper bounds for the error of the refined Ritz vector and the Ritz vector as well as for that of the corresponding residual norms. These results nontrivially extend some convergence results on these two methods for the linear eigenvalue problem to the NEP. Examples are constructed to illustrate some of the results.
The previous state-of-the-art (SOTA) method achieved a remarkable execution accuracy on the Spider dataset, which is one of the largest and most diverse datasets in the Text-to-SQL domain. However, during our reproduction of the business dataset, we observed a significant drop in performance. We examined the differences in dataset complexity, as well as the clarity of questions' intentions, and assessed how those differences could impact the performance of prompting methods. Subsequently, We develop a more adaptable and more general prompting method, involving mainly query rewriting and SQL boosting, which respectively transform vague information into exact and precise information and enhance the SQL itself by incorporating execution feedback and the query results from the database content. In order to prevent information gaps, we include the comments, value types, and value samples for columns as part of the database description in the prompt. Our experiments with Large Language Models (LLMs) illustrate the significant performance improvement on the business dataset and prove the substantial potential of our method. In terms of execution accuracy on the business dataset, the SOTA method scored 21.05, while our approach scored 65.79. As a result, our approach achieved a notable performance improvement even when using a less capable pre-trained language model. Last but not least, we also explore the Text-to-Python and Text-to-Function options, and we deeply analyze the pros and cons among them, offering valuable insights to the community.
We present a hierarchical Bayesian pipeline, BP3M, that measures positions, parallaxes, and proper motions (PMs) for cross-matched sources between Hubble~Space~Telescope (HST) images and Gaia -- even for sparse fields ($N_*<10$ per image) -- expanding from the recent GaiaHub tool. This technique uses Gaia-measured astrometry as priors to predict the locations of sources in HST images, and is therefore able to put the HST images onto a global reference frame without the use of background galaxies/QSOs. Testing our publicly-available code in the Fornax and Draco dSphs, we measure accurate PMs that are a median of 8-13 times more precise than Gaia DR3 alone for $20.5<G<21~\mathrm{mag}$. We are able to explore the effect of observation strategies on BP3M astrometry using synthetic data, finding an optimal strategy to improve parallax and position precision at no cost to the PM uncertainty. Using 1619 HST images in the sparse COSMOS field (median 9 Gaia sources per HST image), we measure BP3M PMs for 2640 unique sources in the $16<G<21.5~\mathrm{mag}$ range, 25% of which have no Gaia PMs; the median BP3M PM uncertainty for $20.25<G<20.75~\mathrm{mag}$ sources is $0.44~$mas/yr compared to $1.03~$mas/yr from Gaia, while the median BP3M PM uncertainty for sources without Gaia-measured PMs ($20.75<G<21.5~\mathrm{mag}$) is $1.16~$mas/yr. The statistics that underpin the BP3M pipeline are a generalized way of combining position measurements from different images, epochs, and telescopes, which allows information to be shared between surveys and archives to achieve higher astrometric precision than that from each catalog alone.
Pre-trained vision and language models such as CLIP have witnessed remarkable success in connecting images and texts with a primary focus on English texts. Despite recent efforts to extend CLIP to support other languages, disparities in performance among different languages have been observed due to uneven resource availability. Additionally, current cross-lingual transfer methods of those pre-trained models would consume excessive resources for a large number of languages. Therefore, we propose a new parameter-efficient cross-lingual transfer learning framework that utilizes a translation-based alignment method to mitigate multilingual disparities and explores parameter-efficient fine-tuning methods for parameter-efficient cross-lingual transfer. Extensive experiments on XTD and Multi30K datasets, covering 11 languages under zero-shot, few-shot, and full-dataset learning scenarios, show that our framework significantly reduces the multilingual disparities among languages and improves cross-lingual transfer results, especially in low-resource scenarios, while only keeping and fine-tuning an extremely small number of parameters compared to the full model (e.g., Our framework only requires 0.16\% additional parameters of a full-model for each language in the few-shot learning scenario). The codes are available at \url{//github.com/eric-ai-lab/PECTVLM}. The codes are available at \url{//github.com/eric-ai-lab/PECTVLM}.
Equipped with Chain-of-Thought (CoT), Large language models (LLMs) have shown impressive reasoning ability in various downstream tasks. Even so, suffering from hallucinations and the inability to access external knowledge, LLMs often come with incorrect or unfaithful intermediate reasoning steps, especially in the context of answering knowledge-intensive tasks such as KBQA. To alleviate this issue, we propose a framework called Knowledge-Driven Chain-of-Thought (KD-CoT) to verify and modify reasoning traces in CoT via interaction with external knowledge, and thus overcome the hallucinations and error propagation. Concretely, we formulate the CoT rationale process of LLMs into a structured multi-round QA format. In each round, LLMs interact with a QA system that retrieves external knowledge and produce faithful reasoning traces based on retrieved precise answers. The structured CoT reasoning of LLMs is facilitated by our developed KBQA CoT collection, which serves as in-context learning demonstrations and can also be utilized as feedback augmentation to train a robust retriever. Extensive experiments on WebQSP and ComplexWebQuestion datasets demonstrate the effectiveness of proposed KD-CoT in task-solving reasoning generation, which outperforms the vanilla CoT ICL with an absolute success rate of 8.0% and 5.1%. Furthermore, our proposed feedback-augmented retriever outperforms the state-of-the-art baselines for retrieving knowledge, achieving significant improvement in Hit and recall performance. Our code and data are released on //github.com/AdelWang/KD-CoT/tree/main.
Next Point-of-Interest (POI) recommendation is a critical task in location-based services that aim to provide personalized suggestions for the user's next destination. Previous works on POI recommendation have laid focused on modeling the user's spatial preference. However, existing works that leverage spatial information are only based on the aggregation of users' previous visited positions, which discourages the model from recommending POIs in novel areas. This trait of position-based methods will harm the model's performance in many situations. Additionally, incorporating sequential information into the user's spatial preference remains a challenge. In this paper, we propose Diff-POI: a Diffusion-based model that samples the user's spatial preference for the next POI recommendation. Inspired by the wide application of diffusion algorithm in sampling from distributions, Diff-POI encodes the user's visiting sequence and spatial character with two tailor-designed graph encoding modules, followed by a diffusion-based sampling strategy to explore the user's spatial visiting trends. We leverage the diffusion process and its reversed form to sample from the posterior distribution and optimized the corresponding score function. We design a joint training and inference framework to optimize and evaluate the proposed Diff-POI. Extensive experiments on four real-world POI recommendation datasets demonstrate the superiority of our Diff-POI over state-of-the-art baseline methods. Further ablation and parameter studies on Diff-POI reveal the functionality and effectiveness of the proposed diffusion-based sampling strategy for addressing the limitations of existing methods.
The processing and analysis of computed tomography (CT) imaging is important for both basic scientific development and clinical applications. In AutoCT, we provide a comprehensive pipeline that integrates an end-to-end automatic preprocessing, registration, segmentation, and quantitative analysis of 3D CT scans. The engineered pipeline enables atlas-based CT segmentation and quantification leveraging diffeomorphic transformations through efficient forward and inverse mappings. The extracted localized features from the deformation field allow for downstream statistical learning that may facilitate medical diagnostics. On a lightweight and portable software platform, AutoCT provides a new toolkit for the CT imaging community to underpin the deployment of artificial intelligence-driven applications.
The rapid development of deep learning has made a great progress in segmentation, one of the fundamental tasks of computer vision. However, the current segmentation algorithms mostly rely on the availability of pixel-level annotations, which are often expensive, tedious, and laborious. To alleviate this burden, the past years have witnessed an increasing attention in building label-efficient, deep-learning-based segmentation algorithms. This paper offers a comprehensive review on label-efficient segmentation methods. To this end, we first develop a taxonomy to organize these methods according to the supervision provided by different types of weak labels (including no supervision, coarse supervision, incomplete supervision and noisy supervision) and supplemented by the types of segmentation problems (including semantic segmentation, instance segmentation and panoptic segmentation). Next, we summarize the existing label-efficient segmentation methods from a unified perspective that discusses an important question: how to bridge the gap between weak supervision and dense prediction -- the current methods are mostly based on heuristic priors, such as cross-pixel similarity, cross-label constraint, cross-view consistency, cross-image relation, etc. Finally, we share our opinions about the future research directions for label-efficient deep segmentation.
For languages with no annotated resources, transferring knowledge from rich-resource languages is an effective solution for named entity recognition (NER). While all existing methods directly transfer from source-learned model to a target language, in this paper, we propose to fine-tune the learned model with a few similar examples given a test case, which could benefit the prediction by leveraging the structural and semantic information conveyed in such similar examples. To this end, we present a meta-learning algorithm to find a good model parameter initialization that could fast adapt to the given test case and propose to construct multiple pseudo-NER tasks for meta-training by computing sentence similarities. To further improve the model's generalization ability across different languages, we introduce a masking scheme and augment the loss function with an additional maximum term during meta-training. We conduct extensive experiments on cross-lingual named entity recognition with minimal resources over five target languages. The results show that our approach significantly outperforms existing state-of-the-art methods across the board.
We introduce a multi-task setup of identifying and classifying entities, relations, and coreference clusters in scientific articles. We create SciERC, a dataset that includes annotations for all three tasks and develop a unified framework called Scientific Information Extractor (SciIE) for with shared span representations. The multi-task setup reduces cascading errors between tasks and leverages cross-sentence relations through coreference links. Experiments show that our multi-task model outperforms previous models in scientific information extraction without using any domain-specific features. We further show that the framework supports construction of a scientific knowledge graph, which we use to analyze information in scientific literature.