The Research Excellence Framework (REF) is a periodic UK-wide assessment of the quality of published research in universities. The most recent REF was in 2014, and the next will be in 2021. The published results of REF2014 include a categorical `quality profile' for each unit of assessment (typically a university department), reporting what percentage of the unit's REF-submitted research outputs were assessed as being at each of four quality levels (labelled 4*, 3*, 2* and 1*). Also in the public domain are the original submissions made to REF2014, which include -- for each unit of assessment -- publication details of the REF-submitted research outputs. In this work, we address the question: to what extent can a REF quality profile for research outputs be attributed to the journals in which (most of) those outputs were published? The data are the published submissions and results from REF2014. The main statistical challenge comes from the fact that REF quality profiles are available only at the aggregated level of whole units of assessment: the REF panel's assessment of each individual research output is not made public. Our research question is thus an `ecological inference' problem, which demands special care in model formulation and methodology. The analysis is based on logit models in which journal-specific parameters are regularized via prior `pseudo-data'. We develop a lack-of-fit measure for the extent to which REF scores appear to depend on publication venues rather than research quality or institution-level differences. Results are presented for several research fields.
Graph Neural Networks (GNNs) are the state-of-the-art model for machine learning on graph-structured data. The most popular class of GNNs operate by exchanging information between adjacent nodes, and are known as Message Passing Neural Networks (MPNNs). Given their widespread use, understanding the expressive power of MPNNs is a key question. However, existing results typically consider settings with uninformative node features. In this paper, we provide a rigorous analysis to determine which function classes of node features can be learned by an MPNN of a given capacity. We do so by measuring the level of pairwise interactions between nodes that MPNNs allow for. This measure provides a novel quantitative characterization of the so-called over-squashing effect, which is observed to occur when a large volume of messages is aggregated into fixed-size vectors. Using our measure, we prove that, to guarantee sufficient communication between pairs of nodes, the capacity of the MPNN must be large enough, depending on properties of the input graph structure, such as commute times. For many relevant scenarios, our analysis results in impossibility statements in practice, showing that over-squashing hinders the expressive power of MPNNs. We validate our theoretical findings through extensive controlled experiments and ablation studies.
Instrumental Variables provide a way of addressing bias due to unmeasured confounding when estimating treatment effects using observational data. As instrument prescription preference of individual healthcare providers has been proposed. Because prescription preference is hard to measure and often unobserved, a surrogate measure constructed from available data is often required for the analysis. Different construction methods for this surrogate measure are possible, such as simple rule-based methods which make use of the observed treatment patterns, or more complex model-based methods that employ formal statistical models to explain the treatment behaviour whilst considering measured confounders. The choice of construction method relies on aspects like data availability within provider, missing data in measured confounders, and possible changes in prescription preference over time. In this paper we conduct a comprehensive simulation study to evaluate different construction methods for surrogates of prescription preference under different data conditions, including: different provider sizes, missing covariate data, and change in preference. We also propose a novel model-based construction method to address between provider differences and change in prescription preference. All presented construction methods are exemplified in a case study of the relative glucose lowering effect of two type 2 diabetes treatments in observational data. Our study shows that preference-based Instrumental Variable methods can be a useful tool for causal inference from observational health data. The choice of construction method should be driven by the data condition at hand. Our proposed method is capable of estimating the causal treatment effect without bias in case of sufficient prescription data per provider, changing prescription preference over time and non-ignorable missingness in measured confounders.
In Bayesian inference, a simple and popular approach to reduce the burden of computing high dimensional integrals against a posterior $\pi$ is to make the Laplace approximation $\hat\gamma$. This is a Gaussian distribution, so computing $\int fd\pi$ via the approximation $\int fd\hat\gamma$ is significantly less expensive. In this paper, we make two general contributions to the topic of high-dimensional Laplace approximations, as well as a third contribution specific to a logistic regression model. First, we tighten the dimension dependence of the error $|\int fd\pi - \int fd\hat\gamma|$ for a broad class of functions $f$. Second, we derive a higher-accuracy approximation $\hat\gamma_S$ to $\pi$, which is a skew-adjusted modification to $\hat\gamma$. Our third contribution - in the setting of Bayesian inference for logistic regression with Gaussian design - is to use the first two results to derive upper bounds which hold uniformly over different sample realizations, and lower bounds on the Laplace mean approximation error. In particular, we prove a skewed Bernstein-von Mises Theorem in this logistic regression setting.
This article presents the results of a quantitative analysis of Ukrainian Arts and Humanities (A&H) research from 2012 to 2021, as observed in Scopus. The overall publication activity and the relative share of A&H publications in relation to Ukraine's total research output, comparing them with other countries. The study analyzes the diversity and total number of sources, as well as the geographic distribution of authors and citing authors, to provide insights into the internationalization level of Ukrainian A&H research. Additionally, the topical spectrum and language usage are considered to complete the overall picture. According to our results, the publication patterns for Ukrainian A&H research exhibit dynamics comparable to those of other countries, with a gradual increase in the total number of papers and sources. However, the citedness is lower than expected, and the share of publications in top-quartile sources is lower for 2020-2021 period compared to the previous years. The impact of internationally collaborative papers, especially those in English, is higher. Nevertheless, over half of all works remain uncited, probably due to the limited readership of the journals selected for publication.
Jupyter notebooks facilitate the bundling of executable code with its documentation and output in one interactive environment, and they represent a popular mechanism to document and share computational workflows. The reproducibility of computational aspects of research is a key component of scientific reproducibility but has not yet been assessed at scale for Jupyter notebooks associated with biomedical publications. We address computational reproducibility at two levels: First, using fully automated workflows, we analyzed the computational reproducibility of Jupyter notebooks related to publications indexed in PubMed Central. We identified such notebooks by mining the articles full text, locating them on GitHub and re-running them in an environment as close to the original as possible. We documented reproduction success and exceptions and explored relationships between notebook reproducibility and variables related to the notebooks or publications. Second, this study represents a reproducibility attempt in and of itself, using essentially the same methodology twice on PubMed Central over two years. Out of 27271 notebooks from 2660 GitHub repositories associated with 3467 articles, 22578 notebooks were written in Python, including 15817 that had their dependencies declared in standard requirement files and that we attempted to re-run automatically. For 10388 of these, all declared dependencies could be installed successfully, and we re-ran them to assess reproducibility. Of these, 1203 notebooks ran through without any errors, including 879 that produced results identical to those reported in the original notebook and 324 for which our results differed from the originally reported ones. Running the other notebooks resulted in exceptions. We zoom in on common problems, highlight trends and discuss potential improvements to Jupyter-related workflows associated with biomedical publications.
In pace with developments in the research field of artificial intelligence, knowledge graphs (KGs) have attracted a surge of interest from both academia and industry. As a representation of semantic relations between entities, KGs have proven to be particularly relevant for natural language processing (NLP), experiencing a rapid spread and wide adoption within recent years. Given the increasing amount of research work in this area, several KG-related approaches have been surveyed in the NLP research community. However, a comprehensive study that categorizes established topics and reviews the maturity of individual research streams remains absent to this day. Contributing to closing this gap, we systematically analyzed 507 papers from the literature on KGs in NLP. Our survey encompasses a multifaceted review of tasks, research types, and contributions. As a result, we present a structured overview of the research landscape, provide a taxonomy of tasks, summarize our findings, and highlight directions for future work.
In practically every industry today, artificial intelligence is one of the most effective ways for machines to assist humans. Since its inception, a large number of researchers throughout the globe have been pioneering the application of artificial intelligence in medicine. Although artificial intelligence may seem to be a 21st-century concept, Alan Turing pioneered the first foundation concept in the 1940s. Artificial intelligence in medicine has a huge variety of applications that researchers are continually exploring. The tremendous increase in computer and human resources has hastened progress in the 21st century, and it will continue to do so for many years to come. This review of the literature will highlight the emerging field of artificial intelligence in medicine and its current level of development.
Transformer-based models are now widely used in NLP, but we still do not understand a lot about their inner workings. This paper describes what is known to date about the famous BERT model (Devlin et al. 2019), synthesizing over 40 analysis studies. We also provide an overview of the proposed modifications to the model and its training regime. We then outline the directions for further research.
Deep learning constitutes a recent, modern technique for image processing and data analysis, with promising results and large potential. As deep learning has been successfully applied in various domains, it has recently entered also the domain of agriculture. In this paper, we perform a survey of 40 research efforts that employ deep learning techniques, applied to various agricultural and food production challenges. We examine the particular agricultural problems under study, the specific models and frameworks employed, the sources, nature and pre-processing of data used, and the overall performance achieved according to the metrics used at each work under study. Moreover, we study comparisons of deep learning with other existing popular techniques, in respect to differences in classification or regression performance. Our findings indicate that deep learning provides high accuracy, outperforming existing commonly used image processing techniques.
Machine Learning has been the quintessential solution for many AI problems, but learning is still heavily dependent on the specific training data. Some learning models can be incorporated with a prior knowledge in the Bayesian set up, but these learning models do not have the ability to access any organised world knowledge on demand. In this work, we propose to enhance learning models with world knowledge in the form of Knowledge Graph (KG) fact triples for Natural Language Processing (NLP) tasks. Our aim is to develop a deep learning model that can extract relevant prior support facts from knowledge graphs depending on the task using attention mechanism. We introduce a convolution-based model for learning representations of knowledge graph entity and relation clusters in order to reduce the attention space. We show that the proposed method is highly scalable to the amount of prior information that has to be processed and can be applied to any generic NLP task. Using this method we show significant improvement in performance for text classification with News20, DBPedia datasets and natural language inference with Stanford Natural Language Inference (SNLI) dataset. We also demonstrate that a deep learning model can be trained well with substantially less amount of labeled training data, when it has access to organised world knowledge in the form of knowledge graph.