In the present academic landscape, the process of collecting data is slow, and the lax infrastructures for data collaborations lead to significant delays in coming up with and disseminating conclusive findings. Therefore, there is an increasing need for a secure, scalable, and trustworthy data-sharing ecosystem that promotes and rewards collaborative data-sharing efforts among researchers, and a robust incentive mechanism is required to achieve this objective. Reputation-based incentives, such as the h-index, have historically played a pivotal role in the academic community. However, the h-index suffers from several limitations. This paper introduces the SCIENCE-index, a blockchain-based metric measuring a researcher's scientific contributions. Utilizing the Microsoft Academic Graph and machine learning techniques, the SCIENCE-index predicts the progress made by a researcher over their career and provides a soft incentive for sharing their datasets with peer researchers. To incentivize researchers to share their data, the SCIENCE-index is augmented to include a data-sharing parameter. DataCite, a database of openly available datasets, proxies this parameter, which is further enhanced by including a researcher's data-sharing activity. Our model is evaluated by comparing the distribution of its output for geographically diverse researchers to that of the h-index. We observe that it results in a much more even spread of evaluations. The SCIENCE-index is a crucial component in constructing a decentralized protocol that promotes trust-based data sharing, addressing the current inequity in dataset sharing. The work outlined in this paper provides the foundation for assessing scientific contributions in future data-sharing spaces powered by decentralized applications.
Ontologies play a critical role in Semantic Web technologies by providing a structured and standardized way to represent knowledge and enabling machines to understand the meaning of data. Several taxonomies and ontologies have been generated, but individuals target one domain, and only some of those have been found expensive in time and manual effort. Also, they need more coverage of unconventional topics representing a more holistic and comprehensive view of the knowledge landscape and interdisciplinary collaborations. Thus, there needs to be an ontology covering Science and Technology and facilitate multidisciplinary research by connecting topics from different fields and domains that may be related or have commonalities. To address these issues, we present an automatic Science and Technology Ontology (S&TO) that covers unconventional topics in different science and technology domains. The proposed S&TO can promote the discovery of new research areas and collaborations across disciplines. The ontology is constructed by applying BERTopic to a dataset of 393,991 scientific articles collected from Semantic Scholar from October 2021 to August 2022, covering four fields of science. Currently, S&TO includes 5,153 topics and 13,155 semantic relations. S&TO model can be updated by running BERTopic on more recent datasets
Pooling and sharing data increases and distributes its value. But since data cannot be revoked once shared, scenarios that require controlled release of data for regulatory, privacy, and legal reasons default to not sharing. Because selectively controlling what data to release is difficult, the few data-sharing consortia that exist are often built around data-sharing agreements resulting from long and tedious one-off negotiations. We introduce Data Station, a data escrow designed to enable the formation of data-sharing consortia. Data owners share data with the escrow knowing it will not be released without their consent. Data users delegate their computation to the escrow. The data escrow relies on delegated computation to execute queries without releasing the data first. Data Station leverages hardware enclaves to generate trust among participants, and exploits the centralization of data and computation to generate an audit log. We evaluate Data Station on machine learning and data-sharing applications while running on an untrusted intermediary. In addition to important qualitative advantages, we show that Data Station: i) outperforms federated learning baselines in accuracy and runtime for the machine learning application; ii) is orders of magnitude faster than alternative secure data-sharing frameworks; and iii) introduces small overhead on the critical path.
Multipath QUIC is a transport protocol that allows for the use of multiple network interfaces for a single connection. It thereby offers, on the one hand, the possibility to gather a higher throughput, while, on the other hand, multiple paths can also be used to transmit data redundantly. Selective redundancy combines these two applications and thereby offers the potential to transmit time-critical data. This paper considers scenarios where data with real-time requirements are transmitted redundantly while at the same time, non-critical data should make use of the aggregated throughput. A new model called congestion window reservation is proposed, which enables an immediate transmission of time-critical data. The performance of this method and its combination with selective redundancy is evaluated using emulab with real data. The results show that this technique leads to a smaller end-to-end latency and reliability for periodically generated priority data.
The impact and originality are two critical dimensions for evaluating scientific publications, measured by citation and disruption metrics respectively. Despite the extensive effort made to understand the statistical properties and evolution of each of these metrics, the relations between the two remain unclear. In this paper, we study the evolution during last 70 years of the correlation between scientific papers' citation and disruption, finding surprisingly a decreasing trend from positive to negative correlations over the years. Consequently, during the years, there are fewer and fewer disruptive works among the highly cited papers. These results suggest that highly disruptive studies nowadays attract less attention from the scientific community. The analysis on papers' references supports this trend, showing that papers citing older references, less popular references and diverse references become to have less citations. Possible explanations for the less attention phenomenon could be due to the increasing information overload in science, and citations become more and more prominent for impact. This is supported by the evidence that research fields with more papers have a more negative correlation between citation and disruption. Finally, we show the generality of our findings by analyzing and comparing six disciplines.
Refactorings are transformations to improve the code design without changing overall functionality and observable behavior. During the refactoring process of smelly test code, practitioners may struggle to identify refactoring candidates and define and apply corrective strategies. This paper reports on an empirical study aimed at understanding how test smells and test refactorings are discussed on the Stack Exchange network. Developers commonly count on Stack Exchange to pick the brains of the wise, i.e., to `look up' how others are completing similar tasks. Therefore, in light of data from the Stack Exchange discussion topics, we could examine how developers understand and perceive test smells, the corrective actions they take to handle them, and the challenges they face when refactoring test code aiming to fix test smells. We observed that developers are interested in others' perceptions and hands-on experience handling test code issues. Besides, there is a clear indication that developers often ask whether test smells or anti-patterns are either good or bad testing practices than code-based refactoring recommendations.
As Autonomous Systems (AS) become more ubiquitous in society, more responsible for our safety and our interaction with them more frequent, it is essential that they are trustworthy. Assessing the trustworthiness of AS is a mandatory challenge for the verification and development community. This will require appropriate standards and suitable metrics that may serve to objectively and comparatively judge trustworthiness of AS across the broad range of current and future applications. The meta-expression `trustworthiness' is examined in the context of AS capturing the relevant qualities that comprise this term in the literature. Recent developments in standards and frameworks that support assurance of autonomous systems are reviewed. A list of key challenges are identified for the community and we present an outline of a process that can be used as a trustworthiness assessment framework for AS.
With the rapid development of deep learning, training Big Models (BMs) for multiple downstream tasks becomes a popular paradigm. Researchers have achieved various outcomes in the construction of BMs and the BM application in many fields. At present, there is a lack of research work that sorts out the overall progress of BMs and guides the follow-up research. In this paper, we cover not only the BM technologies themselves but also the prerequisites for BM training and applications with BMs, dividing the BM review into four parts: Resource, Models, Key Technologies and Application. We introduce 16 specific BM-related topics in those four parts, they are Data, Knowledge, Computing System, Parallel Training System, Language Model, Vision Model, Multi-modal Model, Theory&Interpretability, Commonsense Reasoning, Reliability&Security, Governance, Evaluation, Machine Translation, Text Generation, Dialogue and Protein Research. In each topic, we summarize clearly the current studies and propose some future research directions. At the end of this paper, we conclude the further development of BMs in a more general view.
Classic machine learning methods are built on the $i.i.d.$ assumption that training and testing data are independent and identically distributed. However, in real scenarios, the $i.i.d.$ assumption can hardly be satisfied, rendering the sharp drop of classic machine learning algorithms' performances under distributional shifts, which indicates the significance of investigating the Out-of-Distribution generalization problem. Out-of-Distribution (OOD) generalization problem addresses the challenging setting where the testing distribution is unknown and different from the training. This paper serves as the first effort to systematically and comprehensively discuss the OOD generalization problem, from the definition, methodology, evaluation to the implications and future directions. Firstly, we provide the formal definition of the OOD generalization problem. Secondly, existing methods are categorized into three parts based on their positions in the whole learning pipeline, namely unsupervised representation learning, supervised model learning and optimization, and typical methods for each category are discussed in detail. We then demonstrate the theoretical connections of different categories, and introduce the commonly used datasets and evaluation metrics. Finally, we summarize the whole literature and raise some future directions for OOD generalization problem. The summary of OOD generalization methods reviewed in this survey can be found at //out-of-distribution-generalization.com.
This paper proposes a recommender system to alleviate the cold-start problem that can estimate user preferences based on only a small number of items. To identify a user's preference in the cold state, existing recommender systems, such as Netflix, initially provide items to a user; we call those items evidence candidates. Recommendations are then made based on the items selected by the user. Previous recommendation studies have two limitations: (1) the users who consumed a few items have poor recommendations and (2) inadequate evidence candidates are used to identify user preferences. We propose a meta-learning-based recommender system called MeLU to overcome these two limitations. From meta-learning, which can rapidly adopt new task with a few examples, MeLU can estimate new user's preferences with a few consumed items. In addition, we provide an evidence candidate selection strategy that determines distinguishing items for customized preference estimation. We validate MeLU with two benchmark datasets, and the proposed model reduces at least 5.92% mean absolute error than two comparative models on the datasets. We also conduct a user study experiment to verify the evidence selection strategy.