亚洲精品无码国产爽快A片百度_亚洲日韩精品三级在线观看_99久久久无码国产麻豆_婬乱人妻被快递员玩弄的电影_激情黄色视频久久网站_欧美乱自拍视频在线视频_久章草在线视频免费观看

Knowledge graphs represent facts about real-world entities. Most of these facts are defined as positive statements. The negative statements are scarce but highly relevant under the open-world assumption. Furthermore, they have been demonstrated to improve the performance of several applications, namely in the biomedical domain. However, no benchmark dataset supports the evaluation of the methods that consider these negative statements. We present a collection of datasets for three relation prediction tasks - protein-protein interaction prediction, gene-disease association prediction and disease prediction - that aim at circumventing the difficulties in building benchmarks for knowledge graphs with negative statements. These datasets include data from two successful biomedical ontologies, Gene Ontology and Human Phenotype Ontology, enriched with negative statements. We also generate knowledge graph embeddings for each dataset with two popular path-based methods and evaluate the performance in each task. The results show that the negative statements can improve the performance of knowledge graph embeddings.

相關內容

知識 (knowledge)

關注 12

通過學習、實踐或探索所獲得的認識、判斷或技能。

泛函 · CASE · 圖 · 大學 · 分解的 ·

2023 年 9 月 13 日

Tight quasi-universality of Reeb graph distances

Ulrich Bauer,H?vard Bakke Bjerkevik,Benedikt Fluhr

from arxiv, 28 pages, 8 figures. This version establishes the tightness of all the bounds shown in the conference paper for SoCG 2022

We establish tight bi-Lipschitz bounds certifying quasi-universality (universality up to a constant factor) for various distances between Reeb graphs: the interleaving distance, the functional distortion distance, and the functional contortion distance. The definition of the latter distance is a novel contribution, and for the special case of contour trees we also prove strict universality of this distance. Furthermore, we prove that for the special case of merge trees the functional contortion distance coincides with the interleaving distance, yielding universality of all four distances in this case.

MoDELS · 分離的 · CASES · Airfoil · Performer ·

2023 年 9 月 13 日

Generalizable improvement of the Spalart-Allmaras model through assimilation of experimental data

Deepinder Jot Singh Aulakh,Romit Maulik

This study focuses on the use of model and data fusion for improving the Spalart-Allmaras (SA) closure model for Reynolds-averaged Navier-Stokes solutions of separated flows. In particular, our goal is to develop of models that not-only assimilate sparse experimental data to improve performance in computational models, but also generalize to unseen cases by recovering classical SA behavior. We achieve our goals using data assimilation, namely the Ensemble Kalman Filtering approach (EnKF), to calibrate the coefficients of the SA model for separated flows. A holistic calibration strategy is implemented via a parameterization of the production, diffusion, and destruction terms. This calibration relies on the assimilation of experimental data collected velocity profiles, skin friction, and pressure coefficients for separated flows. Despite using of observational data from a single flow condition around a backward-facing step (BFS), the recalibrated SA model demonstrates generalization to other separated flows, including cases such as the 2D-bump and modified BFS. Significant improvement is observed in the quantities of interest, i.e., skin friction coefficient ($C_f$) and pressure coefficient ($C_p$) for each flow tested. Finally, it is also demonstrated that the newly proposed model recovers SA proficiency for external, unseparated flows, such as flow around a NACA-0012 airfoil without any danger of extrapolation, and that the individually calibrated terms in the SA model are targeted towards specific flow-physics wherein the calibrated production term improves the re-circulation zone while destruction improves the recovery zone.

推斷 · 可辨認的 · 查準率/準確率 · 均值 · 潛在 ·

2023 年 9 月 12 日

The difference between structural counterfactuals and potential outcomes

Lucas de Lara

Most of the literature on causality considers the structural framework of Pearl and the potential-outcome framework of Neyman and Rubin to be formally equivalent, and therefore interchangeably uses the do-notation and the potential-outcome subscript notation to write counterfactual outcomes. In this paper, we superimpose the two causal frameworks to prove that structural counterfactual outcomes and potential outcomes do not coincide in general -- not even in law. More precisely, we express the law of the potential outcomes in terms of the latent structural causal model under the fundamental assumptions of causal inference. This enables us to precisely identify when counterfactual inference is or is not equivalent between approaches, and to clarify the meaning of each kind of counterfactuals.

Storage · INFORMS · Medium · 基 · 核化 ·

2023 年 9 月 12 日

DNA digital data storage and retrieval using algebraic codes

NallappaBhavithran G,Selvakumar R

from arxiv, 7 pages, 3 figures

DNA is a promising storage medium, but its stability and occurrence of Indel errors pose a significant challenge. The relative occurrence of Guanine(G) and Cytosine(C) in DNA is crucial for its longevity, and reverse complementary base pairs should be avoided to prevent the formation of a secondary structure in DNA strands. We overcome these challenges by selecting appropriate group homomorphisms. For storing and retrieving information in DNA strings we use kernel code and the Varshamov-Tenengolts algorithm. The Varshamov-Tenengolts algorithm corrects single indel errors. Additionally, we construct codes of any desired length (n) while calculating its reverse complement distance based on the value of n.

INFORMS · Extensibility · 相互獨立的 · Principle · 統計量 ·

2023 年 9 月 11 日

Data efficiency, dimensionality reduction, and the generalized symmetric information bottleneck

K. Michael Martini,Ilya Nemenman

The Symmetric Information Bottleneck (SIB), an extension of the more familiar Information Bottleneck, is a dimensionality reduction technique that simultaneously compresses two random variables to preserve information between their compressed versions. We introduce the Generalized Symmetric Information Bottleneck (GSIB), which explores different functional forms of the cost of such simultaneous reduction. We then explore the dataset size requirements of such simultaneous compression. We do this by deriving bounds and root-mean-squared estimates of statistical fluctuations of the involved loss functions. We show that, in typical situations, the simultaneous GSIB compression requires qualitatively less data to achieve the same errors compared to compressing variables one at a time. We suggest that this is an example of a more general principle that simultaneous compression is more data efficient than independent compression of each of the input variables.

Integration · 估計/估計量 · data integrity · 推斷 · binary ·

2023 年 9 月 11 日

Turning the information-sharing dial: efficient inference from different data sources

Emily C. Hector,Ryan Martin

from arxiv, 40 pages, 9 figures, 9 tables

A fundamental aspect of statistics is the integration of data from different sources. Classically, Fisher and others were focused on how to integrate homogeneous (or only mildly heterogeneous) sets of data. More recently, as data is becoming more accessible, the question of if data sets from different sources should be integrated is becoming more relevant. The current literature treats this as a question with only two answers: integrate or don't. Here we take a different approach, motivated by information-sharing principles coming from the shrinkage estimation literature. In particular, we deviate from the do/don't perspective and propose a dial parameter that controls the extent to which two data sources are integrated. How far this dial parameter should be turned is shown to depend, for example, on the informativeness of the different data sources as measured by Fisher information. In the context of generalized linear models, this more nuanced data integration framework leads to relatively simple parameter estimates and valid tests/confidence intervals. Moreover, we demonstrate both theoretically and empirically that setting the dial parameter according to our recommendation leads to more efficient estimation compared to other binary data integration schemes.

缺失值 · 數據填補 · 評論員 · 正則化項 · Learning ·

2023 年 9 月 11 日

Conditional expectation with regularization for missing data imputation

Mai Anh Vu,Thu Nguyen,Tu T. Do,Nhan Phan,Nitesh V. Chawla,P?l Halvorsen,Michael A. Riegler,Binh T. Nguyen

Missing data frequently occurs in datasets across various domains, such as medicine, sports, and finance. In many cases, to enable proper and reliable analyses of such data, the missing values are often imputed, and it is necessary that the method used has a low root mean square error (RMSE) between the imputed and the true values. In addition, for some critical applications, it is also often a requirement that the imputation method is scalable and the logic behind the imputation is explainable, which is especially difficult for complex methods that are, for example, based on deep learning. Based on these considerations, we propose a new algorithm named "conditional Distribution-based Imputation of Missing Values with Regularization" (DIMV). DIMV operates by determining the conditional distribution of a feature that has missing entries, using the information from the fully observed features as a basis. As will be illustrated via experiments in the paper, DIMV (i) gives a low RMSE for the imputed values compared to state-of-the-art methods; (ii) fast and scalable; (iii) is explainable as coefficients in a regression model, allowing reliable and trustable analysis, makes it a suitable choice for critical domains where understanding is important such as in medical fields, finance, etc; (iv) can provide an approximated confidence region for the missing values in a given sample; (v) suitable for both small and large scale data; (vi) in many scenarios, does not require a huge number of parameters as deep learning approaches; (vii) handle multicollinearity in imputation effectively; and (viii) is robust to the normally distributed assumption that its theoretical grounds rely on.

線性的 · 循環神經網絡 · Networking · Neural Networks · 凈輸入 ·

2023 年 9 月 7 日

Brief technical note on linearizing recurrent neural networks (RNNs) before vs after the pointwise nonlinearity

Marino Pagan,Adrian Valente,Srdjan Ostojic,Carlos D. Brody

from arxiv, 10 pages

Linearization of the dynamics of recurrent neural networks (RNNs) is often used to study their properties. The same RNN dynamics can be written in terms of the ``activations" (the net inputs to each unit, before its pointwise nonlinearity) or in terms of the ``activities" (the output of each unit, after its pointwise nonlinearity); the two corresponding linearizations are different from each other. This brief and informal technical note describes the relationship between the two linearizations, between the left and right eigenvectors of their dynamics matrices, and shows that some context-dependent effects are readily apparent under linearization of activity dynamics but not linearization of activation dynamics.

圖片分類 · 前饋網絡 · INTERACT · Networking · 前饋 ·

2021 年 5 月 7 日

ResMLP: Feedforward networks for image classification with data-efficient training

Hugo Touvron,Piotr Bojanowski,Mathilde Caron,Matthieu Cord,Alaaeldin El-Nouby,Edouard Grave,Armand Joulin,Gabriel Synnaeve,Jakob Verbeek,Hervé Jégou

We present ResMLP, an architecture built entirely upon multi-layer perceptrons for image classification. It is a simple residual network that alternates (i) a linear layer in which image patches interact, independently and identically across channels, and (ii) a two-layer feed-forward network in which channels interact independently per patch. When trained with a modern training strategy using heavy data-augmentation and optionally distillation, it attains surprisingly good accuracy/complexity trade-offs on ImageNet. We will share our code based on the Timm library and pre-trained models.

entity · 圖 · 知識圖譜 · MoDELS · 鏈路預測 ·

2020 年 8 月 10 日

A survey of embedding models of entities and relationships for knowledge graph completion

Dat Quoc Nguyen

from arxiv, 13 pages, 2 figures and 6 tables

Knowledge graphs (KGs) of real-world facts about entities and their relationships are useful resources for a variety of natural language processing tasks. However, because knowledge graphs are typically incomplete, it is useful to perform knowledge graph completion or link prediction, i.e. predict whether a relationship not in the knowledge graph is likely to be true. This paper serves as a comprehensive survey of embedding models of entities and relationships for knowledge graph completion, summarizing up-to-date experimental results on standard benchmark datasets and pointing out potential future research directions.