This preprint specifies quality requirements for a core ontology whose ontological elements such as terms, non-taxonomic relationships, among others, are based on a foundational ontology. The quality requirements are represented in a quality model that is structured in the form of a requirements tree composed of characteristics and attributes to be measured and evaluated. An attribute represents an atomic aspect of an entity, that is, an elementary non-functional requirement that can be measured by a direct or indirect metric and evaluated by an elementary indicator. In contrast, characteristics that model less atomic aspects of an entity cannot be measured by metrics, but rather are evaluated by derived indicators generally modeled by an aggregation function. Therefore, this preprint shows the design of direct and indirect metrics in addition to the design of elementary indicators, which are used to implement measurement and evaluation activities to obtain the results of a quality requirements tree. In particular, this document shows the applicability of the designed metrics and indicators that are used by a evaluation and comparison strategy. Two process core ontologies were preselected, evaluated and compared in order to adopt strengths in the target entity named ProcessCO. The data and information resulting from this study are also recorded, as well as the outcomes of the revaluation after improvement of the target entity.
Modern cars technologies are evolving quickly. They collect a variety of personal data and treat it on behalf of the car manufacturer to improve the drivers' experience. The precise terms of such a treatment are stated within the privacy policies accepted by the user when buying a car or through the infotainment system when it is first started. This paper uses a double lens to assess people's privacy while they drive a car. The first approach is objective and studies the readability of privacy policies that comes with cars. We analyse the privacy policies of twelve car brands and apply well-known readability indices to evaluate the extent to which privacy policies are comprehensible by all drivers. The second approach targets drivers' opinions to extrapolate their privacy concerns and trust perceptions. We design a questionnaire to collect the opinions of 88 participants and draw essential statistics about them. Our combined findings indicate that privacy is insufficiently understood at present as an issue deriving from driving a car, hence future technologies should be tailored to make people more aware of the issue and to enable them to express their preferences.
Probabilistic linear discriminant analysis (PLDA) has broad application in open-set verification tasks, such as speaker verification. A key concern for PLDA is that the model is too simple (linear Gaussian) to deal with complicated data; however, the simplicity by itself is a major advantage of PLDA, as it leads to desirable generalization. An interesting research therefore is how to improve modeling capacity of PLDA while retaining the simplicity. This paper presents a decoupling approach, which involves a global model that is simple and generalizable, and a local model that is complex and expressive. While the global model holds a bird view on the entire data, the local model represents the details of individual classes. We conduct a preliminary study towards this direction and investigate a simple decoupling model including both the global and local models. The new model, which we call decoupled PLDA, is tested on a speaker verification task. Experimental results show that it consistently outperforms the vanilla PLDA when the model is based on raw speaker vectors. However, when the speaker vectors are processed by length normalization, the advantage of decoupled PLDA will be largely lost, suggesting future research on non-linear local models.
In [Ecological Complexity 44 (2020) Art. 100885, DOI: 10.1016/j.ecocom.2020.100885] a continuous-time compartmental mathematical model for the spread of the Coronavirus disease 2019 (COVID-19) is presented with Portugal as case study, from 2 March to 4 May 2020, and the local stability of the Disease Free Equilibrium (DFE) is analysed. Here, we propose an analogous discrete-time model and, using a suitable Lyapunov function, we prove the global stability of the DFE point. Using COVID-19 real data, we show, through numerical simulations, the consistence of the obtained theoretical results.
Scholarly Knowledge Graphs (KGs) provide a rich source of structured information representing knowledge encoded in scientific publications. With the sheer volume of published scientific literature comprising a plethora of inhomogeneous entities and relations to describe scientific concepts, these KGs are inherently incomplete. We present exBERT, a method for leveraging pre-trained transformer language models to perform scholarly knowledge graph completion. We model triples of a knowledge graph as text and perform triple classification (i.e., belongs to KG or not). The evaluation shows that exBERT outperforms other baselines on three scholarly KG completion datasets in the tasks of triple classification, link prediction, and relation prediction. Furthermore, we present two scholarly datasets as resources for the research community, collected from public KGs and online resources.
A prominent problem in knowledge representation is how to answer queries taking into account also the implicit consequences of an ontology representing domain knowledge. While this problem has been widely studied within the realm of description logic ontologies, it has been surprisingly neglected within the context of vague or imprecise knowledge, particularly from the point of view of mathematical fuzzy logic. In this paper we study the problem of answering conjunctive queries and threshold queries w.r.t. ontologies in fuzzy DL-Lite. Specifically, we show through a rewriting approach that threshold query answering w.r.t. consistent ontologies remains in $AC_0$ in data complexity, but that conjunctive query answering is highly dependent on the selected triangular norm, which has an impact on the underlying semantics. For the idempodent G\"odel t-norm, we provide an effective method based on a reduction to the classical case. This paper is under consideration in Theory and Practice of Logic Programming (TPLP).
The increasing amounts of semantic resources offer valuable storage of human knowledge; however, the probability of wrong entries increases with the increased size. The development of approaches that identify potentially spurious parts of a given knowledge base is thus becoming an increasingly important area of interest. In this work, we present a systematic evaluation of whether structure-only link analysis methods can already offer a scalable means to detecting possible anomalies, as well as potentially interesting novel relation candidates. Evaluating thirteen methods on eight different semantic resources, including Gene Ontology, Food Ontology, Marine Ontology and similar, we demonstrated that structure-only link analysis could offer scalable anomaly detection for a subset of the data sets. Further, we demonstrated that by considering symbolic node embedding, explanations of the predictions (links) could be obtained, making this branch of methods potentially more valuable than the black-box only ones. To our knowledge, this is currently one of the most extensive systematic studies of the applicability of different types of link analysis methods across semantic resources from different domains.
The bidirectional encoder representations from transformers (BERT) model has recently advanced the state-of-the-art in passage re-ranking. In this paper, we analyze the results produced by a fine-tuned BERT model to better understand the reasons behind such substantial improvements. To this aim, we focus on the MS MARCO passage re-ranking dataset and provide potential reasons for the successes and failures of BERT for retrieval. In more detail, we empirically study a set of hypotheses and provide additional analysis to explain the successful performance of BERT.
In many applications, it is important to characterize the way in which two concepts are semantically related. Knowledge graphs such as ConceptNet provide a rich source of information for such characterizations by encoding relations between concepts as edges in a graph. When two concepts are not directly connected by an edge, their relationship can still be described in terms of the paths that connect them. Unfortunately, many of these paths are uninformative and noisy, which means that the success of applications that use such path features crucially relies on their ability to select high-quality paths. In existing applications, this path selection process is based on relatively simple heuristics. In this paper we instead propose to learn to predict path quality from crowdsourced human assessments. Since we are interested in a generic task-independent notion of quality, we simply ask human participants to rank paths according to their subjective assessment of the paths' naturalness, without attempting to define naturalness or steering the participants towards particular indicators of quality. We show that a neural network model trained on these assessments is able to predict human judgments on unseen paths with near optimal performance. Most notably, we find that the resulting path selection method is substantially better than the current heuristic approaches at identifying meaningful paths.
We study the use of the Wave-U-Net architecture for speech enhancement, a model introduced by Stoller et al for the separation of music vocals and accompaniment. This end-to-end learning method for audio source separation operates directly in the time domain, permitting the integrated modelling of phase information and being able to take large temporal contexts into account. Our experiments show that the proposed method improves several metrics, namely PESQ, CSIG, CBAK, COVL and SSNR, over the state-of-the-art with respect to the speech enhancement task on the Voice Bank corpus (VCTK) dataset. We find that a reduced number of hidden layers is sufficient for speech enhancement in comparison to the original system designed for singing voice separation in music. We see this initial result as an encouraging signal to further explore speech enhancement in the time-domain, both as an end in itself and as a pre-processing step to speech recognition systems.
Can an algorithm create original and compelling fashion designs to serve as an inspirational assistant? To help answer this question, we design and investigate different image generation models associated with different loss functions to boost creativity in fashion generation. The dimensions of our explorations include: (i) different Generative Adversarial Networks architectures that start from noise vectors to generate fashion items, (ii) a new loss function that encourages creativity, and (iii) a generation process following the key elements of fashion design (disentangling shape and texture makers). A key challenge of this study is the evaluation of generated designs and the retrieval of best ones, hence we put together an evaluation protocol associating automatic metrics and human experimental studies that we hope will help ease future research. We show that our proposed creativity loss yields better overall appreciation than the one employed in Creative Adversarial Networks. In the end, about 61% of our images are thought to be created by human designers rather than by a computer while also being considered original per our human subject experiments, and our proposed loss scores the highest compared to existing losses in both novelty and likability.