韩国成年性午夜免费视频,日韩精品区一区二三VR

This paper introduces H-STREAM, a big stream/data processing pipelines evaluation engine that proposes stream processing operators as micro-services to support the analysis and visualisation of Big Data streams stemming from IoT (Internet of Things) environments. H-STREAM micro-services combine stream processing and data storage techniques tuned depending on the number of things producing streams, the pace at which they produce them, and the physical computing resources available for processing them online and delivering them to consumers. H-STREAM delivers stream processing and visualisation micro-services installed in a cloud environment. Micro-services can be composed for implementing specific stream aggregation analysis pipelines as queries. The paper presents an experimental validation using Microsoft Azure as a deployment environment for testing the capacity of H-STREAM for dealing with velocity and volume challenges in an (i) a neuroscience experiment and (in) a social connectivity analysis scenario running on IoT farms.

相關內容

Stream Processing

關注 1

COVID-19 · Performer · 數據集 · Signal Processing · 開發集 ·

2021 年 10 月 8 日

The Second DiCOVA Challenge: Dataset and performance analysis for COVID-19 diagnosis using acoustics

Neeraj Kumar Sharma,Srikanth Raj Chetupalli,Debarpan Bhattacharya,Debottam Dutta,Pravin Mote,Sriram Ganapathy

The Second Diagnosis of COVID-19 using Acoustics (DiCOVA) Challenge aimed at accelerating the research in acoustics based detection of COVID-19, a topic at the intersection of acoustics, signal processing, machine learning, and healthcare. This paper presents the details of the challenge, which was an open call for researchers to analyze a dataset of audio recordings consisting of breathing, cough and speech signals. This data was collected from individuals with and without COVID-19 infection, and the task in the challenge was a two-class classification. The development set audio recordings were collected from 965 (172 COVID-19 positive) individuals, while the evaluation set contained data from 471 individuals (71 COVID-19 positive). The challenge featured four tracks, one associated with each sound category of cough, speech and breathing, and a fourth fusion track. A baseline system was also released to benchmark the participants. In this paper, we present an overview of the challenge, the rationale for the data collection and the baseline system. Further, a performance analysis for the systems submitted by the $16$ participating teams in the leaderboard is also presented.

Automator · TOOLS · 優化器 · 情景 · 可辨認的 ·

2021 年 10 月 6 日

Towards Heuristics for Supporting the Validation of Code Smells

Luiz Felipi Junionello,Rafael de Mello

The identification of code smells is largely recognized as a subjective task. Consequently, the automated detection tools available are insufficient to deal with the whole subjectivity involved in the task, requiring human validation. However, developers may follow different but complementary perspectives for manually validating the same code smell. Based on this scenario, our research aims at characterizing a comprehensive and optimized set of heuristics for guiding developers to validate the incidence of code smells reported by automated detection tools. For this purpose, we conducted an empirical study with 12 experienced software developers. In this study, we invited developers to individually validate the incidence of code smells in 24 code snippets from open-source Java projects. For each validation, developers should provide arguments for supporting their decisions. The study findings revealed that developers tend to look from different perspectives even when they agree about the incidence of a code smell. After coding the 303 arguments given into heuristics and refining them, we composed an optimized set of validation items for guiding developers on manually validating the incidence of eight types of code smells: data class, god class, speculative generality, middle man, refused bequest, primitive obsession, long parameter list, and feature envy. We are currently planning a survey with specialists for identifying opportunities for evolving the set of validation items proposed.

核化 · 簇 · 可約的 · Performer · 相互獨立的 ·

2021 年 10 月 6 日

Coresets for Kernel Clustering

Shaofeng H. -C. Jiang,Robert Krauthgamer,Jianing Lou,Yubo Zhang

We devise the first coreset for kernel $k$-Means, and use it to obtain new, more efficient, algorithms. Kernel $k$-Means has superior clustering capability compared to classical $k$-Means particularly when clusters are separable non-linearly, but it also introduces significant computational challenges. We address this computational issue by constructing a coreset, which is a reduced dataset that accurately preserves the clustering costs. Our main result is the first coreset for kernel $k$-Means, whose size is independent of the number of input points $n$, and moreover is constructed in time near-linear in $n$. This result immediately implies new algorithms for kernel $k$-Means, such as a $(1+\epsilon)$-approximation in time near-linear in $n$, and a streaming algorithm using space and update time $\mathrm{poly}(k \epsilon^{-1} \log n)$. We validate our coreset on various datasets with different kernels. Our coreset performs consistently well, achieving small errors while using very few points. We show that our coresets can speed up kernel $k$-Means++ (the kernelized version of the widely used $k$-Means++ algorithm), and we further use this faster kernel $k$-Means++ for spectral clustering. In both applications, we achieve up to 1000x speedup while the error is comparable to baselines that do not use coresets.

Neural Networks · 可約的 · 優化器 · Networking · Networks ·

2021 年 10 月 6 日

Physics-Informed Neural Networks for AC Optimal Power Flow

Rahul Nellikkath,Spyros Chatzivasileiadis

This paper introduces, for the first time to our knowledge, physics-informed neural networks to accurately estimate the AC-OPF result and delivers rigorous guarantees about their performance. Power system operators, along with several other actors, are increasingly using Optimal Power Flow (OPF) algorithms for a wide number of applications, including planning and real-time operations. However, in its original form, the AC Optimal Power Flow problem is often challenging to solve as it is non-linear and non-convex. Besides the large number of approximations and relaxations, recent efforts have also been focusing on Machine Learning approaches, especially neural networks. So far, however, these approaches have only partially considered the wide number of physical models available during training. And, more importantly, they have offered no guarantees about potential constraint violations of their output. Our approach (i) introduces the AC power flow equations inside neural network training and (ii) integrates methods that rigorously determine and reduce the worst-case constraint violations across the entire input domain, while maintaining the optimality of the prediction. We demonstrate how physics-informed neural networks achieve higher accuracy and lower constraint violations than standard neural networks, and show how we can further reduce the worst-case violations for all neural networks.

Processing（編程語言） · SimPLe · TOOLS · 層 · 分離的 ·

2021 年 10 月 5 日

Processes, Systems and Tests: Defining Contextual Equivalences

Clément Aubert,Daniele Varacca

In this position paper, we would like to offer and defend a new template to study equivalences between programs -- in the particular framework of process algebras for concurrent computation.We believe that our layered model of development will clarify the distinction that is too often left implicit between the tasks and duties of the programmer and of the tester. It will also enlighten pre-existing issues that have been running across process algebras as diverse as the calculus of communicating systems, the $\pi$-calculus -- also in its distributed version -- or mobile ambients.Our distinction starts by subdividing the notion of process itself in three conceptually separated entities, that we call \emph{Processes}, \emph{Systems} and \emph{Tests}. While the role of what can be observed and the subtleties in the definitions of congruences have been intensively studied, the fact that \emph{not every process can be tested}, and that \emph{the tester should have access to a different set of tools than the programmer} is curiously left out, or at least not often formally discussed.We argue that this blind spot comes from the under-specification of contexts -- environments in which comparisons takes place -- that play multiple distinct roles but supposedly always \enquote{stay the same}.We illustrate our statement with a simple Java example, the \enquote{usual} concurrent languages, but also back it up with $\lambda$-calculus and existing implementations of concurrent languages as well.

可約的 · Performer · ACID · Integration · ReQuEST ·

2021 年 10 月 5 日

Data Validation for Big Live Data

Malcolm Crowe,Carolyn Begg,Fritz Laux,Martti Laiho

from arxiv, 8 pages, 3 figures. conference paper

Data Integration of heterogeneous data sources relies either on periodically transferring large amounts of data to a physical Data Warehouse or retrieving data from the sources on request only. The latter results in the creation of what is referred to as a virtual Data Warehouse, which is preferable when the use of the latest data is paramount. However, the downside is that it adds network traffic and suffers from performance degradation when the amount of data is high. In this paper, we propose the use of a readCheck validator to ensure the timeliness of the queried data and reduced data traffic. It is further shown that the readCheck allows transactions to update data in the data sources obeying full Atomicity, Consistency, Isolation, and Durability (ACID) properties.

Stream Processing · 流 · Processing（編程語言） · 容差 · 相似度 ·

2020 年 8 月 3 日

A Survey on the Evolution of Stream Processing Systems

Marios Fragkoulis,Paris Carbone,Vasiliki Kalavri,Asterios Katsifodimos

from arxiv, 34 pages, 15 figures, 5 tables

Stream processing has been an active research field for more than 20 years, but it is now witnessing its prime time due to recent successful efforts by the research community and numerous worldwide open-source communities. This survey provides a comprehensive overview of fundamental aspects of stream processing systems and their evolution in the functional areas of out-of-order data management, state management, fault tolerance, high availability, load management, elasticity, and reconfiguration. We review noteworthy past research findings, outline the similarities and differences between early ('00-'10) and modern ('11-'18) streaming systems, and discuss recent trends and open problems.

2019 年 3 月 28 日

Recommendation Systems for Tourism Based on Social Networks: A Survey

Alan Menk,Laura Sebastia,Rebeca Ferreira

Nowadays, recommender systems are present in many daily activities such as online shopping, browsing social networks, etc. Given the rising demand for reinvigoration of the tourist industry through information technology, recommenders have been included into tourism websites such as Expedia, Booking or Tripadvisor, among others. Furthermore, the amount of scientific papers related to recommender systems for tourism is on solid and continuous growth since 2004. Much of this growth is due to social networks that, besides to offer researchers the possibility of using a great mass of available and constantly updated data, they also enable the recommendation systems to become more personalised, effective and natural. This paper reviews and analyses many research publications focusing on tourism recommender systems that use social networks in their projects. We detail their main characteristics, like which social networks are exploited, which data is extracted, the applied recommendation techniques, the methods of evaluation, etc. Through a comprehensive literature review, we aim to collaborate with the future recommender systems, by giving some clear classifications and descriptions of the current tourism recommender systems.

MINE · CC · 數據挖掘 · LD · MoDELS ·

2018 年 10 月 5 日

Semantics of Data Mining Services in Cloud Computing

Manuel Parra-Royon,Ghislain Atemezing,J. M. Benítez

from arxiv, In-depth review. Fixed mistakes

In recent years with the rise of Cloud Computing (CC), many companies providing services in the cloud, are empowered a new series of services to their catalog, such as data mining (DM) and data processing, taking advantage of the vast computing resources available to them. Different service definition proposals have been proposed to address the problem of describing services in CC in a comprehensive way. Bearing in mind that each provider has its own definition of the logic of its services, and specifically of DM services, it should be pointed out that the possibility of describing services in a flexible way between providers is fundamental in order to maintain the usability and portability of this type of CC services. The use of semantic technologies based on the proposal offered by Linked Data (LD) for the definition of services, allows the design and modelling of DM services, achieving a high degree of interoperability. In this article a schema for the definition of DM services on CC is presented, in addition are considered all key aspects of service in CC, such as prices, interfaces, Software Level Agreement, instances or workflow of experimentation, among others. The proposal presented is based on LD, so that it reuses other schemata obtaining a best definition of the service. For the validation of the schema, a series of DM services have been created where some of the best known algorithms such as \textit{Random Forest} or \textit{KMeans} are modeled as services.

INFORMS · 回合 · ReQuEST · 推薦系統 · Things ·

2018 年 2 月 28 日

Information and Environment: IoT-Powered Recommender Systems

Jim Hahn

from arxiv, 25 pages, 4 figures

Internet of Things (IoT) infrastructure within the physical library environment is the basis for an integrative, hybrid approach to digital resource recommenders. The IoT infrastructure provides mobile, dynamic wayfinding support for items in the collection, which includes features for location-based recommendations. The evaluation and analysis herein clarified the nature of users' requests for recommendations based on their location, and describes subject areas of the library for which users request recommendations. The results indicated that users of IoT-based recommendations are interested in a broad distribution of subjects, with a short-head distribution from this collection in American and English Literature. A long-tail finding showed a diversity of topics that are recommended to users in the library book stacks with IoT-powered recommendations.