Digitalization in seaports dovetails the IT infrastructure of various actors (e.g., shipping companies, terminals, customs, port authorities) to process complex workflows for shipping containers. The security of these workflows relies not only on the security of each individual actor but actors must also provide additional guarantees to other actors like, for instance, respecting obligations related to received data or checking the integrity of workflows observed so far. This paper analyses global security requirements (e.g., accountability, confidentiality) of the workflows and decomposes them - according to the way workflow data is stored and distributed - into requirements and obligations for the individual actors. Security mechanisms are presented to satisfy the resulting requirements, which together with the guarantees of all individual actors will guarantee the security of the overall workflow.
We consider the problem of finding a compromise between the opinions of a group of individuals on a number of mutually independent, binary topics. In this paper, we quantify the loss in representativeness that results from requiring the outcome to have majority support, in other words, the "price of majority support". Each individual is assumed to support an outcome if they agree with the outcome on at least as many topics as they disagree on. Our results can also be seen as quantifying Anscombes paradox which states that topic-wise majority outcome may not be supported by a majority. To measure the representativeness of an outcome, we consider two metrics. First, we look for an outcome that agrees with a majority on as many topics as possible. We prove that the maximum number such that there is guaranteed to exist an outcome that agrees with a majority on this number of topics and has majority support, equals $\ceil{(t+1)/2}$ where $t$ is the total number of topics. Second, we count the number of times a voter opinion on a topic matches the outcome on that topic. The goal is to find the outcome with majority support with the largest number of matches. We consider the ratio between this number and the number of matches of the overall best outcome which may not have majority support. We try to find the maximum ratio such that an outcome with majority support and this ratio of matches compared to the overall best is guaranteed to exist. For 3 topics, we show this ratio to be $5/6\approx 0.83$. In general, we prove an upper bound that comes arbitrarily close to $2\sqrt{6}-4\approx 0.90$ as $t$ tends to infinity. Furthermore, we numerically compute a better upper and a non-matching lower bound in the relevant range for $t$.
One of the main features of interest in analysing the light curves of stars is the underlying periodic behaviour. The corresponding observations are a complex type of time series with unequally spaced time points and are sometimes accompanied by varying measures of accuracy. The main tools for analysing these type of data rely on the periodogram-like functions, constructed with a desired feature so that the peaks indicate the presence of a potential period. In this paper, we explore a particular periodogram for the irregularly observed time series data, similar to Thieler et. al. (2013). We identify the potential periods at the appropriate peaks and more importantly with a quantifiable uncertainty. Our approach is shown to easily generalise to non-parametric methods including a weighted Gaussian process regression periodogram. We also extend this approach to correlated background noise. The proposed method for period detection relies on a test based on quadratic forms with normally distributed components. We implement the saddlepoint approximation, as a faster and more accurate alternative to the simulation-based methods that are currently used. The power analysis of the testing methodology is reported together with applications using light curves from the Hunting Outbursting Young Stars citizen science project.
Serverless Computing is a virtualisation-related paradigm that promises to simplify application management and to solve the last challenges in the field: scale down and easy to use. The implied cost reduction, coupled with a simplified management of underlying applications, are expected to further push the adoption of virtualisation-based solutions, including cloud-computing or telco-cloud solutions. However, in this quest for efficiency, security is not ranked among the top priorities, also because of the (misleading) belief that current solutions developed for virtualised environments could be applied (as is) to this new paradigm. Unfortunately, this is not the case, due to the highlighted idiosyncratic features of serverless computing. In this paper, we review the current serverless architectures, abstract and categorise their founding principles, and provide an in depth analyse of them from the point of view of security, referring to principles and practices of the cybersecurity domain. In particular, we show the security shortcomings of the analysed serverless architectural paradigms, point to possible countermeasures, and highlight a few research directions.
Federated learning (FL) is a distributed machine learning (ML) technique that enables collaborative training in which devices perform learning using a local dataset while preserving their privacy. This technique ensures privacy, communication efficiency, and resource conservation. Despite these advantages, FL still suffers from several challenges related to reliability (i.e., unreliable participating devices in training), tractability (i.e., a large number of trained models), and anonymity. To address these issues, we propose a secure and trustworthy blockchain framework (SRB-FL) tailored to FL, which uses blockchain features to enable collaborative model training in a fully distributed and trustworthy manner. In particular, we design a secure FL based on the blockchain sharding that ensures data reliability, scalability, and trustworthiness. In addition, we introduce an incentive mechanism to improve the reliability of FL devices using subjective multi-weight logic. The results show that our proposed SRB-FL framework is efficient and scalable, making it a promising and suitable solution for federated learning.
Wireless energy sharing is a novel convenient alternative to charge IoT devices. In this demo paper, we present a peer-to-peer wireless energy sharing platform. The platform enables users to exchange energy wirelessly with nearby IoT devices. The energy sharing platform allows IoT users to send and receive energy wirelessly. The platform consists of (i) a mobile application that monitors and synchronizes the energy transfer among two IoT devices and (ii) and a backend to register energy providers and consumers and store their energy transfer transactions. The eveloped framework allows the collection of a real wireless energy sharing dataset. A set of preliminary experiments has been conducted on the collected dataset to analyze and demonstrate the behavior of the current wireless energy sharing technology.
Utilizing Visualization-oriented Natural Language Interfaces (V-NLI) as a complementary input modality to direct manipulation for visual analytics can provide an engaging user experience. It enables users to focus on their tasks rather than worrying about operating the interface to visualization tools. In the past two decades, leveraging advanced natural language processing technologies, numerous V-NLI systems have been developed both within academic research and commercial software, especially in recent years. In this article, we conduct a comprehensive review of the existing V-NLIs. In order to classify each paper, we develop categorical dimensions based on a classic information visualization pipeline with the extension of a V-NLI layer. The following seven stages are used: query understanding, data transformation, visual mapping, view transformation, human interaction, context management, and presentation. Finally, we also shed light on several promising directions for future work in the community.
We study the problem of learning in the stochastic shortest path (SSP) setting, where an agent seeks to minimize the expected cost accumulated before reaching a goal state. We design a novel model-based algorithm EB-SSP that carefully skews the empirical transitions and perturbs the empirical costs with an exploration bonus to guarantee both optimism and convergence of the associated value iteration scheme. We prove that EB-SSP achieves the minimax regret rate $\widetilde{O}(B_{\star} \sqrt{S A K})$, where $K$ is the number of episodes, $S$ is the number of states, $A$ is the number of actions and $B_{\star}$ bounds the expected cumulative cost of the optimal policy from any state, thus closing the gap with the lower bound. Interestingly, EB-SSP obtains this result while being parameter-free, i.e., it does not require any prior knowledge of $B_{\star}$, nor of $T_{\star}$ which bounds the expected time-to-goal of the optimal policy from any state. Furthermore, we illustrate various cases (e.g., positive costs, or general costs when an order-accurate estimate of $T_{\star}$ is available) where the regret only contains a logarithmic dependence on $T_{\star}$, thus yielding the first horizon-free regret bound beyond the finite-horizon MDP setting.
As data are increasingly being stored in different silos and societies becoming more aware of data privacy issues, the traditional centralized training of artificial intelligence (AI) models is facing efficiency and privacy challenges. Recently, federated learning (FL) has emerged as an alternative solution and continue to thrive in this new reality. Existing FL protocol design has been shown to be vulnerable to adversaries within or outside of the system, compromising data privacy and system robustness. Besides training powerful global models, it is of paramount importance to design FL systems that have privacy guarantees and are resistant to different types of adversaries. In this paper, we conduct the first comprehensive survey on this topic. Through a concise introduction to the concept of FL, and a unique taxonomy covering: 1) threat models; 2) poisoning attacks and defenses against robustness; 3) inference attacks and defenses against privacy, we provide an accessible review of this important topic. We highlight the intuitions, key techniques as well as fundamental assumptions adopted by various attacks and defenses. Finally, we discuss promising future research directions towards robust and privacy-preserving federated learning.
In recent years with the rise of Cloud Computing (CC), many companies providing services in the cloud, are empowered a new series of services to their catalog, such as data mining (DM) and data processing, taking advantage of the vast computing resources available to them. Different service definition proposals have been proposed to address the problem of describing services in CC in a comprehensive way. Bearing in mind that each provider has its own definition of the logic of its services, and specifically of DM services, it should be pointed out that the possibility of describing services in a flexible way between providers is fundamental in order to maintain the usability and portability of this type of CC services. The use of semantic technologies based on the proposal offered by Linked Data (LD) for the definition of services, allows the design and modelling of DM services, achieving a high degree of interoperability. In this article a schema for the definition of DM services on CC is presented, in addition are considered all key aspects of service in CC, such as prices, interfaces, Software Level Agreement, instances or workflow of experimentation, among others. The proposal presented is based on LD, so that it reuses other schemata obtaining a best definition of the service. For the validation of the schema, a series of DM services have been created where some of the best known algorithms such as \textit{Random Forest} or \textit{KMeans} are modeled as services.
We consider the task of learning the parameters of a {\em single} component of a mixture model, for the case when we are given {\em side information} about that component, we call this the "search problem" in mixture models. We would like to solve this with computational and sample complexity lower than solving the overall original problem, where one learns parameters of all components. Our main contributions are the development of a simple but general model for the notion of side information, and a corresponding simple matrix-based algorithm for solving the search problem in this general setting. We then specialize this model and algorithm to four common scenarios: Gaussian mixture models, LDA topic models, subspace clustering, and mixed linear regression. For each one of these we show that if (and only if) the side information is informative, we obtain parameter estimates with greater accuracy, and also improved computation complexity than existing moment based mixture model algorithms (e.g. tensor methods). We also illustrate several natural ways one can obtain such side information, for specific problem instances. Our experiments on real data sets (NY Times, Yelp, BSDS500) further demonstrate the practicality of our algorithms showing significant improvement in runtime and accuracy.