Event-driven architecture has been widely adopted in the software industry, emerging as an alternative to the development of enterprise applications based on the REST architectural style. However, little is known about the effects of event-driven architecture on modularity while enterprise applications evolve. Consequently, practitioners end up adopting it without any empirical evidence about its impacts on essential indicators, including separation of concerns, coupling, cohesion, complexity and size. This article, therefore, reports an exploratory study comparing event-driven architecture and REST style in terms of modularity. A real-world application was developed using an event-driven architecture and REST through five evolution scenarios. In each scenario, a feature was added. The generated versions were compared using ten metrics. The initial results suggest that the event-driven architecture improved the separation of concerns, but was outperformed considering the metrics of coupling, cohesion, complexity and size. The findings are encouraging and can be seen as a first step in a more ambitious agenda to empirically evaluate the benefits of event-driven architecture against the REST style.
We consider large-scale Markov decision processes with an unknown cost function and address the problem of learning a policy from a finite set of expert demonstrations. We assume that the learner is not allowed to interact with the expert and has no access to reinforcement signal of any kind. Existing inverse reinforcement learning methods come with strong theoretical guarantees, but are computationally expensive, while state-of-the-art policy optimization algorithms achieve significant empirical success, but are hampered by limited theoretical understanding. To bridge the gap between theory and practice, we introduce a novel bilinear saddle-point framework using Lagrangian duality. The proposed primal-dual viewpoint allows us to develop a model-free provably efficient algorithm through the lens of stochastic convex optimization. The method enjoys the advantages of simplicity of implementation, low memory requirements, and computational and sample complexities independent of the number of states. We further present an equivalent no-regret online-learning interpretation.
With growing deployment of Internet of Things (IoT) and machine learning (ML) applications, which need to leverage computation on edge and cloud resources, it is important to develop algorithms and tools to place these distributed computations to optimize their performance. We address the problem of optimally placing computations (described as directed acyclic graphs (DAGs)) on a set of machines to maximize the steady-state throughput for pipelined inputs. Traditionally, such optimization has focused on a different metric, minimizing single-shot makespan, and a well-known algorithm is the Heterogeneous Earliest Finish Time (HEFT) algorithm. Maximizing throughput however, is more suitable for many real-time, edge, cloud and IoT applications, we present a different scheduling algorithm, namely Throughput HEFT (TPHEFT). Further, we present two throughput-oriented enhancements which can be applied to any baseline schedule, that we refer to as "node splitting" (SPLIT) and "task duplication" (DUP). In order to implement and evaluate these algorithms, we built new subsystems and plugins for an open-source dispersed computing framework called Jupiter. Experiments with varying DAG structures indicate that: 1) TPHEFT can significantly improve throughput performance compared to HEFT (up to 2.3 times in our experiments), with greater gains when there is less degree of parallelism in the DAG, 2) Node splitting can potentially improve performance over a baseline schedule, with greater gains when there's an imbalanced allocation of computation or inter-task communication, and 3) Task duplication generally gives improvements only when running upon a baseline that places communication over slow links. To our knowledge, this is the first study to present a systematic experimental implementation and exploration of throughput-enhancing techniques for dispersed computing on real testbeds.
To ensure an efficient and environmentally friendly water resource management, water management associations need means for efficient water monitoring as well as novel strategies to reduce the pollution of surface and ground water. Traditionally, water management associations operate large sensor networks to suffice their needs for hydrological and meteorological measurement data to monitor and model physical processes within catchments. Implementing a comprehensive monitoring system often suffers from sparse coverage of in-situ data. Due to the evolvement of the Copernicus satellite platforms, the broader availability of satellite data provides a great potential for deriving complementary information from Earth Observation data. Although the number of satellite data platforms that provide online processing environments is growing, it is still a big challenge to integrate those platforms into traditional workflows of users from environmental domains such as hydrology. Thus, in this paper, we introduce a software architecture to facilitate the generation of Earth Observation information targeted towards hydrology. The presented WaCoDiS System comprises several microservices as well standardized interfaces that enable a platform-independent processing of satellite data. First, we discuss the contribution of Earth Observation data to water monitoring and derive several challenges regarding the facilitation of satellite data processing. We then describe our system design with a brief overview about the different system components which form an automated processing pipeline. The suitability of our system is proven as part of a pre-operational deployment for a German water management association. In addition, we demonstrate how our system is capable of integrating satellite data platforms, using the Copernicus Data and Exploitation Platform - Deutschland (CODE-DE) as a reference example.
In real-world applications, data often come in a growing manner, where the data volume and the number of classes may increase dynamically. This will bring a critical challenge for learning: given the increasing data volume or the number of classes, one has to instantaneously adjust the neural model capacity to obtain promising performance. Existing methods either ignore the growing nature of data or seek to independently search an optimal architecture for a given dataset, and thus are incapable of promptly adjusting the architectures for the changed data. To address this, we present a neural architecture adaptation method, namely Adaptation eXpert (AdaXpert), to efficiently adjust previous architectures on the growing data. Specifically, we introduce an architecture adjuster to generate a suitable architecture for each data snapshot, based on the previous architecture and the different extent between current and previous data distributions. Furthermore, we propose an adaptation condition to determine the necessity of adjustment, thereby avoiding unnecessary and time-consuming adjustments. Extensive experiments on two growth scenarios (increasing data volume and number of classes) demonstrate the effectiveness of the proposed method.
The Transformer is widely used in natural language processing tasks. To train a Transformer however, one usually needs a carefully designed learning rate warm-up stage, which is shown to be crucial to the final performance but will slow down the optimization and bring more hyper-parameter tunings. In this paper, we first study theoretically why the learning rate warm-up stage is essential and show that the location of layer normalization matters. Specifically, we prove with mean field theory that at initialization, for the original-designed Post-LN Transformer, which places the layer normalization between the residual blocks, the expected gradients of the parameters near the output layer are large. Therefore, using a large learning rate on those gradients makes the training unstable. The warm-up stage is practically helpful for avoiding this problem. On the other hand, our theory also shows that if the layer normalization is put inside the residual blocks (recently proposed as Pre-LN Transformer), the gradients are well-behaved at initialization. This motivates us to remove the warm-up stage for the training of Pre-LN Transformers. We show in our experiments that Pre-LN Transformers without the warm-up stage can reach comparable results with baselines while requiring significantly less training time and hyper-parameter tuning on a wide range of applications.
Time Series Classification (TSC) is an important and challenging problem in data mining. With the increase of time series data availability, hundreds of TSC algorithms have been proposed. Among these methods, only a few have considered Deep Neural Networks (DNNs) to perform this task. This is surprising as deep learning has seen very successful applications in the last years. DNNs have indeed revolutionized the field of computer vision especially with the advent of novel deeper architectures such as Residual and Convolutional Neural Networks. Apart from images, sequential data such as text and audio can also be processed with DNNs to reach state-of-the-art performance for document classification and speech recognition. In this article, we study the current state-of-the-art performance of deep learning algorithms for TSC by presenting an empirical study of the most recent DNN architectures for TSC. We give an overview of the most successful deep learning applications in various time series domains under a unified taxonomy of DNNs for TSC. We also provide an open source deep learning framework to the TSC community where we implemented each of the compared approaches and evaluated them on a univariate TSC benchmark (the UCR/UEA archive) and 12 multivariate time series datasets. By training 8,730 deep learning models on 97 time series datasets, we propose the most exhaustive study of DNNs for TSC to date.
In recent years Deep Learning has brought about a breakthrough in Medical Image Segmentation. U-Net is the most prominent deep network in this regard, which has been the most popular architecture in the medical imaging community. Despite outstanding overall performance in segmenting multimodal medical images, from extensive experimentations on challenging datasets, we found out that the classical U-Net architecture seems to be lacking in certain aspects. Therefore, we propose some modifications to improve upon the already state-of-the-art U-Net model. Hence, following the modifications we develop a novel architecture MultiResUNet as the potential successor to the successful U-Net architecture. We have compared our proposed architecture MultiResUNet with the classical U-Net on a vast repertoire of multimodal medical images. Albeit slight improvements in the cases of ideal images, a remarkable gain in performance has been attained for challenging images. We have evaluated our model on five different datasets, each with their own unique challenges, and have obtained a relative improvement in performance of 10.15%, 5.07%, 2.63%, 1.41%, and 0.62% respectively.
Deep neural network architectures have traditionally been designed and explored with human expertise in a long-lasting trial-and-error process. This process requires huge amount of time, expertise, and resources. To address this tedious problem, we propose a novel algorithm to optimally find hyperparameters of a deep network architecture automatically. We specifically focus on designing neural architectures for medical image segmentation task. Our proposed method is based on a policy gradient reinforcement learning for which the reward function is assigned a segmentation evaluation utility (i.e., dice index). We show the efficacy of the proposed method with its low computational cost in comparison with the state-of-the-art medical image segmentation networks. We also present a new architecture design, a densely connected encoder-decoder CNN, as a strong baseline architecture to apply the proposed hyperparameter search algorithm. We apply the proposed algorithm to each layer of the baseline architectures. As an application, we train the proposed system on cine cardiac MR images from Automated Cardiac Diagnosis Challenge (ACDC) MICCAI 2017. Starting from a baseline segmentation architecture, the resulting network architecture obtains the state-of-the-art results in accuracy without performing any trial-and-error based architecture design approaches or close supervision of the hyperparameters changes.
In this work, we propose a special cascade network for image segmentation, which is based on the U-Net networks as building blocks and the idea of the iterative refinement. The model was mainly applied to achieve higher recognition quality for the task of finding borders of the optic disc and cup, which are relevant to the presence of glaucoma. Compared to a single U-Net and the state-of-the-art methods for the investigated tasks, very high segmentation quality has been achieved without a need for increasing the volume of datasets. Our experiments include comparison with the best-known methods on publicly available databases DRIONS-DB, RIM-ONE v.3, DRISHTI-GS, and evaluation on a private data set collected in collaboration with University of California San Francisco Medical School. The analysis of the architecture details is presented, and it is argued that the model can be employed for a broad scope of image segmentation problems of similar nature.
A recent research trend has emerged to identify developers' emotions, by applying sentiment analysis to the content of communication traces left in collaborative development environments. Trying to overcome the limitations posed by using off-the-shelf sentiment analysis tools, researchers recently started to develop their own tools for the software engineering domain. In this paper, we report a benchmark study to assess the performance and reliability of three sentiment analysis tools specifically customized for software engineering. Furthermore, we offer a reflection on the open challenges, as they emerge from a qualitative analysis of misclassified texts.