The continuously advancing digitization has provided answers to the bureaucratic problems faced by eGovernance services. This innovation led them to an era of automation it has broadened the attack surface and made them a popular target for cyber attacks. eGovernance services utilize internet, which is currently a location addressed system where whoever controls the location controls not only the content itself, but the integrity of that content, and the access to that content. We propose GLASS, a decentralised solution which combines the InterPlanetary File System (IPFS) with Distributed Ledger technology and Smart Contracts to secure EGovernance services. We also create a testbed environment where we measure the IPFS performance.
Society is characterized by the presence of a variety of social norms: collective patterns of sanctioning that can prevent miscoordination and free-riding. Inspired by this, we aim to construct learning dynamics where potentially beneficial social norms can emerge. Since social norms are underpinned by sanctioning, we introduce a training regime where agents can access all sanctioning events but learning is otherwise decentralized. This setting is technologically interesting because sanctioning events may be the only available public signal in decentralized multi-agent systems where reward or policy-sharing is infeasible or undesirable. To achieve collective action in this setting we construct an agent architecture containing a classifier module that categorizes observed behaviors as approved or disapproved, and a motivation to punish in accord with the group. We show that social norms emerge in multi-agent systems containing this agent and investigate the conditions under which this helps them achieve socially beneficial outcomes.
This paper is about shipping runtime verification to the masses. It presents the crucial technology enabling everyday car owners to monitor the behaviour of their cars in-the-wild. Concretely, we present an Android app that deploys RTLola runtime monitors for the purpose of diagnosing automotive exhaust emissions. For this, it harvests the availability of cheap bluetooth adapters to the On-Board-Diagnostics (OBD) ports, which are ubiquitous in cars nowadays. We detail its use in the context of Real Driving Emissions (RDE) tests and report on sample runs that helped identify violations of the regulatory framework currently valid in the European Union.
The API economy refers to the widespread integration of API (advanced programming interface) microservices, where software applications can communicate with each other, as a crucial element in business models and functions. The number of possible ways in which such a system could be used is huge. It is thus desirable to monitor the usage patterns and identify when the system is used in a way that was never used before. This provides a warning to the system analysts and they can ensure uninterrupted operation of the system. In this work we analyze both histograms and call graph of API usage to determine if the usage patterns of the system has shifted. We compare the application of nonparametric statistical and Bayesian sequential analysis to the problem. This is done in a way that overcomes the issue of repeated statistical tests and insures statistical significance of the alerts. The technique was simulated and tested and proven effective in detecting the drift in various scenarios. We also mention modifications to the technique to decrease its memory so that it can respond more quickly when the distribution drift occurs at a delay from when monitoring begins.
Organisations increasingly use automated decision-making systems (ADMS) to inform decisions that affect humans and their environment. While the use of ADMS can improve the accuracy and efficiency of decision-making processes, it is also coupled with ethical challenges. Unfortunately, the governance mechanisms currently used to oversee human decision-making often fail when applied to ADMS. In previous work, we proposed that ethics-based auditing (EBA), i.e. a structured process by which ADMS are assessed for consistency with relevant principles or norms, can (a) help organisations verify claims about their ADMS and (b) provide decision-subjects with justifications for the outputs produced by ADMS. In this article, we outline the conditions under which EBA procedures can be feasible and effective in practice. First, we argue that EBA is best understood as a 'soft' yet 'formal' governance mechanism. This implies that the main responsibility of auditors should be to spark ethical deliberation at key intervention points throughout the software development process and ensure that there is sufficient documentation to respond to potential inquiries. Second, we frame ADMS as parts of larger socio-technical systems to demonstrate that to be feasible and effective, EBA procedures must link to intervention points that span all levels of organisational governance and all phases of the software lifecycle. The main function of EBA should therefore be to inform, formalise, assess, and interlink existing governance structures. Finally, we discuss the policy implications of our findings. To support the emergence of feasible and effective EBA procedures, policymakers and regulators could provide standardised reporting formats, facilitate knowledge exchange, provide guidance on how to resolve normative tensions, and create an independent body to oversee EBA of ADMS.
In this paper, we present the findings from a survey study to investigate how developers and managers define and trade-off developer productivity and software quality (two related lenses into software development). We found that developers and managers, as cohorts, are not well aligned in their views of what it means to be productive (developers think of productivity in terms of activity, while more managers think of productivity in terms of performance). We also found that developers are not accurate at predicting their managers' views of productivity. In terms of quality, we found that individual developers and managers have quite varied views of what quality means to them, but as cohorts they are closely aligned in their different views, with the majority in both groups defining quality in terms of robustness. Over half of the developers and managers reported that quality can be traded for higher productivity and why this trade-off can be justified, while one third consider quality as a necessary part of productivity that cannot be traded. We also present a new descriptive framework for quality, TRUCE, that we synthesize from the survey responses. We call for more discussion between developers and managers about what they each consider as important software quality attributes, and to have open debate about how software quality relates to developer productivity and what trade-offs should or should not be made.
A cloud service provider strives to provide a high Quality of Service (QoS) to client jobs. Such jobs vary in computational and Service-Level-Agreement (SLA) obligations, as well as differ with respect to tolerating delays and SLA violations. The job scheduling plays a critical role in servicing cloud demands by allocating appropriate resources to execute client jobs. The response to such jobs is optimized by the cloud provider on a multi-tier cloud computing environment. Typically, the complex and dynamic nature of multi-tier environments incurs difficulties in meeting such demands, because tiers are dependent on each other which in turn makes bottlenecks of a tier shift to escalate in subsequent tiers. However, the optimization process of existing approaches produces single-tier-driven schedules that do not employ the differential impact of SLA violations in executing client jobs. Furthermore, the impact of schedules optimized at the tier level on the performance of schedules formulated in subsequent tiers tends to be ignored, resulting in a less than optimal performance when measured at the multi-tier level. Thus, failing in committing job obligations incurs SLA penalties that often take the form of either financial compensations, or losing future interests and motivations of unsatisfied clients in the service provided. In this paper, a scheduling and allocation approach is proposed to formulate schedules that account for differential impacts of SLA violation penalties and, thus, produce schedules that are optimal in financial performance. A queue virtualization scheme is designed to facilitate the formulation of optimal schedules at the tier and multi-tier levels of the cloud environment. Because the scheduling problem is NP-hard, a biologically inspired approach is proposed to mitigate the complexity of finding optimal schedules.
Digital contact tracing is being used by many countries to help contain COVID-19's spread in a post-lockdown world. Among the various available techniques, decentralized contact tracing that uses Bluetooth received signal strength indication (RSSI) to detect proximity is considered less of a privacy risk than approaches that rely on collecting absolute locations via GPS, cellular-tower history, or QR-code scanning. As of October 2020, there have been millions of downloads of such Bluetooth-based contract-tracing apps, as more and more countries officially adopt them. However, the effectiveness of these apps in the real world remains unclear due to a lack of empirical research that includes realistic crowd sizes and densities. This study aims to fill that gap, by empirically investigating the effectiveness of Bluetooth-based contact tracing in crowd environments with a total of 80 participants, emulating classrooms, moving lines, and other types of real-world gatherings. The results confirm that Bluetooth RSSI is unreliable for detecting proximity, and that this inaccuracy worsens in environments that are especially crowded. In other words, this technique may be least useful when it is most in need, and that it is fragile when confronted by low-cost jamming. Moreover, technical problems such as high energy consumption and phone overheating caused by the contact-tracing app were found to negatively influence users' willingness to adopt it. On the bright side, however, Bluetooth RSSI may still be useful for detecting coarse-grained contact events, for example, proximity of up to 20m lasting for an hour. Based on our findings, we recommend that existing contact-tracing apps can be re-purposed to focus on coarse-grained proximity detection, and that future ones calibrate distance estimates and adjust broadcast frequencies based on auxiliary information.
Red light running at signalised intersections is a growing road safety issue worldwide, leading to the rapid development of advanced intelligent transportation technologies and countermeasures. However, existing studies have yet to summarise and present the effect of these technology-based innovations in improving safety. This paper represents a comprehensive review of red-light running behaviour prediction methodologies and technology-based countermeasures. Specifically, the major focus of this study is to provide a comprehensive review on two streams of literature targeting red-light running and stop-and-go behaviour at signalised intersection (1) studies focusing on modelling and predicting the red-light running and stop-and-go related driver behaviour and (2) studies focusing on the effectiveness of different technology-based countermeasures which combat such unsafe behaviour. The study provides a systematic guide to assist researchers and stakeholders in understanding how to best identify red-light running and stop-and-go associated driving behaviour and subsequently implement countermeasures to combat such risky behaviour and improve the associated safety.
Visual information extraction (VIE) has attracted considerable attention recently owing to its various advanced applications such as document understanding, automatic marking and intelligent education. Most existing works decoupled this problem into several independent sub-tasks of text spotting (text detection and recognition) and information extraction, which completely ignored the high correlation among them during optimization. In this paper, we propose a robust visual information extraction system (VIES) towards real-world scenarios, which is a unified end-to-end trainable framework for simultaneous text detection, recognition and information extraction by taking a single document image as input and outputting the structured information. Specifically, the information extraction branch collects abundant visual and semantic representations from text spotting for multimodal feature fusion and conversely, provides higher-level semantic clues to contribute to the optimization of text spotting. Moreover, regarding the shortage of public benchmarks, we construct a fully-annotated dataset called EPHOIE (//github.com/HCIILAB/EPHOIE), which is the first Chinese benchmark for both text spotting and visual information extraction. EPHOIE consists of 1,494 images of examination paper head with complex layouts and background, including a total of 15,771 Chinese handwritten or printed text instances. Compared with the state-of-the-art methods, our VIES shows significant superior performance on the EPHOIE dataset and achieves a 9.01% F-score gain on the widely used SROIE dataset under the end-to-end scenario.
To make deliberate progress towards more intelligent and more human-like artificial systems, we need to be following an appropriate feedback signal: we need to be able to define and evaluate intelligence in a way that enables comparisons between two systems, as well as comparisons with humans. Over the past hundred years, there has been an abundance of attempts to define and measure intelligence, across both the fields of psychology and AI. We summarize and critically assess these definitions and evaluation approaches, while making apparent the two historical conceptions of intelligence that have implicitly guided them. We note that in practice, the contemporary AI community still gravitates towards benchmarking intelligence by comparing the skill exhibited by AIs and humans at specific tasks such as board games and video games. We argue that solely measuring skill at any given task falls short of measuring intelligence, because skill is heavily modulated by prior knowledge and experience: unlimited priors or unlimited training data allow experimenters to "buy" arbitrary levels of skills for a system, in a way that masks the system's own generalization power. We then articulate a new formal definition of intelligence based on Algorithmic Information Theory, describing intelligence as skill-acquisition efficiency and highlighting the concepts of scope, generalization difficulty, priors, and experience. Using this definition, we propose a set of guidelines for what a general AI benchmark should look like. Finally, we present a benchmark closely following these guidelines, the Abstraction and Reasoning Corpus (ARC), built upon an explicit set of priors designed to be as close as possible to innate human priors. We argue that ARC can be used to measure a human-like form of general fluid intelligence and that it enables fair general intelligence comparisons between AI systems and humans.