The Latin American Giant Observatory (LAGO) is a distributed cosmic ray observatory at a regional scale in Latin America, by deploying a large network of Water Cherenkov detectors (WCD) and other astroparticle detectors in a wide range of latitudes from Antarctica to M\'exico, and altitudes from sea level to more than 5500 m a.s.l. Detectors telemetry, atmospherics conditions and flux of secondary particles at the ground are measured with extreme detail at each LAGO site by using our own-designed hardware and firmware (ACQUA). To combine and analyse all these data, LAGO developed ANNA, our data analysis framework. Additionally, ARTI, a complete framework of simulations designed to simulate the expected signals at our detectors coming from primary cosmic rays entering the Earth atmosphere, allowing a precise characterization of the sites in realistic atmospheric, geomagnetic and detector conditions. As the measured and synthetic data started to flow, we are facing challenging scenarios given a large amount of data emerging, performed on a diversity of detectors and computing architectures and e-infrastructures. These data need to be transferred, analyzed, catalogued, preserved, and provided for internal and public access and data-mining under an open e-science environment. In this work, we present the implementation of ARTI at the EOSC-Synergy cloud-based services as the first example of LAGO' frameworks that will follow the FAIR principles for provenance, data curation and re-using of data. For this, we calculate the flux of secondary particles expected in up to 1 week at detector level for all the 26 LAGO, and the 1-year flux of high energy secondaries expected at the ANDES Underground Laboratory and other sites. Therefore, we show how this development can help not only LAGO but other data-intensive cosmic rays observatories, muography experiments and underground laboratories.
The COVID-19 pandemic has impacted our society by forcing shutdowns and shifting the way people interacted worldwide. In relation to the impacts on the electric grid, it created a significant decrease in energy demands across the globe. Recent studies have shown that the low load demand conditions caused by COVID-19 lockdowns combined with large renewable generation have resulted in extremely low-inertia grid conditions. In this work, we examine how an attacker could exploit these conditions to cause unsafe grid operating conditions by executing load-altering attacks (LAAs) targeted at compromising hundreds of thousands of IoT-connected high-wattage loads in low-inertia power systems. Our study focuses on analyzing the impact of the COVID-19 mitigation measures on U.S. regional transmission operators (RTOs), formulating a plausible and realistic least-effort LAA targeted at transmission systems with low-inertia conditions, and evaluating the probability of these large-scale LAAs. Theoretical and simulation results are presented based on the WSCC 9-bus test system. Results demonstrate how adversaries could provoke major frequency disturbances by targeting vulnerable load buses in low-inertia systems.
Power estimation is the basis of many hardware optimization strategies. However, it is still challenging to offer accurate power estimation at an early stage such as high-level synthesis (HLS). In this paper, we propose PowerGear, a graph-learning-assisted power estimation approach for FPGA HLS, which features high accuracy, efficiency and transferability. PowerGear comprises two main components: a graph construction flow and a customized graph neural network (GNN) model. Specifically, in the graph construction flow, we introduce buffer insertion, datapath merging, graph trimming and feature annotation techniques to transform HLS designs into graph-structured data, which encode both intra-operation micro-architectures and inter-operation interconnects annotated with switching activities. Furthermore, we propose a novel power-aware heterogeneous edge-centric GNN model which effectively learns heterogeneous edge semantics and structural properties of the constructed graphs via edge-centric neighborhood aggregation, and fits the formulation of dynamic power. Compared with on-board measurement, PowerGear estimates total and dynamic power for new HLS designs with errors of 3.60% and 8.81%, respectively, which outperforms the prior arts in research and the commercial product Vivado. In addition, PowerGear demonstrates a speedup of 4x over Vivado power estimator. Finally, we present a case study in which PowerGear is exploited to facilitate design space exploration for FPGA HLS, leading to a performance gain of up to 11.2%, compared with methods using state-of-the-art predictive models.
The paper provides a novel framework to study the accuracy and stability of numerical integration schemes when employed for the time domain simulation of power systems. A matrix pencil-based approach is adopted to evaluate the error between the dynamic modes of the power system and the modes of the approximated discrete-time system arising from the application of the numerical method. The proposed approach can provide meaningful insights on how different methods compare to each other when applied to a power system, while being general enough to be systematically utilized for, in principle, any numerical method. The framework is illustrated for a handful of well-known explicit and implicit methods, while simulation results are presented based on the WSCC 9-bus system, as well as on a 1, 479-bus dynamic model of the All-Island Irish Transmission System.
With the wide penetration of smart robots in multifarious fields, Simultaneous Localization and Mapping (SLAM) technique in robotics has attracted growing attention in the community. Yet collaborating SLAM over multiple robots still remains challenging due to performance contradiction between the intensive graphics computation of SLAM and the limited computing capability of robots. While traditional solutions resort to the powerful cloud servers acting as an external computation provider, we show by real-world measurements that the significant communication overhead in data offloading prevents its practicability to real deployment. To tackle these challenges, this paper promotes the emerging edge computing paradigm into multi-robot SLAM and proposes RecSLAM, a multi-robot laser SLAM system that focuses on accelerating map construction process under the robot-edge-cloud architecture. In contrast to conventional multi-robot SLAM that generates graphic maps on robots and completely merges them on the cloud, RecSLAM develops a hierarchical map fusion technique that directs robots' raw data to edge servers for real-time fusion and then sends to the cloud for global merging. To optimize the overall pipeline, an efficient multi-robot SLAM collaborative processing framework is introduced to adaptively optimize robot-to-edge offloading tailored to heterogeneous edge resource conditions, meanwhile ensuring the workload balancing among the edge servers. Extensive evaluations show RecSLAM can achieve up to 39% processing latency reduction over the state-of-the-art. Besides, a proof-of-concept prototype is developed and deployed in real scenes to demonstrate its effectiveness.
Testing with simulation environments helps to identify critical failing scenarios for self-driving cars (SDCs). Simulation-based tests are safer than in-field operational tests and allow detecting software defects before deployment. However, these tests are very expensive and are too many to be run frequently within limited time constraints. In this paper, we investigate test case prioritization techniques to increase the ability to detect SDC regression faults with virtual tests earlier. Our approach, called SDC-Prioritizer, prioritizes virtual tests for SDCs according to static features of the roads we designed to be used within the driving scenarios. These features can be collected without running the tests, which means that they do not require past execution results. We introduce two evolutionary approaches to prioritize the test cases using diversity metrics (black-box heuristics) computed on these static features. These two approaches, called SO-SDC-Prioritizer and MO-SDC-Prioritizer, use single-objective and multi-objective genetic algorithms, respectively, to find trade-offs between executing the less expensive tests and the most diverse test cases earlier. Our empirical study conducted in the SDC domain shows that MO-SDC-Prioritizer significantly improves the ability to detect safety-critical failures at the same level of execution time compared to baselines: random and greedy-based test case orderings. Besides, our study indicates that multi-objective meta-heuristics outperform single-objective approaches when prioritizing simulation-based tests for SDCs. MO-SDC-Prioritizer prioritizes test cases with a large improvement in fault detection while its overhead (up to 0.45% of the test execution cost) is negligible.
We propose throughput and cost optimal job scheduling algorithms in cloud computing platforms offering Infrastructure as a Service. We first consider online migration and propose job scheduling algorithms to minimize job migration and server running costs. We consider algorithms that assume knowledge of job-size on arrival of jobs. We characterize the optimal cost subject to system stability. We develop a drift-plus-penalty framework based algorithm that can achieve optimal cost arbitrarily closely. Specifically this algorithm yields a trade-off between delay and costs. We then relax the job-size knowledge assumption and give an algorithm that uses readily offered service to the jobs. We show that this algorithm gives order-wise identical cost as the job size based algorithm. Later, we consider offline job migration that incurs migration delays. We again present throughput optimal algorithms that minimize server running cost. We illustrate the performance of the proposed algorithms and compare these to the existing algorithms via simulation.
Public clouds are one of the most thriving technologies of the past decade. Major applications over public clouds require world-wide distribution and large amounts of data exchange between their distributed servers. To that end, major cloud providers have invested tens of billions of dollars in building world-wide inter-region networking infrastructure that can support high performance communication into, out of, and across public cloud geographic regions. In this paper, we lay the foundation for a comprehensive study and real time monitoring of various characteristic of networking within and between public clouds. We start by presenting CloudCast, a world-wide and expandable measurements and analysis system, currently (January 2019)collecting data from three major public clouds (AWS, GCPand Azure), 59 regions, 1184 intra-cloud and 2238 cross-cloud links (each link represents a direct connection between a pair of regions), amounting to a total of 3422 continuously monitored links and providing active measurements every minute.CloudCast is composed of measurement agents automatically installed in each public cloud region, centralized control, measurement data base, analysis engine and visualization tools. Then we turn to analyze the latency measurement data collected over almost a year . Our analysis yields surprising results. First, each public cloud exhibits a unique set of link latency behaviors along time. Second, using a novel, fair evaluation methodology, termed similar links, we compare the three clouds. Third, we prove that more than 50% of all links do not provide the optimal RTT through the methodology of triangles. Triangles also provide a framework to get around bottlenecks, benefiting not only the majority (53%-70%) of the cross-cloud links by 30% to 70%, but also a significant portion (29%-45%) of intra-cloud links by 14%-33%.
Non-terrestrial networks (NTNs), which integrate space and aerial networks with terrestrial systems, are a key area in the emerging sixth-generation (6G) wireless networks. As part of 6G, NTNs must provide pervasive connectivity to a wide range of devices, including smartphones, vehicles, sensors, robots, and maritime users. However, due to the high mobility and deployment of NTNs, managing the space-air-sea (SAS) NTN resources, i.e., energy, power, and channel allocation, is a major challenge. The design of a SAS-NTN for energy-efficient resource allocation is investigated in this study. The goal is to maximize system energy efficiency (EE) by collaboratively optimizing user equipment (UE) association, power control, and unmanned aerial vehicle (UAV) deployment. Given the limited payloads of UAVs, this work focuses on minimizing the total energy cost of UAVs (trajectory and transmission) while meeting EE requirements. A mixed-integer nonlinear programming problem is proposed, followed by the development of an algorithm to decompose, and solve each problem distributedly. The binary (UE association) and continuous (power, deployment) variables are separated using the Bender decomposition (BD), and then the Dinkelbach algorithm (DA) is used to convert fractional programming into an equivalent solvable form in the subproblem. A standard optimization solver is utilized to deal with the complexity of the master problem for binary variables. The alternating direction method of multipliers (ADMM) algorithm is used to solve the subproblem for the continuous variables. Our proposed algorithm provides a suboptimal solution, and simulation results demonstrate that the proposed algorithm achieves better EE than baselines.
Man-made scenes can be densely packed, containing numerous objects, often identical, positioned in close proximity. We show that precise object detection in such scenes remains a challenging frontier even for state-of-the-art object detectors. We propose a novel, deep-learning based method for precise object detection, designed for such challenging settings. Our contributions include: (1) A layer for estimating the Jaccard index as a detection quality score; (2) a novel EM merging unit, which uses our quality scores to resolve detection overlap ambiguities; finally, (3) an extensive, annotated data set, SKU-110K, representing packed retail environments, released for training and testing under such extreme settings. Detection tests on SKU-110K and counting tests on the CARPK and PUCPR+ show our method to outperform existing state-of-the-art with substantial margins. The code and data will be made available on \url{www.github.com/eg4000/SKU110K_CVPR19}.
In recent years with the rise of Cloud Computing (CC), many companies providing services in the cloud, are empowered a new series of services to their catalog, such as data mining (DM) and data processing, taking advantage of the vast computing resources available to them. Different service definition proposals have been proposed to address the problem of describing services in CC in a comprehensive way. Bearing in mind that each provider has its own definition of the logic of its services, and specifically of DM services, it should be pointed out that the possibility of describing services in a flexible way between providers is fundamental in order to maintain the usability and portability of this type of CC services. The use of semantic technologies based on the proposal offered by Linked Data (LD) for the definition of services, allows the design and modelling of DM services, achieving a high degree of interoperability. In this article a schema for the definition of DM services on CC is presented, in addition are considered all key aspects of service in CC, such as prices, interfaces, Software Level Agreement, instances or workflow of experimentation, among others. The proposal presented is based on LD, so that it reuses other schemata obtaining a best definition of the service. For the validation of the schema, a series of DM services have been created where some of the best known algorithms such as \textit{Random Forest} or \textit{KMeans} are modeled as services.