Non-volatile memory (NVM), also known as persistent memory, is an emerging paradigm for memory that preserves its contents even after power loss. NVM is widely expected to become ubiquitous, and hardware architectures are already providing support for NVM programming. This has stimulated interest in the design of novel concepts ensuring correctness of concurrent programming abstractions in the face of persistency and in the development of associated verification approaches. Software transactional memory (STM) is a key programming abstraction that supports concurrent access to shared state. In a fashion similar to linearizability as the correctness condition for concurrent data structures, there is an established notion of correctness for STMs known as opacity. We have recently proposed {\em durable opacity} as the natural extension of opacity to a setting with non-volatile memory. Together with this novel correctness condition, we designed a verification technique based on refinement. In this paper, we extend this work in two directions. First, we develop a durably opaque version of NOrec (no ownership records), an existing STM algorithm proven to be opaque. Second, we modularise our existing verification approach by separating the proof of durability of memory accesses from the proof of opacity. For NOrec, this allows us to re-use an existing opacity proof and complement it with a proof of the durability of accesses to shared state.
This paper develops simple feed-forward neural networks that achieve the universal approximation property for all continuous functions with a fixed finite number of neurons. These neural networks are simple because they are designed with a simple and computable continuous activation function $\sigma$ leveraging a triangular-wave function and a softsign function. We prove that $\sigma$-activated networks with width $36d(2d+1)$ and depth $11$ can approximate any continuous function on a $d$-dimensioanl hypercube within an arbitrarily small error. Hence, for supervised learning and its related regression problems, the hypothesis space generated by these networks with a size not smaller than $36d(2d+1)\times 11$ is dense in the space of continuous functions. Furthermore, classification functions arising from image and signal classification are in the hypothesis space generated by $\sigma$-activated networks with width $36d(2d+1)$ and depth $12$, when there exist pairwise disjoint closed bounded subsets of $\mathbb{R}^d$ such that the samples of the same class are located in the same subset.
A polynomial indicator function of designs is first introduced by Fontana {\it et al}. (2000) for two-level cases. They give the structure of the indicator functions, especially the relation to the orthogonality of designs. These results are generalized by Aoki (2019) for general multi-level cases. As an application of these results, we can enumerate all orthogonal fractional factorial designs with given size and orthogonality using computational algebraic software. For example, Aoki (2019) gives classifications of orthogonal fractions of 2x2x2x2x3 designs with strength 3, which is derived by simple eliminations of variables. However, the computational feasibility of this naive approach depends on the size of the problems. In fact, it is reported that the computation of orthogonal fractions of 2x2x2x2x3 designs with strength 2 fails to carry out in Aoki (2019). In this paper, using the theory of primary decomposition, we enumerate and classify orthogonal fractions of 2x2x2x2x3 designs with strength 2. We show there are 35200 orthogonal half fractions of 2x2x2x2x3 designs with strength 2, classified into 63 equivalent classes.
The deep learning revolution is touching all scientific disciplines and corners of our lives as a means of harnessing the power of big data. Marine ecology is no exception. These new methods provide analysis of data from sensors, cameras, and acoustic recorders, even in real time, in ways that are reproducible and rapid. Off-the-shelf algorithms can find, count, and classify species from digital images or video and detect cryptic patterns in noisy data. Using these opportunities requires collaboration across ecological and data science disciplines, which can be challenging to initiate. To facilitate these collaborations and promote the use of deep learning towards ecosystem-based management of the sea, this paper aims to bridge the gap between marine ecologists and computer scientists. We provide insight into popular deep learning approaches for ecological data analysis in plain language, focusing on the techniques of supervised learning with deep neural networks, and illustrate challenges and opportunities through established and emerging applications of deep learning to marine ecology. We use established and future-looking case studies on plankton, fishes, marine mammals, pollution, and nutrient cycling that involve object detection, classification, tracking, and segmentation of visualized data. We conclude with a broad outlook of the field's opportunities and challenges, including potential technological advances and issues with managing complex data sets.
Emerging application scenarios, such as cyber-physical systems (CPSs), the Internet of Things (IoT), and edge computing, call for coordination approaches addressing openness, self-adaptation, heterogeneity, and deployment agnosticism. Field-based coordination is one such approach, promoting the idea of programming system coordination declaratively from a global perspective, in terms of functional manipulation and evolution in "space and time" of distributed data structures called fields. More specifically regarding time, in field-based coordination (as in many other distributed approaches to coordination) it is assumed that local activities in each device are regulated by a fair and unsynchronised fixed clock working at the platform level. In this work, we challenge this assumption, and propose an alternative approach where scheduling is programmed in a natural way (along with usual field-based coordination) in terms of causality fields, each enacting a programmable distributed notion of a computation "cause" (why and when a field computation has to be locally computed) and how it should change across time and space. Starting from low-level platform triggers, such causality fields can be organised into multiple layers, up to high-level, collectively-computed time abstractions, to be used at the application level. This reinterpretation of time in terms of articulated causality relations allows us to express what we call "time-fluid" coordination, where scheduling can be finely tuned so as to select the triggers to react to, generally allowing to adaptively balance performance (system reactivity) and cost (resource usage) of computations. We formalise the proposed scheduling framework for field-based coordination in the context of the field calculus, discuss an implementation in the aggregate computing framework, and finally evaluate the approach via simulation on several case studies.
Airborne electromagnetic surveys may consist of hundreds of thousands of soundings. In most cases, this makes 3D inversions unfeasible even when the subsurface is characterized by a high level of heterogeneity. Instead, approaches based on 1D forwards are routinely used because of their computational efficiency. However, it is relatively easy to fit 3D responses with 1D forward modelling and retrieve apparently well-resolved conductivity models. However, those detailed features may simply be caused by fitting the modelling error connected to the approximate forward. In addition, it is, in practice, difficult to identify this kind of artifacts as the modeling error is correlated. The present study demonstrates how to assess the modelling error introduced by the 1D approximation and how to include this additional piece of information into a probabilistic inversion. Not surprisingly, it turns out that this simple modification provides not only much better reconstructions of the targets but, maybe, more importantly, guarantees a correct estimation of the corresponding reliability.
Many applications such as recommendation systems or sports tournaments involve pairwise comparisons within a collection of $n$ items, the goal being to aggregate the binary outcomes of the comparisons in order to recover the latent strength and/or global ranking of the items. In recent years, this problem has received significant interest from a theoretical perspective with a number of methods being proposed, along with associated statistical guarantees under the assumption of a suitable generative model. While these results typically collect the pairwise comparisons as one comparison graph $G$, however in many applications - such as the outcomes of soccer matches during a tournament - the nature of pairwise outcomes can evolve with time. Theoretical results for such a dynamic setting are relatively limited compared to the aforementioned static setting. We study in this paper an extension of the classic BTL (Bradley-Terry-Luce) model for the static setting to our dynamic setup under the assumption that the probabilities of the pairwise outcomes evolve smoothly over the time domain $[0,1]$. Given a sequence of comparison graphs $(G_{t'})_{t' \in \mathcal{T}}$ on a regular grid $\mathcal{T} \subset [0,1]$, we aim at recovering the latent strengths of the items $w_t \in \mathbb{R}^n$ at any time $t \in [0,1]$. To this end, we adapt the Rank Centrality method - a popular spectral approach for ranking in the static case - by locally averaging the available data on a suitable neighborhood of $t$. When $(G_{t'})_{t' \in \mathcal{T}}$ is a sequence of Erd\"os-Renyi graphs, we provide non-asymptotic $\ell_2$ and $\ell_{\infty}$ error bounds for estimating $w_t^*$ which in particular establishes the consistency of this method in terms of $n$, and the grid size $\lvert\mathcal{T}\rvert$. We also complement our theoretical analysis with experiments on real and synthetic data.
Hardware Security Modules (HSMs) are trusted machines that perform sensitive operations in critical ecosystems. They are usually required by law in financial and government digital services. The most important feature of an HSM is its ability to store sensitive credentials and cryptographic keys inside a tamper-resistant hardware, so that every operation is done internally through a suitable API, and such sensitive data are never exposed outside the device. HSMs are now conveniently provided in the cloud, meaning that the physical machines are remotely hosted by some provider and customers can access them through a standard API. The property of keeping sensitive data inside the device is even more important in this setting as a vulnerable application might expose the full API to an attacker. Unfortunately, in the last 20+ years a multitude of practical API-level attacks have been found and proved feasible in real devices. The latest version of PKCS#11, the most popular standard API for HSMs, does not address these issues leaving all the flaws possible. In this paper, we propose the first secure HSM configuration that does not require any restriction or modification of the PKCS#11 API and is suitable to cloud HSM solutions, where compliance to the standard API is of paramount importance. The configuration relies on a careful separation of roles among the different HSM users so that known API flaws are not exploitable by any attacker taking control of the application. We prove the correctness of the configuration by providing a formal model in the state-of-the-art Tamarin prover and we show how to implement the configuration in a real cloud HSM solution.
The problem of Approximate Nearest Neighbor (ANN) search is fundamental in computer science and has benefited from significant progress in the past couple of decades. However, most work has been devoted to pointsets whereas complex shapes have not been sufficiently treated. Here, we focus on distance functions between discretized curves in Euclidean space: they appear in a wide range of applications, from road segments to time-series in general dimension. For $\ell_p$-products of Euclidean metrics, for any $p$, we design simple and efficient data structures for ANN, based on randomized projections, which are of independent interest. They serve to solve proximity problems under a notion of distance between discretized curves, which generalizes both discrete Fr\'echet and Dynamic Time Warping distances. These are the most popular and practical approaches to comparing such curves. We offer the first data structures and query algorithms for ANN with arbitrarily good approximation factor, at the expense of increasing space usage and preprocessing time over existing methods. Query time complexity is comparable or significantly improved by our algorithms, our algorithm is especially efficient when the length of the curves is bounded.
The Normalized Cut (NCut) objective function, widely used in data clustering and image segmentation, quantifies the cost of graph partitioning in a way that biases clusters or segments that are balanced towards having lower values than unbalanced partitionings. However, this bias is so strong that it avoids any singleton partitions, even when vertices are very weakly connected to the rest of the graph. Motivated by the B\"uhler-Hein family of balanced cut costs, we propose the family of Compassionately Conservative Balanced (CCB) Cut costs, which are indexed by a parameter that can be used to strike a compromise between the desire to avoid too many singleton partitions and the notion that all partitions should be balanced. We show that CCB-Cut minimization can be relaxed into an orthogonally constrained $\ell_{\tau}$-minimization problem that coincides with the problem of computing Piecewise Flat Embeddings (PFE) for one particular index value, and we present an algorithm for solving the relaxed problem by iteratively minimizing a sequence of reweighted Rayleigh quotients (IRRQ). Using images from the BSDS500 database, we show that image segmentation based on CCB-Cut minimization provides better accuracy with respect to ground truth and greater variability in region size than NCut-based image segmentation.
Network Virtualization is one of the most promising technologies for future networking and considered as a critical IT resource that connects distributed, virtualized Cloud Computing services and different components such as storage, servers and application. Network Virtualization allows multiple virtual networks to coexist on same shared physical infrastructure simultaneously. One of the crucial keys in Network Virtualization is Virtual Network Embedding, which provides a method to allocate physical substrate resources to virtual network requests. In this paper, we investigate Virtual Network Embedding strategies and related issues for resource allocation of an Internet Provider(InP) to efficiently embed virtual networks that are requested by Virtual Network Operators(VNOs) who share the same infrastructure provided by the InP. In order to achieve that goal, we design a heuristic Virtual Network Embedding algorithm that simultaneously embeds virtual nodes and virtual links of each virtual network request onto physic infrastructure. Through extensive simulations, we demonstrate that our proposed scheme improves significantly the performance of Virtual Network Embedding by enhancing the long-term average revenue as well as acceptance ratio and resource utilization of virtual network requests compared to prior algorithms.