Ensuring the correctness of software for communication centric programs is important but challenging. Previous approaches, based on session types, have been intensively investigated over the past decade. They provide a concise way to express protocol specifications and a lightweight approach for checking their implementation. Current solutions are based on only implicit synchronization, and are based on the less precise types rather than logical formulae. In this paper, we propose a more expressive session logic to capture multiparty protocols. By using two kinds of ordering constraints, namely "happens-before" <HB and "communicates-before" <CB, we show how to ensure from first principle race-freedom over common channels. Our approach refines each specification with both assumptions and proof obligations to ensure compliance to some global protocol. Each specification is then projected for each party and then each channel, to allow cooperative proving through localized automated verification. Our primary goal in automated verification is to ensure race-freedom and communication-safety, but the approach is extensible for deadlock-freedom as well. We shall also describe how modular protocols can be captured and handled by our approach.
Channel estimation in mmWave and THz-range wireless communications (producing Gb/Tb-range of data) is critical to configuring system parameters related to transmission signal quality, and yet it remains a daunting challenge both in software and hardware. Current methods of channel estimations, be it modeling- or data-based (machine learning (ML)), - use and create big data. This in turn requires a large amount of computational resources, read operations to prove if there is some predefined channel configurations, e.g., QoS requirements, in the database, as well as write operations to store the new combinations of QoS parameters in the database. Especially the ML-based approach requires high computational and storage resources, low latency and a higher hardware flexibility. In this paper, we engineer and study the offloading of the above operations to edge and cloud computing systems to understand the suitability of edge and cloud computing to provide rapid response with channel and link configuration parameters on the example of THz channel modeling. We evaluate the performance of the engineered system when the computational and storage resources are orchestrated based on: 1) monolithic architecture, 2) microservices architectures, both in edge-cloud based approach. For microservices approach, we engineer both Docker Swarm and Kubernetes systems. The measurements show a great promise of edge computing and microservices that can quickly respond to properly configure parameters and improve transmission distance and signal quality with ultra-high speed wireless communications.
The design of heat exchanger fields is a key phase to ensure the long-term sustainability of such renewable energy systems. This task has to be accomplished by modelling the relevant processes in the complex system made up of different exchangers, where the heat transfer must be considered within exchangers and outside exchangers. We propose a mathematical model for the study of the heat conduction into the soil as consequence of the presence of exchangers. Such a problem is formulated and solved with an analytical approach. On the basis of such analytical solution, we propose an optimisation procedure to compute the best position of the exchangers by minimising the adverse effects of neighbouring devices. Some numerical experiments are used to show the effectiveness of the proposed method also by taking into account a reference approximation procedure of the problem based on a finite difference method.
We present an analytical framework for the channel estimation and the data detection in massive multiple-input multiple-output uplink systems with 1-bit analog-to-digital converters (ADCs) and i.i.d. Rayleigh fading. First, we provide closed-form expressions of the mean squared error (MSE) of the channel estimation considering the state-of-the-art linear minimum MSE estimator and the class of scaled least-squares estimators. For the data detection, we provide closed-form expressions of the expected value and the variance of the estimated symbols when maximum ratio combining is adopted, which can be exploited to efficiently implement minimum distance detection and, potentially, to design the set of transmit symbols. Our analytical findings explicitly depend on key system parameters such as the signal-to-noise ratio (SNR), the number of user equipments, and the pilot length, thus enabling a precise characterization of the performance of the channel estimation and the data detection with 1-bit ADCs. The proposed analysis highlights a fundamental SNR trade-off, according to which operating at the right noise level significantly enhances the system performance.
The ever-rising computation demand is forcing the move from the CPU to heterogeneous specialized hardware, which is readily available across modern datacenters through disaggregated infrastructure. On the other hand, trusted execution environments (TEEs), one of the most promising recent developments in hardware security, can only protect code confined in the CPU, limiting TEEs' potential and applicability to a handful of applications. We observe that the TEEs' hardware trusted computing base (TCB) is fixed at design time, which in practice leads to using untrusted software to employ peripherals in TEEs. Based on this observation, we propose \emph{composite enclaves} with a configurable hardware and software TCB, allowing enclaves access to multiple computing and IO resources. Finally, we present two case studies of composite enclaves: i) an FPGA platform based on RISC-V Keystone connected to emulated peripherals and sensors, and ii) a large-scale accelerator. These case studies showcase a flexible but small TCB (2.5 KLoC for IO peripherals and drivers), with a low-performance overhead (only around 220 additional cycles for a context switch), thus demonstrating the feasibility of our approach and showing that it can work with a wide range of specialized hardware.
Federated learning allows a set of users to train a deep neural network over their private training datasets. During the protocol, datasets never leave the devices of the respective users. This is achieved by requiring each user to send "only" model updates to a central server that, in turn, aggregates them to update the parameters of the deep neural network. However, it has been shown that each model update carries sensitive information about the user's dataset (e.g., gradient inversion attacks). The state-of-the-art implementations of federated learning protect these model updates by leveraging secure aggregation: A cryptographic protocol that securely computes the aggregation of the model updates of the users. Secure aggregation is pivotal to protect users' privacy since it hinders the server from learning the value and the source of the individual model updates provided by the users, preventing inference and data attribution attacks. In this work, we show that a malicious server can easily elude secure aggregation as if the latter were not in place. We devise two different attacks capable of inferring information on individual private training datasets, independently of the number of users participating in the secure aggregation. This makes them concrete threats in large-scale, real-world federated learning applications. The attacks are generic and do not target any specific secure aggregation protocol. They are equally effective even if the secure aggregation protocol is replaced by its ideal functionality that provides the perfect level of security. Our work demonstrates that secure aggregation has been incorrectly combined with federated learning and that current implementations offer only a "false sense of security".
Analytic combinatorics in several variables is a powerful tool for deriving the asymptotic behavior of combinatorial quantities by analyzing multivariate generating functions. We study information-theoretic questions about sequences in a discrete noiseless channel under cost and forbidden substring constraints. Our main contributions involve the relationship between the graph structure of the channel and the singularities of the bivariate generating function whose coefficients are the number of sequences satisfying the constraints. We combine these new results with methods from multivariate analytic combinatorics to solve questions in many application areas. For example, we determine the optimal coded synthesis rate for DNA data storage when the synthesis supersequence is any periodic string. This follows from a precise characterization of the number of subsequences of an arbitrary periodic strings. Along the way, we provide a new proof of the equivalence of the combinatorial and probabilistic definitions of the cost-constrained capacity, and we show that the cost-constrained channel capacity is determined by a cost-dependent singularity, generalizing Shannon's classical result for unconstrained capacity.
We present a flexible public transit network design model which optimizes a social access objective while guaranteeing that the system's costs and transit times remain within a preset margin of their current levels. The purpose of the model is to find a set of minor, immediate modifications to an existing bus network that can give more communities access to the chosen services while having a minimal impact on the current network's operator costs and user costs. Design decisions consist of reallocation of existing resources in order to adjust line frequencies and capacities. We present a hybrid tabu search/simulated annealing algorithm for the solution of this optimization-based model. As a case study we apply the model to the problem of improving equity of access to primary health care facilities in the Chicago metropolitan area. The results of the model suggest that it is possible to achieve better primary care access equity through reassignment of existing buses and implementation of express runs, while leaving overall service levels relatively unaffected.
Kleene algebra with tests (KAT) is a foundational equational framework for reasoning about programs, which has found applications in program transformations, networking and compiler optimizations, among many other areas. In his seminal work, Kozen proved that KAT subsumes propositional Hoare logic, showing that one can reason about the (partial) correctness of while programs by means of the equational theory of KAT. In this work, we investigate the support that KAT provides for reasoning about incorrectness, instead, as embodied by Ohearn's recently proposed incorrectness logic. We show that KAT cannot directly express incorrectness logic. The main reason for this limitation can be traced to the fact that KAT cannot express explicitly the notion of codomain, which is essential to express incorrectness triples. To address this issue, we study Kleene Algebra with Top and Tests (TopKAT), an extension of KAT with a top element. We show that TopKAT is powerful enough to express a codomain operation, to express incorrectness triples, and to prove all the rules of incorrectness logic sound. This shows that one can reason about the incorrectness of while-like programs by means of the equational theory of TopKAT.
Motivated by unconsolidated data situation and the lack of a standard benchmark in the field, we complement our previous efforts and present a comprehensive corpus designed for training and evaluating text-independent multi-channel speaker verification systems. It can be readily used also for experiments with dereverberation, denoising, and speech enhancement. We tackled the ever-present problem of the lack of multi-channel training data by utilizing data simulation on top of clean parts of the Voxceleb dataset. The development and evaluation trials are based on a retransmitted Voices Obscured in Complex Environmental Settings (VOiCES) corpus, which we modified to provide multi-channel trials. We publish full recipes that create the dataset from public sources as the MultiSV corpus, and we provide results with two of our multi-channel speaker verification systems with neural network-based beamforming based either on predicting ideal binary masks or the more recent Conv-TasNet.
We consider the task of learning the parameters of a {\em single} component of a mixture model, for the case when we are given {\em side information} about that component, we call this the "search problem" in mixture models. We would like to solve this with computational and sample complexity lower than solving the overall original problem, where one learns parameters of all components. Our main contributions are the development of a simple but general model for the notion of side information, and a corresponding simple matrix-based algorithm for solving the search problem in this general setting. We then specialize this model and algorithm to four common scenarios: Gaussian mixture models, LDA topic models, subspace clustering, and mixed linear regression. For each one of these we show that if (and only if) the side information is informative, we obtain parameter estimates with greater accuracy, and also improved computation complexity than existing moment based mixture model algorithms (e.g. tensor methods). We also illustrate several natural ways one can obtain such side information, for specific problem instances. Our experiments on real data sets (NY Times, Yelp, BSDS500) further demonstrate the practicality of our algorithms showing significant improvement in runtime and accuracy.