StreamBed is a capacity planning system for stream processing. It predicts, ahead of any production deployment, the resources that a query will require to process an incoming data rate sustainably, and the appropriate configuration of these resources. StreamBed builds a capacity planning model by piloting a series of runs of the target query in a small-scale, controlled testbed. We implement StreamBed for the popular Flink DSP engine. Our evaluation with large-scale queries of the Nexmark benchmark demonstrates that StreamBed can effectively and accurately predict capacity requirements for jobs spanning more than 1,000 cores using a testbed of only 48 cores.
A significant limitation of the LTE-V2X and NR-V2X sidelink scheduling mechanisms is their difficulty coping with variations in inter packet arrival times, also known as aperiodic packets. This conflicts with the fundamental characteristics of most V2X services which are triggered based on an event. e.g. ETSI Cooperative Awareness Messages (CAMs) - vehicle kinematics, Cooperative Perception Messages (CPMs) - object sensing and Decentralised Event Notification Messages (DENMs) - event occurrences. Furthermore, network management techniques such as congestion control mechanisms can result in varied inter packet arrival times. To combat this, NR-V2X introduced a dynamic grant mechanism, which we show is ineffective unless there is background periodic traffic to stabilise the sensing history upon which the scheduler makes it decisions. The characteristics of V2X services make it implausible that such periodic application traffic will exist. To overcome this significant drawback, we demonstrate that the standardised scheduling algorithms can be made effective if the event triggered arrival rate of packets can be accurately predicted. These predictions can be used to tune the Resource Reservation Interval (RRI) parameter of the MAC scheduler to negate the negative impact of aperiodicity. Such an approach allows the scheduler to achieve comparable performance to a scenario where packets arrive periodically. To demonstrate the effectiveness of our approach, an ML model has been devised for the prediction of cooperative awareness messages, but the same principle can be abstracted to other V2X service types.
Triggerless Data Acquisition Systems (DAQs) require transmitting the data stream from multiple links to the processing node. The short input data words must be concentrated and packed into the longer bit vectors the output interface (e.g., PCI Express) uses. In that process, the unneeded data must be eliminated, and a dense stream of useful DAQ data must be created. Additionally, the time order of the data should be preserved. This paper presents a new solution using the Baseline Network with Reversed Outputs (BNRO) for high-speed data routing. A thorough analysis of the network's operation enabled increased scalability compared to the previously published concentrator based on an 8x8 network. The solution may be scaled by adding additional layers to the BNRO network while minimizing resource consumption. Simulations were done for 4 and 5 layers (16 and 32 inputs). The FPGA implementation and tests in the actual hardware have been successfully performed for 16 inputs. The pipeline registers may be added in each layer independently, shortening the critical path and increasing the maximum acceptable clock frequency.
We address the problem of testing conditional mean and conditional variance for non-stationary data. We build e-values and p-values for four types of non-parametric composite hypotheses with specified mean and variance as well as other conditions on the shape of the data-generating distribution. These shape conditions include symmetry, unimodality, and their combination. Using the obtained e-values and p-values, we construct tests via e-processes also known as testing by betting, as well as tests based on combining p-values. Simulation and empirical studies are conducted for a few settings of the null hypotheses, and they show that methods based on e-processes are efficient.
Ground settlement prediction during the process of mechanized tunneling is of paramount importance and remains a challenging research topic. Typically, two paradigms are existing: a physics-driven approach utilizing process-oriented computational simulation models for the tunnel-soil interaction and the settlement prediction, and a data-driven approach employing machine learning techniques to establish mappings between influencing factors and the ground settlement. To integrate the advantages of both approaches and to assimilate the data from different sources, we propose a multi-fidelity deep operator network (DeepONet) framework, leveraging the recently developed operator learning methods. The presented framework comprises of two components: a low-fidelity subnet that captures the fundamental ground settlement patterns obtained from finite element simulations, and a high-fidelity subnet that learns the nonlinear correlation between numerical models and real engineering monitoring data. A pre-processing strategy for causality is adopted to consider the spatio-temporal characteristics of the settlement during tunnel excavation. Transfer learning is utilized to reduce the training cost for the low-fidelity subnet. The results show that the proposed method can effectively capture the physical information provided by the numerical simulations and accurately fit measured data as well. Remarkably, even with very limited noisy monitoring data, the proposed model can achieve rapid, accurate, and robust predictions of the full-field ground settlement in real-time during mechanized tunnel excavation.
For the differential privacy under the sub-Gamma noise, we derive the asymptotic properties of a class of network models with binary values with a general link function. In this paper, we release the degree sequences of the binary networks under a general noisy mechanism with the discrete Laplace mechanism as a special case. We establish the asymptotic result including both consistency and asymptotically normality of the parameter estimator when the number of parameters goes to infinity in a class of network models. Simulations and a real data example are provided to illustrate asymptotic results.
The joint modeling of multiple longitudinal biomarkers together with a time-to-event outcome is a challenging modeling task of continued scientific interest. In particular, the computational complexity of high dimensional (generalized) mixed effects models often restricts the flexibility of shared parameter joint models, even when the subject-specific marker trajectories follow highly nonlinear courses. We propose a parsimonious multivariate functional principal components representation of the shared random effects. This allows better scalability, as the dimension of the random effects does not directly increase with the number of markers, only with the chosen number of principal component basis functions used in the approximation of the random effects. The functional principal component representation additionally allows to estimate highly flexible subject-specific random trajectories without parametric assumptions. The modeled trajectories can thus be distinctly different for each biomarker. We build on the framework of flexible Bayesian additive joint models implemented in the R-package 'bamlss', which also supports estimation of nonlinear covariate effects via Bayesian P-splines. The flexible yet parsimonious functional principal components basis used in the estimation of the joint model is first estimated in a preliminary step. We validate our approach in a simulation study and illustrate its advantages by analyzing a study on primary biliary cholangitis.
With the remarkable progress that technology has made, the need for processing data near the sensors at the edge has increased dramatically. The electronic systems used in these applications must process data continuously, in real-time, and extract relevant information using the smallest possible energy budgets. A promising approach for implementing always-on processing of sensory signals that supports on-demand, sparse, and edge-computing is to take inspiration from biological nervous system. Following this approach, we present a brain-inspired platform for prototyping real-time event-based Spiking Neural Networks (SNNs). The system proposed supports the direct emulation of dynamic and realistic neural processing phenomena such as short-term plasticity, NMDA gating, AMPA diffusion, homeostasis, spike frequency adaptation, conductance-based dendritic compartments and spike transmission delays. The analog circuits that implement such primitives are paired with a low latency asynchronous digital circuits for routing and mapping events. This asynchronous infrastructure enables the definition of different network architectures, and provides direct event-based interfaces to convert and encode data from event-based and continuous-signal sensors. Here we describe the overall system architecture, we characterize the mixed signal analog-digital circuits that emulate neural dynamics, demonstrate their features with experimental measurements, and present a low- and high-level software ecosystem that can be used for configuring the system. The flexibility to emulate different biologically plausible neural networks, and the chip's ability to monitor both population and single neuron signals in real-time, allow to develop and validate complex models of neural processing for both basic research and edge-computing applications.
Recent advances have shown that GP priors, or their finite realisations, can be encoded using deep generative models such as variational autoencoders (VAEs). These learned generators can serve as drop-in replacements for the original priors during MCMC inference. While this approach enables efficient inference, it loses information about the hyperparameters of the original models, and consequently makes inference over hyperparameters impossible and the learned priors indistinct. To overcome this limitation, we condition the VAE on stochastic process hyperparameters. This allows the joint encoding of hyperparameters with GP realizations and their subsequent estimation during inference. Further, we demonstrate that our proposed method, PriorCVAE, is agnostic to the nature of the models which it approximates, and can be used, for instance, to encode solutions of ODEs. It provides a practical tool for approximate inference and shows potential in real-life spatial and spatiotemporal applications.
In large-scale systems there are fundamental challenges when centralised techniques are used for task allocation. The number of interactions is limited by resource constraints such as on computation, storage, and network communication. We can increase scalability by implementing the system as a distributed task-allocation system, sharing tasks across many agents. However, this also increases the resource cost of communications and synchronisation, and is difficult to scale. In this paper we present four algorithms to solve these problems. The combination of these algorithms enable each agent to improve their task allocation strategy through reinforcement learning, while changing how much they explore the system in response to how optimal they believe their current strategy is, given their past experience. We focus on distributed agent systems where the agents' behaviours are constrained by resource usage limits, limiting agents to local rather than system-wide knowledge. We evaluate these algorithms in a simulated environment where agents are given a task composed of multiple subtasks that must be allocated to other agents with differing capabilities, to then carry out those tasks. We also simulate real-life system effects such as networking instability. Our solution is shown to solve the task allocation problem to 6.7% of the theoretical optimal within the system configurations considered. It provides 5x better performance recovery over no-knowledge retention approaches when system connectivity is impacted, and is tested against systems up to 100 agents with less than a 9% impact on the algorithms' performance.
We present ResMLP, an architecture built entirely upon multi-layer perceptrons for image classification. It is a simple residual network that alternates (i) a linear layer in which image patches interact, independently and identically across channels, and (ii) a two-layer feed-forward network in which channels interact independently per patch. When trained with a modern training strategy using heavy data-augmentation and optionally distillation, it attains surprisingly good accuracy/complexity trade-offs on ImageNet. We will share our code based on the Timm library and pre-trained models.