The estimation of parameter standard errors for semi-variogram models is challenging, given the two-step process required to fit a parametric model to spatially correlated data. Motivated by an application in the social-epidemiology, we focus on exponential semi-variogram models fitted to data between 500 to 2000 observations and little control over the sampling design. Previously proposed methods for the estimation of standard errors cannot be applied in this context. Approximate closed form solutions are too costly using generalized least squares in terms of memory capacities. The generalized bootstrap proposed by Olea and Pardo-Ig\'uzquiza is nonetheless applicable with weighted instead of generalized least squares. However, the standard error estimates are hugely biased and imprecise. Therefore, we propose a filtering method added to the generalized bootstrap. The new development is presented and evaluated with a simulation study which shows that the generalized bootstrap with check-based filtering leads to massively improved results compared to the quantile-based filter method and previously developed approaches. We provide a case study using birthweight data.
Training deep learning models for video classification from audio-visual data commonly requires immense amounts of labeled training data collected via a costly process. A challenging and underexplored, yet much cheaper, setup is few-shot learning from video data. In particular, the inherently multi-modal nature of video data with sound and visual information has not been leveraged extensively for the few-shot video classification task. Therefore, we introduce a unified audio-visual few-shot video classification benchmark on three datasets, i.e. the VGGSound-FSL, UCF-FSL, ActivityNet-FSL datasets, where we adapt and compare ten methods. In addition, we propose AV-DIFF, a text-to-feature diffusion framework, which first fuses the temporal and audio-visual features via cross-modal attention and then generates multi-modal features for the novel classes. We show that AV-DIFF obtains state-of-the-art performance on our proposed benchmark for audio-visual (generalised) few-shot learning. Our benchmark paves the way for effective audio-visual classification when only limited labeled data is available. Code and data are available at //github.com/ExplainableML/AVDIFF-GFSL.
Adversarial attacks dramatically change the output of an otherwise accurate learning system using a seemingly inconsequential modification to a piece of input data. Paradoxically, empirical evidence indicates that even systems which are robust to large random perturbations of the input data remain susceptible to small, easily constructed, adversarial perturbations of their inputs. Here, we show that this may be seen as a fundamental feature of classifiers working with high dimensional input data. We introduce a simple generic and generalisable framework for which key behaviours observed in practical systems arise with high probability -- notably the simultaneous susceptibility of the (otherwise accurate) model to easily constructed adversarial attacks, and robustness to random perturbations of the input data. We confirm that the same phenomena are directly observed in practical neural networks trained on standard image classification problems, where even large additive random noise fails to trigger the adversarial instability of the network. A surprising takeaway is that even small margins separating a classifier's decision surface from training and testing data can hide adversarial susceptibility from being detected using randomly sampled perturbations. Counterintuitively, using additive noise during training or testing is therefore inefficient for eradicating or detecting adversarial examples, and more demanding adversarial training is required.
A novel and fully distributed optimization method is proposed for the distributed robust convex program (DRCP) over a time-varying unbalanced directed network without imposing any differentiability assumptions. Firstly, a tractable approximated DRCP (ADRCP) is introduced by discretizing the semi-infinite constraints into a finite number of inequality constraints and restricting the right-hand side of the constraints with a proper positive parameter, which will be iteratively solved by a random-fixed projection algorithm. Secondly, a cutting-surface consensus approach is proposed for locating an approximately optimal consensus solution of the DRCP with guaranteed feasibility. This approach is based on iteratively approximating the DRCP by successively reducing the restriction parameter of the right-hand constraints and populating the cutting-surfaces into the existing finite set of constraints. Thirdly, to ensure finite-time convergence of the distributed optimization, a distributed termination algorithm is developed based on uniformly local consensus and zeroth-order optimality under uniformly strongly connected graphs. Fourthly, it is proved that the cutting-surface consensus approach converges within a finite number of iterations. Finally, the effectiveness of the approach is illustrated through a numerical example.
StreamBed is a capacity planning system for stream processing.It predicts, ahead of any production deployment, the resources that a query will require to process an incoming data rate sustainably, and the appropriate configuration of these resources. StreamBed builds a capacity planning model by piloting a series of runs of the target query in a small-scale, controlled testbed. We implement StreamBed for the popular Flink DSP engine. Our evaluation with large-scale queries of the Nexmark benchmark demonstrates that StreamBed can effectively and accurately predict capacity requirements for jobs spanning more than 1,000 cores using a testbed of only 48 cores.
We consider disjoint and sliding blocks estimators of cluster indices for multivariate, regularly varying time series in the Peak-over-Threshold framework. We aim to provide a complete description of the limiting behaviour of these estimators. This is achieved by a precise expansion for the difference between the sliding and the disjoint blocks statistics. The rates in the expansion stem from internal clusters and boundary clusters. To obtain these rates we need to extend the existing results on vague convergence of cluster measures. We reveal dichotomous behaviour between small blocks and large blocks scenario.
Bayesian graphical models are powerful tools to infer complex relationships in high dimension, yet are often fraught with computational and statistical challenges. If exploited in a principled way, the increasing information collected alongside the data of primary interest constitutes an opportunity to mitigate these difficulties by guiding the detection of dependence structures. For instance, gene network inference may be informed by the use of publicly available summary statistics on the regulation of genes by genetic variants. Here we present a novel Gaussian graphical modelling framework to identify and leverage information on the centrality of nodes in conditional independence graphs. Specifically, we consider a fully joint hierarchical model to simultaneously infer (i) sparse precision matrices and (ii) the relevance of node-level information for uncovering the sought-after network structure. We encode such information as candidate auxiliary variables using a spike-and-slab submodel on the propensity of nodes to be hubs, which allows hypothesis-free selection and interpretation of a sparse subset of relevant variables. As efficient exploration of large posterior spaces is needed for real-world applications, we develop a variational expectation conditional maximisation algorithm that scales inference to hundreds of samples, nodes and auxiliary variables. We illustrate and exploit the advantages of our approach in simulations and in a gene network study which identifies hub genes involved in biological pathways relevant to immune-mediated diseases.
The objective of the multi-condition human motion synthesis task is to incorporate diverse conditional inputs, encompassing various forms like text, music, speech, and more. This endows the task with the capability to adapt across multiple scenarios, ranging from text-to-motion and music-to-dance, among others. While existing research has primarily focused on single conditions, the multi-condition human motion generation remains underexplored. In this paper, we address these challenges by introducing MCM, a novel paradigm for motion synthesis that spans multiple scenarios under diverse conditions. The MCM framework is able to integrate with any DDPM-like diffusion model to accommodate multi-conditional information input while preserving its generative capabilities. Specifically, MCM employs two-branch architecture consisting of a main branch and a control branch. The control branch shares the same structure as the main branch and is initialized with the parameters of the main branch, effectively maintaining the generation ability of the main branch and supporting multi-condition input. We also introduce a Transformer-based diffusion model MWNet (DDPM-like) as our main branch that can capture the spatial complexity and inter-joint correlations in motion sequences through a channel-dimension self-attention module. Quantitative comparisons demonstrate that our approach achieves SoTA results in both text-to-motion and competitive results in music-to-dance tasks, comparable to task-specific methods. Furthermore, the qualitative evaluation shows that MCM not only streamlines the adaptation of methodologies originally designed for text-to-motion tasks to domains like music-to-dance and speech-to-gesture, eliminating the need for extensive network re-configurations but also enables effective multi-condition modal control, realizing "once trained is motion need".
We consider network-based decentralized optimization problems, where each node in the network possesses a local function and the objective is to collectively attain a consensus solution that minimizes the sum of all the local functions. A major challenge in decentralized optimization is the reliance on communication which remains a considerable bottleneck in many applications. To address this challenge, we propose an adaptive randomized communication-efficient algorithmic framework that reduces the volume of communication by periodically tracking the disagreement error and judiciously selecting the most influential and effective edges at each node for communication. Within this framework, we present two algorithms: Adaptive Consensus (AC) to solve the consensus problem and Adaptive Consensus based Gradient Tracking (AC-GT) to solve smooth strongly convex decentralized optimization problems. We establish strong theoretical convergence guarantees for the proposed algorithms and quantify their performance in terms of various algorithmic parameters under standard assumptions. Finally, numerical experiments showcase the effectiveness of the framework in significantly reducing the information exchange required to achieve a consensus solution.
Binary responses arise in a multitude of statistical problems, including binary classification, bioassay, current status data problems and sensitivity estimation. There has been an interest in such problems in the Bayesian nonparametrics community since the early 1970s, but inference given binary data is intractable for a wide range of modern simulation-based models, even when employing MCMC methods. Recently, Christensen (2023) introduced a novel simulation technique based on counting permutations, which can estimate both posterior distributions and marginal likelihoods for any model from which a random sample can be generated. However, the accompanying implementation of this technique struggles when the sample size is too large (n > 250). Here we present perms, a new implementation of said technique which is substantially faster and able to handle larger data problems than the original implementation. It is available both as an R package and a Python library. The basic usage of perms is illustrated via two simple examples: a tractable toy problem and a bioassay problem. A more complex example involving changepoint analysis is also considered. We also cover the details of the implementation and illustrate the computational speed gain of perms via a simple simulation study.
Within the applications of spatial point processes, it is increasingly becoming common that events are labeled by marks, prompting an exploration beyond the spatial distribution of events by incorporating the marks in the undertaken analysis. In this paper, we first consider marked spatial point processes in $\R^2$, where marks are either integer-valued, real-valued, or object-valued, and review the state-of-the-art to analyze the spatial structure and type of interaction/correlation between marks. More specifically, we review cross/dot-type summary characteristics, mark-weighted summary characteristics, various mark correlation functions, and frequency domain approaches. Second, we propose novel cross/dot-type higher-order summary characteristics, mark-weighted summary characteristics, and mark correlation functions for marked point processes on linear networks. Through a simulation study, we show that ignoring the underlying network gives rise to erroneous conclusions about the interaction/correlation between marks. Finally, we consider two applications: the locations of two types of butterflies in Melbourne, Australia, and the locations of public trees along the street network of Vancouver, Canada, where trees are labeled by their diameters at breast height.