亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We provide a unified operational framework for the study of causality, non-locality and contextuality, in a fully device-independent and theory-independent setting. Our work has its roots in the sheaf-theoretic framework for contextuality by Abramsky and Brandenburger, which it extends to include arbitrary causal orders (be they definite, dynamical or indefinite). We define a notion of causal function for arbitrary spaces of input histories, and we show that the explicit imposition of causal constraints on joint outputs is equivalent to the free assignment of local outputs to the tip events of input histories. We prove factorisation results for causal functions over parallel, sequential, and conditional sequential compositions of the underlying spaces. We prove that causality is equivalent to continuity with respect to the lowerset topology on the underlying spaces, and we show that partial causal functions defined on open sub-spaces can be bundled into a presheaf. In a striking departure from the Abramsky-Brandenburger setting, however, we show that causal functions fail, under certain circumstances, to form a sheaf. We define empirical models as compatible families in the presheaf of probability distributions on causal functions, for arbitrary open covers of the underlying space of input histories. We show the existence of causally-induced contextuality, a phenomenon arising when the causal constraints themselves become context-dependent, and we prove a no-go result for non-locality on total orders, both static and dynamical.

相關內容

The rise of datathons, also known as data or data science hackathons, has provided a platform to collaborate, learn, and innovate in a short timeframe. Despite their significant potential benefits, organizations often struggle to effectively work with data due to a lack of clear guidelines and best practices for potential issues that might arise. Drawing on our own experiences and insights from organizing >80 datathon challenges with >60 partnership organizations since 2016, we provide guidelines and recommendations that serve as a resource for organizers to navigate the data-related complexities of datathons. We apply our proposed framework to 10 case studies.

Inversion of operators is a fundamental concept in data processing. Inversion of linear operators is well studied, supported by established theory. When an inverse either does not exist or is not unique, generalized inverses are used. Most notable is the Moore-Penrose inverse, widely used in physics, statistics, and various fields of engineering. This work investigates generalized inversion of nonlinear operators. We first address broadly the desired properties of generalized inverses, guided by the Moore-Penrose axioms. We define the notion for general sets, and then a refinement, termed pseudo-inverse, for normed spaces. We present conditions for existence and uniqueness of a pseudo-inverse and establish theoretical results investigating its properties, such as continuity, its value for operator compositions and projection operators, and others. Analytic expressions are given for the pseudo-inverse of some well-known, non-invertible, nonlinear operators, such as hard- or soft-thresholding and ReLU. We analyze a neural layer and discuss relations to wavelet thresholding. Next, the Drazin inverse, and a relaxation, are investigated for operators with equal domain and range. We present scenarios where inversion is expressible as a linear combination of forward applications of the operator. Such scenarios arise for classes of nonlinear operators with vanishing polynomials, similar to the minimal or characteristic polynomials for matrices. Inversion using forward applications may facilitate the development of new efficient algorithms for approximating generalized inversion of complex nonlinear operators.

Classical simulations are essential for the development of quantum computing, and their exponential scaling can easily fill any modern supercomputer. In this paper we consider the performance and energy consumption of large Quantum Fourier Transform (QFT) simulations run on ARCHER2, the UK's National Supercomputing Service, with QuEST toolkit. We take into account CPU clock frequency and node memory size, and use cache-blocking to rearrange the circuit, which minimises communications. We find that using 2.00GHz instead of 2.25GHz can save as much as 25% of energy at 5% increase in runtime. Higher node memory also has the potential to be more efficient, and cost the user fewer CUs, but at higher runtime penalty. Finally, we present a cache-blocking QFT circuit, which halves the required communication. All our optimisations combined result in 40% faster simulations and 35% energy savings in 44 qubit simulations on 4,096 ARCHER2 nodes.

We consider a random geometric hypergraph model based on an underlying bipartite graph. Nodes and hyperedges are sampled uniformly in a domain, and a node is assigned to those hyperedges that lie with a certain radius. From a modelling perspective, we explain how the model captures higher order connections that arise in real data sets. Our main contribution is to study the connectivity properties of the model. In an asymptotic limit where the number of nodes and hyperedges grow in tandem we give a condition on the radius that guarantees connectivity.

This paper presents a computational framework for the concise encoding of an ensemble of persistence diagrams, in the form of weighted Wasserstein barycenters [100], [102] of a dictionary of atom diagrams. We introduce a multi-scale gradient descent approach for the efficient resolution of the corresponding minimization problem, which interleaves the optimization of the barycenter weights with the optimization of the atom diagrams. Our approach leverages the analytic expressions for the gradient of both sub-problems to ensure fast iterations and it additionally exploits shared-memory parallelism. Extensive experiments on public ensembles demonstrate the efficiency of our approach, with Wasserstein dictionary computations in the orders of minutes for the largest examples. We show the utility of our contributions in two applications. First, we apply Wassserstein dictionaries to data reduction and reliably compress persistence diagrams by concisely representing them with their weights in the dictionary. Second, we present a dimensionality reduction framework based on a Wasserstein dictionary defined with a small number of atoms (typically three) and encode the dictionary as a low dimensional simplex embedded in a visual space (typically in 2D). In both applications, quantitative experiments assess the relevance of our framework. Finally, we provide a C++ implementation that can be used to reproduce our results.

We study automated intrusion response for an IT infrastructure and formulate the interaction between an attacker and a defender as a partially observed stochastic game. To solve the game we follow an approach where attack and defense strategies co-evolve through reinforcement learning and self-play toward an equilibrium. Solutions proposed in previous work prove the feasibility of this approach for small infrastructures but do not scale to realistic scenarios due to the exponential growth in computational complexity with the infrastructure size. We address this problem by introducing a method that recursively decomposes the game into subgames which can be solved in parallel. Applying optimal stopping theory we show that the best response strategies in these subgames exhibit threshold structures, which allows us to compute them efficiently. To solve the decomposed game we introduce an algorithm called Decompositional Fictitious Self-Play (DFSP), which learns Nash equilibria through stochastic approximation. We evaluate the learned strategies in an emulation environment where real intrusions and response actions can be executed. The results show that the learned strategies approximate an equilibrium and that DFSP significantly outperforms a state-of-the-art algorithm for a realistic infrastructure configuration.

Since its introduction, the partial information decomposition (PID) has emerged as a powerful, information-theoretic technique useful for studying the structure of (potentially higher-order) interactions in complex systems. Despite its utility, the applicability of the PID is restricted by the need to assign elements as either inputs or targets, as well as the specific structure of the mutual information itself. Here, we introduce a generalized information decomposition that relaxes the source/target distinction while still satisfying the basic intuitions about information. This approach is based on the decomposition of the Kullback-Leibler divergence, and consequently allows for the analysis of any information gained when updating from an arbitrary prior to an arbitrary posterior. Consequently, any information-theoretic measure that can be written in as a Kullback-Leibler divergence admits a decomposition in the style of Williams and Beer, including the total correlation, the negentropy, and the mutual information as special cases. In this paper, we explore how the generalized information decomposition can reveal novel insights into existing measures, as well as the nature of higher-order synergies. We show that synergistic information is intimately related to the well-known Tononi-Sporns-Edelman (TSE) complexity, and that synergistic information requires a similar integration/segregation balance as a high TSE complexity. Finally, we end with a discussion of how this approach fits into other attempts to generalize the PID and the possibilities for empirical applications.

The adaptive processing of structured data is a long-standing research topic in machine learning that investigates how to automatically learn a mapping from a structured input to outputs of various nature. Recently, there has been an increasing interest in the adaptive processing of graphs, which led to the development of different neural network-based methodologies. In this thesis, we take a different route and develop a Bayesian Deep Learning framework for graph learning. The dissertation begins with a review of the principles over which most of the methods in the field are built, followed by a study on graph classification reproducibility issues. We then proceed to bridge the basic ideas of deep learning for graphs with the Bayesian world, by building our deep architectures in an incremental fashion. This framework allows us to consider graphs with discrete and continuous edge features, producing unsupervised embeddings rich enough to reach the state of the art on several classification tasks. Our approach is also amenable to a Bayesian nonparametric extension that automatizes the choice of almost all model's hyper-parameters. Two real-world applications demonstrate the efficacy of deep learning for graphs. The first concerns the prediction of information-theoretic quantities for molecular simulations with supervised neural models. After that, we exploit our Bayesian models to solve a malware-classification task while being robust to intra-procedural code obfuscation techniques. We conclude the dissertation with an attempt to blend the best of the neural and Bayesian worlds together. The resulting hybrid model is able to predict multimodal distributions conditioned on input graphs, with the consequent ability to model stochasticity and uncertainty better than most works. Overall, we aim to provide a Bayesian perspective into the articulated research field of deep learning for graphs.

Transformer, an attention-based encoder-decoder architecture, has revolutionized the field of natural language processing. Inspired by this significant achievement, some pioneering works have recently been done on adapting Transformerliked architectures to Computer Vision (CV) fields, which have demonstrated their effectiveness on various CV tasks. Relying on competitive modeling capability, visual Transformers have achieved impressive performance on multiple benchmarks such as ImageNet, COCO, and ADE20k as compared with modern Convolution Neural Networks (CNN). In this paper, we have provided a comprehensive review of over one hundred different visual Transformers for three fundamental CV tasks (classification, detection, and segmentation), where a taxonomy is proposed to organize these methods according to their motivations, structures, and usage scenarios. Because of the differences in training settings and oriented tasks, we have also evaluated these methods on different configurations for easy and intuitive comparison instead of only various benchmarks. Furthermore, we have revealed a series of essential but unexploited aspects that may empower Transformer to stand out from numerous architectures, e.g., slack high-level semantic embeddings to bridge the gap between visual and sequential Transformers. Finally, three promising future research directions are suggested for further investment.

Co-evolving time series appears in a multitude of applications such as environmental monitoring, financial analysis, and smart transportation. This paper aims to address the following challenges, including (C1) how to incorporate explicit relationship networks of the time series; (C2) how to model the implicit relationship of the temporal dynamics. We propose a novel model called Network of Tensor Time Series, which is comprised of two modules, including Tensor Graph Convolutional Network (TGCN) and Tensor Recurrent Neural Network (TRNN). TGCN tackles the first challenge by generalizing Graph Convolutional Network (GCN) for flat graphs to tensor graphs, which captures the synergy between multiple graphs associated with the tensors. TRNN leverages tensor decomposition to model the implicit relationships among co-evolving time series. The experimental results on five real-world datasets demonstrate the efficacy of the proposed method.

北京阿比特科技有限公司