美国式禁忌电影在线观看免费观看_天天躁夜夜躁狠狠躁综合_A级毛片免费高清毛片视频_国产欧美日韩综合视频一区二区_天天综合网网欲色2020_国产有码一区二区三区在线观看_美女视频永久黄网站免费观看国产

Dynamic techniques are a scalable and effective way to analyze concurrent programs. Instead of analyzing all behaviors of a program, these techniques detect errors by focusing on a single program execution. Often a crucial step in these techniques is to define a causal ordering between events in the execution, which is then computed using vector clocks, a simple data structure that stores logical times of threads. The two basic operations of vector clocks, namely join and copy, require $\Theta(k)$ time, where $k$ is the number of threads. Thus they are a computational bottleneck when $k$ is large. In this work, we introduce tree clocks, a new data structure that replaces vector clocks for computing causal orderings in program executions. Joining and copying tree clocks takes time that is roughly proportional to the number of entries being modified, and hence the two operations do not suffer the a-priori $\Theta(k)$ cost per application. We show that when used to compute the classic happens-before (HB) partial order, tree clocks are optimal, in the sense that no other data structure can lead to smaller asymptotic running time. Moreover, we demonstrate that tree clocks can be used to compute other partial orders, such as schedulable-happens-before (SHB) and the standard Mazurkiewicz (MAZ) partial order, and thus are a versatile data structure. Our experiments show that just by replacing vector clocks with tree clocks, the computation becomes from $2.02 \times$ faster (MAZ) to $2.66 \times$ (SHB) and $2.97 \times$ (HB) on average per benchmark. These results illustrate that tree clocks have the potential to become a standard data structure with wide applications in concurrent analyses.

相關內容

向(xiang)量化

關注 1

Processing（編程語言） · MoDELS · 講稿 · INTERACT · 相互獨立的 ·

2022 年 4 月 20 日

Modeling and Executing Production Processes with Capabilities and Skills using Ontologies and BPMN

Aljosha K?cher,Luis Miguel Vieira da Silva,Alexander Fay

from arxiv, To be submitted to ETFA 2022

Current challenges of the manufacturing industry require modular and changeable manufacturing systems that can be adapted to variable conditions with little effort. At the same time, production recipes typically represent important company know-how that should not be directly tied to changing plant configurations. Thus, there is a need to model general production recipes independent of specific plant layouts. For execution of such a recipe however, a binding to then available production resources needs to be made. In this contribution, select a suitable modeling language to model and execute such recipes. Furthermore, we present an approach to solve the issue of recipe modeling and execution in modular plants using semantically modeled capabilities and skills as well as BPMN. We make use of BPMN to model \emph{capability processes}, i.e. production processes referencing abstract descriptions of resource functions. These capability processes are not bound to a certain plant layout, as there can be multiple resources fulfilling the same capability. For execution, every capability in a capability process is replaced by a skill realizing it, effectively creating a \emph{skill process} consisting of various skill invocations. The presented solution is capable of orchestrating and executing complex processes that integrate production steps with typical IT functionalities such as error handling, user interactions and notifications. Benefits of the approach are demonstrated using a flexible manufacturing system.

聯合分布 · 情景 · 隨機漫步 · 泛化理論 · 正則化項 ·

2022 年 4 月 20 日

Private measures, random walks, and synthetic data

March Boedihardjo,Thomas Strohmer,Roman Vershynin

Differential privacy is a mathematical concept that provides an information-theoretic security guarantee. While differential privacy has emerged as a de facto standard for guaranteeing privacy in data sharing, the known mechanisms to achieve it come with some serious limitations. Utility guarantees are usually provided only for a fixed, a priori specified set of queries. Moreover, there are no utility guarantees for more complex - but very common - machine learning tasks such as clustering or classification. In this paper we overcome some of these limitations. Working with metric privacy, a powerful generalization of differential privacy, we develop a polynomial-time algorithm that creates a private measure from a data set. This private measure allows us to efficiently construct private synthetic data that are accurate for a wide range of statistical analysis tools. Moreover, we prove an asymptotically sharp min-max result for private measures and synthetic data for general compact metric spaces. A key ingredient in our construction is a new superregular random walk, whose joint distribution of steps is as regular as that of independent random variables, yet which deviates from the origin logarithmicaly slowly.

Extensibility · Performer · 統計量 · Processing（編程語言） · 進化計算 ·

2022 年 4 月 19 日

Co-evolutionary Probabilistic Structured Grammatical Evolution

Jessica Mégane,Nuno Louren?o,Penousal Machado

from arxiv, 9 pages, 10 figures, The Genetic and Evolutionary Computation Conference

This work proposes an extension to Structured Grammatical Evolution (SGE) called Co-evolutionary Probabilistic Structured Grammatical Evolution (Co-PSGE). In Co-PSGE each individual in the population is composed by a grammar and a genotype, which is a list of dynamic lists, each corresponding to a non-terminal of the grammar containing real numbers that correspond to the probability of choosing a derivation rule. Each individual uses its own grammar to map the genotype into a program. During the evolutionary process, both the grammar and the genotype are subject to variation operators. The performance of the proposed approach is compared to 3 different methods, namely, Grammatical Evolution (GE), Probabilistic Grammatical Evolution (PGE), and SGE on four different benchmark problems. The results show the effectiveness of the approach since Co-PSGE is able to outperform all the methods with statistically significant differences in the majority of the problems.

凸集 · 核嶺回歸 · 再生核希爾伯特空間 · 核化 · 近似 ·

2022 年 4 月 19 日

Compressed Empirical Measures (in finite dimensions)

Steffen Grünew?lder

We study approaches for compressing the empirical measure in the context of finite dimensional reproducing kernel Hilbert spaces (RKHSs).In this context, the empirical measure is contained within a natural convex set and can be approximated using convex optimization methods. Such an approximation gives under certain conditions rise to a coreset of data points. A key quantity that controls how large such a coreset has to be is the size of the largest ball around the empirical measure that is contained within the empirical convex set. The bulk of our work is concerned with deriving high probability lower bounds on the size of such a ball under various conditions. We complement this derivation of the lower bound by developing techniques that allow us to apply the compression approach to concrete inference problems such as kernel ridge regression. We conclude with a construction of an infinite dimensional RKHS for which the compression is poor, highlighting some of the difficulties one faces when trying to move to infinite dimensional RKHSs.

條件獨立的 · 相互獨立的 · 評論員 · 優化器 · 可辨認的 ·

2022 年 4 月 19 日

Prespecification of Structure for Optimizing Data Collection and Research Transparency by Leveraging Conditional Independencies

Matthew J. Vowels

Data collection and research methodology represents a critical part of the research pipeline. On the one hand, it is important that we collect data in a way that maximises the validity of what we are measuring, which may involve the use of long scales with many items. On the other hand, collecting a large number of items across multiple scales results in participant fatigue, and expensive and time consuming data collection. It is therefore important that we use the available resources optimally. In this work, we consider how a consideration for theory and the associated causal/structural model can help us to streamline data collection procedures by not wasting time collecting data for variables which are not causally critical for subsequent analysis. This not only saves time and enables us to redirect resources to attend to other variables which are more important, but also increases research transparency and the reliability of theory testing. In order to achieve this streamlined data collection, we leverage structural models, and Markov conditional independency structures implicit in these models to identify the substructures which are critical for answering a particular research question. In this work, we review the relevant concepts and present a number of didactic examples with the hope that psychologists can use these techniques to streamline their data collection process without invalidating the subsequent analysis. We provide a number of simulation results to demonstrate the limited analytical impact of this streamlining.

線性的 · 子集搜索 · MoDELS · 損失函數（機器學習） · INFORMS ·

2022 年 4 月 18 日

Subset selection for linear mixed models

Daniel R. Kowal

Linear mixed models (LMMs) are instrumental for regression analysis with structured dependence, such as grouped, clustered, or multilevel data. However, selection among the covariates--while accounting for this structured dependence--remains a challenge. We introduce a Bayesian decision analysis for subset selection with LMMs. Using a Mahalanobis loss function that incorporates the structured dependence, we derive optimal linear coefficients for (i) any given subset of variables and (ii) all subsets of variables that satisfy a cardinality constraint. Crucially, these estimates inherit shrinkage or regularization and uncertainty quantification from the underlying Bayesian model, and apply for any well-specified Bayesian LMM. More broadly, our decision analysis strategy deemphasizes the role of a single "best" subset, which is often unstable and limited in its information content, and instead favors a collection of near-optimal subsets. This collection is summarized by key member subsets and variable-specific importance metrics. Customized subset search and out-of-sample approximation algorithms are provided for more scalable computing. These tools are applied to simulated data and a longitudinal physical activity dataset, and demonstrate excellent prediction, estimation, and selection ability.

流 · 子空間 · 分解的 · 優化器 · 線性的 ·

2022 年 4 月 18 日

High-Dimensional Geometric Streaming in Polynomial Space

David P. Woodruff,Taisuke Yasuda

from arxiv, Abstract shortened to meet arXiv limits; v2 fix statements concerning online condition number

Many existing algorithms for streaming geometric data analysis have been plagued by exponential dependencies in the space complexity, which are undesirable for processing high-dimensional data sets. In particular, once $d\geq\log n$, there are no known non-trivial streaming algorithms for problems such as maintaining convex hulls and L\"owner-John ellipsoids of $n$ points, despite a long line of work in streaming computational geometry since [AHV04]. We simultaneously improve these results to $\mathrm{poly}(d,\log n)$ bits of space by trading off with a $\mathrm{poly}(d,\log n)$ factor distortion. We achieve these results in a unified manner, by designing the first streaming algorithm for maintaining a coreset for $\ell_\infty$ subspace embeddings with $\mathrm{poly}(d,\log n)$ space and $\mathrm{poly}(d,\log n)$ distortion. Our algorithm also gives similar guarantees in the \emph{online coreset} model. Along the way, we sharpen results for online numerical linear algebra by replacing a log condition number dependence with a $\log n$ dependence, answering a question of [BDM+20]. Our techniques provide a novel connection between leverage scores, a fundamental object in numerical linear algebra, and computational geometry. For $\ell_p$ subspace embeddings, we give nearly optimal trade-offs between space and distortion for one-pass streaming algorithms. For instance, we give a deterministic coreset using $O(d^2\log n)$ space and $O((d\log n)^{1/2-1/p})$ distortion for $p>2$, whereas previous deterministic algorithms incurred a $\mathrm{poly}(n)$ factor in the space or the distortion [CDW18]. Our techniques have implications in the offline setting, where we give optimal trade-offs between the space complexity and distortion of subspace sketch data structures. To do this, we give an elementary proof of a "change of density" theorem of [LT80] and make it algorithmic.

分類數據 · Extensibility · 圖 · INFORMS · 情景 ·

2022 年 4 月 16 日

Categorical Data Structures for Technical Computing

Evan Patterson,Owen Lynch,James Fairbanks

from arxiv, 27 pages, 7 figures

Many mathematical objects can be represented as functors from finitely-presented categories $\mathsf{C}$ to $\mathsf{Set}$. For instance, graphs are functors to $\mathsf{Set}$ from the category with two parallel arrows. Such functors are known informally as $\mathsf{C}$-sets. In this paper, we describe and implement an extension of $\mathsf{C}$-sets having data attributes with fixed types, such as graphs with labeled vertices or real-valued edge weights. We call such structures "acsets," short for "attributed $\mathsf{C}$-sets." Derived from previous work on algebraic databases, acsets are a joint generalization of graphs and data frames. They also encompass more elaborate graph-like objects such as wiring diagrams and Petri nets with rate constants. We develop the mathematical theory of acsets and then describe a generic implementation in the Julia programming language, which uses advanced language features to achieve performance comparable with specialized data structures.

分解的 · 因子分析 · 相互獨立的 · MoDELS · 分離的 ·

2022 年 4 月 15 日

On the dimensional indeterminacy of one-wave factor analysis under causal effects

Tyler J. VanderWeele,Charles J. K. Batty

It is shown, with two sets of indicators that separately load on two distinct factors, independent of one another conditional on the past, that if it is the case that at least one of the factors causally affects the other, then, in many settings, the process will converge to a factor model in which a single factor will suffice to capture the covariance structure among the indicators. Factor analysis with one wave of data can then not distinguish between factor models with a single factor versus those with two factors that are causally related. Therefore, unless causal relations between factors can be ruled out a priori, alleged empirical evidence from one-wave factor analysis for a single factor still leaves open the possibilities of a single factor or of two factors that causally affect one another. The implications for interpreting the factor structure of psychological scales, such as self-report scales for anxiety and depression, or for happiness and purpose, are discussed. The results are further illustrated through simulations to gain insight into the practical implications of the results in more realistic settings prior to the convergence of the processes. Some further generalizations to an arbitrary number of underlying factors are noted.

講稿 · 匯聚 · MoDELS · 噪聲 · 情景 ·

2022 年 4 月 14 日

Distributed Reconstruction of Noisy Pooled Data

Max Hahn-Klimroth,Dominik Kaaser

from arxiv, Accepted at 42nd IEEE International Conference on Distributed Computing Systems (ICDCS)

In the pooled data problem we are given a set of $n$ agents, each of which holds a hidden state bit, either $0$ or $1$. A querying procedure returns for a query set the sum of the states of the queried agents. The goal is to reconstruct the states using as few queries as possible. In this paper we consider two noise models for the pooled data problem. In the noisy channel model, the result for each agent flips with a certain probability. In the noisy query model, each query result is subject to random Gaussian noise. Our results are twofold. First, we present and analyze for both error models a simple and efficient distributed algorithm that reconstructs the initial states in a greedy fashion. Our novel analysis pins down the range of error probabilities and distributions for which our algorithm reconstructs the exact initial states with high probability. Secondly, we present simulation results of our algorithm and compare its performance with approximate message passing (AMP) algorithms that are conjectured to be optimal in a number of related problems.