国产高清一区二区在线影院_久久91超碰色中文字幕总站_国产一区二区三区精品免费视频_一一级黄片级黄片_欧美一乱一性一交一视频_国产综合久久久久久精品2_精品AA在线视频

Grey-box fuzzing is the lightweight approach of choice for finding bugs in sequential programs. It provides a balance between efficiency and effectiveness by conducting a biased random search over the domain of program inputs using a feedback function from observed test executions. For distributed system testing, however, the state-of-practice is represented today by only black-box tools that do not attempt to infer and exploit any knowledge of the system's past behaviours to guide the search for bugs. In this work, we present Mallory: the first framework for grey-box fuzz-testing of distributed systems. Unlike popular black-box distributed system fuzzers, such as Jepsen, that search for bugs by randomly injecting network partitions and node faults or by following human-defined schedules, Mallory is adaptive. It exercises a novel metric to learn how to maximize the number of observed system behaviors by choosing different sequences of faults, thus increasing the likelihood of finding new bugs. The key enablers for our approach are the new ideas of timeline-driven testing and timeline abstraction that provide the feedback function guiding a biased random search for failures. Mallory dynamically constructs Lamport timelines of the system behaviour, abstracts these timelines into happens-before summaries, and introduces faults guided by its real-time observation of the summaries. We have evaluated Mallory on a diverse set of widely-used industrial distributed systems. Compared to the start-of-the-art black-box fuzzer Jepsen, Mallory explores more behaviours and takes less time to find bugs. Mallory discovered 22 zero-day bugs (of which 18 were confirmed by developers), including 10 new vulnerabilities, in rigorously-tested distributed systems such as Braft, Dqlite, and Redis. 6 new CVEs have been assigned.

相關內容

黑(hei)盒(he)

關注 1

在(zai)(zai)科學(xue)，計算(suan)和工程學(xue)中(zhong)，黑(hei)(hei)盒(he)(he)是(shi)一種(zhong)設備，系統或對象，可以根據(ju)其輸(shu)入(ru)和輸(shu)出（或傳輸(shu)特性）對其進行(xing)查看，而無需對其內部工作有任(ren)何(he)了(le)解。它(ta)的(de)實現是(shi)“不(bu)透(tou)明(ming)的(de)”（黑(hei)(hei)色）。幾乎任(ren)何(he)事物都可以被稱(cheng)(cheng)(cheng)為(wei)黑(hei)(hei)盒(he)(he)：晶體管(guan)，引擎，算(suan)法(fa)，人(ren)腦，機構(gou)或政府。為(wei)了(le)使用典型(xing)的(de)“黑(hei)(hei)匣子方法(fa)”來分(fen)析建模為(wei)開放系統的(de)事物，僅考慮刺激(ji)/響應的(de)行(xing)為(wei)，以推斷（未知）盒(he)(he)子。該黑(hei)(hei)匣子系統的(de)通(tong)常(chang)表示形式是(shi)在(zai)(zai)該方框(kuang)中(zhong)居中(zhong)的(de)數據(ju)流(liu)程圖(tu)。黑(hei)(hei)盒(he)(he)的(de)對立面是(shi)一個內部組件或邏輯可用于檢查的(de)系統，通(tong)常(chang)將其稱(cheng)(cheng)(cheng)為(wei)白盒(he)(he)（有時也(ye)稱(cheng)(cheng)(cheng)為(wei)“透(tou)明(ming)盒(he)(he)”或“玻璃盒(he)(he)”）。

可交換的 · 隨機變量 · 同分布的 · 狀態空間 · Analysis ·

2023 年 6 月 20 日

Generalizing the de Finetti--Hewitt--Savage theorem

Irfan Alam

from arxiv, 116 pages, 2 diagrams, 3 appendices; v4 includes a new appendix on a philosophical introduction to nonstandard analysis

A sequence of random variables is called exchangeable if its joint distribution is invariant under permutations. The original formulation of de Finetti's theorem says that any exchangeable sequence of $\{0,1\}$-valued random variables can be thought of as a mixture of independent and identically distributed sequences in a certain precise mathematical sense. Interpreting this statement from a convex analytic perspective, Hewitt and Savage obtained the same conclusion for more general state spaces under some topological conditions. The main contribution of this paper is in providing a new framework that explains the theorem purely as a consequence of the underlying distribution of the random variables, with no topological conditions (beyond Hausdorffness) on the state space being necessary if the distribution is Radon. We also show that it is consistent with the axioms of ZFC that de Finetti's theorem holds for all sequences of exchangeable random variables taking values in any complete metric space. The framework we use is based on nonstandard analysis. We have provided a self-contained introduction to nonstandard analysis as an appendix, thus rendering measure theoretic probability and point-set topology as the only prerequisites for this paper. Our introduction aims to develop some new ideologies that might be of interest to mathematicians, philosophers, and mathematics educators alike. Our technical tools come from nonstandard topological measure theory, in which a highlight is a new generalization of Prokhorov's theorem. Modulo such technical tools, our proof relies on properties of the empirical measures induced by hyperfinitely many identically distributed random variables -- a feature that allows us to establish de Finetti's theorem in the generality that we seek while still retaining the combinatorial intuition of proofs of simpler versions of de Finetti's theorem.

可約的 · 評論員 · 可理解性 · Better · 模型評估 ·

2023 年 6 月 20 日

Towards Explaining Distribution Shifts

Sean Kulinski,David I. Inouye

from arxiv, ICML 2023

A distribution shift can have fundamental consequences such as signaling a change in the operating environment or significantly reducing the accuracy of downstream models. Thus, understanding distribution shifts is critical for examining and hopefully mitigating the effect of such a shift. Most prior work focuses on merely detecting if a shift has occurred and assumes any detected shift can be understood and handled appropriately by a human operator. We hope to aid in these manual mitigation tasks by explaining the distribution shift using interpretable transportation maps from the original distribution to the shifted one. We derive our interpretable mappings from a relaxation of optimal transport, where the candidate mappings are restricted to a set of interpretable mappings. We then inspect multiple quintessential use-cases of distribution shift in real-world tabular, text, and image datasets to showcase how our explanatory mappings provide a better balance between detail and interpretability than baseline explanations by both visual inspection and our PercentExplained metric.

推斷 · INFORMS · 統計量 · 知識 (knowledge) · 可辨認的 ·

2023 年 6 月 19 日

Inference under constrained distribution shifts

Santiago Cortes-Gomez,Mateo Dulce,Bryan Wilder

Large-scale administrative or observational datasets are increasingly used to inform decision making. While this effort aims to ground policy in real-world evidence, challenges have arise as that selection bias and other forms of distribution shift often plague observational data. Previous attempts to provide robust inferences have given guarantees depending on a user-specified amount of possible distribution shift (e.g., the maximum KL divergence between the observed and target distributions). However, decision makers will often have additional knowledge about the target distribution which constrains the kind of shifts which are possible. To leverage such information, we proposed a framework that enables statistical inference in the presence of distribution shifts which obey user-specified constraints in the form of functions whose expectation is known under the target distribution. The output is high-probability bounds on the value an estimand takes on the target distribution. Hence, our method leverages domain knowledge in order to partially identify a wide class of estimands. We analyze the computational and statistical properties of methods to estimate these bounds, and show that our method can produce informative bounds on a variety of simulated and semisynthetic tasks.

Storage · Performer · Boosting（一種模型訓練加速方式） · 可約的 · state-of-the-art ·

2023 年 6 月 18 日

Boosting the Performance of Degraded Reads in RS-coded Distributed Storage Systems

Tian Xie,Juntao Fang,Shenggang wan,Changsheng Xie,Xubin He

Reed-Solomon (RS) codes have been increasingly adopted by distributed storage systems in place of replication,because they provide the same level of availability with much lower storage overhead. However, a key drawback of those RS-coded distributed storage systems is the poor latency of degraded reads, which can be incurred by data failures or hot spots,and are not rare in production environments. To address this issue, we propose a novel parallel reconstruction solution called APLS. APLS leverages all surviving source nodes to send the data needed by degraded reads and chooses light-loaded starter nodes to receive the reconstructed data of those degraded reads. Hence, the latency of the degraded reads can be improved.Prototyping-based experiments are conducted to compare APLS with ECPipe, the state-of-the-art solution of improving the latency of degraded reads. The experimental results demonstrate that APLS effectively reduces the latency, particularly under heavy or medium workloads.

情景 · MoDELS · 全 · 樣本 · Analysis ·

2023 年 6 月 17 日

Variance-based reliability sensitivity with dependent inputs using failure samples

Max Ehre,Iason Papaioannou,Daniel Straub

Reliability sensitivity analysis is concerned with measuring the influence of a system's uncertain input parameters on its probability of failure. Statistically dependent inputs present a challenge in both computing and interpreting these sensitivity indices; such dependencies require discerning between variable interactions produced by the probabilistic model describing the system inputs and the computational model describing the system itself. To accomplish such a separation of effects in the context of reliability sensitivity analysis we extend on an idea originally proposed by Mara and Tarantola (2012) for model outputs unrelated to rare events. We compute the independent (influence via computational model) and full (influence via both computational and probabilistic model) contributions of all inputs to the variance of the indicator function of the rare event. We compute this full set of variance-based sensitivity indices of the rare event indicator using a single set of failure samples. This is possible by considering $d$ different hierarchically structured isoprobabilistic transformations of this set of failure samples from the original $d$-dimensional space of dependent inputs to standard-normal space. The approach facilitates computing the full set of variance-based reliability sensitivity indices with a single set of failure samples obtained as the byproduct of a single run of a sample-based rare event estimation method. That is, no additional evaluations of the computational model are required. We demonstrate the approach on a test function and two engineering problems.

穩健性 · Neural Networks · Networking · Performer · FAST ·

2023 年 6 月 16 日

Wasserstein distributional robustness of neural networks

Xingjian Bai,Guangyi He,Yifan Jiang,Jan Obloj

from arxiv, 23 pages, 6 figures, 8 tables

Deep neural networks are known to be vulnerable to adversarial attacks (AA). For an image recognition task, this means that a small perturbation of the original can result in the image being misclassified. Design of such attacks as well as methods of adversarial training against them are subject of intense research. We re-cast the problem using techniques of Wasserstein distributionally robust optimization (DRO) and obtain novel contributions leveraging recent insights from DRO sensitivity analysis. We consider a set of distributional threat models. Unlike the traditional pointwise attacks, which assume a uniform bound on perturbation of each input data point, distributional threat models allow attackers to perturb inputs in a non-uniform way. We link these more general attacks with questions of out-of-sample performance and Knightian uncertainty. To evaluate the distributional robustness of neural networks, we propose a first-order AA algorithm and its multi-step version. Our attack algorithms include Fast Gradient Sign Method (FGSM) and Projected Gradient Descent (PGD) as special cases. Furthermore, we provide a new asymptotic estimate of the adversarial accuracy against distributional threat models. The bound is fast to compute and first-order accurate, offering new insights even for the pointwise AA. It also naturally yields out-of-sample performance guarantees. We conduct numerical experiments on the CIFAR-10 dataset using DNNs on RobustBench to illustrate our theoretical results. Our code is available at //github.com/JanObloj/W-DRO-Adversarial-Methods.

GNN · 圖形處理器 · 圖 · Taxonomy · Networking ·

2022 年 11 月 1 日

Distributed Graph Neural Network Training: A Survey

Yingxia Shao,Hongzheng Li,Xizhi Gu,Hongbo Yin,Yawen Li,Xupeng Miao,Wentao Zhang,Bin Cui,Lei Chen

Graph neural networks (GNNs) are a type of deep learning models that learning over graphs, and have been successfully applied in many domains. Despite the effectiveness of GNNs, it is still challenging for GNNs to efficiently scale to large graphs. As a remedy, distributed computing becomes a promising solution of training large-scale GNNs, since it is able to provide abundant computing resources. However, the dependency of graph structure increases the difficulty of achieving high-efficiency distributed GNN training, which suffers from the massive communication and workload imbalance. In recent years, many efforts have been made on distributed GNN training, and an array of training algorithms and systems have been proposed. Yet, there is a lack of systematic review on the optimization techniques from graph processing to distributed execution. In this survey, we analyze three major challenges in distributed GNN training that are massive feature communication, the loss of model accuracy and workload imbalance. Then we introduce a new taxonomy for the optimization techniques in distributed GNN training that address the above challenges. The new taxonomy classifies existing techniques into four categories that are GNN data partition, GNN batch generation, GNN execution model, and GNN communication protocol.We carefully discuss the techniques in each category. In the end, we summarize existing distributed GNN systems for multi-GPUs, GPU-clusters and CPU-clusters, respectively, and give a discussion about the future direction on scalable GNNs.

Prompt · 學成 · Extensibility · 替代損失 · 講稿 ·

2022 年 5 月 6 日

Prompt Distribution Learning

Yuning Lu,Jianzhuang Liu,Yonggang Zhang,Yajing Liu,Xinmei Tian

from arxiv, Accepted by CVPR 2022

We present prompt distribution learning for effectively adapting a pre-trained vision-language model to address downstream recognition tasks. Our method not only learns low-bias prompts from a few samples but also captures the distribution of diverse prompts to handle the varying visual representations. In this way, we provide high-quality task-related content for facilitating recognition. This prompt distribution learning is realized by an efficient approach that learns the output embeddings of prompts instead of the input embeddings. Thus, we can employ a Gaussian distribution to model them effectively and derive a surrogate loss for efficient training. Extensive experiments on 12 datasets demonstrate that our method consistently and significantly outperforms existing methods. For example, with 1 sample per category, it relatively improves the average result by 9.1% compared to human-crafted prompts.

Performer · Machine Learning · 模型性能 · MoDELS · Processing（編程語言） ·

2021 年 8 月 2 日

A Survey of Human-in-the-loop for Machine Learning

Xingjiao Wu,Luwei Xiao,Yixuan Sun,Junhang Zhang,Tianlong Ma,Liang He

Human-in-the-loop aims to train an accurate prediction model with minimum cost by integrating human knowledge and experience. Humans can provide training data for machine learning applications and directly accomplish some tasks that are hard for computers in the pipeline with the help of machine-based approaches. In this paper, we survey existing works on human-in-the-loop from a data perspective and classify them into three categories with a progressive relationship: (1) the work of improving model performance from data processing, (2) the work of improving model performance through interventional model training, and (3) the design of the system independent human-in-the-loop. Using the above categorization, we summarize major approaches in the field, along with their technical strengths/ weaknesses, we have simple classification and discussion in natural language processing, computer vision, and others. Besides, we provide some open challenges and opportunities. This survey intends to provide a high-level summarization for human-in-the-loop and motivates interested readers to consider approaches for designing effective human-in-the-loop solutions.

分布式機器學習 · Machine Learning · 學成 · Storage · 優化器 ·

2019 年 9 月 18 日

Distributed Machine Learning on Mobile Devices: A Survey

Renjie Gu,Shuo Yang,Fan Wu

In recent years, mobile devices have gained increasingly development with stronger computation capability and larger storage. Some of the computation-intensive machine learning and deep learning tasks can now be run on mobile devices. To take advantage of the resources available on mobile devices and preserve users' privacy, the idea of mobile distributed machine learning is proposed. It uses local hardware resources and local data to solve machine learning sub-problems on mobile devices, and only uploads computation results instead of original data to contribute to the optimization of the global model. This architecture can not only relieve computation and storage burden on servers, but also protect the users' sensitive information. Another benefit is the bandwidth reduction, as various kinds of local data can now participate in the training process without being uploaded to the server. In this paper, we provide a comprehensive survey on recent studies of mobile distributed machine learning. We survey a number of widely-used mobile distributed machine learning methods. We also present an in-depth discussion on the challenges and future directions in this area. We believe that this survey can demonstrate a clear overview of mobile distributed machine learning and provide guidelines on applying mobile distributed machine learning to real applications.