成人午夜性影院视频-美女午夜一区视频在线播放

Goodness-of-fit tests are often used in data analysis to test the agreement of a distribution to a set of data. These tests can be used to detect an unknown signal against a known background or to set limits on a proposed signal distribution in experiments contaminated by poorly understood backgrounds. Out-of-the-box non-parametric tests that can target any proposed distribution are only available in the univariate case. In this paper, we discuss how to build goodness-of-fit tests for arbitrary multivariate distributions or multivariate data generation models.

相關內容

MoDELS

關注 43

ACM/IEEE第23屆模型驅動工程語言和系統國際會議，是模型驅動軟件和系統工程的首要會議系列，由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來，模型涵蓋了建模的各個方面，從語言和方法到工具和應用程序。模特的參加者來自不同的背景，包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇，參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會，并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。官網鏈接： · Performer · CASE · Extensibility · Integration ·

2023 年 5 月 9 日

Score-based calibration testing for multivariate forecast distributions

Malte Knüppel,Fabian Krüger,Marc-Oliver Pohle

Multivariate distributional forecasts have become widespread in recent years. To assess the quality of such forecasts, suitable evaluation methods are needed. In the univariate case, calibration tests based on the probability integral transform (PIT) are routinely used. However, multivariate extensions of PIT-based calibration tests face various challenges. We therefore introduce a general framework for calibration testing in the multivariate case and propose two new tests that arise from it. Both approaches use proper scoring rules and are simple to implement even in large dimensions. The first employs the PIT of the score. The second is based on comparing the expected performance of the forecast distribution (i.e., the expected score) to its actual performance based on realized observations (i.e., the realized score). The tests have good size and power properties in simulations and solve various problems of existing tests. We apply the new tests to forecast distributions for macroeconomic and financial time series data.

binary · 比特 · 相同 · 周期的 · 表示 ·

2023 年 5 月 9 日

Ternary Instantaneous Noise-based Logic

Laszlo B. Kish

from arxiv, submitted for publication, new reference added, small corrections

One of the possible representations of three-valued instantaneous noise-based logic is proposed. The third value is an uncertain bit value, which can be useful in artificial intelligence applications. There is a forth value, too, that can represent a non-existing bit (vacuum-state) that is the same (1 numeric value) for all bits, however that is a squeezed state common for all bits. Some logic gates are explored. A ternary Universe has a significant advantage compared to the standard binary one: its amplitude is never zero during any clock period. All the known binary logic gates work for the binary bit values in the same way as earlier therefore the former binary algorithms can be run in the ternary system with no change and without the problems posed by zero values of the Universe.

prototype · 評論員 · PCA · MASS · 查準率/準確率 ·

2023 年 5 月 9 日

Quantizing rare random maps: application to flooding visualization

Charlie Sire,Rodolphe Le Riche,Didier Rullière,Jérémy Rohmer,Lucie Pheulpin,Yann Richet

from arxiv, 40 pages, 11 Figures, published in Journal of Computational and Graphical Statistic

Visualization is an essential operation when assessing the risk of rare events such as coastal or river floodings. The goal is to display a few prototype events that best represent the probability law of the observed phenomenon, a task known as quantization. It becomes a challenge when data is expensive to generate and critical events are scarce, like extreme natural hazard. In the case of floodings, each event relies on an expensive-to-evaluate hydraulic simulator which takes as inputs offshore meteo-oceanic conditions and dyke breach parameters to compute the water level map. In this article, Lloyd's algorithm, which classically serves to quantize data, is adapted to the context of rare and costly-to-observe events. Low probability is treated through importance sampling, while Functional Principal Component Analysis combined with a Gaussian process deal with the costly hydraulic simulations. The calculated prototype maps represent the probability distribution of the flooding events in a minimal expected distance sense, and each is associated to a probability mass. The method is first validated using a 2D analytical model and then applied to a real coastal flooding scenario. The two sources of error, the metamodel and the importance sampling, are evaluated to quantify the precision of the method.

離散化 · Continuity · 正則化項 · 可辨認的 · 觀測變量 ·

2023 年 5 月 9 日

Causal Discovery with Unobserved Variables: A Proxy Variable Approach

Mingzhou Liu,Xinwei Sun,Yu Qiao,Yizhou Wang

from arxiv, Preprint, under review

Discovering causal relations from observational data is important. The existence of unobserved variables (e.g. latent confounding or mediation) can mislead the causal identification. To overcome this problem, proximal causal discovery methods attempted to adjust for the bias via the proxy of the unobserved variable. Particularly, hypothesis test-based methods proposed to identify the causal edge by testing the induced violation of linearity. However, these methods only apply to discrete data with strict level constraints, which limits their practice in the real world. In this paper, we fix this problem by extending the proximal hypothesis test to cases where the system consists of continuous variables. Our strategy is to present regularity conditions on the conditional distributions of the observed variables given the hidden factor, such that if we discretize its observed proxy with sufficiently fine, finite bins, the involved discretization error can be effectively controlled. Based on this, we can convert the problem of testing continuous causal relations to that of testing discrete causal relations in each bin, which can be effectively solved with existing methods. These non-parametric regularities we present are mild and can be satisfied by a wide range of structural causal models. Using both simulated and real-world data, we show the effectiveness of our method in recovering causal relations when unobserved variables exist.

子采樣 · 可辨認的 · TOOLS · 推斷 · Continuity ·

2023 年 5 月 9 日

Causal Discovery from Subsampled Time Series with Proxy Variables

Mingzhou Liu,Xinwei Sun,Lingjing Hu,Yizhou Wang

from arxiv, Preprint, under review

Inferring causal structures from time series data is the central interest of many scientific inquiries. A major barrier to such inference is the problem of subsampling, i.e., the frequency of measurements is much lower than that of causal influence. To overcome this problem, numerous model-based and model-free methods have been proposed, yet either limited to the linear case or failed to establish identifiability. In this work, we propose a model-free algorithm that can identify the entire causal structure from subsampled time series, without any parametric constraint. The idea is that the challenge of subsampling arises mainly from \emph{unobserved} time steps and therefore should be handled with tools designed for unobserved variables. Among these tools, we find the proxy variable approach particularly fits, in the sense that the proxy of an unobserved variable is naturally itself at the observed time step. Following this intuition, we establish comprehensive structural identifiability results. Our method is constraint-based and requires no more regularities than common continuity and differentiability. Theoretical advantages are reflected in experimental results.

MoDELS · 噪聲 · Continuity · Elevate · 協方差矩陣 ·

2023 年 5 月 5 日

Noise calibration for the stochastic rotating shallow water model

Dan Crisan,Oana Lang,Alexander Lobbe,Peter Jan van Leeuwen,Roland Potthast

from arxiv, Comments welcome

Stochastic partial differential equations have been used in a variety of contexts to model the evolution of uncertain dynamical systems. In recent years, their applications to geophysical fluid dynamics has increased massively. For a judicious usage in modelling fluid evolution, one needs to calibrate the amplitude of the noise to data. In this paper we address this requirement for the stochastic rotating shallow water (SRSW) model. This work is a continuation of [LvLCP23], where a data assimilation methodology has been introduced for the SRSW model. The noise used in [LvLCP23] was introduced as an arbitrary random phase shift in the Fourier space. This is not necessarily consistent with the uncertainty induced by a model reduction procedure. In this paper, we introduce a new method of noise calibration of the SRSW model which is compatible with the model reduction technique. The method is generic and can be applied to arbitrary stochastic parametrizations. It is also agnostic as to the source of data (real or synthetic). It is based on a principal component analysis technique to generate the eigenvectors and the eigenvalues of the covariance matrix of the stochastic parametrization. For SRSW model covered in this paper, we calibrate the noise by using the elevation variable of the model, as this is an observable easily obtainable in practical application, and use synthetic data as input for the calibration procedure.

規范化的 · 情景 · 凸集 · Learning · 標準正態分布 ·

2023 年 5 月 4 日

Testing Convex Truncation

Anindya De,Shivam Nadimpalli,Rocco A. Servedio

from arxiv, Preliminary version in SODA 2023; the current version includes a strengthened lower bound. 26 pages

We study the basic statistical problem of testing whether normally distributed $n$-dimensional data has been truncated, i.e. altered by only retaining points that lie in some unknown truncation set $S \subseteq \mathbb{R}^n$. As our main algorithmic results, (1) We give a computationally efficient $O(n)$-sample algorithm that can distinguish the standard normal distribution $N(0,I_n)$ from $N(0,I_n)$ conditioned on an unknown and arbitrary convex set $S$. (2) We give a different computationally efficient $O(n)$-sample algorithm that can distinguish $N(0,I_n)$ from $N(0,I_n)$ conditioned on an unknown and arbitrary mixture of symmetric convex sets. These results stand in sharp contrast with known results for learning or testing convex bodies with respect to the normal distribution or learning convex-truncated normal distributions, where state-of-the-art algorithms require essentially $n^{\sqrt{n}}$ samples. An easy argument shows that no finite number of samples suffices to distinguish $N(0,I_n)$ from an unknown and arbitrary mixture of general (not necessarily symmetric) convex sets, so no common generalization of results (1) and (2) above is possible. We also prove that any algorithm (computationally efficient or otherwise) that can distinguish $N(0,I_n)$ from $N(0,I_n)$ conditioned on an unknown symmetric convex set must use $\Omega(n)$ samples. This shows that the sample complexity of each of our algorithms is optimal up to a constant factor.

異常檢測 · 生成式對抗網絡 · Networking · 判別器 · Machine Learning ·

2019 年 1 月 15 日

MAD-GAN: Multivariate Anomaly Detection for Time Series Data with Generative Adversarial Networks

Dan Li,Dacheng Chen,Lei Shi,Baihong Jin,Jonathan Goh,See-Kiong Ng

from arxiv, This is a pre-print of an on-going work. arXiv admin note: text overlap with arXiv:1809.04758

The prevalence of networked sensors and actuators in many real-world systems such as smart buildings, factories, power plants, and data centers generate substantial amounts of multivariate time series data for these systems. The rich sensor data can be continuously monitored for intrusion events through anomaly detection. However, conventional threshold-based anomaly detection methods are inadequate due to the dynamic complexities of these systems, while supervised machine learning methods are unable to exploit the large amounts of data due to the lack of labeled data. On the other hand, current unsupervised machine learning approaches have not fully exploited the spatial-temporal correlation and other dependencies amongst the multiple variables (sensors/actuators) in the system for detecting anomalies. In this work, we propose an unsupervised multivariate anomaly detection method based on Generative Adversarial Networks (GANs). Instead of treating each data stream independently, our proposed MAD-GAN framework considers the entire variable set concurrently to capture the latent interactions amongst the variables. We also fully exploit both the generator and discriminator produced by the GAN, using a novel anomaly score called DR-score to detect anomalies by discrimination and reconstruction. We have tested our proposed MAD-GAN using two recent datasets collected from real-world CPS: the Secure Water Treatment (SWaT) and the Water Distribution (WADI) datasets. Our experimental results showed that the proposed MAD-GAN is effective in reporting anomalies caused by various cyber-intrusions compared in these complex real-world systems.

GANs · 遷移學習 · 可辨認的 · 能量函數 · 學成 ·

2018 年 12 月 6 日

Adversarial Transfer Learning

Garrett Wilson,Diane J. Cook

There is a recent large and growing interest in generative adversarial networks (GANs), which offer powerful features for generative modeling, density estimation, and energy function learning. GANs are difficult to train and evaluate but are capable of creating amazingly realistic, though synthetic, image data. Ideas stemming from GANs such as adversarial losses are creating research opportunities for other challenges such as domain adaptation. In this paper, we look at the field of GANs with emphasis on these areas of emerging research. To provide background for adversarial techniques, we survey the field of GANs, looking at the original formulation, training variants, evaluation methods, and extensions. Then we survey recent work on transfer learning, focusing on comparing different adversarial domain adaptation methods. Finally, we take a look forward to identify open research directions for GANs and domain adaptation, including some promising applications such as sensor-based human behavior modeling.

ConvNets · DAM · 特征空間 · 無監督 · 圖像分割 ·

2018 年 4 月 29 日

Unsupervised Cross-Modality Domain Adaptation of ConvNets for Biomedical Image Segmentations with Adversarial Loss

Qi Dou,Cheng Ouyang,Cheng Chen,Hao Chen,Pheng-Ann Heng

from arxiv, Accepted to IJCAI 2018

Convolutional networks (ConvNets) have achieved great successes in various challenging vision tasks. However, the performance of ConvNets would degrade when encountering the domain shift. The domain adaptation is more significant while challenging in the field of biomedical image analysis, where cross-modality data have largely different distributions. Given that annotating the medical data is especially expensive, the supervised transfer learning approaches are not quite optimal. In this paper, we propose an unsupervised domain adaptation framework with adversarial learning for cross-modality biomedical image segmentations. Specifically, our model is based on a dilated fully convolutional network for pixel-wise prediction. Moreover, we build a plug-and-play domain adaptation module (DAM) to map the target input to features which are aligned with source domain feature space. A domain critic module (DCM) is set up for discriminating the feature space of both domains. We optimize the DAM and DCM via an adversarial loss without using any target domain label. Our proposed method is validated by adapting a ConvNet trained with MRI images to unpaired CT data for cardiac structures segmentations, and achieved very promising results.