两个人的视频免费国语版,中文字幕在线视频第一页亚洲,国产福利一区二区三区视频

In this article, a novel identification test is proposed, which can be applied to parameteric models such as Mixture of Normal (MN) distributions, Markow Switching(MS), or Structural Autoregressive (SVAR) models. In the approach, it is assumed that model parameters are identified under the null whereas under the alternative they are not identified. Thanks to the setting, the Maximum Likelihood (ML) estimator preserves its properties under the null hypothesis. The proposed test is based on a comparison of two consistent estimators based on independent subsamples of the data set. A Wald type statistic is proposed which has a typical $\chi^2$ distribution. Finally, the method is adjusted to test if the heteroscedasticity assumption is sufficient to identify parameters of SVAR model. Its properties are evaluated with a Monte Carlo experiment, which allows non Gaussian distribution of errors and mis-specified VAR order. They indicate that the test has an asymptotically correct size. Moreover, outcomes show that the power of the test makes it suitable for empirical applications.

相關內容

可辨認的

關注 4

Analysis · Performer · 均值 · 方差 · 無限 ·

2022 年 7 月 25 日

Multi-sample Comparison Using Spatial Signs for Infinite Dimensional Data

Joydeep Chowdhury,Probal Chaudhuri

We consider an analysis of variance type problem, where the sample observations are random elements in an infinite dimensional space. This scenario covers the case, where the observations are random functions. For such a problem, we propose a test based on spatial signs. We develop an asymptotic implementation as well as a bootstrap implementation and a permutation implementation of this test and investigate their size and power properties. We compare the performance of our test with that of several mean based tests of analysis of variance for functional data studied in the literature. Interestingly, our test not only outperforms the mean based tests in several non-Gaussian models with heavy tails or skewed distributions, but in some Gaussian models also. Further, we also compare the performance of our test with the mean based tests in several models involving contaminated probability distributions. Finally, we demonstrate the performance of these tests in three real datasets: a Canadian weather dataset, a spectrometric dataset on chemical analysis of meat samples and a dataset on orthotic measurements on volunteers.

Learning · 可辨認的 · Analysis · 知識 (knowledge) · 可行 ·

2022 年 7 月 25 日

Differential testing for machine learning: an analysis for classification algorithms beyond deep learning

Steffen Herbold,Steffen Tunkel

from arxiv, Under review

Context: Differential testing is a useful approach that uses different implementations of the same algorithms and compares the results for software testing. In recent years, this approach was successfully used for test campaigns of deep learning frameworks. Objective: There is little knowledge on the application of differential testing beyond deep learning. Within this article, we want to close this gap for classification algorithms. Method: We conduct a case study using Scikit-learn, Weka, Spark MLlib, and Caret in which we identify the potential of differential testing by considering which algorithms are available in multiple frameworks, the feasibility by identifying pairs of algorithms that should exhibit the same behavior, and the effectiveness by executing tests for the identified pairs and analyzing the deviations. Results: While we found a large potential for popular algorithms, the feasibility seems limited because often it is not possible to determine configurations that are the same in other frameworks. The execution of the feasible tests revealed that there is a large amount of deviations for the scores and classes. Only a lenient approach based on statistical significance of classes does not lead to a huge amount of test failures. Conclusions: The potential of differential testing beyond deep learning seems limited for research into the quality of machine learning libraries. Practitioners may still use the approach if they have deep knowledge about implementations, especially if a coarse oracle that only considers significant differences of classes is sufficient.

Analysis · 前向 · 推斷 · 統計方法 ·

2022 年 7 月 23 日

Alternative approaches for analysing repeated measures data that are missing not at random

Oliver Dukes,David Richardson,Eric Tchetgen Tchetgen

We consider studies where multiple measures on an outcome variable are collected over time, but some subjects drop out before the end of follow up. Analyses of such data often proceed under either a 'last observation carried forward' or 'missing at random' assumption. We consider two alternative strategies for identification; the first is closely related to the difference-in-differences methodology in the causal inference literature. The second enables correction for violations of the parallel trend assumption, so long as one has access to a valid 'bespoke instrumental variable'. These are compared with existing approaches, first conceptually and then in an analysis of data from the Framingham Heart Study.

估計/估計量 · 無偏 · Extensibility · FAST · 無偏估計 ·

2022 年 7 月 22 日

Local search for efficient causal effect estimation

Debo Cheng,Jiuyong Li,Lin Liu,Jiji Zhang,Jixue Liu,Thuc Duy Le

from arxiv, 14 pages, 8 figures and 2 tables

Causal effect estimation from observational data is a challenging problem, especially with high dimensional data and in the presence of unobserved variables. The available data-driven methods for tackling the problem either provide an estimation of the bounds of a causal effect (i.e. nonunique estimation) or have low efficiency. The major hurdle for achieving high efficiency while trying to obtain unique and unbiased causal effect estimation is how to find a proper adjustment set for confounding control in a fast way, given the huge covariate space and considering unobserved variables. In this paper, we approach the problem as a local search task for finding valid adjustment sets in data. We establish the theorems to support the local search for adjustment sets, and we show that unique and unbiased estimation can be achieved from observational data even when there exist unobserved variables. We then propose a data-driven algorithm that is fast and consistent under mild assumptions. We also make use of a frequent pattern mining method to further speed up the search of minimal adjustment sets for causal effect estimation. Experiments conducted on extensive synthetic and real-world datasets demonstrate that the proposed algorithm outperforms the state-of-the-art criteria/estimators in both accuracy and time-efficiency.

MoDELS · 最大似然估計 · 極大似然 · 極大似然估計 · 蒙特卡羅 ·

2022 年 7 月 22 日

Time-Varying Poisson Autoregression

Giovanni Angelini,Giuseppe Cavaliere,Enzo D'Innocenzo,Luca De Angelis

In this paper we propose a new time-varying econometric model, called Time-Varying Poisson AutoRegressive with eXogenous covariates (TV-PARX), suited to model and forecast time series of counts. {We show that the score-driven framework is particularly suitable to recover the evolution of time-varying parameters and provides the required flexibility to model and forecast time series of counts characterized by convoluted nonlinear dynamics and structural breaks.} We study the asymptotic properties of the TV-PARX model and prove that, under mild conditions, maximum likelihood estimation (MLE) yields strongly consistent and asymptotically normal parameter estimates. Finite-sample performance and forecasting accuracy are evaluated through Monte Carlo simulations. The empirical usefulness of the time-varying specification of the proposed TV-PARX model is shown by analyzing the number of new daily COVID-19 infections in Italy and the number of corporate defaults in the US.

Learning · 統計量 · Analysis · INFORMS · 假設檢驗 ·

2022 年 7 月 22 日

Statistical Hypothesis Testing Based on Machine Learning: Large Deviations Analysis

Paolo Braca,Leonardo M. Millefiori,Augusto Aubry,Stefano Marano,Antonio De Maio,Peter Willett

We study the performance -- and specifically the rate at which the error probability converges to zero -- of Machine Learning (ML) classification techniques. Leveraging the theory of large deviations, we provide the mathematical conditions for a ML classifier to exhibit error probabilities that vanish exponentially, say $\sim \exp\left(-n\,I + o(n) \right)$, where $n$ is the number of informative observations available for testing (or another relevant parameter, such as the size of the target in an image) and $I$ is the error rate. Such conditions depend on the Fenchel-Legendre transform of the cumulant-generating function of the Data-Driven Decision Function (D3F, i.e., what is thresholded before the final binary decision is made) learned in the training phase. As such, the D3F and, consequently, the related error rate $I$, depend on the given training set, which is assumed of finite size. Interestingly, these conditions can be verified and tested numerically exploiting the available dataset, or a synthetic dataset, generated according to the available information on the underlying statistical model. In other words, the classification error probability convergence to zero and its rate can be computed on a portion of the dataset available for training. Coherently with the large deviations theory, we can also establish the convergence, for $n$ large enough, of the normalized D3F statistic to a Gaussian distribution. This property is exploited to set a desired asymptotic false alarm probability, which empirically turns out to be accurate even for quite realistic values of $n$. Furthermore, approximate error probability curves $\sim \zeta_n \exp\left(-n\,I \right)$ are provided, thanks to the refined asymptotic derivation (often referred to as exact asymptotics), where $\zeta_n$ represents the most representative sub-exponential terms of the error probabilities.

Engineering · 數據集 · Performer · Airfoil · 可辨認的 ·

2022 年 7 月 22 日

Concept Identification for Complex Engineering Datasets

Felix Lanfermann,Sebastian Schmitt

from arxiv, 19 pages, 14 figures, accepted at Advanced Engineering Informatics

Finding meaningful concepts in engineering application datasets which allow for a sensible grouping of designs is very helpful in many contexts. It allows for determining different groups of designs with similar properties and provides useful knowledge in the engineering decision making process. Also, it opens the route for further refinements of specific design candidates which exhibit certain characteristic features. In this work, an approach to define meaningful and consistent concepts in an existing engineering dataset is presented. The designs in the dataset are characterized by a multitude of features such as design parameters, geometrical properties or performance values of the design for various boundary conditions. In the proposed approach the complete feature set is partitioned into several subsets called description spaces. The definition of the concepts respects this partitioning which leads to several desired properties of the identified concepts. This cannot be achieved with state-of-the-art clustering or concept identification approaches. A novel concept quality measure is proposed, which provides an objective value for a given definition of concepts in a dataset. The usefulness of the measure is demonstrated by considering a realistic engineering dataset consisting of about 2500 airfoil profiles, for which the performance values (lift and drag) for three different operating conditions were obtained by a computational fluid dynamics simulation. A numerical optimization procedure is employed, which maximizes the concept quality measure and finds meaningful concepts for different setups of the description spaces, while also incorporating user preference. It is demonstrated how these concepts can be used to select archetypal representatives of the dataset which exhibit characteristic features of each concept.

估計/估計量 · MoDELS · Processing（編程語言） · 泛化理論 · 異方差 ·

2022 年 7 月 20 日

An Integer GARCH model for a Poisson process with time varying zero-inflation

Isuru Ratnayake,V. A. Samaranayake

A time-varying zero-inflated serially dependent Poisson process is proposed. The model assumes that the intensity of the Poisson Process evolves according to a generalized autoregressive conditional heteroscedastic (GARCH) formulation. The proposed model is a generalization of the zero-inflated Poisson Integer GARCH model proposed by Fukang Zhu in 2012, which in return is a generalization of the Integer GARCH (INGARCH) model introduced by Ferland, Latour, and Oraichi in 2006. The proposed model builds on previous work by allowing the zero-inflation parameter to vary over time, governed by a deterministic function or by an exogenous variable. Both the Expectation Maximization (EM) and the Maximum Likelihood Estimation (MLE) approaches are presented as possible estimation methods. A simulation study shows that both parameter estimation methods provide good estimates. Applications to two real-life data sets show that the proposed INGARCH model provides a better fit than the traditional zero-inflated INGARCH model in the cases considered.

圖 · 學成 · Signal Processing · Processing（編程語言） · Networking ·

2020 年 1 月 2 日

Graph Signal Processing -- Part III: Machine Learning on Graphs, from Graph Topology to Applications

Ljubisa Stankovic,Danilo Mandic,Milos Dakovic,Milos Brajovic,Bruno Scalzo,Shengxi Li,Anthony G. Constantinides

from arxiv, 61 pages, 55 figures, 40 examples

Many modern data analytics applications on graphs operate on domains where graph topology is not known a priori, and hence its determination becomes part of the problem definition, rather than serving as prior knowledge which aids the problem solution. Part III of this monograph starts by addressing ways to learn graph topology, from the case where the physics of the problem already suggest a possible topology, through to most general cases where the graph topology is learned from the data. A particular emphasis is on graph topology definition based on the correlation and precision matrices of the observed data, combined with additional prior knowledge and structural conditions, such as the smoothness or sparsity of graph connections. For learning sparse graphs (with small number of edges), the least absolute shrinkage and selection operator, known as LASSO is employed, along with its graph specific variant, graphical LASSO. For completeness, both variants of LASSO are derived in an intuitive way, and explained. An in-depth elaboration of the graph topology learning paradigm is provided through several examples on physically well defined graphs, such as electric circuits, linear heat transfer, social and computer networks, and spring-mass systems. As many graph neural networks (GNN) and convolutional graph networks (GCN) are emerging, we have also reviewed the main trends in GNNs and GCNs, from the perspective of graph signal filtering. Tensor representation of lattice-structured graphs is next considered, and it is shown that tensors (multidimensional data arrays) are a special class of graph signals, whereby the graph vertices reside on a high-dimensional regular lattice structure. This part of monograph concludes with two emerging applications in financial data processing and underground transportation networks modeling.

圖 · Processing（編程語言） · Signal Processing · 傅立葉變換 · Extensibility ·

2019 年 9 月 23 日

Graph Signal Processing -- Part II: Processing and Analyzing Signals on Graphs

Ljubisa Stankovic,Danilo Mandic,Milos Dakovic,Milos Brajovic,Bruno Scalzo,Anthony G. Constantinides

from arxiv, 60 pages, 50 figures,

The focus of Part I of this monograph has been on both the fundamental properties, graph topologies, and spectral representations of graphs. Part II embarks on these concepts to address the algorithmic and practical issues centered round data/signal processing on graphs, that is, the focus is on the analysis and estimation of both deterministic and random data on graphs. The fundamental ideas related to graph signals are introduced through a simple and intuitive, yet illustrative and general enough case study of multisensor temperature field estimation. The concept of systems on graph is defined using graph signal shift operators, which generalize the corresponding principles from traditional learning systems. At the core of the spectral domain representation of graph signals and systems is the Graph Discrete Fourier Transform (GDFT). The spectral domain representations are then used as the basis to introduce graph signal filtering concepts and address their design, including Chebyshev polynomial approximation series. Ideas related to the sampling of graph signals are presented and further linked with compressive sensing. Localized graph signal analysis in the joint vertex-spectral domain is referred to as the vertex-frequency analysis, since it can be considered as an extension of classical time-frequency analysis to the graph domain of a signal. Important topics related to the local graph Fourier transform (LGFT) are covered, together with its various forms including the graph spectral and vertex domain windows and the inversion conditions and relations. A link between the LGFT with spectral varying window and the spectral graph wavelet transform (SGWT) is also established. Realizations of the LGFT and SGWT using polynomial (Chebyshev) approximations of the spectral functions are further considered. Finally, energy versions of the vertex-frequency representations are introduced.