亚洲精品无码黄色网站在线观看_中文熟妇亚洲视频观看_天堂在线视频精品专区_国产精品免费aⅴ片在线观看_欧美精品一区二区在线精品_曰韩无码人妻熟妇不卡一二三_厨房丝袜麻麻被后进怀孕

The homogeneity problem for testing if more than two different samples come from the same population is considered for the case of functional data. The methodological results are motivated by the study of homogeneity of electronic devices fabricated by different materials and active layer thicknesses. In the case of normality distribution of the stochastic processes associated with each sample, this problem is known as Functional ANOVA problem and is reduced to test the equality of the mean group functions (FANOVA). The problem is that the current/voltage curves associated with Resistive Random Access Memories (RRAM) are not generated by a Gaussian process so that a different approach is necessary for testing homogeneity. To solve this problem two different parametric and nonparametric approaches based on basis expansion of the sample curves are proposed. The first consists of testing multivariate homogeneity tests on a vector of basis coefficients of the sample curves. The second is based on dimension reduction by using functional principal component analysis of the sample curves (FPCA) and testing multivariate homogeneity on a vector of principal components scores. Different approximation numerical techniques are employed to adapt the experimental data for the statistical study. An extensive simulation study is developed for analyzing the performance of both approaches in the parametric and non-parametric cases. Finally, the proposed methodologies are applied on three samples of experimental reset curves measured in three different RRAM technologies.

相關內容

泛函

關注 0

類別 · 規范化的 · MoDELS · 自助法/自舉法 · 標準正態分布 ·

2024 年 3 月 20 日

A class of bootstrap based residuals for compositional data

Gustavo H. A. Pereira,Jianwen Cai

Regression models for compositional data are common in several areas of knowledge. As in other classes of regression models, it is desirable to perform diagnostic analysis in these models using residuals that are approximately standard normally distributed. However, for regression models for compositional data, there has not been any multivariate residual that meets this requirement. In this work, we introduce a class of asymptotically standard normally distributed residuals for compositional data based on bootstrap. Monte Carlo simulation studies indicate that the distributions of the residuals of this class are well approximated by the standard normal distribution in small samples. An application to simulated data also suggests that one of the residuals of the proposed class is better to identify model misspecification than its competitors. Finally, the usefulness of the best residual of the proposed class is illustrated through an application on sleep stages. The class of residuals proposed here can also be used in other classes of multivariate regression models.

近似 · 譜半徑 · 不可約的 · UniFormer · Tensor ·

2024 年 3 月 19 日

Complexity of Geometric programming in the Turing model and application to nonnegative tensors

Shmuel Friedland,Stéphane Gaubert

from arxiv, 38 pages

We consider a geometric programming problem consisting in minimizing a function given by the supremum of finitely many log-Laplace transforms of discrete nonnegative measures on a Euclidean space. Under a coerciveness assumption, we show that a $\varepsilon$-minimizer can be computed in a time that is polynomial in the input size and in $|\log\varepsilon|$. This is obtained by establishing bit-size estimates on approximate minimizers and by applying the ellipsoid method. We also derive polynomial iteration complexity bounds for the interior point method applied to the same class of problems. We deduce that the spectral radius of a partially symmetric, weakly irreducible nonnegative tensor can be approximated within $\varepsilon$ error in poly-time. For strongly irreducible tensors, we also show that the logarithm of the positive eigenvector is poly-time computable. Our results also yield that the the maximum of a nonnegative homogeneous $d$-form in the unit ball with respect to $d$-H\"older norm can be approximated in poly-time. In particular, the spectral radius of uniform weighted hypergraphs and some known upper bounds for the clique number of uniform hypergraphs are poly-time computable.

Integration · 估計/估計量 · data integrity · 推斷 · Principle ·

2024 年 3 月 19 日

Turning the information-sharing dial: efficient inference from different data sources

Emily C. Hector,Ryan Martin

from arxiv, 46 pages, 10 figures, 15 tables

A fundamental aspect of statistics is the integration of data from different sources. Classically, Fisher and others were focused on how to integrate homogeneous (or only mildly heterogeneous) sets of data. More recently, as data are becoming more accessible, the question of if data sets from different sources should be integrated is becoming more relevant. The current literature treats this as a question with only two answers: integrate or don't. Here we take a different approach, motivated by information-sharing principles coming from the shrinkage estimation literature. In particular, we deviate from the do/don't perspective and propose a dial parameter that controls the extent to which two data sources are integrated. How far this dial parameter should be turned is shown to depend, for example, on the informativeness of the different data sources as measured by Fisher information. In the context of generalized linear models, this more nuanced data integration framework leads to relatively simple parameter estimates and valid tests/confidence intervals. Moreover, we demonstrate both theoretically and empirically that setting the dial parameter according to our recommendation leads to more efficient estimation compared to other binary data integration schemes.

INFORMS · Machine Learning · 可辨認的 · 縮放 · Learning ·

2024 年 3 月 19 日

Information decomposition in complex systems via machine learning

Kieran A. Murphy,Dani S. Bassett

from arxiv, Project page: //distributed-information-bottleneck.github.io/

One of the fundamental steps toward understanding a complex system is identifying variation at the scale of the system's components that is most relevant to behavior on a macroscopic scale. Mutual information provides a natural means of linking variation across scales of a system due to its independence of functional relationship between observables. However, characterizing the manner in which information is distributed across a set of observables is computationally challenging and generally infeasible beyond a handful of measurements. Here we propose a practical and general methodology that uses machine learning to decompose the information contained in a set of measurements by jointly optimizing a lossy compression of each measurement. Guided by the distributed information bottleneck as a learning objective, the information decomposition identifies the variation in the measurements of the system state most relevant to specified macroscale behavior. We focus our analysis on two paradigmatic complex systems: a Boolean circuit and an amorphous material undergoing plastic deformation. In both examples, the large amount of entropy of the system state is decomposed, bit by bit, in terms of what is most related to macroscale behavior. The identification of meaningful variation in data, with the full generality brought by information theory, is made practical for studying the connection between micro- and macroscale structure in complex systems.

Continuity · 預測器/決策函數 · MoDELS · Extensibility · 估計/估計量 ·

2024 年 3 月 18 日

Proposal of a general framework to categorize continuous predictor variables

Irantzu Barrio,Javier Roca-Pardi?as,Cristobal Esteban,Maria Durban

The use of discretized variables in the development of prediction models is a common practice, in part because the decision-making process is more natural when it is based on rules created from segmented models. Although this practice is perhaps more common in medicine, it is extensible to any area of knowledge where a predictive model helps in decision-making. Therefore, providing researchers with a useful and valid categorization method could be a relevant issue when developing prediction models. In this paper, we propose a new general methodology that can be applied to categorize a predictor variable in any regression model where the response variable belongs to the exponential family distribution. Furthermore, it can be applied in any multivariate context, allowing to categorize more than one continuous covariate simultaneously. In addition, a computationally very efficient method is proposed to obtain the optimal number of categories, based on a pseudo-BIC proposal. Several simulation studies have been conducted in which the efficiency of the method with respect to both the location and the number of estimated cut-off points is shown. Finally, the categorization proposal has been applied to a real data set of 543 patients with chronic obstructive pulmonary disease from Galdakao Hospital's five outpatient respiratory clinics, who were followed up for 10 years. We applied the proposed methodology to jointly categorize the continuous variables six-minute walking test and forced expiratory volume in one second in a multiple Poisson generalized additive model for the response variable rate of the number of hospital admissions by years of follow-up. The location and number of cut-off points obtained were clinically validated as being in line with the categorizations used in the literature.

預測器/決策函數 · SCAM · 平滑 · MoDELS · 泛函 ·

2024 年 3 月 17 日

On some extensions of shape-constrained generalized additive modelling in R

Natalya Pya Arnqvist

Regression models that incorporate smooth functions of predictor variables to explain the relationships with a response variable have gained widespread usage and proved successful in various applications. By incorporating smooth functions of predictor variables, these models can capture complex relationships between the response and predictors while still allowing for interpretation of the results. In situations where the relationships between a response variable and predictors are explored, it is not uncommon to assume that these relationships adhere to certain shape constraints. Examples of such constraints include monotonicity and convexity. The scam package for R has become a popular package to carry out the full fitting of exponential family generalized additive modelling with shape restrictions on smooths. The paper aims to extend the existing framework of shape-constrained generalized additive models (SCAM) to accommodate smooth interactions of covariates, linear functionals of shape-constrained smooths and incorporation of residual autocorrelation. The methods described in this paper are implemented in the recent version of the package scam, available on the Comprehensive R Archive Network (CRAN).

INFORMS · 信息先驗 · MoDELS · 邊緣化 · 樣例 ·

2024 年 3 月 17 日

Translating predictive distributions into informative priors

Andrew A. Manderson,Robert J. B. Goudie

from arxiv, Revised to shorten the main text considerably

When complex Bayesian models exhibit implausible behaviour, one solution is to assemble available information into an informative prior. Challenges arise as prior information is often only available for the observable quantity, or some model-derived marginal quantity, rather than directly pertaining to the natural parameters in our model. We propose a method for translating available prior information, in the form of an elicited distribution for the observable or model-derived marginal quantity, into an informative joint prior. Our approach proceeds given a parametric class of prior distributions with as yet undetermined hyperparameters, and minimises the difference between the supplied elicited distribution and corresponding prior predictive distribution. We employ a global, multi-stage Bayesian optimisation procedure to locate optimal values for the hyperparameters. Three examples illustrate our approach: a cure-fraction survival model, where censoring implies that the observable quantity is a priori a mixed discrete/continuous quantity; a setting in which prior information pertains to $R^{2}$ -- a model-derived quantity; and a nonlinear regression model.

方差 · 極小點 · 隨機變量 · 噪聲 · 微分熵 ·

2024 年 3 月 15 日

Minimum entropy of a log-concave variable for fixed variance

James Melbourne,Piotr Nayar,Cyril Roberto

from arxiv, A simpler proof of the "Three-point inequality'' is given

We show that for log-concave real random variables with fixed variance the Shannon differential entropy is minimized for an exponential random variable. We apply this result to derive upper bounds on capacities of additive noise channels with log-concave noise. We also improve constants in the reverse entropy power inequalities for log-concave random variables.

估計/估計量 · 推斷 · 統計量 · 樣本 · 穩健性 ·

2024 年 3 月 12 日

Inference for non-probability samples using the calibration approach for quantiles

Maciej Ber?sewicz,Marcin Szymkowiak

Non-probability survey samples are examples of data sources that have become increasingly popular in recent years, also in official statistics. However, statistical inference based on non-probability samples is much more difficult because they are biased and are not representative of the target population (Wu, 2022). In this paper we consider a method of joint calibration for totals (Deville & S\"arndal, 1992) and quantiles (Harms & Duchesne, 2006) and use the proposed approach to extend existing inference methods for non-probability samples, such as inverse probability weighting, mass imputation and doubly robust estimators. By including quantile information in the estimation process non-linear relationships between the target and auxiliary variables can be approximated the way it is done in step-wise (constant) regression. Our simulation study has demonstrated that the estimators in question are more robust against model mis-specification and, as a result, help to reduce bias and improve estimation efficiency. Variance estimation for our proposed approach is also discussed. We show that existing inference methods can be used and that the resulting confidence intervals are at nominal levels. Finally, we applied the proposed methods to estimate the share of vacancies aimed at Ukrainian workers in Poland using an integrated set of administrative and survey data about job vacancies. The proposed approaches have been implemented in two R packages (nonprobsvy and jointCalib), which were used to conduct the simulation and empirical study

entity · 圖 · 知識圖譜 · MoDELS · 鏈路預測 ·

2020 年 8 月 10 日

A survey of embedding models of entities and relationships for knowledge graph completion

Dat Quoc Nguyen

from arxiv, 13 pages, 2 figures and 6 tables

Knowledge graphs (KGs) of real-world facts about entities and their relationships are useful resources for a variety of natural language processing tasks. However, because knowledge graphs are typically incomplete, it is useful to perform knowledge graph completion or link prediction, i.e. predict whether a relationship not in the knowledge graph is likely to be true. This paper serves as a comprehensive survey of embedding models of entities and relationships for knowledge graph completion, summarizing up-to-date experimental results on standard benchmark datasets and pointing out potential future research directions.