日韩在线精品小视频,一区二区三区性色福利在线视频,在线欧洲亚洲视频,亚洲国产另类精品专区

Asymptotic methods for hypothesis testing in high-dimensional data usually require the dimension of the observations to increase to infinity, often with an additional condition on its rate of increase compared to the sample size. On the other hand, multivariate asymptotic methods are valid for fixed dimension only, and their practical implementations in hypothesis testing methodology typically require the sample size to be large compared to the dimension for yielding desirable results. However, in practical scenarios, it is usually not possible to determine whether the dimension of the data at hand conform to the conditions required for the validity of the high-dimensional asymptotic methods, or whether the sample size is large enough compared to the dimension of the data. In this work, a theory of asymptotic convergence is proposed, which holds uniformly over the dimension of the random vectors. This theory attempts to unify the asymptotic results for fixed-dimensional multivariate data and high-dimensional data, and accounts for the effect of the dimension of the data on the performance of the hypothesis testing procedures. The methodology developed based on this asymptotic theory can be applied to data of any dimension. An application of this theory is demonstrated in the two-sample test for the equality of locations. The test statistic proposed is unscaled by the sample covariance, similar to usual tests for high-dimensional data. Using simulated examples, it is demonstrated that the proposed test exhibits better performance compared to several popular tests in the literature for high-dimensional data. Further, it is demonstrated in simulated models that the proposed unscaled test performs better than the usual scaled two-sample tests for multivariate data, including the Hotelling's $T^2$ test for multivariate Gaussian data.

相關內容

Performer

關注 0

MoDELS · 損失 · xgboost · 長短期記憶網絡 · DNN ·

2024 年 5 月 7 日

Physics-data hybrid dynamic model of a multi-axis manipulator for sensorless dexterous manipulation and high-performance motion planning

Wu-Te Yang,Jyun-Ming Liao,Pei-Chun Lin

from arxiv, 26 pages, 16 figures

We report on the development of an implementable physics-data hybrid dynamic model for an articulated manipulator to plan and operate in various scenarios. Meanwhile, the physics-based and data-driven dynamic models are studied in this research to select the best model for planning. The physics-based model is constructed using the Lagrangian method, and the loss terms include inertia loss, viscous loss, and friction loss. As for the data-driven model, three methods are explored, including DNN, LSTM, and XGBoost. Our modeling results demonstrate that, after comprehensive hyperparameter optimization, the XGBoost architecture outperforms DNN and LSTM in accurately representing manipulator dynamics. The hybrid model with physics-based and data-driven terms has the best performance among all models based on the RMSE criteria, and it only needs about 24k of training data. In addition, we developed a virtual force sensor of a manipulator using the observed external torque derived from the dynamic model and designed a motion planner through the physics-data hybrid dynamic model. The external torque contributes to forces and torque on the end effector, facilitating interaction with the surroundings, while the internal torque governs manipulator motion dynamics and compensates for internal losses. By estimating external torque via the difference between measured joint torque and internal losses, we implement a sensorless control strategy which is demonstrated through a peg-in-hole task. Lastly, a learning-based motion planner based on the hybrid dynamic model assists in planning time-efficient trajectories for the manipulator. This comprehensive approach underscores the efficacy of integrating physics-based and data-driven models for advanced manipulator control and planning in industrial environments.

估計/估計量 · 正則化項 · 超參數 · Performer · 方陣 ·

2024 年 5 月 6 日

Recursive identification with regularization and on-line hyperparameters estimation

Bernard Vau,Tudor-Bogdan Airimitoaie

from arxiv, //hal.science/hal-04337419

This paper presents a regularized recursive identification algorithm with simultaneous on-line estimation of both the model parameters and the algorithms hyperparameters. A new kernel is proposed to facilitate the algorithm development. The performance of this novel scheme is compared with that of the recursive least squares algorithm in simulation.

秩 · 泛函 · 統計量 · GROUP · 得分 ·

2024 年 5 月 6 日

Doubly ranked tests for grouped functional data

Mark J. Meyer

Nonparametric tests for functional data are a challenging class of tests to work with because of the potentially high dimensional nature of functional data. One of the main challenges for considering rank-based tests, like the Mann-Whitney or Wilcoxon Rank Sum tests (MWW), is that the unit of observation is a curve. Thus any rank-based test must consider ways of ranking curves. While several procedures, including depth-based methods, have recently been used to create scores for rank-based tests, these scores are not constructed under the null and often introduce additional, uncontrolled for variability. We therefore reconsider the problem of rank-based tests for functional data and develop an alternative approach that incorporates the null hypothesis throughout. Our approach first ranks realizations from the curves at each time point, then summarizes the ranks for each subject using a sufficient statistic we derive, and finally re-ranks the sufficient statistics in a procedure we refer to as a doubly ranked test. As we demonstrate, doubly rank tests are more powerful while maintaining ideal type I error in the two sample, MWW setting. We also extend our framework to more than two samples, developing a Kruskal-Wallis test for functional data which exhibits good test characteristics as well. Finally, we illustrate the use of doubly ranked tests in functional data contexts from material science, climatology, and public health policy.

可理解性 · SimPLe · BASIC · 原點 · Analysis ·

2024 年 5 月 5 日

Permutation time irreversibility in sleep electroencephalograms: Dependence on sleep stage and the effect of equal values

Wenpo Yao

from arxiv, 13 pages, 9 figures

Time irreversibility (TIR) refers to the manifestation of nonequilibrium brain activity influenced by various physiological conditions; however, the influence of sleep on electroencephalogram (EEG) TIR has not been sufficiently investigated. In this paper, a comprehensive study on permutation TIR (pTIR) of EEG data under different sleep stages is conducted. Two basic ordinal patterns (i.e., the original and amplitude permutations) are distinguished to simplify sleep EEGs, and then the influences of equal values and forbidden permutation on pTIR are elucidated. To detect pTIR of brain electric signals, 5 groups of EEGs in the awake, stages I, II, III, and rapid eye movement (REM) stages are collected from the public Polysomnographic Database in PhysioNet. Test results suggested that the pTIR of sleep EEGs significantly decreases as the sleep stage increases (p<0.001), with the awake and REM EEGs, demonstrating greater differences than others. Comparative analysis and numerical simulations support the importance of equal values. Distribution of equal states, a simple quantification of amplitude fluctuations, significantly increases with the sleep stage (p<0.001). If these equalities are ignored, incorrect probabilistic differences may arise in the forward-backward and symmetric permutations of TIR, leading to contradictory results; moreover, the ascending and descending orders for symmetric permutations also lead different outcomes in sleep EEGs. Overall, pTIR in sleep EEGs contributes to our understanding of quantitative TIR and classification of sleep EEGs.

可辨認的 · MoDELS · 推斷 · INFORMS · 估計/估計量 ·

2024 年 5 月 4 日

Identifiable causal inference with noisy treatment and no side information

Antti P?ll?nen,Pekka Marttinen

from arxiv, 18 pages, 10 figures. Changes consist of polishing the original version. The experiments and results remain the same

In some causal inference scenarios, the treatment variable is measured inaccurately, for instance in epidemiology or econometrics. Failure to correct for the effect of this measurement error can lead to biased causal effect estimates. Previous research has not studied methods that address this issue from a causal viewpoint while allowing for complex nonlinear dependencies and without assuming access to side information. For such a scenario, this study proposes a model that assumes a continuous treatment variable that is inaccurately measured. Building on existing results for measurement error models, we prove that our model's causal effect estimates are identifiable, even without knowledge of the measurement error variance or other side information. Our method relies on a deep latent variable model in which Gaussian conditionals are parameterized by neural networks, and we develop an amortized importance-weighted variational objective for training the model. Empirical results demonstrate the method's good performance with unknown measurement error. More broadly, our work extends the range of applications in which reliable causal inference can be conducted.

操作 · Networking · Neural Networks · 統計量 · 矩 ·

2024 年 5 月 3 日

Physics-informed neural networks for operator equations with stochastic data

Paul Escapil-Inchauspé,Gonzalo A. Ruz

We consider the computation of statistical moments to operator equations with stochastic data. We remark that application of PINNs -- referred to as TPINNs -- allows to solve the induced tensor operator equations under minimal changes of existing PINNs code, and enabling handling of non-linear and time-dependent operators. We propose two types of architectures, referred to as vanilla and multi-output TPINNs, and investigate their benefits and limitations. Exhaustive numerical experiments are performed; demonstrating applicability and performance; raising a variety of new promising research avenues.

估計/估計量 · Integration · TOOLS · 置信度 · INFORMS ·

2024 年 5 月 3 日

Integrating measures of replicability into scholarly search: Challenges and opportunities

Chuhao Wu,Tatiana Chakravorti,John Carroll,Sarah Rajtmajer

Challenges to reproducibility and replicability have gained widespread attention, driven by large replication projects with lukewarm success rates. A nascent work has emerged developing algorithms to estimate the replicability of published findings. The current study explores ways in which AI-enabled signals of confidence in research might be integrated into the literature search. We interview 17 PhD researchers about their current processes for literature search and ask them to provide feedback on a replicability estimation tool. Our findings suggest that participants tend to confuse replicability with generalizability and related concepts. Information about replicability can support researchers throughout the research design processes. However, the use of AI estimation is debatable due to the lack of explainability and transparency. The ethical implications of AI-enabled confidence assessment must be further studied before such tools could be widely accepted. We discuss implications for the design of technological tools to support scholarly activities and advance replicability.

MoDELS · 假陽性 · 假正例率 · 估計/估計量 · state-of-the-art ·

2024 年 5 月 3 日

Predictive change point detection for heterogeneous data

Anna-Christina Glock,Florian Sobieczky,Johannes Fürnkranz,Peter Filzmoser,Martin Jech

A change point detection (CPD) framework assisted by a predictive machine learning model called "Predict and Compare" is introduced and characterised in relation to other state-of-the-art online CPD routines which it outperforms in terms of false positive rate and out-of-control average run length. The method's focus is on improving standard methods from sequential analysis such as the CUSUM rule in terms of these quality measures. This is achieved by replacing typically used trend estimation functionals such as the running mean with more sophisticated predictive models (Predict step), and comparing their prognosis with actual data (Compare step). The two models used in the Predict step are the ARIMA model and the LSTM recursive neural network. However, the framework is formulated in general terms, so as to allow the use of other prediction or comparison methods than those tested here. The power of the method is demonstrated in a tribological case study in which change points separating the run-in, steady-state, and divergent wear phases are detected in the regime of very few false positives.

GROUP · CASE · Analysis · 標量 · Extensibility ·

2024 年 5 月 2 日

Sensitivity analysis for matching on high-dimensional predictors: A case study of racial disparity in US mortality

Marina Hernandez,Ciprian Crainiceanu

Matching on a low dimensional vector of scalar covariates consists of constructing groups of individuals in which each individual in a group is within a pre-specified distance from an individual in another group. However, matching in high dimensional spaces is more challenging because the distance can be sensitive to implementation details, caliper width, and measurement error of observations. To partially address these problems, we propose to use extensive sensitivity analyses and identify the main sources of variation and bias. We illustrate these concepts by examining the racial disparity in all-cause mortality in the US using the National Health and Nutrition Examination Survey (NHANES 2003-2006). In particular, we match African Americans to Caucasian Americans on age, gender, BMI and objectively measured physical activity (PA). PA is measured every minute using accelerometers for up to seven days and then transformed into an empirical distribution of all of the minute-level observations. The Wasserstein metric is used as the measure of distance between these participant-specific distributions.

層 · Projection · 輸出 · Subspace · 卷積 ·

2024 年 5 月 2 日

Out-of-distribution detection based on subspace projection of high-dimensional features output by the last convolutional layer

Qiuyu Zhu,Yiwei He

from arxiv, 10 pages, 4 figures

Out-of-distribution (OOD) detection, crucial for reliable pattern classification, discerns whether a sample originates outside the training distribution. This paper concentrates on the high-dimensional features output by the final convolutional layer, which contain rich image features. Our key idea is to project these high-dimensional features into two specific feature subspaces, leveraging the dimensionality reduction capacity of the network's linear layers, trained with Predefined Evenly-Distribution Class Centroids (PEDCC)-Loss. This involves calculating the cosines of three projection angles and the norm values of features, thereby identifying distinctive information for in-distribution (ID) and OOD data, which assists in OOD detection. Building upon this, we have modified the batch normalization (BN) and ReLU layer preceding the fully connected layer, diminishing their impact on the output feature distributions and thereby widening the distribution gap between ID and OOD data features. Our method requires only the training of the classification network model, eschewing any need for input pre-processing or specific OOD data pre-tuning. Extensive experiments on several benchmark datasets demonstrates that our approach delivers state-of-the-art performance. Our code is available at //github.com/Hewell0/ProjOOD.