爱琴海论坛视频播放三免费,成人亚洲国产综合精品夜色,国产精品久久久AV免费

Categorization is one of the basic tasks in machine learning and data analysis. Building on formal concept analysis (FCA), the starting point of the present work is that different ways to categorize a given set of objects exist, which depend on the choice of the sets of features used to classify them, and different such sets of features may yield better or worse categorizations, relative to the task at hand. In their turn, the (a priori) choice of a particular set of features over another might be subjective and express a certain epistemic stance (e.g. interests, relevance, preferences) of an agent or a group of agents, namely, their interrogative agenda. In the present paper, we represent interrogative agendas as sets of features, and explore and compare different ways to categorize objects w.r.t. different sets of features (agendas). We first develop a simple unsupervised FCA-based algorithm for outlier detection which uses categorizations arising from different agendas. We then present a supervised meta-learning algorithm to learn suitable (fuzzy) agendas for categorization as sets of features with different weights or masses. We combine this meta-learning algorithm with the unsupervised outlier detection algorithm to obtain a supervised outlier detection algorithm. We show that these algorithms perform at par with commonly used algorithms for outlier detection on commonly used datasets in outlier detection. These algorithms provide both local and global explanations of their results.

相關內容

異常點

關注 1

對數幾率回歸 · 樣本復雜度 · 估計/估計量 · 規范化的 · 樣本 ·

2024 年 2 月 8 日

On the sample complexity of parameter estimation in logistic regression with normal design

Daniel Hsu,Arya Mazumdar

The logistic regression model is one of the most popular data generation model in noisy binary classification problems. In this work, we study the sample complexity of estimating the parameters of the logistic regression model up to a given $\ell_2$ error, in terms of the dimension and the inverse temperature, with standard normal covariates. The inverse temperature controls the signal-to-noise ratio of the data generation process. While both generalization bounds and asymptotic performance of the maximum-likelihood estimator for logistic regression are well-studied, the non-asymptotic sample complexity that shows the dependence on error and the inverse temperature for parameter estimation is absent from previous analyses. We show that the sample complexity curve has two change-points in terms of the inverse temperature, clearly separating the low, moderate, and high temperature regimes.

MoDELS · 預測準確率 · INFORMS · 模型評估 · 判別器 ·

2024 年 2 月 8 日

When accurate prediction models yield harmful self-fulfilling prophecies

Wouter A. C. van Amsterdam,Nan van Geloven,Jesse H. Krijthe,Rajesh Ranganath,Giovanni Ciná

Objective: Prediction models are popular in medical research and practice. By predicting an outcome of interest for specific patients, these models may help inform difficult treatment decisions, and are often hailed as the poster children for personalized, data-driven healthcare. Many prediction models are deployed for decision support based on their prediction accuracy in validation studies. We investigate whether this is a safe and valid approach. Materials and Methods: We show that using prediction models for decision making can lead to harmful decisions, even when the predictions exhibit good discrimination after deployment. These models are harmful self-fulfilling prophecies: their deployment harms a group of patients but the worse outcome of these patients does not invalidate the predictive power of the model. Results: Our main result is a formal characterization of a set of such prediction models. Next we show that models that are well calibrated before and after deployment are useless for decision making as they made no change in the data distribution. Discussion: Our results point to the need to revise standard practices for validation, deployment and evaluation of prediction models that are used in medical decisions. Conclusion: Outcome prediction models can yield harmful self-fulfilling prophecies when used for decision making, a new perspective on prediction model development, deployment and monitoring is needed.

Performer · MoDELS · 回合 · Automator · 離散化 ·

2024 年 2 月 8 日

Reconsidering the performance of DEVS modeling and simulation environments using the DEVStone benchmark

José L. Risco-Martín,Saurabh Mittal,Juan Carlos Fabero,Marina Zapater,Román Hermida

The Discrete Event System Specification formalism (DEVS), which supports hierarchical and modular model composition, has been widely used to understand, analyze and develop a variety of systems. DEVS has been implemented in various languages and platforms over the years. The DEVStone benchmark was conceived to generate a set of models with varied structure and behavior, and to automate the evaluation of the performance of DEVS-based simulators. However, DEVStone is still in a preliminar phase and more model analysis is required. In this paper, we revisit DEVStone introducing new equations to compute the number of events triggered. We also introduce a new benchmark, called HOmem, designed as an alternative version of HOmod, with similar CPU and memory requirements, but with an easier implementation and analytically more manageable. Finally, we compare both the performance and memory footprint of five different DEVS simulators in two different hardware platforms.

INTERACT · 去噪自編碼 · Analysis · 約束 · 交互式分析 ·

2024 年 2 月 7 日

Corotational modeling and NURBS-based kinematic constraint implementation in three-dimensional vehicle-track-structure interaction analysis

Maria Fedorova,M. V. Sivaselvan

from arxiv, 39 pages, 22 figures

An algorithm for three-dimensional dynamic vehicle-track-structure interaction (VTSI) analysis is described in this paper. The algorithm is described in terms of bridges and high-speed trains, but more generally applies to multibody systems coupled to deformable structures by time-varying kinematic constraints. Coupling is accomplished by a kinematic constraint/Lagrange multiplier approach, resulting in a system of index-3 Differential Algebraic Equations (DAE). Three main new concepts are developed. (i) A corotational approach is used to represent the vehicle (train) dynamics. Reference coordinate frames are fitted to the undeformed geometry of the bridge. While the displacements of the train can be large, deformations are taken to be small within these frames, resulting in linear (time-varying) rather than nonlinear dynamics. (ii) If conventional finite elements are used to discretize the track, the curvature is discontinuous across elements (and possibly rotation, too, for curved tracks). This results in spurious numerical oscillations in computed contact forces and accelerations, quantities of key interest in VTSI. A NURBS-based discretization is employed for the track to mitigate such oscillations. (iii) The higher order continuity due to using NURBS allows for alternative techniques for solving the VTSI system. First, enforcing constraints at the acceleration level reduces an index-3 DAE to an index-1 system that can be solved without numerical dissipation. Second, a constraint projection method is proposed to solve an index-3 DAE system without numerical dissipation by correcting wheel velocities and accelerations. Moreover, the modularity of the presented algorithm, resulting from a kinematic constraint/Lagrange multiplier formulation, enables ready integration of this VTSI approach in existing structural analysis and finite element software.

PCA · 正交 · 相關系數 · 方差 · 約束 ·

2024 年 2 月 7 日

From explained variance of correlated components to PCA without orthogonality constraints

Marie Chavent,Guy Chavent

Block Principal Component Analysis (Block PCA) of a data matrix A, where loadings Z are determined by maximization of AZ 2 over unit norm orthogonal loadings, is difficult to use for the design of sparse PCA by 1 regularization, due to the difficulty of taking care of both the orthogonality constraint on loadings and the non differentiable 1 penalty. Our objective in this paper is to relax the orthogonality constraint on loadings by introducing new objective functions expvar(Y) which measure the part of the variance of the data matrix A explained by correlated components Y = AZ. So we propose first a comprehensive study of mathematical and numerical properties of expvar(Y) for two existing definitions Zou et al. [2006], Shen and Huang [2008] and four new definitions. Then we show that only two of these explained variance are fit to use as objective function in block PCA formulations for A rid of orthogonality constraints.

估計/估計量 · Minimax · 貝葉斯估計 · MoDELS · 頻率主義學派 ·

2024 年 2 月 6 日

Nonparametric Bayesian estimation in a multidimensional diffusion model with high frequency data

Marc Hoffmann,Kolyan Ray

from arxiv, 61 pages, 1 figure

We consider nonparametric Bayesian inference in a multidimensional diffusion model with reflecting boundary conditions based on discrete high-frequency observations. We prove a general posterior contraction rate theorem in $L^2$-loss, which is applied to Gaussian priors. The resulting posteriors, as well as their posterior means, are shown to converge to the ground truth at the minimax optimal rate over H\"older smoothness classes in any dimension. Of independent interest and as part of our proofs, we show that certain frequentist penalized least squares estimators are also minimax optimal.

圖 · 評論員 · 邊 · 表示 · MoDELS ·

2024 年 2 月 6 日

On provable privacy vulnerabilities of graph representations

Ruofan Wu,Guanhua Fang,Qiying Pan,Mingyang Zhang,Tengfei Liu,Weiqiang Wang,Wenbiao Zhao

Graph representation learning (GRL) is critical for extracting insights from complex network structures, but it also raises security concerns due to potential privacy vulnerabilities in these representations. This paper investigates the structural vulnerabilities in graph neural models where sensitive topological information can be inferred through edge reconstruction attacks. Our research primarily addresses the theoretical underpinnings of cosine-similarity-based edge reconstruction attacks (COSERA), providing theoretical and empirical evidence that such attacks can perfectly reconstruct sparse Erdos Renyi graphs with independent random features as graph size increases. Conversely, we establish that sparsity is a critical factor for COSERA's effectiveness, as demonstrated through analysis and experiments on stochastic block models. Finally, we explore the resilience of (provably) private graph representations produced via noisy aggregation (NAG) mechanism against COSERA. We empirically delineate instances wherein COSERA demonstrates both efficacy and deficiency in its capacity to function as an instrument for elucidating the trade-off between privacy and utility.

估計/估計量 · Minimax · 泛函 · Analysis · 核化 ·

2024 年 2 月 6 日

Local differential privacy in survival analysis using private failure indicators

Egea Maxime,Escobar-Bach Mikael

We consider the estimation of the cumulative hazard function, and equivalently the distribution function, with censored data under a setup that preserves the privacy of the survival database. This is done through a $\alpha$-locally differentially private mechanism for the failure indicators and by proposing a non-parametric kernel estimator for the cumulative hazard function that remains consistent under the privatization. Under mild conditions, we also prove lowers bounds for the minimax rates of convergence and show that estimator is minimax optimal under a well-chosen bandwidth.

Processing（編程語言） · Subspace · MoDELS · 易處理的 · 維數災難 ·

2024 年 2 月 6 日

Combining additivity and active subspaces for high-dimensional Gaussian process modeling

Mickael Binois,Victor Picheny

Gaussian processes are a widely embraced technique for regression and classification due to their good prediction accuracy, analytical tractability and built-in capabilities for uncertainty quantification. However, they suffer from the curse of dimensionality whenever the number of variables increases. This challenge is generally addressed by assuming additional structure in theproblem, the preferred options being either additivity or low intrinsic dimensionality. Our contribution for high-dimensional Gaussian process modeling is to combine them with a multi-fidelity strategy, showcasing the advantages through experiments on synthetic functions and datasets.

鏈路預測 · 知識庫 · 基 · Extensibility · INTERACT ·

2020 年 4 月 10 日

Tensor Decompositions for temporal knowledge base completion

Timothée Lacroix,Guillaume Obozinski,Nicolas Usunier

Most algorithms for representation learning and link prediction in relational data have been designed for static data. However, the data they are applied to usually evolves with time, such as friend graphs in social networks or user interactions with items in recommender systems. This is also the case for knowledge bases, which contain facts such as (US, has president, B. Obama, [2009-2017]) that are valid only at certain points in time. For the problem of link prediction under temporal constraints, i.e., answering queries such as (US, has president, ?, 2012), we propose a solution inspired by the canonical decomposition of tensors of order 4. We introduce new regularization schemes and present an extension of ComplEx (Trouillon et al., 2016) that achieves state-of-the-art performance. Additionally, we propose a new dataset for knowledge base completion constructed from Wikidata, larger than previous benchmarks by an order of magnitude, as a new reference for evaluating temporal and non-temporal link prediction methods.