国产又色又爽又黄又免费软件,亚洲综合在线观看一区二区三区,A欧美日韩高清在线播放,亚洲不卡一二三在线观看

Inter-rater reliability (IRR) has been the prevalent quality and precision measure in ratings from multiple raters. However, applicant selection procedures based on ratings from multiple raters usually result in a binary outcome. This final outcome is not considered in IRR, which instead focuses on the ratings of the individual subjects or objects. In this work, we outline how to transform the selection procedures into a binary classification framework and develop a quantile approximation which connects a measurement model for the ratings with the binary classification framework. The quantile approximation allows us to estimate the probability of correctly selecting the best applicants and assess error probabilities when evaluating the quality of selection procedures using ratings from multiple raters. We draw connections between the inter-rater reliability and the binary classification metrics, showing that binary classification metrics depend solely on the IRR coefficient and proportion of selected applicants. We assess the performance of the quantile approximation in a simulation study and apply it in an example comparing the reliability of multiple grant peer review selection procedures.

相關內容

二分類

關注 0

Agent · 估計/估計量 · 廣義線性模型 · 線性的 · 線性模型 ·

2022 年 9 月 16 日

Truthful Generalized Linear Models

Yuan Qiu,Jinyan Liu,Di Wang

from arxiv, To appear in The 18th Conference on Web and Internet Economics (WINE 2022)

In this paper we study estimating Generalized Linear Models (GLMs) in the case where the agents (individuals) are strategic or self-interested and they concern about their privacy when reporting data. Compared with the classical setting, here we aim to design mechanisms that can both incentivize most agents to truthfully report their data and preserve the privacy of individuals' reports, while their outputs should also close to the underlying parameter. In the first part of the paper, we consider the case where the covariates are sub-Gaussian and the responses are heavy-tailed where they only have the finite fourth moments. First, motivated by the stationary condition of the maximizer of the likelihood function, we derive a novel private and closed form estimator. Based on the estimator, we propose a mechanism which has the following properties via some appropriate design of the computation and payment scheme for several canonical models such as linear regression, logistic regression and Poisson regression: (1) the mechanism is $o(1)$-jointly differentially private (with probability at least $1-o(1)$); (2) it is an $o(\frac{1}{n})$-approximate Bayes Nash equilibrium for a $(1-o(1))$-fraction of agents to truthfully report their data, where $n$ is the number of agents; (3) the output could achieve an error of $o(1)$ to the underlying parameter; (4) it is individually rational for a $(1-o(1))$ fraction of agents in the mechanism ; (5) the payment budget required from the analyst to run the mechanism is $o(1)$. In the second part, we consider the linear regression model under more general setting where both covariates and responses are heavy-tailed and only have finite fourth moments. By using an $\ell_4$-norm shrinkage operator, we propose a private estimator and payment scheme which have similar properties as in the sub-Gaussian case.

MoDELS · TCN · 位置嵌入 · 估計/估計量 · state-of-the-art ·

2022 年 9 月 16 日

DBT-DMAE: An Effective Multivariate Time Series Pre-Train Model under Missing Data

Kai Zhang,Qinmin Yang,Chao Li

Multivariate time series(MTS) is a universal data type related to many practical applications. However, MTS suffers from missing data problems, which leads to degradation or even collapse of the downstream tasks, such as prediction and classification. The concurrent missing data handling procedures could inevitably arouse the biased estimation and redundancy-training problem when encountering multiple downstream tasks. This paper presents a universally applicable MTS pre-train model, DBT-DMAE, to conquer the abovementioned obstacle. First, a missing representation module is designed by introducing dynamic positional embedding and random masking processing to characterize the missing symptom. Second, we proposed an auto-encoder structure to obtain the generalized MTS encoded representation utilizing an ameliorated TCN structure called dynamic-bidirectional-TCN as the basic unit, which integrates the dynamic kernel and time-fliping trick to draw temporal features effectively. Finally, the overall feed-in and loss strategy is established to ensure the adequate training of the whole model. Comparative experiment results manifest that the DBT-DMAE outperforms the other state-of-the-art methods in six real-world datasets and two different downstream tasks. Moreover, ablation and interpretability experiments are delivered to verify the validity of DBT-DMAE's substructures.

INFORMS · 塑造 · 損失 · 3D · 三維重建 ·

2022 年 9 月 16 日

Capturing Shape Information with Multi-Scale Topological Loss Terms for 3D Reconstruction

Dominik J. E. Waibel,Scott Atwell,Matthias Meier,Carsten Marr,Bastian Rieck

from arxiv, Accepted at the 25th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI)

Reconstructing 3D objects from 2D images is both challenging for our brains and machine learning algorithms. To support this spatial reasoning task, contextual information about the overall shape of an object is critical. However, such information is not captured by established loss terms (e.g. Dice loss). We propose to complement geometrical shape information by including multi-scale topological features, such as connected components, cycles, and voids, in the reconstruction loss. Our method uses cubical complexes to calculate topological features of 3D volume data and employs an optimal transport distance to guide the reconstruction process. This topology-aware loss is fully differentiable, computationally efficient, and can be added to any neural network. We demonstrate the utility of our loss by incorporating it into SHAPR, a model for predicting the 3D cell shape of individual cells based on 2D microscopy images. Using a hybrid loss that leverages both geometrical and topological information of single objects to assess their shape, we find that topological information substantially improves the quality of reconstructions, thus highlighting its ability to extract more relevant features from image datasets.

Analysis · 不變 · 近似 · 似然 · 向量空間 ·

2022 年 9 月 15 日

On the detrimental effect of invariances in the likelihood for variational inference

Richard Kurle,Ralf Herbrich,Tim Januschowski,Yuyang Wang,Jan Gasthaus

Variational Bayesian posterior inference often requires simplifying approximations such as mean-field parametrisation to ensure tractability. However, prior work has associated the variational mean-field approximation for Bayesian neural networks with underfitting in the case of small datasets or large model sizes. In this work, we show that invariances in the likelihood function of over-parametrised models contribute to this phenomenon because these invariances complicate the structure of the posterior by introducing discrete and/or continuous modes which cannot be well approximated by Gaussian mean-field distributions. In particular, we show that the mean-field approximation has an additional gap in the evidence lower bound compared to a purpose-built posterior that takes into account the known invariances. Importantly, this invariance gap is not constant; it vanishes as the approximation reverts to the prior. We proceed by first considering translation invariances in a linear model with a single data point in detail. We show that, while the true posterior can be constructed from a mean-field parametrisation, this is achieved only if the objective function takes into account the invariance gap. Then, we transfer our analysis of the linear model to neural networks. Our analysis provides a framework for future work to explore solutions to the invariance problem.

Learning · Analysis · 表示學習 · 泛化理論 · Seven ·

2022 年 9 月 15 日

Generalized Representations Learning for Time Series Classification

Wang Lu,Jindong Wang,Xinwei Sun,Yiqiang Chen,Xing Xie

from arxiv, Technical report; 18 pages

Time series classification is an important problem in real world. Due to its non-stationary property that the distribution changes over time, it remains challenging to build models for generalization to unseen distributions. In this paper, we propose to view the time series classification problem from the distribution perspective. We argue that the temporal complexity attributes to the unknown latent distributions within. To this end, we propose DIVERSIFY to learn generalized representations for time series classification. DIVERSIFY takes an iterative process: it first obtains the worst-case distribution scenario via adversarial training, then matches the distributions of the obtained sub-domains. We also present some theoretical insights. We conduct experiments on gesture recognition, speech commands recognition, wearable stress and affect detection, and sensor-based human activity recognition with a total of seven datasets in different settings. Results demonstrate that DIVERSIFY significantly outperforms other baselines and effectively characterizes the latent distributions by qualitative and quantitative analysis.

可辨認的 · 圖 · TOOLS · 貝葉斯網/貝葉斯網絡 · 概率圖模型 ·

2022 年 9 月 14 日

Identifying Causal Effects on a Chain Event Graph for Remedial Interventions

Xuewen Yu,Jim Q. Smith

To efficiently analyse system reliability, graphical tools such as fault trees and Bayesian networks are widely adopted. In this article, instead of conventional graphical tools, we apply a probabilistic graphical model called the chain event graph (CEG) to represent failure and deteriorating processes of a system. The CEG is derived from an event tree and can flexibly represent the unfolding of the asymmetric processes. We customise a domain-specific intervention on the CEG called the remedial intervention for maintenance. This fixes the root causes of a failure and returns the status of the system to as good as new: a novel type of intervention designed specifically for reliability applications. The semantics of the CEG are expressive enough to capture the necessary intervention calculus. Furthermore through the bespoke causal algebras the CEG provides a transparent framework to guide and express the rationale behind predictive inferences about the effects of various types of the remedial intervention. A back-door theorem is adapted to apply to these interventions to help discover when causal effects can be identified from a partially observed system.

Continuity · Attention · 服務器 · 論文 ·

2022 年 9 月 14 日

A Generic Privacy-Preserving Protocol For Keystroke Dynamics-Based Continuous Authentication

Ahmed Fraz Baig,Sigurd Eskeland

from arxiv, Baig, A. and Eskeland, S. A Generic Privacy-preserving Protocol for Keystroke Dynamics-based Continuous Authentication.In Proceedings of the 19th International Conference on Security and Cryptography (SECRYPT 2022), pages 491-498 ISBN: 978-989-758-590-6; ISSN: 2184-7711

Continuous authentication utilizes automatic recognition of certain user features for seamless and passive authentication without requiring user attention. Such features can be divided into categories of physiological biometrics and behavioral biometrics. Keystroke dynamics is proposed for behavioral biometrics-oriented authentication by recognizing users by means of their typing patterns. However, it has been pointed out that continuous authentication using physiological biometrics and behavior biometrics incur privacy risks, revealing personal characteristics and activities. In this paper, we consider a previously proposed keystroke dynamics-based authentication scheme that has no privacy-preserving properties. In this regard, we propose a generic privacy-preserving version of this authentication scheme in which all user features are encrypted -- preventing disclosure of those to the authentication server. Our scheme is generic in the sense that it assumes homomorphic cryptographic primitives. Authentication is conducted on the basis of encrypted data due to the homomorphic cryptographic properties of our protocol.

數據點 · 示例 · 優化器 · 均方根 · 均方誤差 ·

2022 年 9 月 10 日

scatteR: Generating instance space based on scagnostics

Janith C. Wanniarachchi,Thiyanga S. Talagala

from arxiv, 14 pages. For the associated R package, see //cran.r-project.org/package=scatteR

Modern synthetic data generators consist of model-based methods where the focus is primarily on tuning the parameters of the model and not on specifying the structure of the data itself. Scagnostics is an exploratory graphical method, capable of encapsulating the structure of bivariate data through graph-theoretic measures. An inverse scagnostic measure would therefore provide an entry point to generate datasets based on the characteristics of instance space rather than a model-based simulation approach. scatteR is a novel data generation method with controllable characteristics based on scagnostic measurements. We have used a Generalized Simulated Annealing optimizer iteratively to discover the optimal arrangement of data points in each iteration that minimizes the distance between the current and target measurements. Generally, as a pedagogical tool, scatteR can be used to generate datasets to teach statistical methods. Based on the results of this study, scatteR is capable of generating 50 data points in under 30 seconds with a 0.05 Root Mean Squared Error on average.

非凸 · 優化器 · 因子分解 · 統計量 · 分解的 ·

2019 年 9 月 19 日

Nonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview

Yuejie Chi,Yue M. Lu,Yuxin Chen

from arxiv, Invited overview article

Substantial progress has been made recently on developing provably accurate and efficient algorithms for low-rank matrix factorization via nonconvex optimization. While conventional wisdom often takes a dim view of nonconvex optimization algorithms due to their susceptibility to spurious local minima, simple iterative methods such as gradient descent have been remarkably successful in practice. The theoretical footings, however, had been largely lacking until recently. In this tutorial-style overview, we highlight the important role of statistical models in enabling efficient nonconvex optimization with performance guarantees. We review two contrasting approaches: (1) two-stage algorithms, which consist of a tailored initialization step followed by successive refinement; and (2) global landscape analysis and initialization-free algorithms. Several canonical matrix factorization problems are discussed, including but not limited to matrix sensing, phase retrieval, matrix completion, blind deconvolution, robust principal component analysis, phase synchronization, and joint alignment. Special care is taken to illustrate the key technical insights underlying their analyses. This article serves as a testament that the integrated consideration of optimization and statistics leads to fruitful research findings.

圖卷積神經網絡/圖卷積網絡 · 節點分類 · 圖 · 圖卷積 · Networking ·

2019 年 3 月 5 日

Semi-supervised Node Classification via Hierarchical Graph Convolutional Networks

Fenyu Hu,Yanqiao Zhu,Shu Wu,Liang Wang,Tieniu Tan

from arxiv, 7 pages, 3 figures

Graph convolutional networks (GCNs) have been successfully applied in node classification tasks of network mining. However, most of these models based on neighborhood aggregation are usually shallow and lack the "graph pooling" mechanism, which prevents the model from obtaining adequate global information. In order to increase the receptive field, we propose a novel deep Hierarchical Graph Convolutional Network (H-GCN) for semi-supervised node classification. H-GCN first repeatedly aggregates structurally similar nodes to hyper-nodes and then refines the coarsened graph to the original to restore the representation for each node. Instead of merely aggregating one- or two-hop neighborhood information, the proposed coarsening procedure enlarges the receptive field for each node, hence more global information can be learned. Comprehensive experiments conducted on public datasets demonstrate the effectiveness of the proposed method over the state-of-art methods. Notably, our model gains substantial improvements when only a few labeled samples are provided.