国产欧美日韩综合在线,亚洲五月花在线观看,欧美综合区自拍亚洲综合天堂,国产精品午夜在线观看体验区,日韩精品一区二区不卡中文字幕

Currently, there is limited research investigating the phenomenon of research data repositories being shut down, and the impact this has on the long-term availability of data. This paper takes an infrastructure perspective on the preservation of research data by using a registry to identify 191 research data repositories that have been closed and presenting information on the shutdown process. The results show that 6.2 % of research data repositories indexed in the registry were shut down. The risks resulting in repository shutdown are varied. The median age of a repository when shutting down is 12 years. Strategies to prevent data loss at the infrastructure level are pursued to varying extent. 44 % of the repositories in the sample migrated data to another repository, and 12 % maintain limited access to their data collection. However, both strategies are not permanent solutions. Finally, the general lack of information on repository shutdown events as well as the effect on the findability of data and the permanence of the scholarly record are discussed.

相關內容

INFORMS

關注 10

《計算機信息》雜志發表高質量的論文，擴大了運籌學和計算的范圍，尋求有關理論、方法、實驗、系統和應用方面的原創研究論文、新穎的調查和教程論文，以及描述新的和有用的軟件工具的論文。官網鏈接： · 損失函數（機器學習） · 核化 · 貝葉斯推斷 · 損失 ·

2023 年 11 月 27 日

An efficient likelihood-free Bayesian inference method based on sequential neural posterior estimation

Yifei Xiong,Xiliang Yang,Sanguo Zhang,Zhijian He

from arxiv, 30 pages, 7 figures

Sequential neural posterior estimation (SNPE) techniques have been recently proposed for dealing with simulation-based models with intractable likelihoods. Unlike approximate Bayesian computation, SNPE techniques learn the posterior from sequential simulation using neural network-based conditional density estimators by minimizing a specific loss function. The SNPE method proposed by Lueckmann et al. (2017) used a calibration kernel to boost the sample weights around the observed data, resulting in a concentrated loss function. However, the use of calibration kernels may increase the variances of both the empirical loss and its gradient, making the training inefficient. To improve the stability of SNPE, this paper proposes to use an adaptive calibration kernel and several variance reduction techniques. The proposed method greatly speeds up the process of training, and provides a better approximation of the posterior than the original SNPE method and some existing competitors as confirmed by numerical experiments.

MoDELS · Automator · Facebook AI Research · Performer · Machine Learning ·

2023 年 11 月 27 日

Automated discovery of trade-off between utility, privacy and fairness in machine learning models

Bogdan Ficiu,Neil D. Lawrence,Andrei Paleyes

from arxiv, 3rd Workshop on Bias and Fairness in AI (BIAS), ECML 2023

Machine learning models are deployed as a central component in decision making and policy operations with direct impact on individuals' lives. In order to act ethically and comply with government regulations, these models need to make fair decisions and protect the users' privacy. However, such requirements can come with decrease in models' performance compared to their potentially biased, privacy-leaking counterparts. Thus the trade-off between fairness, privacy and performance of ML models emerges, and practitioners need a way of quantifying this trade-off to enable deployment decisions. In this work we interpret this trade-off as a multi-objective optimization problem, and propose PFairDP, a pipeline that uses Bayesian optimization for discovery of Pareto-optimal points between fairness, privacy and utility of ML models. We show how PFairDP can be used to replicate known results that were achieved through manual constraint setting process. We further demonstrate effectiveness of PFairDP with experiments on multiple models and datasets.

GROUP · 推斷 · Weight · 設計 · 統計量 ·

2023 年 11 月 27 日

Adjusted inference for multiple testing procedure in group sequential designs

Yujie Zhao,Qi Liu,Linda Z. Sun,Keaven M. Anderson

Adjustment of statistical significance levels for repeated analysis in group sequential trials has been understood for some time. Similarly, methods for adjustment accounting for testing multiple hypotheses are common. There is limited research on simultaneously adjusting for both multiple hypothesis testing and multiple analyses of one or more hypotheses. We address this gap by proposing adjusted-sequential p-values that reject an elementary hypothesis when its adjusted-sequential p-values are less than or equal to the family-wise Type I error rate (FWER) in a group sequential design. We also propose sequential p-values for intersection hypotheses as a tool to compute adjusted sequential p-values for elementary hypotheses. We demonstrate the application using weighted Bonferroni tests and weighted parametric tests, comparing adjusted sequential p-values to a desired FWER for inference on each elementary hypothesis tested.

回合 · 離散化 · t檢驗 · 統計量 · Analysis ·

2023 年 11 月 26 日

An optimised cuckoo-based discrete symbiotic organisms search strategy for tasks scheduling in cloud computing environment

Suleiman Sa'ad,Abdullah Muhammed,Mohammed Abdullahi,Azizol Abdullah

from arxiv, 21 pages, 5 figures, 2 algorithms, 6 tables

Currently, the cloud computing paradigm is experiencing rapid growth as there is a shift from other distributed computing methods and traditional IT infrastructure towards it. Consequently, optimised task scheduling techniques have become crucial in managing the expanding cloud computing environment. In cloud computing, numerous tasks need to be scheduled on a limited number of diverse virtual machines to minimise the imbalance between the local and global search space; and optimise system utilisation. Task scheduling is a challenging problem known as NP-complete, which means that there is no exact solution, and we can only achieve near-optimal results, particularly when using large-scale tasks in the context of cloud computing. This paper proposes an optimised strategy, Cuckoo-based Discrete Symbiotic Organisms Search (C-DSOS) that incorporated with Levy-Flight for optimal task scheduling in the cloud computing environment to minimise degree of imbalance. The strategy is based on the Standard Symbiotic Organism Search (SOS), which is a nature-inspired metaheuristic optimisation algorithm designed for numerical optimisation problems. SOS simulates the symbiotic relationships observed in ecosystems, such as mutualism, commensalism, and parasitism. To evaluate the proposed technique, the CloudSim toolkit simulator was used to conduct experiments. The results demonstrated that C-DSOS outperforms the Simulated Annealing Symbiotic Organism Search (SASOS) algorithm, which is a benchmarked algorithm commonly used in task scheduling problems. C-DSOS exhibits a favourable convergence rate, especially when using larger search spaces, making it suitable for task scheduling problems in the cloud. For the analysis, a t-test was employed, reveals that C-DSOS is statistically significant compared to the benchmarked SASOS algorithm, particularly for scenarios involving a large search space.

可約的 · INFORMS · 假陰性 · 信息檢索 · 秩 ·

2023 年 11 月 25 日

Relevance feedback strategies for recall-oriented neural information retrieval

Timo Kats,Peter van der Putten,Jan Scholtes

In a number of information retrieval applications (e.g., patent search, literature review, due diligence, etc.), preventing false negatives is more important than preventing false positives. However, approaches designed to reduce review effort (like "technology assisted review") can create false negatives, since they are often based on active learning systems that exclude documents automatically based on user feedback. Therefore, this research proposes a more recall-oriented approach to reducing review effort. More specifically, through iteratively re-ranking the relevance rankings based on user feedback, which is also referred to as relevance feedback. In our proposed method, the relevance rankings are produced by a BERT-based dense-vector search and the relevance feedback is based on cumulatively summing the queried and selected embeddings. Our results show that this method can reduce review effort between 17.85% and 59.04%, compared to a baseline approach (of no feedback), given a fixed recall target

Projection · 線搜索 · 穩健性 · 可約的 · ENJOY ·

2023 年 11 月 24 日

Pitfalls of Projection: A study of Newton-type solvers for incremental potentials

Andreas Longva,Fabian L?schner,José Antonio Fernández-Fernández,Egor Larionov,Uri M. Ascher,Jan Bender

from arxiv, 20 pages, 18 figures, under peer view at time of submission. Supplemental video in ancillary files

Nonlinear systems arising from time integrators like Backward Euler can sometimes be reformulated as optimization problems, known as incremental potentials. We show through a comprehensive experimental analysis that the widely used Projected Newton method, which relies on unconditional semidefinite projection of Hessian contributions, typically exhibits a reduced convergence rate compared to classical Newton's method. We demonstrate how factors like resolution, element order, projection method, material model and boundary handling impact convergence of Projected Newton and Newton. Drawing on these findings, we propose the hybrid method Project-on-Demand Newton, which projects only conditionally, and show that it enjoys both the robustness of Projected Newton and convergence rate of Newton. We additionally introduce Kinetic Newton, a regularization-based method that takes advantage of the structure of incremental potentials and avoids projection altogether. We compare the four solvers on hyperelasticity and contact problems. We also present a nuanced discussion of convergence criteria, and propose a new acceleration-based criterion that avoids problems associated with existing residual norm criteria and is easier to interpret. We finally address a fundamental limitation of the Armijo backtracking line search that occasionally blocks convergence, especially for stiff problems. We propose a novel parameter-free, robust line search technique to eliminate this issue.

Agent · Learning · 試驗 · 在線 · 強化學習 ·

2023 年 11 月 23 日

Designing and evaluating an online reinforcement learning agent for physical exercise recommendations in N-of-1 trials

Dominik Meier,Ipek Ensari,Stefan Konigorski

Personalized adaptive interventions offer the opportunity to increase patient benefits, however, there are challenges in their planning and implementation. Once implemented, it is an important question whether personalized adaptive interventions are indeed clinically more effective compared to a fixed gold standard intervention. In this paper, we present an innovative N-of-1 trial study design testing whether implementing a personalized intervention by an online reinforcement learning agent is feasible and effective. Throughout, we use a new study on physical exercise recommendations to reduce pain in endometriosis for illustration. We describe the design of a contextual bandit recommendation agent and evaluate the agent in simulation studies. The results show that, first, implementing a personalized intervention by an online reinforcement learning agent is feasible. Second, such adaptive interventions have the potential to improve patients' benefits even if only few observations are available. As one challenge, they add complexity to the design and implementation process. In order to quantify the expected benefit, data from previous interventional studies is required. We expect our approach to be transferable to other interventions and clinical interventions.

表示 · 相同 · CASE · 樣本 · 層次聚類 ·

2023 年 11 月 23 日

Comparing representations of high-dimensional data with persistent homology: a case study in neuroimaging

Ty Easley,Kevin Freese,Elizabeth Munch,Janine Bijsterbosch

from arxiv, Revision of preprint of submission for NeurIPS 2023 conference

Despite much attention, the comparison of reduced-dimension representations of high-dimensional data remains a challenging problem in multiple fields, especially when representations remain high-dimensional compared to sample size. We offer a framework for evaluating the topological similarity of high-dimensional representations of very high-dimensional data, a regime where topological structure is more likely captured in the distribution of topological "noise" than a few prominent generators. Treating each representational map as a metric embedding, we compute the Vietoris-Rips persistence of its image. We then use the topological bootstrap to analyze the re-sampling stability of each representation, assigning a "prevalence score" for each nontrivial basis element of its persistence module. Finally, we compare the persistent homology of representations using a prevalence-weighted variant of the Wasserstein distance. Notably, our method is able to compare representations derived from different samples of the same distribution and, in particular, is not restricted to comparisons of graphs on the same vertex set. In addition, representations need not lie in the same metric space. We apply this analysis to a cross-sectional sample of representations of functional neuroimaging data in a large cohort and hierarchically cluster under the prevalence-weighted Wasserstein. We find that the ambient dimension of a representation is a stronger predictor of the number and stability of topological features than its decomposition rank. Our findings suggest that important topological information lies in repeatable, low-persistence homology generators, whose distributions capture important and interpretable differences between high-dimensional data representations.

成對型 · Learning · 估計/估計量 · MoDELS · 獎勵函數 ·

2023 年 11 月 23 日

A density estimation perspective on learning from pairwise human preferences

Vincent Dumoulin,Daniel D. Johnson,Pablo Samuel Castro,Hugo Larochelle,Yann Dauphin

Learning from human feedback (LHF) -- and in particular learning from pairwise preferences -- has recently become a crucial ingredient in training large language models (LLMs), and has been the subject of much research. Most recent works frame it as a reinforcement learning problem, where a reward function is learned from pairwise preference data and the LLM is treated as a policy which is adapted to maximize the rewards, often under additional regularization constraints. We propose an alternative interpretation which centers on the generative process for pairwise preferences and treats LHF as a density estimation problem. We provide theoretical and empirical results showing that for a family of generative processes defined via preference behavior distribution equations, training a reward function on pairwise preferences effectively models an annotator's implicit preference distribution. Finally, we discuss and present findings on "annotator misspecification" -- failure cases where wrong modeling assumptions are made about annotator behavior, resulting in poorly-adapted models -- suggesting that approaches that learn from pairwise human preferences could have trouble learning from a population of annotators with diverse viewpoints.

entity · 圖 · 知識圖譜 · MoDELS · 鏈路預測 ·

2020 年 8 月 10 日

A survey of embedding models of entities and relationships for knowledge graph completion

Dat Quoc Nguyen

from arxiv, 13 pages, 2 figures and 6 tables

Knowledge graphs (KGs) of real-world facts about entities and their relationships are useful resources for a variety of natural language processing tasks. However, because knowledge graphs are typically incomplete, it is useful to perform knowledge graph completion or link prediction, i.e. predict whether a relationship not in the knowledge graph is likely to be true. This paper serves as a comprehensive survey of embedding models of entities and relationships for knowledge graph completion, summarizing up-to-date experimental results on standard benchmark datasets and pointing out potential future research directions.