亚洲乱色熟女一区二区三区麻豆,国产污片在线观看网站,中日美韩欧大乳人妻在线

As advancements in artificial intelligence (AI) propel progress in the life sciences, they may also enable the weaponisation and misuse of biological agents. This article differentiates two classes of AI tools that could pose such biosecurity risks: large language models (LLMs) and biological design tools (BDTs). LLMs, such as GPT-4 and its successors, might provide dual-use information and thus remove some barriers encountered by historical biological weapons efforts. As LLMs are turned into multi-modal lab assistants and autonomous science tools, this will increase their ability to support non-experts in performing laboratory work. Thus, LLMs may in particular lower barriers to biological misuse. In contrast, BDTs will expand the capabilities of sophisticated actors. Concretely, BDTs may enable the creation of pandemic pathogens substantially worse than anything seen to date and could enable forms of more predictable and targeted biological weapons. In combination, the convergence of LLMs and BDTs could raise the ceiling of harm from biological agents and could make them broadly accessible. A range of interventions would help to manage risks. Independent pre-release evaluations could help understand the capabilities of models and the effectiveness of safeguards. Options for differentiated access to such tools should be carefully weighed with the benefits of openly releasing systems. Lastly, essential for mitigating risks will be universal and enhanced screening of gene synthesis products.

相關內容

TOOLS

關注 1

這個新版本的工具會議系列恢復了從1989年到2012年的50個會議的傳統。工具最初是“面向對象語言和系統的技術”，后來發展到包括軟件技術的所有創新方面。今天許多最重要的軟件概念都是在這里首次引入的。2019年TOOLS 50+1在俄羅斯喀山附近舉行，以同樣的創新精神、對所有與軟件相關的事物的熱情、科學穩健性和行業適用性的結合以及歡迎該領域所有趨勢和社區的開放態度，延續了該系列。官網鏈接： · 大學 · state-of-the-art · ONCE · Integration ·

2024 年 2 月 9 日

Local Universal Explainer (LUX) -- a rule-based explainer with factual, counterfactual and visual explanations

Szymon Bobek,Grzegorz J. Nalepa

Explainable artificial intelligence (XAI) is one of the most intensively developed area of AI in recent years. It is also one of the most fragmented with multiple methods that focus on different aspects of explanations. This makes difficult to obtain the full spectrum of explanation at once in a compact and consistent way. To address this issue, we present Local Universal Explainer (LUX), which is a rule-based explainer that can generate factual, counterfactual and visual explanations. It is based on a modified version of decision tree algorithms that allows for oblique splits and integration with feature importance XAI methods such as SHAP or LIME. It does not use data generation in opposite to other algorithms, but is focused on selecting local concepts in a form of high-density clusters of real data that have the highest impact on forming the decision boundary of the explained model. We tested our method on real and synthetic datasets and compared it with state-of-the-art rule-based explainers such as LORE, EXPLAN and Anchor. Our method outperforms the existing approaches in terms of simplicity, global fidelity, representativeness, and consistency.

MoDELS · 預測準確率 · INFORMS · 模型評估 · 判別器 ·

2024 年 2 月 8 日

When accurate prediction models yield harmful self-fulfilling prophecies

Wouter A. C. van Amsterdam,Nan van Geloven,Jesse H. Krijthe,Rajesh Ranganath,Giovanni Ciná

Objective: Prediction models are popular in medical research and practice. By predicting an outcome of interest for specific patients, these models may help inform difficult treatment decisions, and are often hailed as the poster children for personalized, data-driven healthcare. Many prediction models are deployed for decision support based on their prediction accuracy in validation studies. We investigate whether this is a safe and valid approach. Materials and Methods: We show that using prediction models for decision making can lead to harmful decisions, even when the predictions exhibit good discrimination after deployment. These models are harmful self-fulfilling prophecies: their deployment harms a group of patients but the worse outcome of these patients does not invalidate the predictive power of the model. Results: Our main result is a formal characterization of a set of such prediction models. Next we show that models that are well calibrated before and after deployment are useless for decision making as they made no change in the data distribution. Discussion: Our results point to the need to revise standard practices for validation, deployment and evaluation of prediction models that are used in medical decisions. Conclusion: Outcome prediction models can yield harmful self-fulfilling prophecies when used for decision making, a new perspective on prediction model development, deployment and monitoring is needed.

Performer · MoDELS · 回合 · Automator · 離散化 ·

2024 年 2 月 8 日

Reconsidering the performance of DEVS modeling and simulation environments using the DEVStone benchmark

José L. Risco-Martín,Saurabh Mittal,Juan Carlos Fabero,Marina Zapater,Román Hermida

The Discrete Event System Specification formalism (DEVS), which supports hierarchical and modular model composition, has been widely used to understand, analyze and develop a variety of systems. DEVS has been implemented in various languages and platforms over the years. The DEVStone benchmark was conceived to generate a set of models with varied structure and behavior, and to automate the evaluation of the performance of DEVS-based simulators. However, DEVStone is still in a preliminar phase and more model analysis is required. In this paper, we revisit DEVStone introducing new equations to compute the number of events triggered. We also introduce a new benchmark, called HOmem, designed as an alternative version of HOmod, with similar CPU and memory requirements, but with an easier implementation and analytically more manageable. Finally, we compare both the performance and memory footprint of five different DEVS simulators in two different hardware platforms.

AI · Principle · 設計 · Continuity · Conformer ·

2024 年 2 月 8 日

POLARIS: A framework to guide the development of Trustworthy AI systems

Maria Teresa Baldassarre,Domenico Gigante,Marcos Kalinowski,Azzurra Ragone

In the ever-expanding landscape of Artificial Intelligence (AI), where innovation thrives and new products and services are continuously being delivered, ensuring that AI systems are designed and developed responsibly throughout their entire lifecycle is crucial. To this end, several AI ethics principles and guidelines have been issued to which AI systems should conform. Nevertheless, relying solely on high-level AI ethics principles is far from sufficient to ensure the responsible engineering of AI systems. In this field, AI professionals often navigate by sight. Indeed, while recommendations promoting Trustworthy AI (TAI) exist, these are often high-level statements that are difficult to translate into concrete implementation strategies. There is a significant gap between high-level AI ethics principles and low-level concrete practices for AI professionals. To address this challenge, our work presents an experience report where we develop a novel holistic framework for Trustworthy AI - designed to bridge the gap between theory and practice - and report insights from its application in an industrial case study. The framework is built on the result of a systematic review of the state of the practice, a survey, and think-aloud interviews with 34 AI practitioners. The framework, unlike most of those already in the literature, is designed to provide actionable guidelines and tools to support different types of stakeholders throughout the entire Software Development Life Cycle (SDLC). Our goal is to empower AI professionals to confidently navigate the ethical dimensions of TAI through practical insights, ensuring that the vast potential of AI is exploited responsibly for the benefit of society as a whole.

motivation · Hackathon · Analysis · CASE · Engineering ·

2024 年 2 月 8 日

Can participation in a hackathon impact the motivation of software engineering students? A preliminary case study analysis

Allysson Allex Araújo,Marcos Kalinowski,Maria Teresa Baldassarre

[Background] Hackathons are increasingly gaining prominence in Software Engineering (SE) education, lauded for their ability to elevate students' skill sets. [Objective] This paper investigates whether hackathons can impact the motivation of SE students. [Method] We conducted an evaluative case study assessing students' motivations before and after a hackathon, combining quantitative analysis using the Academic Motivation Scale (AMS) and qualitative coding of open-ended responses. [Results] Pre-hackathon findings reveal a diverse range of motivations with an overall acceptance, while post-hackathon responses highlight no statistically significant shift in participants' perceptions. Qualitative findings uncovered themes related to networking, team dynamics, and skill development. From a practical perspective, our findings highlight the potential of hackathons to impact participants' motivation. [Conclusion] While our study enhances the comprehension of hackathons as a motivational tool, it also underscores the need for further exploration of psychometric dimensions in SE educational research.

語言模型化 · TOOLS · MoDELS · 統計量 · AI ·

2024 年 2 月 6 日

AI language models as role-playing tools, not human participants

Zhicheng Lin

from arxiv, 6 pages, 1 table

Advances in AI invite misuse of language models as replacements for human participants. We argue that treating their responses as glimpses into an average human mind fundamentally mischaracterizes these statistical algorithms and that language models should be embraced as flexible simulation tools, able to mimic diverse behaviors without possessing human traits themselves.

模型評估 · 推薦系統 · 分解的 · SimPLe · Better ·

2024 年 2 月 6 日

Reliability quality measures for recommender systems

Jesús Bobadilla,Abraham Gutierrez,Fernando Ortega,Bo Zhu

Users want to know the reliability of the recommendations; they do not accept high predictions if there is no reliability evidence. Recommender systems should provide reliability values associated with the predictions. Research into reliability measures requires the existence of simple, plausible and universal reliability quality measures. Research into recommender system quality measures has focused on accuracy. Moreover, novelty, serendipity and diversity have been studied; nevertheless there is an important lack of research into reliability/confidence quality measures. This paper proposes a reliability quality prediction measure (RPI) and a reliability quality recommendation measure (RRI). Both quality measures are based on the hypothesis that the more suitable a reliability measure is, the better accuracy results it will provide when applied. These reliability quality measures show accuracy improvements when appropriated reliability values are associated with their predictions (i.e. high reliability values associated with correct predictions or low reliability values associated with incorrect predictions). The proposed reliability quality metrics will lead to the design of brand new recommender system reliability measures. These measures could be applied to different matrix factorization techniques and to content-based, context-aware and social recommendation approaches. The recommender system reliability measures designed could be tested, compared and improved using the proposed reliability quality metrics.

Machine Learning · Learning · 可理解性 · Facebook AI Research · Nuance ·

2024 年 2 月 6 日

Measuring machine learning harms from stereotypes: requires understanding who is being harmed by which errors in what ways

Angelina Wang,Xuechunzi Bai,Solon Barocas,Su Lin Blodgett

from arxiv, earlier draft non-archival at EAAMO 2023

As machine learning applications proliferate, we need an understanding of their potential for harm. However, current fairness metrics are rarely grounded in human psychological experiences of harm. Drawing on the social psychology of stereotypes, we use a case study of gender stereotypes in image search to examine how people react to machine learning errors. First, we use survey studies to show that not all machine learning errors reflect stereotypes nor are equally harmful. Then, in experimental studies we randomly expose participants to stereotype-reinforcing, -violating, and -neutral machine learning errors. We find stereotype-reinforcing errors induce more experientially (i.e., subjectively) harmful experiences, while having minimal changes to cognitive beliefs, attitudes, or behaviors. This experiential harm impacts women more than men. However, certain stereotype-violating errors are more experientially harmful for men, potentially due to perceived threats to masculinity. We conclude that harm cannot be the sole guide in fairness mitigation, and propose a nuanced perspective depending on who is experiencing what harm and why.

MoDELS · 圖 · 正則化項 · 近似 · 易處理的 ·

2024 年 2 月 5 日

Efficient algorithms for the Potts model on small-set expanders

Charles Carlson,Ewan Davies,Alexandra Kolla

from arxiv, 24 pages

An emerging trend in approximate counting is to show that certain `low-temperature' problems are easy on typical instances, despite worst-case hardness results. For the class of regular graphs one usually shows that expansion can be exploited algorithmically, and since random regular graphs are good expanders with high probability the problem is typically tractable. Inspired by approaches used in subexponential-time algorithms for Unique Games, we develop an approximation algorithm for the partition function of the ferromagnetic Potts model on graphs with a small-set expansion condition. In such graphs it may not suffice to explore the state space of the model close to ground states, and a novel feature of our method is to efficiently find a larger set of `pseudo-ground states' such that it is enough to explore the model around each pseudo-ground state.

Attention · 支持向量機 · 長短期記憶網絡 · 支持向量 · SimPLe ·

2024 年 2 月 5 日

An Attention Long Short-Term Memory based system for automatic classification of speech intelligibility

Miguel Fernández-Díaz,Ascensión Gallardo-Antolín

Speech intelligibility can be degraded due to multiple factors, such as noisy environments, technical difficulties or biological conditions. This work is focused on the development of an automatic non-intrusive system for predicting the speech intelligibility level in this latter case. The main contribution of our research on this topic is the use of Long Short-Term Memory (LSTM) networks with log-mel spectrograms as input features for this purpose. In addition, this LSTM-based system is further enhanced by the incorporation of a simple attention mechanism that is able to determine the more relevant frames to this task. The proposed models are evaluated with the UA-Speech database that contains dysarthric speech with different degrees of severity. Results show that the attention LSTM architecture outperforms both, a reference Support Vector Machine (SVM)-based system with hand-crafted features and a LSTM-based system with Mean-Pooling.