人人操人人莫人人草_露脸视频一区二区三区在线播放_无码人妻精品一区二区免费视频_国产欧美日韩综合在线第一_日本成人中文字幕在线看_日韩不卡一卡二卡3卡四卡网站_中文色综合一二三区

Pedro R. A. S. Bassi,Wenxuan Li,Yucheng Tang,Fabian Isensee,Zifu Wang,Jieneng Chen,Yu-Cheng Chou,Yannick Kirchhoff,Maximilian Rokuss,Ziyan Huang,Jin Ye,Junjun He,Tassilo Wald,Constantin Ulrich,Michael Baumgartner,Saikat Roy,Klaus H. Maier-Hein,Paul Jaeger,Yiwen Ye,Yutong Xie,Jianpeng Zhang,Ziyang Chen,Yong Xia,Zhaohu Xing,Lei Zhu,Yousef Sadegheih,Afshin Bozorgpour,Pratibha Kumari,Reza Azad,Dorit Merhof,Pengcheng Shi,Ting Ma,Yuxin Du,Fan Bai,Tiejun Huang,Bo Zhao,Haonan Wang,Xiaomeng Li,Hanxue Gu,Haoyu Dong,Jichen Yang,Maciej A. Mazurowski,Saumya Gupta,Linshan Wu,Jiaxin Zhuang,Hao Chen,Holger Roth,Daguang Xu,Matthew B. Blaschko,Sergio Decherchi,Andrea Cavalli,Alan L. Yuille,Zongwei Zhou

from arxiv, Accepted to NeurIPS-2024

How can we test AI performance? This question seems trivial, but it isn't. Standard benchmarks often have problems such as in-distribution and small-size test sets, oversimplified metrics, unfair comparisons, and short-term outcome pressure. As a consequence, good performance on standard benchmarks does not guarantee success in real-world scenarios. To address these problems, we present Touchstone, a large-scale collaborative segmentation benchmark of 9 types of abdominal organs. This benchmark is based on 5,195 training CT scans from 76 hospitals around the world and 5,903 testing CT scans from 11 additional hospitals. This diverse test set enhances the statistical significance of benchmark results and rigorously evaluates AI algorithms across various out-of-distribution scenarios. We invited 14 inventors of 19 AI algorithms to train their algorithms, while our team, as a third party, independently evaluated these algorithms on three test sets. In addition, we also evaluated pre-existing AI frameworks--which, differing from algorithms, are more flexible and can support different algorithms--including MONAI from NVIDIA, nnU-Net from DKFZ, and numerous other open-source frameworks. We are committed to expanding this benchmark to encourage more innovation of AI algorithms for the medical domain.

相關內容

關注 7038

人(ren)(ren)工(gong)(gong)智(zhi)(zhi)(zhi)能(neng)雜志(zhi)AI(Artificial Intelligence)是目(mu)前(qian)公認的(de)(de)(de)(de)(de)(de)發表該領域(yu)(yu)最新(xin)研(yan)究(jiu)成果(guo)的(de)(de)(de)(de)(de)(de)主要國際論(lun)(lun)壇。該期(qi)刊歡迎(ying)有關AI廣泛(fan)方面(mian)的(de)(de)(de)(de)(de)(de)論(lun)(lun)文，這些論(lun)(lun)文構成了(le)整個領域(yu)(yu)的(de)(de)(de)(de)(de)(de)進步，也歡迎(ying)介(jie)紹人(ren)(ren)工(gong)(gong)智(zhi)(zhi)(zhi)能(neng)應(ying)用(yong)的(de)(de)(de)(de)(de)(de)論(lun)(lun)文，但重點應(ying)該放在新(xin)的(de)(de)(de)(de)(de)(de)和(he)新(xin)穎的(de)(de)(de)(de)(de)(de)人(ren)(ren)工(gong)(gong)智(zhi)(zhi)(zhi)能(neng)方法如何提(ti)高應(ying)用(yong)領域(yu)(yu)的(de)(de)(de)(de)(de)(de)性(xing)能(neng)，而不是介(jie)紹傳統人(ren)(ren)工(gong)(gong)智(zhi)(zhi)(zhi)能(neng)方法的(de)(de)(de)(de)(de)(de)另一(yi)個應(ying)用(yong)。關于(yu)應(ying)用(yong)的(de)(de)(de)(de)(de)(de)論(lun)(lun)文應(ying)該描(miao)述一(yi)個原則(ze)性(xing)的(de)(de)(de)(de)(de)(de)解決方案，強(qiang)調其新(xin)穎性(xing)，并對正在開發的(de)(de)(de)(de)(de)(de)人(ren)(ren)工(gong)(gong)智(zhi)(zhi)(zhi)能(neng)技術進行深入(ru)的(de)(de)(de)(de)(de)(de)評(ping)估。官網地址：

CAD · Python · API · 設計 · 語言模型化 ·

2024 年 12 月 18 日

CAD-Assistant: Tool-Augmented VLLMs as Generic CAD Task Solvers?

Dimitrios Mallis,Ahmet Serdar Karadeniz,Sebastian Cavada,Danila Rukhovich,Niki Foteinopoulou,Kseniya Cherenkova,Anis Kacem,Djamila Aouada

We propose CAD-Assistant, a general-purpose CAD agent for AI-assisted design. Our approach is based on a powerful Vision and Large Language Model (VLLM) as a planner and a tool-augmentation paradigm using CAD-specific modules. CAD-Assistant addresses multimodal user queries by generating actions that are iteratively executed on a Python interpreter equipped with the FreeCAD software, accessed via its Python API. Our framework is able to assess the impact of generated CAD commands on geometry and adapts subsequent actions based on the evolving state of the CAD design. We consider a wide range of CAD-specific tools including Python libraries, modules of the FreeCAD Python API, helpful routines, rendering functions and other specialized modules. We evaluate our method on multiple CAD benchmarks and qualitatively demonstrate the potential of tool-augmented VLLMs as generic CAD task solvers across diverse CAD workflows.

WEB · Integration · Machine Learning · 可辨認的 · 優化器 ·

2024 年 12 月 18 日

Enhancing Rhetorical Figure Annotation: An Ontology-Based Web Application with RAG Integration

Ramona Kühn,Jelena Mitrovi?,Michael Granitzer

from arxiv, The 31st International Conference on Computational Linguistics (COLING 2025)

Rhetorical figures play an important role in our communication. They are used to convey subtle, implicit meaning, or to emphasize statements. We notice them in hate speech, fake news, and propaganda. By improving the systems for computational detection of rhetorical figures, we can also improve tasks such as hate speech and fake news detection, sentiment analysis, opinion mining, or argument mining. Unfortunately, there is a lack of annotated data, as well as qualified annotators that would help us build large corpora to train machine learning models for the detection of rhetorical figures. The situation is particularly difficult in languages other than English, and for rhetorical figures other than metaphor, sarcasm, and irony. To overcome this issue, we develop a web application called "Find your Figure" that facilitates the identification and annotation of German rhetorical figures. The application is based on the German Rhetorical ontology GRhOOT which we have specially adapted for this purpose. In addition, we improve the user experience with Retrieval Augmented Generation (RAG). In this paper, we present the restructuring of the ontology, the development of the web application, and the built-in RAG pipeline. We also identify the optimal RAG settings for our application. Our approach is one of the first to practically use rhetorical ontologies in combination with RAG and shows promising results.

Prompt · 學習器 · 代碼 · Obvious · 全 ·

2024 年 12 月 17 日

Breaking the Programming Language Barrier: Multilingual Prompting to Empower Non-Native English Learners

James Prather,Brent N. Reeves,Paul Denny,Juho Leinonen,Stephen MacNeil,Andrew Luxton-Reilly,Jo?o Orvalho,Amin Alipour,Ali Alfageeh,Thezyrie Amarouche,Bailey Kimmel,Jared Wright,Musa Blake,Gweneth Barbre

from arxiv, 10 pages, 3 tables. Accepted for publication at the 27th Australasian Computing Education Conference (ACE 2025)

Non-native English speakers (NNES) face multiple barriers to learning programming. These barriers can be obvious, such as the fact that programming language syntax and instruction are often in English, or more subtle, such as being afraid to ask for help in a classroom full of native English speakers. However, these barriers are frustrating because many NNES students know more about programming than they can articulate in English. Advances in generative AI (GenAI) have the potential to break down these barriers because state of the art models can support interactions in multiple languages. Moreover, recent work has shown that GenAI can be highly accurate at code generation and explanation. In this paper, we provide the first exploration of NNES students prompting in their native languages (Arabic, Chinese, and Portuguese) to generate code to solve programming problems. Our results show that students are able to successfully use their native language to solve programming problems, but not without some difficulty specifying programming terminology and concepts. We discuss the challenges they faced, the implications for practice in the short term, and how this might transform computing education globally in the long term.

自動問答 · 多樣性 · 基準 · Processing（編程語言） · 相同 ·

2024 年 12 月 16 日

SCITAT: A Question Answering Benchmark for Scientific Tables and Text Covering Diverse Reasoning Types

Xuanliang Zhang,Dingzirui Wang,Baoxin Wang,Longxu Dou,Xinyuan Lu,Keyan Xu,Dayong Wu,Qingfu Zhu,Wanxiang Che

Scientific question answering (SQA) is an important task aimed at answering questions based on papers. However, current SQA datasets have limited reasoning types and neglect the relevance between tables and text, creating a significant gap with real scenarios. To address these challenges, we propose a QA benchmark for scientific tables and text with diverse reasoning types (SciTaT). To cover more reasoning types, we summarize various reasoning types from real-world questions. To involve both tables and text, we require the questions to incorporate tables and text as much as possible. Based on SciTaT, we propose a strong baseline (CaR), which combines various reasoning methods to address different reasoning types and process tables and text at the same time. CaR brings average improvements of 12.9% over other baselines on SciTaT, validating its effectiveness. Error analysis reveals the challenges of SciTaT, such as complex numerical calculations and domain knowledge.

Vision · INTERACT · 變換 · MoDELS · 數據集 ·

2024 年 12 月 13 日

ManipGPT: Is Affordance Segmentation by Large Vision Models Enough for Articulated Object Manipulation?

Taewhan Kim,Hojin Bae,Zeming Li,Xiaoqi Li,Iaroslav Ponomarenko,Ruihai Wu,Hao Dong

from arxiv, 8 pages, 6 figures

Visual actionable affordance has emerged as a transformative approach in robotics, focusing on perceiving interaction areas prior to manipulation. Traditional methods rely on pixel sampling to identify successful interaction samples or processing pointclouds for affordance mapping. However, these approaches are computationally intensive and struggle to adapt to diverse and dynamic environments. This paper introduces ManipGPT, a framework designed to predict optimal interaction areas for articulated objects using a large pre-trained vision transformer (ViT). We created a dataset of 9.9k simulated and real images to bridge the sim-to-real gap and enhance real-world applicability. By fine-tuning the vision transformer on this small dataset, we significantly improved part-level affordance segmentation, adapting the model's in-context segmentation capabilities to robot manipulation scenarios. This enables effective manipulation across simulated and real-world environments by generating part-level affordance masks, paired with an impedance adaptation policy, sufficiently eliminating the need for complex datasets or perception systems.

數據集 · 流 · 標注 · 基準 · 在線 ·

2024 年 12 月 13 日

ViTHSD: Exploiting Hatred by Targets for Hate Speech Detection on Vietnamese Social Media Texts

Cuong Nhat Vo,Khanh Bao Huynh,Son T. Luu,Trong-Hop Do

from arxiv, Accepted for publication at Journal of Computational Social Science

The growth of social networks makes toxic content spread rapidly. Hate speech detection is a task to help decrease the number of harmful comments. With the diversity in the hate speech created by users, it is necessary to interpret the hate speech besides detecting it. Hence, we propose a methodology to construct a system for targeted hate speech detection from online streaming texts from social media. We first introduce the ViTHSD - a targeted hate speech detection dataset for Vietnamese Social Media Texts. The dataset contains 10K comments, each comment is labeled to specific targets with three levels: clean, offensive, and hate. There are 5 targets in the dataset, and each target is labeled with the corresponding level manually by humans with strict annotation guidelines. The inter-annotator agreement obtained from the dataset is 0.45 by Cohen's Kappa index, which is indicated as a moderate level. Then, we construct a baseline for this task by combining the Bi-GRU-LSTM-CNN with the pre-trained language model to leverage the power of text representation of BERTology. Finally, we suggest a methodology to integrate the baseline model for targeted hate speech detection into the online streaming system for practical application in preventing hateful and offensive content on social media.

數據集 · 模型評估 · MoDELS · 小樣本學習 · 推斷 ·

2024 年 12 月 13 日

First Train to Generate, then Generate to Train: UnitedSynT5 for Few-Shot NLI

Sourav Banerjee,Anush Mahajan,Ayushi Agarwal,Eishkaran Singh

from arxiv, 14 pages

Natural Language Inference (NLI) tasks require identifying the relationship between sentence pairs, typically classified as entailment, contradiction, or neutrality. While the current state-of-the-art (SOTA) model, Entailment Few-Shot Learning (EFL), achieves a 93.1% accuracy on the Stanford Natural Language Inference (SNLI) dataset, further advancements are constrained by the dataset's limitations. To address this, we propose a novel approach leveraging synthetic data augmentation to enhance dataset diversity and complexity. We present UnitedSynT5, an advanced extension of EFL that leverages a T5-based generator to synthesize additional premise-hypothesis pairs, which are rigorously cleaned and integrated into the training data. These augmented examples are processed within the EFL framework, embedding labels directly into hypotheses for consistency. We train a GTR-T5-XL model on this expanded dataset, achieving a new benchmark of 94.7% accuracy on the SNLI dataset, 94.0% accuracy on the E-SNLI dataset, and 92.6% accuracy on the MultiNLI dataset, surpassing the previous SOTA models. This research demonstrates the potential of synthetic data augmentation in improving NLI models, offering a path forward for further advancements in natural language understanding tasks.

操作 · Performer · CASE · 在線 · 全 ·

2024 年 12 月 12 日

Spectrum and RAN Sharing: How to Avoid Cross-Subsidization While Taking Full Advantage of Massive MU-MIMO?

Abdalla Hussein,Patrick Mitran,Catherine Rosenberg

Motivated by the need to use spectrum more efficiently, this paper investigates fine grained spectrum sharing (FGSS) in Multi-User massive MIMO (MU-mMIMO) systems where a neutral host enables users from different operators to share the same resource blocks. To be accepted by operators, FGSS must i) guarantee isolation so that the load of one operator does not impact the performance of another, and ii) avoid cross-subsidization whereby one operator gains more from sharing than another. We first formulate and solve an offline problem to assess the potential performance gains of FGSS with respect to the static spectrum sharing case, where operators have fixed separate sub-bands, and find that the gains can be significant, motivating the development for online solutions for FGSS. Transitioning from an offline to an online study presents unique challenges, including the lack of apriori knowledge regarding the performance of the fixed sharing case that is required to ensure isolation and cross-subsidization avoidance. We overcome these challenges and propose an online algorithm that is fast and significantly outperforms the static case. The main finding is that FGSS for a MU-mMIMO downlink system is doable in a way that is ``safe" to operators and brings large gains in spectrum efficiency (e.g., for 4 operators, a gain above 60\% is seen in many cases).

Learning · 優化器 · motivation · 多樣性 · 訓練數據 ·

2024 年 12 月 12 日

Should We Learn Contact-Rich Manipulation Policies from Sampling-Based Planners?

Huaijiang Zhu,Tong Zhao,Xinpei Ni,Jiuguang Wang,Kuan Fang,Ludovic Righetti,Tao Pang

The tremendous success of behavior cloning (BC) in robotic manipulation has been largely confined to tasks where demonstrations can be effectively collected through human teleoperation. However, demonstrations for contact-rich manipulation tasks that require complex coordination of multiple contacts are difficult to collect due to the limitations of current teleoperation interfaces. We investigate how to leverage model-based planning and optimization to generate training data for contact-rich dexterous manipulation tasks. Our analysis reveals that popular sampling-based planners like rapidly exploring random tree (RRT), while efficient for motion planning, produce demonstrations with unfavorably high entropy. This motivates modifications to our data generation pipeline that prioritizes demonstration consistency while maintaining solution diversity. Combined with a diffusion-based goal-conditioned BC approach, our method enables effective policy learning and zero-shot transfer to hardware for two challenging contact-rich manipulation tasks.

AdderNet · Neural Networks · Networking · 卷積 · 模型評估 ·

2019 年 12 月 31 日

AdderNet: Do We Really Need Multiplications in Deep Learning?

Hanting Chen,Yunhe Wang,Chunjing Xu,Boxin Shi,Chao Xu,Qi Tian,Chang Xu

Compared with cheap addition operation, multiplication operation is of much higher computation complexity. The widely-used convolutions in deep neural networks are exactly cross-correlation to measure the similarity between input feature and convolution filters, which involves massive multiplications between float values. In this paper, we present adder networks (AdderNets) to trade these massive multiplications in deep neural networks, especially convolutional neural networks (CNNs), for much cheaper additions to reduce computation costs. In AdderNets, we take the $\ell_1$-norm distance between filters and input feature as the output response. The influence of this new similarity measure on the optimization of neural network have been thoroughly analyzed. To achieve a better performance, we develop a special back-propagation approach for AdderNets by investigating the full-precision gradient. We then propose an adaptive learning rate strategy to enhance the training procedure of AdderNets according to the magnitude of each neuron's gradient. As a result, the proposed AdderNets can achieve 74.9% Top-1 accuracy 91.7% Top-5 accuracy using ResNet-50 on the ImageNet dataset without any multiplication in convolution layer.