亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

<tfoot id='ub61w'></tfoot>

<legend id='ub61w'><style id='ub61w'><dir id='ub61w'><q id='ub61w'></q></dir></style></legend>

<i id='ub61w'><tr id='ub61w'><dt id='ub61w'><q id='ub61w'><span id='ub61w'><b id='ub61w'><form id='ub61w'><ins id='ub61w'></ins><ul id='ub61w'></ul><sub id='ub61w'></sub></form><legend id='ub61w'></legend><bdo id='ub61w'><pre id='ub61w'><center id='ub61w'></center></pre></bdo></b><th id='ub61w'></th></span></q></dt></tr></i><div id='ub61w'><tfoot id='ub61w'></tfoot><dl id='ub61w'><fieldset id='ub61w'></fieldset></dl></div>

<li id='ub61w'><abbr id='ub61w'></abbr></li>

·

3D · Attention · MoDELS · 解碼 · 變換 ·

2024 年 10 月 30 日

Neural Attention Field: Emerging Point Relevance in 3D Scenes for One-Shot Dexterous Grasping

Qianxu Wang,Congyue Deng,Tyler Ga Wei Lum,Yuanpei Chen,Yaodong Yang,Jeannette Bohg,Yixin Zhu,Leonidas Guibas

One-shot transfer of dexterous grasps to novel scenes with object and context variations has been a challenging problem. While distilled feature fields from large vision models have enabled semantic correspondences across 3D scenes, their features are point-based and restricted to object surfaces, limiting their capability of modeling complex semantic feature distributions for hand-object interactions. In this work, we propose the \textit{neural attention field} for representing semantic-aware dense feature fields in the 3D space by modeling inter-point relevance instead of individual point features. Core to it is a transformer decoder that computes the cross-attention between any 3D query point with all the scene points, and provides the query point feature with an attention-based aggregation. We further propose a self-supervised framework for training the transformer decoder from only a few 3D pointclouds without hand demonstrations. Post-training, the attention field can be applied to novel scenes for semantics-aware dexterous grasping from one-shot demonstration. Experiments show that our method provides better optimization landscapes by encouraging the end-effector to focus on task-relevant scene regions, resulting in significant improvements in success rates on real robots compared with the feature-field-based methods.

相關內容

3D是英(ying)文“Three Dimensions”的(de)簡稱，中文是指三(san)維、三(san)個(ge)維度(du)、三(san)個(ge)坐(zuo)標(biao)，即有(you)(you)長(chang)、有(you)(you)寬、有(you)(you)高(gao)，換句話說，就是立體的(de)，是相對于(yu)只(zhi)有(you)(you)長(chang)和寬的(de)平(ping)面（2D）而言。

MoDELS · Canvas · 知識 (knowledge) · 圖 · INFORMS ·

2024 年 12 月 12 日

Context Canvas: Enhancing Text-to-Image Diffusion Models with Knowledge Graph-Based RAG

Kavana Venkatesh,Yusuf Dalva,Ismini Lourentzou,Pinar Yanardag

from arxiv, Project Page: //context-canvas.github.io/

We introduce a novel approach to enhance the capabilities of text-to-image models by incorporating a graph-based RAG. Our system dynamically retrieves detailed character information and relational data from the knowledge graph, enabling the generation of visually accurate and contextually rich images. This capability significantly improves upon the limitations of existing T2I models, which often struggle with the accurate depiction of complex or culturally specific subjects due to dataset constraints. Furthermore, we propose a novel self-correcting mechanism for text-to-image models to ensure consistency and fidelity in visual outputs, leveraging the rich context from the graph to guide corrections. Our qualitative and quantitative experiments demonstrate that Context Canvas significantly enhances the capabilities of popular models such as Flux, Stable Diffusion, and DALL-E, and improves the functionality of ControlNet for fine-grained image editing tasks. To our knowledge, Context Canvas represents the first application of graph-based RAG in enhancing T2I models, representing a significant advancement for producing high-fidelity, context-aware multi-faceted images.

有偏 · MoDELS · 可約的 · 多樣性 · 穩健性 ·

2024 年 12 月 12 日

Make Satire Boring Again: Reducing Stylistic Bias of Satirical Corpus by Utilizing Generative LLMs

Asli Umay Ozturk,Recep Firat Cekinel,Asli Umay Ozturk

from arxiv, Accepted to BUCC2025 Workshop @COLING2025

Satire detection is essential for accurately extracting opinions from textual data and combating misinformation online. However, the lack of diverse corpora for satire leads to the problem of stylistic bias which impacts the models' detection performances. This study proposes a debiasing approach for satire detection, focusing on reducing biases in training data by utilizing generative large language models. The approach is evaluated in both cross-domain (irony detection) and cross-lingual (English) settings. Results show that the debiasing method enhances the robustness and generalizability of the models for satire and irony detection tasks in Turkish and English. However, its impact on causal language models, such as Llama-3.1, is limited. Additionally, this work curates and presents the Turkish Satirical News Dataset with detailed human annotations, with case studies on classification, debiasing, and explainability.

Analysis · 數據集 · MoDELS · Extensibility · AIM ·

2024 年 12 月 12 日

Speech-Forensics: Towards Comprehensive Synthetic Speech Dataset Establishment and Analysis

Zhoulin Ji,Chenhao Lin,Hang Wang,Chao Shen

Detecting synthetic from real speech is increasingly crucial due to the risks of misinformation and identity impersonation. While various datasets for synthetic speech analysis have been developed, they often focus on specific areas, limiting their utility for comprehensive research. To fill this gap, we propose the Speech-Forensics dataset by extensively covering authentic, synthetic, and partially forged speech samples that include multiple segments synthesized by different high-quality algorithms. Moreover, we propose a TEmporal Speech LocalizaTion network, called TEST, aiming at simultaneously performing authenticity detection, multiple fake segments localization, and synthesis algorithms recognition, without any complex post-processing. TEST effectively integrates LSTM and Transformer to extract more powerful temporal speech representations and utilizes dense prediction on multi-scale pyramid features to estimate the synthetic spans. Our model achieves an average mAP of 83.55% and an EER of 5.25% at the utterance level. At the segment level, it attains an EER of 1.07% and a 92.19% F1 score. These results highlight the model's robust capability for a comprehensive analysis of synthetic speech, offering a promising avenue for future research and practical applications in this field.

MoDELS · 數據集 · Performer · 查準率/準確率 · INFORMS ·

2024 年 12 月 11 日

LAION-SG: An Enhanced Large-Scale Dataset for Training Complex Image-Text Models with Structural Annotations

Zejian Li,Chenye Meng,Yize Li,Ling Yang,Shengyuan Zhang,Jiarui Ma,Jiayi Li,Guang Yang,Changyuan Yang,Zhiyuan Yang,Jinxiong Chang,Lingyun Sun

Recent advances in text-to-image (T2I) generation have shown remarkable success in producing high-quality images from text. However, existing T2I models show decayed performance in compositional image generation involving multiple objects and intricate relationships. We attribute this problem to limitations in existing datasets of image-text pairs, which lack precise inter-object relationship annotations with prompts only. To address this problem, we construct LAION-SG, a large-scale dataset with high-quality structural annotations of scene graphs (SG), which precisely describe attributes and relationships of multiple objects, effectively representing the semantic structure in complex scenes. Based on LAION-SG, we train a new foundation model SDXL-SG to incorporate structural annotation information into the generation process. Extensive experiments show advanced models trained on our LAION-SG boast significant performance improvements in complex scene generation over models on existing datasets. We also introduce CompSG-Bench, a benchmark that evaluates models on compositional image generation, establishing a new standard for this domain.

Prompt · 黑盒 · MoDELS · 語言模型化 · 數據集 ·

2024 年 12 月 11 日

Fundamental Limits of Prompt Compression: A Rate-Distortion Framework for Black-Box Language Models

Alliot Nagle,Adway Girish,Marco Bondaschi,Michael Gastpar,Ashok Vardhan Makkuva,Hyeji Kim

from arxiv, 42 pages, 17 figures. Accepted to NeurIPS 2024

We formalize the problem of prompt compression for large language models (LLMs) and present a framework to unify token-level prompt compression methods which create hard prompts for black-box models. We derive the distortion-rate function for this setup as a linear program, and provide an efficient algorithm to compute this fundamental limit via the dual of the linear program. Using the distortion-rate function as the baseline, we study the performance of existing compression schemes on a synthetic dataset consisting of prompts generated from a Markov chain, natural language queries, and their respective answers. Our empirical analysis demonstrates the criticality of query-aware prompt compression, where the compressor has knowledge of the downstream task/query for the black-box LLM. We show that there is a large gap between the performance of current prompt compression methods and the optimal strategy, and propose Adaptive QuerySelect, a query-aware, variable-rate adaptation of a prior work to close the gap. We extend our experiments to a small natural language dataset to further confirm our findings on our synthetic dataset.

Neural Networks · 圖形處理器 · 圖 · Networking · MoDELS ·

2021 年 1 月 27 日

Graph Neural Network for Traffic Forecasting: A Survey

Weiwei Jiang,Jiayun Luo

Traffic forecasting is an important factor for the success of intelligent transportation systems. Deep learning models including convolution neural networks and recurrent neural networks have been applied in traffic forecasting problems to model the spatial and temporal dependencies. In recent years, to model the graph structures in the transportation systems as well as the contextual information, graph neural networks (GNNs) are introduced as new tools and have achieved the state-of-the-art performance in a series of traffic forecasting problems. In this survey, we review the rapidly growing body of recent research using different GNNs, e.g., graph convolutional and graph attention networks, in various traffic forecasting problems, e.g., road traffic flow and speed forecasting, passenger flow forecasting in urban rail transit systems, demand forecasting in ride-hailing platforms, etc. We also present a collection of open data and source resources for each problem, as well as future research directions. To the best of our knowledge, this paper is the first comprehensive survey that explores the application of graph neural networks for traffic forecasting problems. We have also created a public Github repository to update the latest papers, open data and source resources.

任務對話系統 · INFORMS · 圖 · Networking · entity ·

2020 年 8 月 11 日

KBGN: Knowledge-Bridge Graph Network for Adaptive Vision-Text Reasoning in Visual Dialogue

Xiaoze Jiang,Siyi Du,Zengchang Qin,Yajing Sun,Jing Yu

from arxiv, Accepted by the 28th ACM International Conference on Multimedia (ACM MM 2020)

Visual dialogue is a challenging task that needs to extract implicit information from both visual (image) and textual (dialogue history) contexts. Classical approaches pay more attention to the integration of the current question, vision knowledge and text knowledge, despising the heterogeneous semantic gaps between the cross-modal information. In the meantime, the concatenation operation has become de-facto standard to the cross-modal information fusion, which has a limited ability in information retrieval. In this paper, we propose a novel Knowledge-Bridge Graph Network (KBGN) model by using graph to bridge the cross-modal semantic relations between vision and text knowledge in fine granularity, as well as retrieving required knowledge via an adaptive information selection mode. Moreover, the reasoning clues for visual dialogue can be clearly drawn from intra-modal entities and inter-modal bridges. Experimental results on VisDial v1.0 and VisDial-Q datasets demonstrate that our model outperforms exiting models with state-of-the-art results.

圖注意力網絡 · 情感分類 · 圖 · Networking · 注意力機制 ·

2019 年 9 月 5 日

Syntax-Aware Aspect Level Sentiment Classification with Graph Attention Networks

Binxuan Huang,Kathleen M. Carley

from arxiv, Accepted by EMNLP 2019

Aspect level sentiment classification aims to identify the sentiment expressed towards an aspect given a context sentence. Previous neural network based methods largely ignore the syntax structure in one sentence. In this paper, we propose a novel target-dependent graph attention network (TD-GAT) for aspect level sentiment classification, which explicitly utilizes the dependency relationship among words. Using the dependency graph, it propagates sentiment features directly from the syntactic context of an aspect target. In our experiments, we show our method outperforms multiple baselines with GloVe embeddings. We also demonstrate that using BERT representations further substantially boosts the performance.

Performer · 判別器 · 正例 · 假陽性 · 監督 ·

2018 年 5 月 24 日

DSGAN: Generative Adversarial Training for Distant Supervision Relation Extraction

Pengda Qin,Weiran Xu,William Yang Wang

Distant supervision can effectively label data for relation extraction, but suffers from the noise labeling problem. Recent works mainly perform soft bag-level noise reduction strategies to find the relatively better samples in a sentence bag, which is suboptimal compared with making a hard decision of false positive samples in sentence level. In this paper, we introduce an adversarial learning framework, which we named DSGAN, to learn a sentence-level true-positive generator. Inspired by Generative Adversarial Networks, we regard the positive samples generated by the generator as the negative samples to train the discriminator. The optimal generator is obtained until the discrimination ability of the discriminator has the greatest decline. We adopt the generator to filter distant supervision training dataset and redistribute the false positive instances into the negative set, in which way to provide a cleaned dataset for relation classification. The experimental results show that the proposed strategy significantly improves the performance of distant supervision relation extraction comparing to state-of-the-art systems.

Neural Networks · 目標檢測 · Networking · Extensibility · MoDELS ·

2018 年 1 月 12 日

MSDNN: Multi-Scale Deep Neural Network for Salient Object Detection

Fen Xiao,Wenzheng Deng,Liangchan Peng,Chunhong Cao,Kai Hu,Xieping Gao

from arxiv, 10 pages, 12 figures

Salient object detection is a fundamental problem and has been received a great deal of attentions in computer vision. Recently deep learning model became a powerful tool for image feature extraction. In this paper, we propose a multi-scale deep neural network (MSDNN) for salient object detection. The proposed model first extracts global high-level features and context information over the whole source image with recurrent convolutional neural network (RCNN). Then several stacked deconvolutional layers are adopted to get the multi-scale feature representation and obtain a series of saliency maps. Finally, we investigate a fusion convolution module (FCM) to build a final pixel level saliency map. The proposed model is extensively evaluated on four salient object detection benchmark datasets. Results show that our deep model significantly outperforms other 12 state-of-the-art approaches.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191

<dir id='Kualz'><del id='xYApr'><del id='OuNIj'></del><pre id='JCxDh'><pre id='dIqHw'><option id='BHtlu'><address id='ucfte'></address><bdo id='Abw9u'><tr id='lQOw8'><acronym id='eLDwl'><pre id='pUeXb'></pre></acronym><div id='beCZ9'></div></tr></bdo></option></pre><small id='1Fp68'><address id='MotDT'><u id='Igyj7'><legend id='x54Jr'><option id='1d7ae'><abbr id='J8srd'></abbr><li id='vJGaD'><pre id='QATyv'></pre></li></option></legend><select id='Zr0NN'></select></u></address></small></pre></del><sup id='jv62O'></sup><blockquote id='gl3Hi'><dt id='WIp1x'></dt></blockquote><blockquote id='P7czx'></blockquote></dir><tt id='9KyPp'></tt><u id='7lFoB'><tt id='Jesx6'><form id='obBN8'></form></tt><td id='kQ92R'><dt id='hl3aP'></dt></td></u>

<code id='LvD6w'><i id='iboRG'><q id='hFZ0Y'><legend id='DzoJO'><pre id='kAYiV'><style id='EmtrZ'><acronym id='MwA7p'><i id='E2esM'><form id='OHJpz'><option id='U8ZFK'><center id='c2lIh'></center></option></form></i></acronym></style><tt id='lGvj3'></tt></pre></legend></q></i></code><center id='6R824'></center>

<dd id='CQfpx'></dd>

<style id='jRp4s'></style><sub id='Mttpk'><dfn id='5mmY2'><abbr id='1Qdob'><big id='xbFBL'><bdo id='ITXam'></bdo></big></abbr></dfn></sub>_{<dir id='3Gsdw'></dir>}