亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

<li id='5epal'></li>

_{^{<dd id='5epal'><tbody id='5epal'><td id='5epal'><optgroup id='5epal'><strong id='5epal'></strong></optgroup><address id='5epal'><ul id='5epal'></ul></address><big id='5epal'></big></td><table id='5epal'></table></tbody><pre id='5epal'></pre></dd><span id='5epal'><b id='5epal'></b></span>}}


<dfn id='5epal'><optgroup id='5epal'></optgroup></dfn><tfoot id='5epal'><bdo id='5epal'><div id='5epal'></div><i id='5epal'><dt id='5epal'></dt></i></bdo></tfoot>

_{<fieldset id='5epal'></fieldset>}

·

TAP · Performer · 自助法/自舉法 · Microsoft Surface · MoDELS ·

2024 年 5 月 23 日

BootsTAP: Bootstrapped Training for Tracking-Any-Point

Carl Doersch,Pauline Luc,Yi Yang,Dilara Gokay,Skanda Koppula,Ankush Gupta,Joseph Heyward,Ignacio Rocco,Ross Goroshin,Jo?o Carreira,Andrew Zisserman

To endow models with greater understanding of physics and motion, it is useful to enable them to perceive how solid surfaces move and deform in real scenes. This can be formalized as Tracking-Any-Point (TAP), which requires the algorithm to track any point on solid surfaces in a video, potentially densely in space and time. Large-scale groundtruth training data for TAP is only available in simulation, which currently has a limited variety of objects and motion. In this work, we demonstrate how large-scale, unlabeled, uncurated real-world data can improve a TAP model with minimal architectural changes, using a selfsupervised student-teacher setup. We demonstrate state-of-the-art performance on the TAP-Vid benchmark surpassing previous results by a wide margin: for example, TAP-Vid-DAVIS performance improves from 61.3% to 67.4%, and TAP-Vid-Kinetics from 57.2% to 62.5%. For visualizations, see our project webpage at //bootstap.github.io/

相關內容

TAP

ACM應用感知TAP(ACM Transactions on Applied Perception)旨在通過發表有助于統一這些領域研究的高質量論文來增強計算機科學與心理學/感知之間的協同作用。該期刊發表跨學科研究，在跨計算機科學和感知心理學的任何主題領域都具有重大而持久的價值。所有論文都必須包含感知和計算機科學兩個部分。主題包括但不限于：視覺感知：計算機圖形學，科學/數據/信息可視化，數字成像，計算機視覺，立體和3D顯示技術。聽覺感知：聽覺顯示和界面，聽覺聽覺編碼，空間聲音，語音合成和識別。觸覺：觸覺渲染，觸覺輸入和感知。感覺運動知覺：手勢輸入，身體運動輸入。感官感知：感官整合，多模式渲染和交互。官網地址：

Learning · Performer · 奇異的 · 可約的 · Performance ·

2024 年 7 月 3 日

Bunny-VisionPro: Real-Time Bimanual Dexterous Teleoperation for Imitation Learning

Runyu Ding,Yuzhe Qin,Jiyue Zhu,Chengzhe Jia,Shiqi Yang,Ruihan Yang,Xiaojuan Qi,Xiaolong Wang

from arxiv, project page: //dingry.github.io/projects/bunny_visionpro.html

Teleoperation is a crucial tool for collecting human demonstrations, but controlling robots with bimanual dexterous hands remains a challenge. Existing teleoperation systems struggle to handle the complexity of coordinating two hands for intricate manipulations. We introduce Bunny-VisionPro, a real-time bimanual dexterous teleoperation system that leverages a VR headset. Unlike previous vision-based teleoperation systems, we design novel low-cost devices to provide haptic feedback to the operator, enhancing immersion. Our system prioritizes safety by incorporating collision and singularity avoidance while maintaining real-time performance through innovative designs. Bunny-VisionPro outperforms prior systems on a standard task suite, achieving higher success rates and reduced task completion times. Moreover, the high-quality teleoperation demonstrations improve downstream imitation learning performance, leading to better generalizability. Notably, Bunny-VisionPro enables imitation learning with challenging multi-stage, long-horizon dexterous manipulation tasks, which have rarely been addressed in previous work. Our system's ability to handle bimanual manipulations while prioritizing safety and real-time performance makes it a powerful tool for advancing dexterous manipulation and imitation learning.

有偏 · Performer · MoDELS · GROUP · 可辨認的 ·

2024 年 7 月 3 日

ViG-Bias: Visually Grounded Bias Discovery and Mitigation

Badr-Eddine Marani,Mohamed Hanini,Nihitha Malayarukil,Stergios Christodoulidis,Maria Vakalopoulou,Enzo Ferrante

from arxiv, Accepted to ECCV 2024

The proliferation of machine learning models in critical decision making processes has underscored the need for bias discovery and mitigation strategies. Identifying the reasons behind a biased system is not straightforward, since in many occasions they are associated with hidden spurious correlations which are not easy to spot. Standard approaches rely on bias audits performed by analyzing model performance in pre-defined subgroups of data samples, usually characterized by common attributes like gender or ethnicity when it comes to people, or other specific attributes defining semantically coherent groups of images. However, it is not always possible to know a-priori the specific attributes defining the failure modes of visual recognition systems. Recent approaches propose to discover these groups by leveraging large vision language models, which enable the extraction of cross-modal embeddings and the generation of textual descriptions to characterize the subgroups where a certain model is underperforming. In this work, we argue that incorporating visual explanations (e.g. heatmaps generated via GradCAM or other approaches) can boost the performance of such bias discovery and mitigation frameworks. To this end, we introduce Visually Grounded Bias Discovery and Mitigation (ViG-Bias), a simple yet effective technique which can be integrated to a variety of existing frameworks to improve both, discovery and mitigation performance. Our comprehensive evaluation shows that incorporating visual explanations enhances existing techniques like DOMINO, FACTS and Bias-to-Text, across several challenging datasets, including CelebA, Waterbirds, and NICO++.

推斷 · 秩 · Automator · Python · 模型評估 ·

2024 年 7 月 2 日

TIGER: A Generating-Then-Ranking Framework for Practical Python Type Inference

Chong Wang,Jian Zhang,Yiling Lou,Mingwei Liu,Weisong Sun,Yang Liu,Xin Peng

Python's dynamic typing system offers flexibility and expressiveness but can lead to type-related errors, prompting the need for automated type inference to enhance type hinting. While existing learning-based approaches show promising inference accuracy, they struggle with practical challenges in comprehensively handling various types, including complex generic types and (unseen) user-defined types. In this paper, we introduce TIGER, a two-stage generating-then-ranking (GTR) framework, designed to effectively handle Python's diverse type categories. TIGER leverages fine-tuned pre-trained code models to train a generative model with a span masking objective and a similarity model with a contrastive training objective. This approach allows TIGER to generate a wide range of type candidates, including complex generics in the generating stage, and accurately rank them with user-defined types in the ranking stage. Our evaluation on the ManyTypes4Py dataset shows TIGER's advantage over existing methods in various type categories, notably improving accuracy in inferring user-defined and unseen types by 11.2% and 20.1% respectively in Top-5 Exact Match. Moreover, the experimental results not only demonstrate TIGER's superior performance and efficiency, but also underscore the significance of its generating and ranking stages in enhancing automated type inference.

Pyramid · MoDELS · Extensibility · state-of-the-art · Performer ·

2022 年 12 月 1 日

Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis

Wan-Cyuan Fan,Yen-Chun Chen,Dongdong Chen,Yu Cheng,Lu Yuan,Yu-Chiang Frank Wang

from arxiv, AAAI 2023

Diffusion models (DMs) have shown great potential for high-quality image synthesis. However, when it comes to producing images with complex scenes, how to properly describe both image global structures and object details remains a challenging task. In this paper, we present Frido, a Feature Pyramid Diffusion model performing a multi-scale coarse-to-fine denoising process for image synthesis. Our model decomposes an input image into scale-dependent vector quantized features, followed by a coarse-to-fine gating for producing image output. During the above multi-scale representation learning stage, additional input conditions like text, scene graph, or image layout can be further exploited. Thus, Frido can be also applied for conditional or cross-modality image synthesis. We conduct extensive experiments over various unconditioned and conditional image generation tasks, ranging from text-to-image synthesis, layout-to-image, scene-graph-to-image, to label-to-image. More specifically, we achieved state-of-the-art FID scores on five benchmarks, namely layout-to-image on COCO and OpenImages, scene-graph-to-image on COCO and Visual Genome, and label-to-image on COCO. Code is available at //github.com/davidhalladay/Frido.

可理解性 · MoDELS · INFORMS · 穩健性 · 黑盒 ·

2022 年 4 月 30 日

ExSum: From Local Explanations to Model Understanding

Yilun Zhou,Marco Tulio Ribeiro,Julie Shah

from arxiv, NAACL 2022. The project website is at //yilunzhou.github.io/exsum/

Interpretability methods are developed to understand the working mechanisms of black-box models, which is crucial to their responsible deployment. Fulfilling this goal requires both that the explanations generated by these methods are correct and that people can easily and reliably understand them. While the former has been addressed in prior work, the latter is often overlooked, resulting in informal model understanding derived from a handful of local explanations. In this paper, we introduce explanation summary (ExSum), a mathematical framework for quantifying model understanding, and propose metrics for its quality assessment. On two domains, ExSum highlights various limitations in the current practice, helps develop accurate model understanding, and reveals easily overlooked properties of the model. We also connect understandability to other properties of explanations such as human alignment, robustness, and counterfactual minimality and plausibility.

BART · 圖 · MoDELS · 知識圖譜 · 生成模型 ·

2021 年 1 月 21 日

KG-BART: Knowledge Graph-Augmented BART for Generative Commonsense Reasoning

Ye Liu,Yao Wan,Lifang He,Hao Peng,Philip S. Yu

from arxiv, 10 pages, 7 figures, Appear in AAAI 2021

Generative commonsense reasoning which aims to empower machines to generate sentences with the capacity of reasoning over a set of concepts is a critical bottleneck for text generation. Even the state-of-the-art pre-trained language generation models struggle at this task and often produce implausible and anomalous sentences. One reason is that they rarely consider incorporating the knowledge graph which can provide rich relational information among the commonsense concepts. To promote the ability of commonsense reasoning for text generation, we propose a novel knowledge graph augmented pre-trained language generation model KG-BART, which encompasses the complex relations of concepts through the knowledge graph and produces more logical and natural sentences as output. Moreover, KG-BART can leverage the graph attention to aggregate the rich concept semantics that enhances the model generalization on unseen concept sets. Experiments on benchmark CommonGen dataset verify the effectiveness of our proposed approach by comparing with several strong pre-trained language generation models, particularly KG-BART outperforms BART by 5.80, 4.60, in terms of BLEU-3, 4. Moreover, we also show that the generated context by our model can work as background scenarios to benefit downstream commonsense QA tasks.

蒸餾 · MoDELS · 學成 · Student-Teacher · Vision ·

2020 年 4 月 13 日

Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks

Lin Wang,Kuk-Jin Yoon

from arxiv, 30 pages, paper in submission

Deep neural models in recent years have been successful in almost every field, including extremely complex problem statements. However, these models are huge in size, with millions (and even billions) of parameters, thus demanding more heavy computation power and failing to be deployed on edge devices. Besides, the performance boost is highly dependent on redundant labeled data. To achieve faster speeds and to handle the problems caused by the lack of data, knowledge distillation (KD) has been proposed to transfer information learned from one model to another. KD is often characterized by the so-called `Student-Teacher' (S-T) learning framework and has been broadly applied in model compression and knowledge transfer. This paper is about KD and S-T learning, which are being actively studied in recent years. First, we aim to provide explanations of what KD is and how/why it works. Then, we provide a comprehensive survey on the recent progress of KD methods together with S-T frameworks typically for vision tasks. In general, we consider some fundamental questions that have been driving this research area and thoroughly generalize the research progress and technical details. Additionally, we systematically analyze the research status of KD in vision applications. Finally, we discuss the potentials and open challenges of existing methods and prospect the future directions of KD and S-T learning.

圖 · 知識圖譜 · 語言模型化 · entity · BERT ·

2019 年 9 月 7 日

KG-BERT: BERT for Knowledge Graph Completion

Liang Yao,Chengsheng Mao,Yuan Luo

Knowledge graphs are important resources for many artificial intelligence tasks but often suffer from incompleteness. In this work, we propose to use pre-trained language models for knowledge graph completion. We treat triples in knowledge graphs as textual sequences and propose a novel framework named Knowledge Graph Bidirectional Encoder Representations from Transformer (KG-BERT) to model these triples. Our method takes entity and relation descriptions of a triple as input and computes scoring function of the triple with the KG-BERT language model. Experimental results on multiple benchmark knowledge graphs show that our method can achieve state-of-the-art performance in triple classification, link prediction and relation prediction tasks.

圖卷積神經網絡/圖卷積網絡 · AdaBoost · 圖卷積 · 圖 · Networking ·

2019 年 8 月 14 日

AdaGCN: Adaboosting Graph Convolutional Networks into Deep Models

Ke Sun,Zhouchen Lin,Zhanxing Zhu

The design of deep graph models still remains to be investigated and the crucial part is how to explore and exploit the knowledge from different hops of neighbors in an efficient way. In this paper, we propose a novel RNN-like deep graph neural network architecture by incorporating AdaBoost into the computation of network; and the proposed graph convolutional network called AdaGCN~(AdaBoosting Graph Convolutional Network) has the ability to efficiently extract knowledge from high-order neighbors and integrate knowledge from different hops of neighbors into the network in an AdaBoost way. We also present the architectural difference between AdaGCN and existing graph convolutional methods to show the benefits of our proposal. Finally, extensive experiments demonstrate the state-of-the-art prediction performance and the computational advantage of our approach AdaGCN.

Transformer-XL · 語言模型化 · MoDELS · 學成 · 樹庫 ·

2019 年 6 月 2 日

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

Zihang Dai,Zhilin Yang,Yiming Yang,Jaime Carbonell,Quoc V. Le,Ruslan Salakhutdinov

from arxiv, ACL 2019 long paper. Code and pretrained models are available at //github.com/kimiyoung/transformer-xl

Transformers have a potential of learning longer-term dependency, but are limited by a fixed-length context in the setting of language modeling. We propose a novel neural architecture Transformer-XL that enables learning dependency beyond a fixed length without disrupting temporal coherence. It consists of a segment-level recurrence mechanism and a novel positional encoding scheme. Our method not only enables capturing longer-term dependency, but also resolves the context fragmentation problem. As a result, Transformer-XL learns dependency that is 80% longer than RNNs and 450% longer than vanilla Transformers, achieves better performance on both short and long sequences, and is up to 1,800+ times faster than vanilla Transformers during evaluation. Notably, we improve the state-of-the-art results of bpc/perplexity to 0.99 on enwiki8, 1.08 on text8, 18.3 on WikiText-103, 21.8 on One Billion Word, and 54.5 on Penn Treebank (without finetuning). When trained only on WikiText-103, Transformer-XL manages to generate reasonably coherent, novel text articles with thousands of tokens. Our code, pretrained models, and hyperparameters are available in both Tensorflow and PyTorch.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

自助法/自舉法

Microsoft Surface

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191

<tr id='1n6bX'><strong id='CTT9e'></strong><small id='KYPpG'></small><button id='NsFyR'></button><li id='DkhqB'><noscript id='iqdwL'><big id='rsUy3'></big><dt id='whJtv'></dt></noscript></li></tr><ol id='rQtt9'><option id='p9coG'><table id='P38Gs'><blockquote id='QgPKO'><tbody id='mQwwW'></tbody></blockquote></table></option></ol><u id='seqxk'></u><kbd id='FSmcA'><kbd id='X1Xsl'></kbd></kbd>

<code id='aFye1'><strong id='FiWkx'></strong></code>

<fieldset id='Hlk6k'></fieldset>

<span id='qwwyn'></span>

<ins id='ICtau'></ins>

<acronym id='xET29'><em id='PjxVQ'></em><td id='IngpV'><div id='snMEG'></div></td></acronym><address id='CvzCy'><big id='YaW6A'><big id='LJEbV'></big><legend id='EI6zl'></legend></big></address>

<i id='q0hXy'><div id='OBwA8'><ins id='JWEwX'></ins></div></i>

<i id='0LWk5'></i>