99热日韩这里只有国产中文精品_亚洲中文字幕久久久久_久久麻传媒亚洲AV国产_综合自拍亚洲综合图区网_看一级毛片直播在线_国产色视频在线观看视频_久久天天躁狠狠躁夜夜88

In this paper, we present a novel dataset captured using a VR headset to record conversations between participants within a physics simulator (AI2-THOR). Our primary objective is to extend the field of co-speech gesture generation by incorporating rich contextual information within referential settings. Participants engaged in various conversational scenarios, all based on referential communication tasks. The dataset provides a rich set of multimodal recordings such as motion capture, speech, gaze, and scene graphs. This comprehensive dataset aims to enhance the understanding and development of gesture generation models in 3D scenes by providing diverse and contextually rich data.

相關內容

數據集

關注 88

數據集，又稱為資料集、數據集合或資料集合，是一種由數據所組成的集合。
Data set（或dataset）是一個數據的集合，通常以表格形式出現。每一列代表一個特定變量。每一行都對應于某一成員的數據集的問題。它列出的價值觀為每一個變量，如身高和體重的一個物體或價值的隨機數。每個數值被稱為數據資料。對應于行數，該數據集的數據可能包括一個或多個成員。

Facebook AI Research · 相似度 · MoDELS · 數據集 · 模型評估 ·

2024 年 11 月 8 日

Enhancing Model Fairness and Accuracy with Similarity Networks: A Methodological Approach

Samira Maghool,Paolo Ceravolo

from arxiv, 7 pages, 4 figures

In this paper, we propose an innovative approach to thoroughly explore dataset features that introduce bias in downstream machine-learning tasks. Depending on the data format, we use different techniques to map instances into a similarity feature space. Our method's ability to adjust the resolution of pairwise similarity provides clear insights into the relationship between the dataset classification complexity and model fairness. Experimental results confirm the promising applicability of the similarity network in promoting fair models. Moreover, leveraging our methodology not only seems promising in providing a fair downstream task such as classification, it also performs well in imputation and augmentation of the dataset satisfying the fairness criteria such as demographic parity and imbalanced classes.

數據集 · 話題 · 任務對話系統 · MoDELS · 平滑 ·

2024 年 11 月 7 日

NaturalConv: A Chinese Dialogue Dataset Towards Multi-turn Topic-driven Conversation

Xiaoyang Wang,Chen Li,Jianqiao Zhao,Dong Yu

from arxiv, Accepted as a main track paper at AAAI 2021

In this paper, we propose a Chinese multi-turn topic-driven conversation dataset, NaturalConv, which allows the participants to chat anything they want as long as any element from the topic is mentioned and the topic shift is smooth. Our corpus contains 19.9K conversations from six domains, and 400K utterances with an average turn number of 20.1. These conversations contain in-depth discussions on related topics or widely natural transition between multiple topics. We believe either way is normal for human conversation. To facilitate the research on this corpus, we provide results of several benchmark models. Comparative results show that for this dataset, our current models are not able to provide significant improvement by introducing background knowledge/topic. Therefore, the proposed dataset should be a good benchmark for further research to evaluate the validity and naturalness of multi-turn conversation systems. Our dataset is available at //ailab.tencent.com/ailab/nlp/dialogue/#datasets.

塊 · binary · 通道 · 論文 ·

2024 年 11 月 6 日

Partial Orders in Rate-Matched Polar Codes

Zhichao Liu,Liuquan Yao,Yuan Li,Huazi Zhang,Jun Wang,Guiying Yan,Zhiming Ma

from arxiv, 8 pages, 2 figures, 1 table

In this paper, we establish the partial order (POs) for both the binary erasure channel (BEC) and the binary memoryless symmetric channel (BMSC) under any block rate-matched polar codes. Firstly, we define the POs in the sense of rate-matched polar codes as a sequential block version. Furthermore, we demonstrate the persistence of POs after block rate matching in the BEC. Finally, leveraging the existing POs in the BEC, we obtain more POs in the BMSC under block rate matching. Simulations show that the PW sequence constructed from \beta-expansion can be improved by the tool of POs. Actually, any fixed reliable sequence in the mother polar codes can be improved by POs for rate matching.

MoDELS · Performer · 學習率 · HTTPS · Learning ·

2024 年 11 月 6 日

Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent

Xingwu Sun,Yanfeng Chen,Yiqing Huang,Ruobing Xie,Jiaqi Zhu,Kai Zhang,Shuaipeng Li,Zhen Yang,Jonny Han,Xiaobo Shu,Jiahao Bu,Zhongzhi Chen,Xuemeng Huang,Fengzong Lian,Saiyong Yang,Jianfeng Yan,Yuyuan Zeng,Xiaoqin Ren,Chao Yu,Lulu Wu,Yue Mao,Jun Xia,Tao Yang,Suncong Zheng,Kan Wu,Dian Jiao,Jinbao Xue,Xipeng Zhang,Decheng Wu,Kai Liu,Dengpeng Wu,Guanghui Xu,Shaohua Chen,Shuang Chen,Xiao Feng,Yigeng Hong,Junqiang Zheng,Chengcheng Xu,Zongwei Li,Xiong Kuang,Jianglu Hu,Yiqi Chen,Yuchi Deng,Guiyang Li,Ao Liu,Chenchen Zhang,Shihui Hu,Zilong Zhao,Zifan Wu,Yao Ding,Weichao Wang,Han Liu,Roberts Wang,Hao Fei,Peijie Yu,Ze Zhao,Xun Cao,Hai Wang,Fusheng Xiang,Mengyuan Huang,Zhiyuan Xiong,Bin Hu,Xuebin Hou,Lei Jiang,Jianqiang Ma,Jiajia Wu,Yaping Deng,Yi Shen,Qian Wang,Weijie Liu,Jie Liu,Meng Chen,Liang Dong,Weiwen Jia,Hu Chen,Feifei Liu,Rui Yuan,Huilin Xu,Zhenxiang Yan,Tengfei Cao,Zhichao Hu,Xinhua Feng,Dong Du,Tinghao Yu,Yangyu Tao,Feng Zhang,Jianchen Zhu,Chengzhong Xu,Xirui Li,Chong Zha,Wen Ouyang,Yinben Xia,Xiang Li,Zekun He,Rongpeng Chen,Jiawei Song,Ruibin Chen,Fan Jiang,Chongqing Zhao,Bo Wang,Hao Gong,Rong Gan,Winston Hu,Zhanhui Kang,Yong Yang,Yuhong Liu,Di Wang,Jie Jiang

from arxiv, 17 pages, 4 Figures

In this paper, we introduce Hunyuan-Large, which is currently the largest open-source Transformer-based mixture of experts model, with a total of 389 billion parameters and 52 billion activation parameters, capable of handling up to 256K tokens. We conduct a thorough evaluation of Hunyuan-Large's superior performance across various benchmarks including language understanding and generation, logical reasoning, mathematical problem-solving, coding, long-context, and aggregated tasks, where it outperforms LLama3.1-70B and exhibits comparable performance when compared to the significantly larger LLama3.1-405B model. Key practice of Hunyuan-Large include large-scale synthetic data that is orders larger than in previous literature, a mixed expert routing strategy, a key-value cache compression technique, and an expert-specific learning rate strategy. Additionally, we also investigate the scaling laws and learning rate schedule of mixture of experts models, providing valuable insights and guidances for future model development and optimization. The code and checkpoints of Hunyuan-Large are released to facilitate future innovations and applications. Codes: //github.com/Tencent/Hunyuan-Large Models: //huggingface.co/tencent/Tencent-Hunyuan-Large

推斷 · MoDELS · 語言模型化 · 大語言模型 · 數據集 ·

2024 年 11 月 5 日

LiveMind: Low-latency Large Language Models with Simultaneous Inference

Chuangtao Chen,Grace Li Zhang,Xunzhao Yin,Cheng Zhuo,Ulf Schlichtmann,Bing Li

In this paper, we introduce LiveMind, a novel low-latency inference framework for large language model (LLM) inference which enables LLMs to perform inferences with incomplete user input. By reallocating computational processes to the input phase, a substantial reduction in latency is achieved, thereby significantly enhancing the interactive experience for users of LLMs. The framework adeptly manages the visibility of the streaming input to the model, allowing it to infer from incomplete user input or await additional content. Compared with traditional inference methods on complete user input, our approach demonstrates an average reduction in response latency of 84.0% on the MMLU dataset and 71.6% on the MMLU-Pro dataset, while maintaining comparable accuracy. Additionally, our framework facilitates collaborative inference and output across different models. By employing an large LLM for inference and a small LLM for output, we achieve an average 37% reduction in response latency, alongside a 4.30% improvement in accuracy on the MMLU-Pro dataset compared with the baseline. The proposed LiveMind framework advances the field of human-AI interaction by enabling more responsive and efficient communication between users and AI systems.

鏈路預測 · 圖 · 圖形處理器 · Neural Networks · Networking ·

2021 年 6 月 16 日

Neural Bellman-Ford Networks: A General Graph Neural Network Framework for Link Prediction

Zhaocheng Zhu,Zuobai Zhang,Louis-Pascal Xhonneux,Jian Tang

Link prediction is a very fundamental task on graphs. Inspired by traditional path-based methods, in this paper we propose a general and flexible representation learning framework based on paths for link prediction. Specifically, we define the representation of a pair of nodes as the generalized sum of all path representations, with each path representation as the generalized product of the edge representations in the path. Motivated by the Bellman-Ford algorithm for solving the shortest path problem, we show that the proposed path formulation can be efficiently solved by the generalized Bellman-Ford algorithm. To further improve the capacity of the path formulation, we propose the Neural Bellman-Ford Network (NBFNet), a general graph neural network framework that solves the path formulation with learned operators in the generalized Bellman-Ford algorithm. The NBFNet parameterizes the generalized Bellman-Ford algorithm with 3 neural components, namely INDICATOR, MESSAGE and AGGREGATE functions, which corresponds to the boundary condition, multiplication operator, and summation operator respectively. The NBFNet is very general, covers many traditional path-based methods, and can be applied to both homogeneous graphs and multi-relational graphs (e.g., knowledge graphs) in both transductive and inductive settings. Experiments on both homogeneous graphs and knowledge graphs show that the proposed NBFNet outperforms existing methods by a large margin in both transductive and inductive settings, achieving new state-of-the-art results.

自動問答 · 注意力機制 · 可約的 · MoDELS · 匯聚 ·

2021 年 5 月 10 日

Poolingformer: Long Document Modeling with Pooling Attention

Hang Zhang,Yeyun Gong,Yelong Shen,Weisheng Li,Jiancheng Lv,Nan Duan,Weizhu Chen

from arxiv, Accepted by ICML 2021

In this paper, we introduce a two-level attention schema, Poolingformer, for long document modeling. Its first level uses a smaller sliding window pattern to aggregate information from neighbors. Its second level employs a larger window to increase receptive fields with pooling attention to reduce both computational cost and memory consumption. We first evaluate Poolingformer on two long sequence QA tasks: the monolingual NQ and the multilingual TyDi QA. Experimental results show that Poolingformer sits atop three official leaderboards measured by F1, outperforming previous state-of-the-art models by 1.9 points (79.8 vs. 77.9) on NQ long answer, 1.9 points (79.5 vs. 77.6) on TyDi QA passage answer, and 1.6 points (67.6 vs. 66.0) on TyDi QA minimal answer. We further evaluate Poolingformer on a long sequence summarization task. Experimental results on the arXiv benchmark continue to demonstrate its superior performance.

Taxonomy · 目標檢測 · 可辨認的 · 評論員 · HTTPS ·

2020 年 3 月 11 日

Imbalance Problems in Object Detection: A Review

Kemal Oksuz,Baris Can Cam,Sinan Kalkan,Emre Akbas

from arxiv, Accepted to IEEE TPAMI; currently in press

In this paper, we present a comprehensive review of the imbalance problems in object detection. To analyze the problems in a systematic manner, we introduce a problem-based taxonomy. Following this taxonomy, we discuss each problem in depth and present a unifying yet critical perspective on the solutions in the literature. In addition, we identify major open issues regarding the existing imbalance problems as well as imbalance problems that have not been discussed before. Moreover, in order to keep our review up to date, we provide an accompanying webpage which catalogs papers addressing imbalance problems, according to our problem-based taxonomy. Researchers can track newer studies on this webpage available at: //github.com/kemaloksuz/ObjectDetectionImbalance .

元學習 · 語音識別 · MAML · 學成 · 端到端 ·

2019 年 10 月 26 日

Meta Learning for End-to-End Low-Resource Speech Recognition

Jui-Yang Hsu,Yuan-Jui Chen,Hung-yi Lee

from arxiv, 5 pages, submitted to ICASSP 2020

In this paper, we proposed to apply meta learning approach for low-resource automatic speech recognition (ASR). We formulated ASR for different languages as different tasks, and meta-learned the initialization parameters from many pretraining languages to achieve fast adaptation on unseen target language, via recently proposed model-agnostic meta learning algorithm (MAML). We evaluated the proposed approach using six languages as pretraining tasks and four languages as target tasks. Preliminary results showed that the proposed method, MetaASR, significantly outperforms the state-of-the-art multitask pretraining approach on all target languages with different combinations of pretraining languages. In addition, since MAML's model-agnostic property, this paper also opens new research direction of applying meta learning to more speech-related applications.

Branch · Networking · 示例 · Better · 可理解性 ·

2019 年 4 月 10 日

S4Net: Single Stage Salient-Instance Segmentation

Ruochen Fan,Ming-Ming Cheng,Qibin Hou,Tai-Jiang Mu,Jingdong Wang,Shi-Min Hu

We consider an interesting problem-salient instance segmentation in this paper. Other than producing bounding boxes, our network also outputs high-quality instance-level segments. Taking into account the category-independent property of each target, we design a single stage salient instance segmentation framework, with a novel segmentation branch. Our new branch regards not only local context inside each detection window but also its surrounding context, enabling us to distinguish the instances in the same scope even with obstruction. Our network is end-to-end trainable and runs at a fast speed (40 fps when processing an image with resolution 320x320). We evaluate our approach on a publicly available benchmark and show that it outperforms other alternative solutions. We also provide a thorough analysis of the design choices to help readers better understand the functions of each part of our network. The source code can be found at \url{//github.com/RuochenFan/S4Net}.