亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

<tfoot id='myh0z'></tfoot>

<legend id='myh0z'><style id='myh0z'><dir id='myh0z'><q id='myh0z'></q></dir></style></legend>

<i id='myh0z'><tr id='myh0z'><dt id='myh0z'><q id='myh0z'><span id='myh0z'><b id='myh0z'><form id='myh0z'><ins id='myh0z'></ins><ul id='myh0z'></ul><sub id='myh0z'></sub></form><legend id='myh0z'></legend><bdo id='myh0z'><pre id='myh0z'><center id='myh0z'></center></pre></bdo></b><th id='myh0z'></th></span></q></dt></tr></i><div id='myh0z'><tfoot id='myh0z'></tfoot><dl id='myh0z'><fieldset id='myh0z'></fieldset></dl></div>

·

INTERACT · 協同過濾 · 圖 · SimPLe · 特征提取 ·

2024 年 8 月 11 日

GraphTransfer: A Generic Feature Fusion Framework for Collaborative Filtering

Jiafeng Xia,Dongsheng Li,Hansu Gu,Tun Lu,Ning Gu

Graph Neural Networks (GNNs) have demonstrated effectiveness in collaborative filtering tasks due to their ability to extract powerful structural features. However, combining the graph features extracted from user-item interactions and auxiliary features extracted from user genres and item properties remains a challenge. Currently available fusion methods face two major issues: 1) simple methods such as concatenation and summation are generic, but not accurate in capturing feature relationships; 2) task-specific methods like attention mechanisms and meta paths may not be suitable for general feature fusion. To address these challenges, we present GraphTransfer, a simple but universal feature fusion framework for GNN-based collaborative filtering. Our method accurately fuses different types of features by first extracting graph features from the user-item interaction graph and auxiliary features from users and items using GCN. The proposed cross fusion module then effectively bridges the semantic gaps between the interaction scores of different features. Theoretical analysis and experiments on public datasets show that GraphTransfer outperforms other feature fusion methods in CF tasks. Additionally, we demonstrate the universality of our framework via empirical studies in three other scenarios, showing that GraphTransfer leads to significant improvements in the performance of CF algorithms.

相關內容

INTERACT

IFIP TC13 Conference on Human-Computer Interaction是人機交互領域的研究者和實踐者展示其工作的重要平臺。多年來，這些會議吸引了來自幾個國家和文化的研究人員。官網鏈接： · Prompt · MoDELS · 得分 · 情景 ·

2024 年 9 月 27 日

IDGen: Item Discrimination Induced Prompt Generation for LLM Evaluation

Fan Lin,Shuyi Xie,Yong Dai,Wenlin Yao,Tianjiao Lang,Zishan Xu,Zhichao Hu,Xiao Xiao,Yuhong Liu,Yu Zhang

from arxiv, NeurIPS 2024

As Large Language Models (LLMs) grow increasingly adept at managing complex tasks, the evaluation set must keep pace with these advancements to ensure it remains sufficiently discriminative. Item Discrimination (ID) theory, which is widely used in educational assessment, measures the ability of individual test items to differentiate between high and low performers. Inspired by this theory, we propose an ID-induced prompt synthesis framework for evaluating LLMs to ensure the evaluation set can continually update and refine according to model abilities. Our data synthesis framework prioritizes both breadth and specificity. It can generate prompts that comprehensively evaluate the capabilities of LLMs while revealing meaningful performance differences between models, allowing for effective discrimination of their relative strengths and weaknesses across various tasks and domains. To produce high-quality data, we incorporate a self-correct mechanism into our generalization framework, and develop two models to predict prompt discrimination and difficulty score to facilitate our data synthesis framework, contributing valuable tools to evaluation data synthesis research. We apply our generated data to evaluate five SOTA models. Our data achieves an average score of 51.92, accompanied by a variance of 10.06. By contrast, previous works (i.e., SELF-INSTRUCT and WizardLM) obtain an average score exceeding 67, with a variance below 3.2. The results demonstrate that the data generated by our framework is more challenging and discriminative compared to previous works. We will release a dataset of over 3,000 carefully crafted prompts to facilitate evaluation research of LLMs.

語言模型化 · MoDELS · Performer · 大語言模型 · 操作 ·

2024 年 9 月 27 日

OWL: A Large Language Model for IT Operations

Hongcheng Guo,Jian Yang,Jiaheng Liu,Liqun Yang,Linzheng Chai,Jiaqi Bai,Junran Peng,Xiaorong Hu,Chao Chen,Dongfeng Zhang,Xu Shi,Tieqiao Zheng,Liangfan Zheng,Bo Zhang,Ke Xu,Zhoujun Li

from arxiv, ICLR 2024

With the rapid development of IT operations, it has become increasingly crucial to efficiently manage and analyze large volumes of data for practical applications. The techniques of Natural Language Processing (NLP) have shown remarkable capabilities for various tasks, including named entity recognition, machine translation and dialogue systems. Recently, Large Language Models (LLMs) have achieved significant improvements across various NLP downstream tasks. However, there is a lack of specialized LLMs for IT operations. In this paper, we introduce the OWL, a large language model trained on our collected OWL-Instruct dataset with a wide range of IT-related information, where the mixture-of-adapter strategy is proposed to improve the parameter-efficient tuning across different domains or tasks. Furthermore, we evaluate the performance of our OWL on the OWL-Bench established by us and open IT-related benchmarks. OWL demonstrates superior performance results on IT tasks, which outperforms existing models by significant margins. Moreover, we hope that the findings of our work will provide more insights to revolutionize the techniques of IT operations with specialized LLMs.

Learning · 機器人 · Performer · 值域 · 回合 ·

2024 年 9 月 27 日

iWalker: Imperative Visual Planning for Walking Humanoid Robot

Xiao Lin,Yuhao Huang,Taimeng Fu,Xiaobin Xiong,Chen Wang

Humanoid robots, with the potential to perform a broad range of tasks in environments designed for humans, have been deemed crucial for the basis of general AI agents. When talking about planning and controlling, although traditional models and task-specific methods have been extensively studied over the past few decades, they are inadequate for achieving the flexibility and versatility needed for general autonomy. Learning approaches, especially reinforcement learning, are powerful and popular nowadays, but they are inherently "blind" during training, relying heavily on trials in simulation without proper guidance from physical principles or underlying dynamics. In response, we propose a novel end-to-end pipeline that seamlessly integrates perception, planning, and model-based control for humanoid robot walking. We refer to our method as iWalker, which is driven by imperative learning (IL), a self-supervising neuro-symbolic learning framework. This enables the robot to learn from arbitrary unlabeled data, significantly improving its adaptability and generalization capabilities. In experiments, iWalker demonstrates effectiveness in both simulated and real-world environments, representing a significant advancement toward versatile and autonomous humanoid robots.

MoDELS · MATLAB · 方陣 · SimPLe · 均值 ·

2024 年 9 月 26 日

VVTEAM: A Compact Behavioral Model for Volatile Memristors

Tanay Patni,Rishona Daniels,Shahar Kvatinsky

from arxiv, 4 pages, 4 figures, 1 table, to be published in proceedings of 2024 International Flexible Electronics Technology Conference (IFETC 2024)

Volatile memristors have recently gained popularity as promising devices for neuromorphic circuits, capable of mimicking the leaky function of neurons and offering advantages over capacitor-based circuits in terms of power dissipation and area. Additionally, volatile memristors are useful as selector devices and for hardware security circuits such as physical unclonable functions. To facilitate the design and simulation of circuits, a compact behavioral model is essential. This paper proposes V-VTEAM, a compact, simple, general, and flexible behavioral model for volatile memristors, inspired by the VTEAM nonvolatile memristor model and developed in MATLAB. The validity of the model is demonstrated by fitting it to an ion drift/diffusion-based Ag/SiOx/C/W volatile memristor, achieving a relative root mean error square of 4.5%.

特化 · MoDELS · 語言模型化 · Learning · 大語言模型 ·

2024 年 9 月 26 日

MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models

Gongfan Fang,Hongxu Yin,Saurav Muralidharan,Greg Heinrich,Jeff Pool,Jan Kautz,Pavlo Molchanov,Xinchao Wang

from arxiv, NeurIPS 2024 Spotlight

Large Language Models (LLMs) are distinguished by their massive parameter counts, which typically result in significant redundancy. This work introduces MaskLLM, a learnable pruning method that establishes Semi-structured (or ``N:M'') Sparsity in LLMs, aimed at reducing computational overhead during inference. Instead of developing a new importance criterion, MaskLLM explicitly models N:M patterns as a learnable distribution through Gumbel Softmax sampling. This approach facilitates end-to-end training on large-scale datasets and offers two notable advantages: 1) High-quality Masks - our method effectively scales to large datasets and learns accurate masks; 2) Transferability - the probabilistic modeling of mask distribution enables the transfer learning of sparsity across domains or tasks. We assessed MaskLLM using 2:4 sparsity on various LLMs, including LLaMA-2, Nemotron-4, and GPT-3, with sizes ranging from 843M to 15B parameters, and our empirical results show substantial improvements over state-of-the-art methods. For instance, leading approaches achieve a perplexity (PPL) of 10 or greater on Wikitext compared to the dense model's 5.12 PPL, but MaskLLM achieves a significantly lower 6.72 PPL solely by learning the masks with frozen weights. Furthermore, MaskLLM's learnable nature allows customized masks for lossless application of 2:4 sparsity to downstream tasks or domains. Code is available at \url{//github.com/NVlabs/MaskLLM}.

貪心逐層預訓練 · 貪心 · Neural Networks · Networking · 離散化 ·

2024 年 9 月 25 日

fOGA: Orthogonal Greedy Algorithm for Fractional Laplace Equations

Ruitong Shan,Young Ju Lee,Jiwei Jia

from arxiv, 15 pages

In this paper, we explore the finite difference approximation of the fractional Laplace operator in conjunction with a neural network method for solving it. We discretized the fractional Laplace operator using the Riemann-Liouville formula relevant to fractional equations. A shallow neural network was constructed to address the discrete fractional operator, coupled with the OGA algorithm. To validate the feasibility of our approach, we conducted numerical experiments, testing both the Laplace operator and the fractional Laplace operator, yielding favorable convergence results.

機器人 · 穩健性 · 控制器 · 傳感器 · INFORMS ·

2024 年 9 月 24 日

MBC: Multi-Brain Collaborative Control for Quadruped Robots

Hang Liu,Yi Cheng,Rankun Li,Xiaowen Hu,Linqi Ye,Houde Liu

from arxiv, 18 pages, 9 figures, Website and Videos: //quad-mbc.github.io/

In the field of locomotion task of quadruped robots, Blind Policy and Perceptive Policy each have their own advantages and limitations. The Blind Policy relies on preset sensor information and algorithms, suitable for known and structured environments, but it lacks adaptability in complex or unknown environments. The Perceptive Policy uses visual sensors to obtain detailed environmental information, allowing it to adapt to complex terrains, but its effectiveness is limited under occluded conditions, especially when perception fails. Unlike the Blind Policy, the Perceptive Policy is not as robust under these conditions. To address these challenges, we propose a MBC:Multi-Brain collaborative system that incorporates the concepts of Multi-Agent Reinforcement Learning and introduces collaboration between the Blind Policy and the Perceptive Policy. By applying this multi-policy collaborative model to a quadruped robot, the robot can maintain stable locomotion even when the perceptual system is impaired or observational data is incomplete. Our simulations and real-world experiments demonstrate that this system significantly improves the robot's passability and robustness against perception failures in complex environments, validating the effectiveness of multi-policy collaboration in enhancing robotic motion performance.

state-of-the-art · 可理解性 · BERT · 去噪自編碼器 · Performer ·

2019 年 6 月 19 日

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Zhilin Yang,Zihang Dai,Yiming Yang,Jaime Carbonell,Ruslan Salakhutdinov,Quoc V. Le

from arxiv, Pretrained models and code are available at //github.com/zihangdai/xlnet

With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling. However, relying on corrupting the input with masks, BERT neglects dependency between the masked positions and suffers from a pretrain-finetune discrepancy. In light of these pros and cons, we propose XLNet, a generalized autoregressive pretraining method that (1) enables learning bidirectional contexts by maximizing the expected likelihood over all permutations of the factorization order and (2) overcomes the limitations of BERT thanks to its autoregressive formulation. Furthermore, XLNet integrates ideas from Transformer-XL, the state-of-the-art autoregressive model, into pretraining. Empirically, XLNet outperforms BERT on 20 tasks, often by a large margin, and achieves state-of-the-art results on 18 tasks including question answering, natural language inference, sentiment analysis, and document ranking.

Performer · 判別器 · 正例 · 假陽性 · 監督 ·

2018 年 5 月 24 日

DSGAN: Generative Adversarial Training for Distant Supervision Relation Extraction

Pengda Qin,Weiran Xu,William Yang Wang

Distant supervision can effectively label data for relation extraction, but suffers from the noise labeling problem. Recent works mainly perform soft bag-level noise reduction strategies to find the relatively better samples in a sentence bag, which is suboptimal compared with making a hard decision of false positive samples in sentence level. In this paper, we introduce an adversarial learning framework, which we named DSGAN, to learn a sentence-level true-positive generator. Inspired by Generative Adversarial Networks, we regard the positive samples generated by the generator as the negative samples to train the discriminator. The optimal generator is obtained until the discrimination ability of the discriminator has the greatest decline. We adopt the generator to filter distant supervision training dataset and redistribute the false positive instances into the negative set, in which way to provide a cleaned dataset for relation classification. The experimental results show that the proposed strategy significantly improves the performance of distant supervision relation extraction comparing to state-of-the-art systems.

FPGA · 卷積神經網絡 · Neural Networks · 卷積 · 層 ·

2016 年 9 月 30 日

Caffeinated FPGAs: FPGA Framework For Convolutional Neural Networks

Roberto DiCecco,Griffin Lacey,Jasmina Vasiljevic,Paul Chow,Graham Taylor,Shawki Areibi

Convolutional Neural Networks (CNNs) have gained significant traction in the field of machine learning, particularly due to their high accuracy in visual recognition. Recent works have pushed the performance of GPU implementations of CNNs to significantly improve their classification and training times. With these improvements, many frameworks have become available for implementing CNNs on both CPUs and GPUs, with no support for FPGA implementations. In this work we present a modified version of the popular CNN framework Caffe, with FPGA support. This allows for classification using CNN models and specialized FPGA implementations with the flexibility of reprogramming the device when necessary, seamless memory transactions between host and device, simple-to-use test benches, and the ability to create pipelined layer implementations. To validate the framework, we use the Xilinx SDAccel environment to implement an FPGA-based Winograd convolution engine and show that the FPGA layer can be used alongside other layers running on a host processor to run several popular CNNs (AlexNet, GoogleNet, VGG A, Overfeat). The results show that our framework achieves 50 GFLOPS across 3x3 convolutions in the benchmarks. This is achieved within a practical framework, which will aid in future development of FPGA-based CNNs.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191

<form id='myh0z'></form>

<bdo id='myh0z'><sup id='myh0z'><div id='myh0z'><bdo id='myh0z'></bdo></div></sup></bdo>