亚洲乱色熟女一区二区三区麻豆,蝴蝶谷成人网

Speech enhancement (SE) improves communication in noisy environments, affecting areas such as automatic speech recognition, hearing aids, and telecommunications. With these domains typically being power-constrained and event-based while requiring low latency, neuromorphic algorithms in the form of spiking neural networks (SNNs) have great potential. Yet, current effective SNN solutions require a contextual sampling window imposing substantial latency, typically around 32ms, too long for many applications. Inspired by Dual-Path Spiking Neural Networks (DPSNNs) in classical neural networks, we develop a two-phase time-domain streaming SNN framework -- the Dual-Path Spiking Neural Network (DPSNN). In the DPSNN, the first phase uses Spiking Convolutional Neural Networks (SCNNs) to capture global contextual information, while the second phase uses Spiking Recurrent Neural Networks (SRNNs) to focus on frequency-related features. In addition, the regularizer suppresses activation to further enhance energy efficiency of our DPSNNs. Evaluating on the VCTK and Intel DNS Datasets, we demonstrate that our approach achieves the very low latency (approximately 5ms) required for applications like hearing aids, while demonstrating excellent signal-to-noise ratio (SNR), perceptual quality, and energy efficiency.

相關內容

Neural Networks

關注 1648

神經網絡（Neural Networks）是世界上三個最古老的神經建模學會的檔案期刊:國際神經網絡學會(INNS)、歐洲神經網絡學會(ENNS)和日本神經網絡學會(JNNS)。神經網絡提供了一個論壇，以發展和培育一個國際社會的學者和實踐者感興趣的所有方面的神經網絡和相關方法的計算智能。神經網絡歡迎高質量論文的提交，有助于全面的神經網絡研究，從行為和大腦建模，學習算法，通過數學和計算分析，系統的工程和技術應用，大量使用神經網絡的概念和技術。這一獨特而廣泛的范圍促進了生物和技術研究之間的思想交流，并有助于促進對生物啟發的計算智能感興趣的跨學科社區的發展。因此，神經網絡編委會代表的專家領域包括心理學，神經生物學，計算機科學，工程，數學，物理。該雜志發表文章、信件和評論以及給編輯的信件、社論、時事、軟件調查和專利信息。文章發表在五個部分之一:認知科學，神經科學，學習系統，數學和計算分析、工程和應用。官網地址：

數據集 · Performer · Less · Extensibility · 多樣性 ·

2024 年 10 月 2 日

InstaTrans: An Instruction-Aware Translation Framework for Non-English Instruction Datasets

Yungi Kim,Chanjun Park

It is challenging to generate high-quality instruction datasets for non-English languages due to tail phenomena, which limit performance on less frequently observed data. To mitigate this issue, we propose translating existing high-quality English instruction datasets as a solution, emphasizing the need for complete and instruction-aware translations to maintain the inherent attributes of these datasets. We claim that fine-tuning LLMs with datasets translated in this way can improve their performance in the target language. To this end, we introduces a new translation framework tailored for instruction datasets, named InstaTrans (INSTruction-Aware TRANSlation). Through extensive experiments, we demonstrate the superiority of InstaTrans over other competitors in terms of completeness and instruction-awareness of translation, highlighting its potential to broaden the accessibility of LLMs across diverse languages at a relatively low cost. Furthermore, we have validated that fine-tuning LLMs with datasets translated by InstaTrans can effectively improve their performance in the target language.

Integration · 大語言模型 · 語言模型化 · Performer · 分離的 ·

2024 年 10 月 2 日

OneGen: Efficient One-Pass Unified Generation and Retrieval for LLMs

Jintian Zhang,Cheng Peng,Mengshu Sun,Xiang Chen,Lei Liang,Zhiqiang Zhang,Jun Zhou,Huajun Chen,Ningyu Zhang

from arxiv, EMNLP 2024 Findings; code is available at //github.com/zjunlp/OneGen

Despite the recent advancements in Large Language Models (LLMs), which have significantly enhanced the generative capabilities for various NLP tasks, LLMs still face limitations in directly handling retrieval tasks. However, many practical applications demand the seamless integration of both retrieval and generation. This paper introduces a novel and efficient One-pass Generation and retrieval framework (OneGen), designed to improve LLMs' performance on tasks that require both generation and retrieval. The proposed framework bridges the traditionally separate training approaches for generation and retrieval by incorporating retrieval tokens generated autoregressively. This enables a single LLM to handle both tasks simultaneously in a unified forward pass. We conduct experiments on two distinct types of composite tasks, RAG and Entity Linking, to validate the pluggability, effectiveness, and efficiency of OneGen in training and inference. Furthermore, our results show that integrating generation and retrieval within the same context preserves the generative capabilities of LLMs while improving retrieval performance. To the best of our knowledge, OneGen is the first to enable LLMs to conduct vector retrieval during the generation.

entity · MoDELS · 知識 (knowledge) · 正交 · 圖 ·

2024 年 10 月 2 日

Block-Diagonal Orthogonal Relation and Matrix Entity for Knowledge Graph Embedding

Yihua Zhu,Hidetoshi Shimodaira

from arxiv, EMNLP2024 findings (Long)

The primary aim of Knowledge Graph embeddings (KGE) is to learn low-dimensional representations of entities and relations for predicting missing facts. While rotation-based methods like RotatE and QuatE perform well in KGE, they face two challenges: limited model flexibility requiring proportional increases in relation size with entity dimension, and difficulties in generalizing the model for higher-dimensional rotations. To address these issues, we introduce OrthogonalE, a novel KGE model employing matrices for entities and block-diagonal orthogonal matrices with Riemannian optimization for relations. This approach enhances the generality and flexibility of KGE models. The experimental results indicate that our new KGE model, OrthogonalE, is both general and flexible, significantly outperforming state-of-the-art KGE models while substantially reducing the number of relation parameters.

語音增強 · Performance · 輸出 · 噪聲 · 回合 ·

2024 年 10 月 2 日

Restorative Speech Enhancement: A Progressive Approach Using SE and Codec Modules

Hsin-Tien Chiang,Hao Zhang,Yong Xu,Meng Yu,Dong Yu

from arxiv, Paper in submission

In challenging environments with significant noise and reverberation, traditional speech enhancement (SE) methods often lead to over-suppressed speech, creating artifacts during listening and harming downstream tasks performance. To overcome these limitations, we propose a novel approach called Restorative SE (RestSE), which combines a lightweight SE module with a generative codec module to progressively enhance and restore speech quality. The SE module initially reduces noise, while the codec module subsequently performs dereverberation and restores speech using generative capabilities. We systematically explore various quantization techniques within the codec module to optimize performance. Additionally, we introduce a weighted loss function and feature fusion that merges the SE output with the original mixture, particularly at segments where the SE output is heavily distorted. Experimental results demonstrate the effectiveness of our proposed method in enhancing speech quality under adverse conditions. Audio demos are available at: //sophie091524.github.io/RestorativeSE/.

示例 · GPT-4 · 大語言模型 · Performer · 語言模型化 ·

2024 年 10 月 1 日

See What LLMs Cannot Answer: A Self-Challenge Framework for Uncovering LLM Weaknesses

Yulong Chen,Yang Liu,Jianhao Yan,Xuefeng Bai,Ming Zhong,Yinghao Yang,Ziyi Yang,Chenguang Zhu,Yue Zhang

from arxiv, COLM 2024

The impressive performance of Large Language Models (LLMs) has consistently surpassed numerous human-designed benchmarks, presenting new challenges in assessing the shortcomings of LLMs. Designing tasks and finding LLMs' limitations are becoming increasingly important. In this paper, we investigate the question of whether an LLM can discover its own limitations from the errors it makes. To this end, we propose a Self-Challenge evaluation framework with human-in-the-loop. Starting from seed instances that GPT-4 fails to answer, we prompt GPT-4 to summarize error patterns that can be used to generate new instances and incorporate human feedback on them to refine these patterns for generating more challenging data, iteratively. We end up with 8 diverse patterns, such as text manipulation and questions with assumptions. We then build a benchmark, SC-G4, consisting of 1,835 instances generated by GPT-4 using these patterns, with human-annotated gold responses. The SC-G4 serves as a challenging benchmark that allows for a detailed assessment of LLMs' abilities. Our results show that only 44.96\% of instances in SC-G4 can be answered correctly by GPT-4. Interestingly, our pilot study indicates that these error patterns also challenge other LLMs, such as Claude-3 and Llama-3, and cannot be fully resolved through fine-tuning. Our work takes the first step to demonstrate that LLMs can autonomously identify their inherent flaws and provide insights for future dynamic and automatic evaluation.

無監督 · 異常檢測 · MoDELS · 異常點 · 規范化的 ·

2024 年 9 月 30 日

UMAD: Unsupervised Mask-Level Anomaly Detection for Autonomous Driving

Daniel Bogdoll,No?l Ollick,Tim Joseph,Svetlana Pavlitska,J. Marius Z?llner

from arxiv, Daniel Bogdoll and No\"el Ollick contributed equally. Accepted for publication at BMVC 2024 RROW workshop

Dealing with atypical traffic scenarios remains a challenging task in autonomous driving. However, most anomaly detection approaches cannot be trained on raw sensor data but require exposure to outlier data and powerful semantic segmentation models trained in a supervised fashion. This limits the representation of normality to labeled data, which does not scale well. In this work, we revisit unsupervised anomaly detection and present UMAD, leveraging generative world models and unsupervised image segmentation. Our method outperforms state-of-the-art unsupervised anomaly detection.

優化器 · MoDELS · 集成 · 模型評估 · Learning ·

2024 年 9 月 29 日

PhishGuard: A Multi-Layered Ensemble Model for Optimal Phishing Website Detection

Md Sultanul Islam Ovi,Md. Hasibur Rahman,Mohammad Arif Hossain

Phishing attacks are a growing cybersecurity threat, leveraging deceptive techniques to steal sensitive information through malicious websites. To combat these attacks, this paper introduces PhishGuard, an optimal custom ensemble model designed to improve phishing site detection. The model combines multiple machine learning classifiers, including Random Forest, Gradient Boosting, CatBoost, and XGBoost, to enhance detection accuracy. Through advanced feature selection methods such as SelectKBest and RFECV, and optimizations like hyperparameter tuning and data balancing, the model was trained and evaluated on four publicly available datasets. PhishGuard outperformed state-of-the-art models, achieving a detection accuracy of 99.05% on one of the datasets, with similarly high results across other datasets. This research demonstrates that optimization methods in conjunction with ensemble learning greatly improve phishing detection performance.

多樣性 · 機器人 · 優化器 · 值域 · 穩健性 ·

2024 年 9 月 29 日

OptiGrasp: Optimized Grasp Pose Detection Using RGB Images for Warehouse Picking Robots

Soofiyan Atar,Yi Li,Markus Grotz,Michael Wolf,Dieter Fox,Joshua Smith

from arxiv, 8 pages, 6 figures

In warehouse environments, robots require robust picking capabilities to manage a wide variety of objects. Effective deployment demands minimal hardware, strong generalization to new products, and resilience in diverse settings. Current methods often rely on depth sensors for structural information, which suffer from high costs, complex setups, and technical limitations. Inspired by recent advancements in computer vision, we propose an innovative approach that leverages foundation models to enhance suction grasping using only RGB images. Trained solely on a synthetic dataset, our method generalizes its grasp prediction capabilities to real-world robots and a diverse range of novel objects not included in the training set. Our network achieves an 82.3\% success rate in real-world applications. The project website with code and data will be available at //optigrasp.github.io.

知識 (knowledge) · MoDELS · 語言模型化 · PromptKD · 蒸餾 ·

2024 年 9 月 27 日

PromptKD: Distilling Student-Friendly Knowledge for Generative Language Models via Prompt Tuning

Gyeongman Kim,Doohyuk Jang,Eunho Yang

from arxiv, EMNLP 2024 Findings. Our project page: //promptkd.github.io

Recent advancements in large language models (LLMs) have raised concerns about inference costs, increasing the need for research into model compression. While knowledge distillation (KD) is a prominent method for this, research on KD for generative language models like LLMs is relatively sparse, and the approach of distilling student-friendly knowledge, which has shown promising performance in KD for classification models, remains unexplored in generative language models. To explore this approach, we propose PromptKD, a simple yet effective method that utilizes prompt tuning - for the first time in KD - to enable generative language models to transfer student-friendly knowledge. Unlike previous works in classification that require fine-tuning the entire teacher model for extracting student-friendly knowledge, PromptKD achieves similar effects by adding a small number of prompt tokens and tuning only the prompt with student guidance. Extensive experiments on instruction-following datasets show that PromptKD achieves state-of-the-art performance while adding only 0.0007% of the teacher's parameters as prompts. Further analysis suggests that distilling student-friendly knowledge alleviates exposure bias effectively throughout the entire training process, leading to performance enhancements.

語言模型化 · Performer · Agent · MoDELS · Learning ·

2023 年 5 月 19 日

Introspective Tips: Large Language Model for In-Context Decision Making

Liting Chen,Lu Wang,Hang Dong,Yali Du,Jie Yan,Fangkai Yang,Shuang Li,Pu Zhao,Si Qin,Saravan Rajmohan,Qingwei Lin,Dongmei Zhang

from arxiv, 22 pages, 4 figures

The emergence of large language models (LLMs) has substantially influenced natural language processing, demonstrating exceptional results across various tasks. In this study, we employ ``Introspective Tips" to facilitate LLMs in self-optimizing their decision-making. By introspectively examining trajectories, LLM refines its policy by generating succinct and valuable tips. Our method enhances the agent's performance in both few-shot and zero-shot learning situations by considering three essential scenarios: learning from the agent's past experiences, integrating expert demonstrations, and generalizing across diverse games. Importantly, we accomplish these improvements without fine-tuning the LLM parameters; rather, we adjust the prompt to generalize insights from the three aforementioned situations. Our framework not only supports but also emphasizes the advantage of employing LLM in in-contxt decision-making. Experiments involving over 100 games in TextWorld illustrate the superior performance of our approach.