99热日韩这里只有国产中文精品,国产亚洲欧美丝袜在线观看三区

In this paper, we develop upon the topic of loss function learning, an emergent meta-learning paradigm that aims to learn loss functions that significantly improve the performance of the models trained under them. Specifically, we propose a new meta-learning framework for task and model-agnostic loss function learning via a hybrid search approach. The framework first uses genetic programming to find a set of symbolic loss functions. Second, the set of learned loss functions is subsequently parameterized and optimized via unrolled differentiation. The versatility and performance of the proposed framework are empirically validated on a diverse set of supervised learning tasks. Results show that the learned loss functions bring improved convergence, sample efficiency, and inference performance on tabulated, computer vision, and natural language processing problems, using a variety of task-specific neural network architectures.

相關內容

損失函數（機器學習）

關注 10

損失函數，在AI中亦稱呼距離函數，度量函數。此處的距離代表的是抽象性的，代表真實數據與預測數據之間的誤差。損失函數（loss function）是用來估量你模型的預測值f(x)與真實值Y的不一致程度，它是一個非負實值函數,通常使用L(Y, f(x))來表示，損失函數越小，模型的魯棒性就越好。損失函數是經驗風險函數的核心部分，也是結構風險函數重要組成部分。

Neural Networks · 卷積 · Networking · 層 · 卷積神經網絡 ·

2024 年 4 月 15 日

Layered Uploading for Quantum Convolutional Neural Networks

Grégoire Barrué,Tony Quertier

Continuing our analysis of quantum machine learning applied to our use-case of malware detection, we investigate the potential of quantum convolutional neural networks. More precisely, we propose a new architecture where data is uploaded all along the quantum circuit. This allows us to use more features from the data, hence giving to the algorithm more information, without having to increase the number of qubits that we use for the quantum circuit. This approach is motivated by the fact that we do not always have great amounts of data, and that quantum computers are currently restricted in their number of logical qubits.

CASES · 知識 (knowledge) · CASE · 圖 · entity ·

2024 年 4 月 15 日

Automatic Knowledge Graph Construction for Judicial Cases

Jie Zhou,Xin Chen,Hang Zhang,Zhe Li

In this paper, we explore the application of cognitive intelligence in legal knowledge, focusing on the development of judicial artificial intelligence. Utilizing natural language processing (NLP) as the core technology, we propose a method for the automatic construction of case knowledge graphs for judicial cases. Our approach centers on two fundamental NLP tasks: entity recognition and relationship extraction. We compare two pre-trained models for entity recognition to establish their efficacy. Additionally, we introduce a multi-task semantic relationship extraction model that incorporates translational embedding, leading to a nuanced contextualized case knowledge representation. Specifically, in a case study involving a "Motor Vehicle Traffic Accident Liability Dispute," our approach significantly outperforms the baseline model. The entity recognition F1 score improved by 0.36, while the relationship extraction F1 score increased by 2.37. Building on these results, we detail the automatic construction process of case knowledge graphs for judicial cases, enabling the assembly of knowledge graphs for hundreds of thousands of judgments. This framework provides robust semantic support for applications of judicial AI, including the precise categorization and recommendation of related cases.

Integration · 優化器 · Performer · 近似 · Projection ·

2024 年 4 月 12 日

Optimized Detection with Analog Beamforming for Monostatic Integrated Sensing and Communication

Rodrigo Hernangómez,Jochen Fink,Renato L. G. Cavalcante,Zoran Utkovski,S?awomir Stańczak

from arxiv, 7 pages, 4 figures. Published at IEEE International Conference on Communications (ICC) 2024. IEEE Copyright protected

In this paper, we formalize an optimization framework for analog beamforming in the context of monostatic integrated sensing and communication (ISAC), where we also address the problem of self-interference in the analog domain. As a result, we derive semidefinite programs to approach detection-optimal transmit and receive beamformers, and we devise a superiorized iterative projection algorithm to approximate them. Our simulations show that this approach outperforms the detection performance of well-known design techniques for ISAC beamforming, while it achieves satisfactory self-interference suppression.

相似度 · Performer · Agent · 穩健性 · INFORMS ·

2024 年 4 月 12 日

A Transferability Metric Using Scene Similarity and Local Map Observation for DRL Navigation

Shiwei Lian,Feitian Zhang

While deep reinforcement learning (DRL) has attracted a rapidly growing interest in solving the problem of navigation without global maps, DRL typically leads to a mediocre navigation performance in practice due to the gap between the training scene and the actual test scene. To quantify the transferability of a DRL agent between the training and test scenes, this paper proposes a new transferability metric -- the scene similarity calculated using an improved image template matching algorithm. Specifically, two transferability performance indicators are designed including the global scene similarity that evaluates the overall robustness of a DRL algorithm and the local scene similarity that serves as a safety measure when a DRL agent is deployed without a global map. In addition, this paper proposes the use of a local map that fuses 2D LiDAR data with spatial information of both the agent and the destination as the DRL observation, aiming to improve the transferability of DRL navigation algorithms. With a wheeled robot as the case study platform, both simulation and real-world experiments are conducted in a total of 26 different scenes. The experimental results affirm the robustness of the local map observation design and demonstrate the strong correlation between the scene similarity metric and the success rate of DRL navigation algorithms.

可約的 · MoDELS · 稀疏 · CC · 推斷 ·

2024 年 4 月 11 日

Bayesian Federated Model Compression for Communication and Computation Efficiency

Chengyu Xia,Danny H. K. Tsang,Vincent K. N. Lau

In this paper, we investigate Bayesian model compression in federated learning (FL) to construct sparse models that can achieve both communication and computation efficiencies. We propose a decentralized Turbo variational Bayesian inference (D-Turbo-VBI) FL framework where we firstly propose a hierarchical sparse prior to promote a clustered sparse structure in the weight matrix. Then, by carefully integrating message passing and VBI with a decentralized turbo framework, we propose the D-Turbo-VBI algorithm which can (i) reduce both upstream and downstream communication overhead during federated training, and (ii) reduce the computational complexity during local inference. Additionally, we establish the convergence property for thr proposed D-Turbo-VBI algorithm. Simulation results show the significant gain of our proposed algorithm over the baselines in reducing communication overhead during federated training and computational complexity of final model.

泛函 · 優化器 · Learning · Machine Learning · Notability ·

2024 年 3 月 29 日

Functional Bilevel Optimization for Machine Learning

Ieva Petrulionyte,Julien Mairal,Michael Arbel

In this paper, we introduce a new functional point of view on bilevel optimization problems for machine learning, where the inner objective is minimized over a function space. These types of problems are most often solved by using methods developed in the parametric setting, where the inner objective is strongly convex with respect to the parameters of the prediction function. The functional point of view does not rely on this assumption and notably allows using over-parameterized neural networks as the inner prediction function. We propose scalable and efficient algorithms for the functional bilevel optimization problem and illustrate the benefits of our approach on instrumental regression and reinforcement learning tasks, which admit natural functional bilevel structures.

Performer · 數據增強 · 錯誤率 · 穩健性 · 音素 ·

2024 年 3 月 29 日

A Comparison of Speech Data Augmentation Methods Using S3PRL Toolkit

Mina Huh,Ruchira Ray,Corey Karnei

Data augmentations are known to improve robustness in speech-processing tasks. In this study, we summarize and compare different data augmentation strategies using S3PRL toolkit. We explore how HuBERT and wav2vec perform using different augmentation techniques (SpecAugment, Gaussian Noise, Speed Perturbation) for Phoneme Recognition (PR) and Automatic Speech Recognition (ASR) tasks. We evaluate model performance in terms of phoneme error rate (PER) and word error rate (WER). From the experiments, we observed that SpecAugment slightly improves the performance of HuBERT and wav2vec on the original dataset. Also, we show that models trained using the Gaussian Noise and Speed Perturbation dataset are more robust when tested with augmented test sets.

Learning · Networking · Processing（編程語言） · 損失函數（機器學習） · 計算成本 ·

2024 年 3 月 29 日

Temporal Difference Learning for High-Dimensional PIDEs with Jumps

Liwei Lu,Hailong Guo,Xu Yang,Yi Zhu

In this paper, we propose a deep learning framework for solving high-dimensional partial integro-differential equations (PIDEs) based on the temporal difference learning. We introduce a set of Levy processes and construct a corresponding reinforcement learning model. To simulate the entire process, we use deep neural networks to represent the solutions and non-local terms of the equations. Subsequently, we train the networks using the temporal difference error, termination condition, and properties of the non-local terms as the loss function. The relative error of the method reaches O(10^{-3}) in 100-dimensional experiments and O(10^{-4}) in one-dimensional pure jump problems. Additionally, our method demonstrates the advantages of low computational cost and robustness, making it well-suited for addressing problems with different forms and intensities of jumps.

數據增強 · Taxonomy · 文本分類 · Machine Learning · 訓練數據 ·

2021 年 7 月 7 日

A Survey on Data Augmentation for Text Classification

Markus Bayer,Marc-André Kaufhold,Christian Reuter

from arxiv, 35 pages, 6 figures, 8 tables

Data augmentation, the artificial creation of training data for machine learning by transformations, is a widely studied research field across machine learning disciplines. While it is useful for increasing the generalization capabilities of a model, it can also address many other challenges and problems, from overcoming a limited amount of training data over regularizing the objective to limiting the amount data used to protect privacy. Based on a precise description of the goals and applications of data augmentation (C1) and a taxonomy for existing works (C2), this survey is concerned with data augmentation methods for textual classification and aims to achieve a concise and comprehensive overview for researchers and practitioners (C3). Derived from the taxonomy, we divided more than 100 methods into 12 different groupings and provide state-of-the-art references expounding which methods are highly promising (C4). Finally, research perspectives that may constitute a building block for future work are given (C5).

貪心逐層預訓練 · 學成 · 貪心 · 深度強化學習 · Extensibility ·

2019 年 3 月 8 日

Learning Heuristics over Large Graphs via Deep Reinforcement Learning

Akash Mittal,Anuj Dhawan,Sourav Medya,Sayan Ranu,Ambuj Singh

In this paper, we propose a deep reinforcement learning framework called GCOMB to learn algorithms that can solve combinatorial problems over large graphs. GCOMB mimics the greedy algorithm in the original problem and incrementally constructs a solution. The proposed framework utilizes Graph Convolutional Network (GCN) to generate node embeddings that predicts the potential nodes in the solution set from the entire node set. These embeddings enable an efficient training process to learn the greedy policy via Q-learning. Through extensive evaluation on several real and synthetic datasets containing up to a million nodes, we establish that GCOMB is up to 41% better than the state of the art, up to seven times faster than the greedy algorithm, robust and scalable to large dynamic networks.