国产乱人弄视频免费观看_伊人亚洲综合青草青草久热_日本高清视频在线观看_欧美日韩精品一区_欧美国产日本精品一区二区三区_日韩一区国产二区欧美三区免费_欧美视频免费在线

We address the challenging problem of Long-Tailed Semi-Supervised Learning (LTSSL) where labeled data exhibit imbalanced class distribution and unlabeled data follow an unknown distribution. Unlike in balanced SSL, the generated pseudo-labels are skewed towards head classes, intensifying the training bias. Such a phenomenon is even amplified as more unlabeled data will be mislabeled as head classes when the class distribution of labeled and unlabeled datasets are mismatched. To solve this problem, we propose a novel method named ComPlementary Experts (CPE). Specifically, we train multiple experts to model various class distributions, each of them yielding high-quality pseudo-labels within one form of class distribution. Besides, we introduce Classwise Batch Normalization for CPE to avoid performance degradation caused by feature distribution mismatch between head and non-head classes. CPE achieves state-of-the-art performances on CIFAR-10-LT, CIFAR-100-LT, and STL-10-LT dataset benchmarks. For instance, on CIFAR-10-LT, CPE improves test accuracy by over >2.22% compared to baselines. Code is available at //github.com/machengcheng2016/CPE-LTSSL.

相關內容

未(wei)標記(ji)

關注 0

Performer · 塑造 · 損失 · 約束 · Learning ·

2024 年 2 月 14 日

Loss Shaping Constraints for Long-Term Time Series Forecasting

Ignacio Hounie,Javier Porras-Valenzuela,Alejandro Ribeiro

Several applications in time series forecasting require predicting multiple steps ahead. Despite the vast amount of literature in the topic, both classical and recent deep learning based approaches have mostly focused on minimising performance averaged over the predicted window. We observe that this can lead to disparate distributions of errors across forecasting steps, especially for recent transformer architectures trained on popular forecasting benchmarks. That is, optimising performance on average can lead to undesirably large errors at specific time-steps. In this work, we present a Constrained Learning approach for long-term time series forecasting that aims to find the best model in terms of average performance that respects a user-defined upper bound on the loss at each time-step. We call our approach loss shaping constraints because it imposes constraints on the loss at each time step, and leverage recent duality results to show that despite its non-convexity, the resulting problem has a bounded duality gap. We propose a practical Primal-Dual algorithm to tackle it, and demonstrate that the proposed approach exhibits competitive average performance in time series forecasting benchmarks, while shaping the distribution of errors across the predicted window.

相互獨立的 ·

2024 年 2 月 14 日

Entropy Jump and Entropic Central Limit Theorem for Independent Sum

Liuquan Yao,Shuai Yuan

from arxiv, 12 pages

It is a manuscript for results about entropic central limit theorem for independent sum under finite Poincar\'e constant conditions.

通道 · 估計/估計量 · massive MIMO · Performer · MIMO ·

2024 年 2 月 14 日

Lightweight Deep Learning Based Channel Estimation for Extremely Large-Scale Massive MIMO Systems

Shen Gao,Peihao Dong,Zhiwen Pan,Xiaohu You

from arxiv, Accepted by IEEE Transactions on Vehicular Technology

Extremely large-scale massive multiple-input multiple-output (XL-MIMO) systems introduce the much higher channel dimensionality and incur the additional near-field propagation effect, aggravating the computation load and the difficulty to acquire the prior knowledge for channel estimation. In this article, an XL-MIMO channel network (XLCNet) is developed to estimate the high-dimensional channel, which is a universal solution for both the near-field users and far-field users with different channel statistics. Furthermore, a compressed XLCNet (C-XLCNet) is designed via weight pruning and quantization to accelerate the model inference as well as to facilitate the model storage and transmission. Simulation results show the performance superiority and universality of XLCNet. Compared to XLCNet, C-XLCNet incurs the limited performance loss while reducing the computational complexity and model size by about $10 \times$ and $36 \times$, respectively.

MAC · 通道 · 準則 · CASE · 樣例 ·

2024 年 2 月 13 日

The Multiple-Access Channel with Entangled Transmitters

Uzi Pereg,Christian Deppe,Holger Boche

Communication over a classical multiple-access channel (MAC) with entanglement resources is considered, whereby two transmitters share entanglement resources a priori before communication begins. Leditzky et al. (2020) presented an example of a classical MAC, defined in terms of a pseudo telepathy game, such that the sum rate with entangled transmitters is strictly higher than the best achievable sum rate without such resources. Here, we establish inner and outer bounds on the capacity region for the general MAC with entangled transmitters, and show that the previous result can be obtained as a special case. It has long been known that the capacity region of the classical MAC under a message-average error criterion can be strictly larger than with a maximal error criterion (Dueck, 1978). We observe that given entanglement resources, the regions coincide. Furthermore, we address the combined setting of entanglement resources and conferencing, where the transmitters can also communicate with each other over rate-limited links. Using superdense coding, entanglement can double the conferencing rate.

語言模型化 · 大語言模型 · MoDELS · Extensibility · Performer ·

2024 年 2 月 13 日

PROXYQA: An Alternative Framework for Evaluating Long-Form Text Generation with Large Language Models

Haochen Tan,Zhijiang Guo,Zhan Shi,Lu Xu,Zhili Liu,Yunlong Feng,Xiaoguang Li,Yasheng Wang,Lifeng Shang,Qun Liu,Linqi Song

Large Language Models (LLMs) have exhibited remarkable success in long-form context comprehension tasks. However, their capacity to generate long contents, such as reports and articles, remains insufficiently explored. Current benchmarks do not adequately assess LLMs' ability to produce informative and comprehensive content, necessitating a more rigorous evaluation approach. In this study, we introduce \textsc{ProxyQA}, a framework for evaluating long-form text generation, comprising in-depth human-curated \textit{meta-questions} spanning various domains. Each meta-question contains corresponding \textit{proxy-questions} with annotated answers. LLMs are prompted to generate extensive content in response to these meta-questions. Utilizing an evaluator and incorporating generated content as background context, \textsc{ProxyQA} evaluates the quality of generated content based on the evaluator's performance in answering the \textit{proxy-questions}. We examine multiple LLMs, emphasizing \textsc{ProxyQA}'s demanding nature as a high-quality assessment tool. Human evaluation demonstrates that evaluating through \textit{proxy-questions} is a highly self-consistent and human-criteria-correlated validation method. The dataset and leaderboard will be available at \url{//github.com/Namco0816/ProxyQA}.

大語言模型 · 自動問答 · 噪聲 · Performer · Extensibility ·

2024 年 2 月 13 日

CABINET: Content Relevance based Noise Reduction for Table Question Answering

Sohan Patnaik,Heril Changwal,Milan Aggarwal,Sumit Bhatia,Yaman Kumar,Balaji Krishnamurthy

from arxiv, Accepted at ICLR 2024 (spotlight)

Table understanding capability of Large Language Models (LLMs) has been extensively studied through the task of question-answering (QA) over tables. Typically, only a small part of the whole table is relevant to derive the answer for a given question. The irrelevant parts act as noise and are distracting information, resulting in sub-optimal performance due to the vulnerability of LLMs to noise. To mitigate this, we propose CABINET (Content RelevAnce-Based NoIse ReductioN for TablE QuesTion-Answering) - a framework to enable LLMs to focus on relevant tabular data by suppressing extraneous information. CABINET comprises an Unsupervised Relevance Scorer (URS), trained differentially with the QA LLM, that weighs the table content based on its relevance to the input question before feeding it to the question-answering LLM (QA LLM). To further aid the relevance scorer, CABINET employs a weakly supervised module that generates a parsing statement describing the criteria of rows and columns relevant to the question and highlights the content of corresponding table cells. CABINET significantly outperforms various tabular LLM baselines, as well as GPT3-based in-context learning methods, is more robust to noise, maintains outperformance on tables of varying sizes, and establishes new SoTA performance on WikiTQ, FeTaQA, and WikiSQL datasets. We release our code and datasets at //github.com/Sohanpatnaik106/CABINET_QA.

6G · 有向 · Networking · Networks · Wireless Networks ·

2024 年 2 月 12 日

RIS-Empowered LEO Satellite Networks for 6G: Promising Usage Scenarios and Future Directions

Mesut Toka,Byungju Lee,Jaehyup Seong,Aryan Kaushik,Juhwan Lee,Jungwoo Lee,Namyoon Lee,Wonjae Shin,H. Vincent Poor

from arxiv, 18 pages, 5 figures, Paper accepted by IEEE Communications Magazine

Low-Earth orbit (LEO) satellite systems have been deemed a promising key enabler for current 5G and the forthcoming 6G wireless networks. Such LEO satellite constellations can provide worldwide three-dimensional coverage, high data rate, and scalability, thus enabling truly ubiquitous connectivity. On the other hand, another promising technology, reconfigurable intelligent surfaces (RISs), has emerged with favorable features, such as flexible deployment, cost & power efficiency, less transmission delay, noise-free nature, and in-band full-duplex structure. LEO satellite networks have many practical imperfections and limitations; however, exploiting RISs has been shown to be a potential solution to overcome these challenges. Particularly, RISs can enhance link quality, reduce the Doppler shift effect, and mitigate inter-/intra beam interference. In this article, we delve into exploiting RISs in LEO satellite networks. First, we present a holistic overview of LEO satellite communication and RIS technology, highlighting potential benefits and challenges. Second, we describe promising usage scenarios and applications in detail. Finally, we discuss potential future directions and challenges on RIS-empowered LEO networks, offering futuristic visions of the upcoming 6G era.

Learning · MoDELS · Analysis · INTERACT · 在線 ·

2024 年 2 月 11 日

A Theoretical Analysis of Nash Learning from Human Feedback under General KL-Regularized Preference

Chenlu Ye,Wei Xiong,Yuheng Zhang,Nan Jiang,Tong Zhang

from arxiv, RLHF, NLHF, Alignment for LLMs

Reinforcement Learning from Human Feedback (RLHF) learns from the preference signal provided by a probabilistic preference model, which takes a prompt and two responses as input, and produces a score indicating the preference of one response against another. So far, the most popular RLHF paradigm is reward-based, which starts with an initial step of reward modeling, and the constructed reward is then used to provide a reward signal for the subsequent reward optimization stage. However, the existence of a reward function is a strong assumption and the reward-based RLHF is limited in expressivity and cannot capture the real-world complicated human preference. In this work, we provide theoretical insights for a recently proposed learning paradigm, Nash learning from human feedback (NLHF), which considered a general preference model and formulated the alignment process as a game between two competitive LLMs. The learning objective is to find a policy that consistently generates responses preferred over any competing policy while staying close to the initial model. The objective is defined as the Nash equilibrium (NE) of the KL-regularized preference model. We aim to make the first attempt to study the theoretical learnability of the KL-regularized NLHF by considering both offline and online settings. For the offline learning from a pre-collected dataset, we propose algorithms that are efficient under suitable coverage conditions of the dataset. For batch online learning from iterative interactions with a preference oracle, our proposed algorithm enjoys a finite sample guarantee under the structural condition of the underlying preference model. Our results connect the new NLHF paradigm with traditional RL theory, and validate the potential of reward-model-free learning under general preference.

大語言模型 · 語言模型化 · MoDELS · 聯邦學習 · Learning ·

2024 年 2 月 10 日

OpenFedLLM: Training Large Language Models on Decentralized Private Data via Federated Learning

Rui Ye,Wenhao Wang,Jingyi Chai,Dihan Li,Zexi Li,Yinda Xu,Yaxin Du,Yanfeng Wang,Siheng Chen

from arxiv, 28 pages, 3 figures, 16 tables

Trained on massive publicly available data, large language models (LLMs) have demonstrated tremendous success across various fields. While more data contributes to better performance, a disconcerting reality is that high-quality public data will be exhausted in a few years. In this paper, we offer a potential next step for contemporary LLMs: collaborative and privacy-preserving LLM training on the underutilized distributed private data via federated learning (FL), where multiple data owners collaboratively train a shared model without transmitting raw data. To achieve this, we build a concise, integrated, and research-friendly framework/codebase, named OpenFedLLM. It covers federated instruction tuning for enhancing instruction-following capability, federated value alignment for aligning with human values, and 7 representative FL algorithms. Besides, OpenFedLLM supports training on diverse domains, where we cover 8 training datasets; and provides comprehensive evaluations, where we cover 30+ evaluation metrics. Through extensive experiments, we observe that all FL algorithms outperform local training on training LLMs, demonstrating a clear performance improvement across a variety of settings. Notably, in a financial benchmark, Llama2-7B fine-tuned by applying any FL algorithm can outperform GPT-4 by a significant margin while the model obtained through individual training cannot, demonstrating strong motivation for clients to participate in FL. The code is available at //github.com/rui-ye/OpenFedLLM.

圖片分類 · 生成式對抗網絡 · Networking · 未標記 · GANs ·

2018 年 2 月 10 日

Generative Adversarial Networks and Probabilistic Graph Models for Hyperspectral Image Classification

Zilong Zhong,Jonathan Li

from arxiv, Accepted by AAAI-18

High spectral dimensionality and the shortage of annotations make hyperspectral image (HSI) classification a challenging problem. Recent studies suggest that convolutional neural networks can learn discriminative spatial features, which play a paramount role in HSI interpretation. However, most of these methods ignore the distinctive spectral-spatial characteristic of hyperspectral data. In addition, a large amount of unlabeled data remains an unexploited gold mine for efficient data use. Therefore, we proposed an integration of generative adversarial networks (GANs) and probabilistic graphical models for HSI classification. Specifically, we used a spectral-spatial generator and a discriminator to identify land cover categories of hyperspectral cubes. Moreover, to take advantage of a large amount of unlabeled data, we adopted a conditional random field to refine the preliminary classification results generated by GANs. Experimental results obtained using two commonly studied datasets demonstrate that the proposed framework achieved encouraging classification accuracy using a small number of data for training.