好诱人的搜子好爽免费观看,看一级毛片久久久久久免费毛片,高清无码人成电影,国产95在线

We introduce SOAR, a novel Self-supervised pretraining algorithm for aerial footage captured by Unmanned Aerial Vehicles (UAVs). We incorporate human object knowledge throughout the pretraining process to enhance UAV video pretraining efficiency and downstream action recognition performance. This is in contrast to prior works that primarily incorporate object information during the fine-tuning stage. Specifically, we first propose a novel object-aware masking strategy designed to retain the visibility of certain patches related to objects throughout the pretraining phase. Second, we introduce an object-aware loss function that utilizes object information to adjust the reconstruction loss, preventing bias towards less informative background patches. In practice, SOAR with a vanilla ViT backbone, outperforms best UAV action recognition models, recording a 9.7% and 21.4% boost in top-1 accuracy on the NEC-Drone and UAV-Human datasets, while delivering an inference speed of 18.7ms per video, making it 2x to 5x faster. Additionally, SOAR obtains comparable accuracy to prior self-supervised learning (SSL) methods while requiring 87.5% less pretraining time and 25% less memory usage

相關內容

INFORMS

關注 10

《計算機信息》雜志發表高質量的論文，擴大了運籌學和計算的范圍，尋求有關理論、方法、實驗、系統和應用方面的原創研究論文、新穎的調查和教程論文，以及描述新的和有用的軟件工具的論文。官網鏈接： · 在線 · 語言模型化 · 大語言模型 · 最優化 ·

2024 年 11 月 5 日

Self-Exploring Language Models: Active Preference Elicitation for Online Alignment

Shenao Zhang,Donghan Yu,Hiteshi Sharma,Han Zhong,Zhihan Liu,Ziyi Yang,Shuohang Wang,Hany Hassan,Zhaoran Wang

Preference optimization, particularly through Reinforcement Learning from Human Feedback (RLHF), has achieved significant success in aligning Large Language Models (LLMs) to adhere to human intentions. Unlike offline alignment with a fixed dataset, online feedback collection from humans or AI on model generations typically leads to more capable reward models and better-aligned LLMs through an iterative process. However, achieving a globally accurate reward model requires systematic exploration to generate diverse responses that span the vast space of natural language. Random sampling from standard reward-maximizing LLMs alone is insufficient to fulfill this requirement. To address this issue, we propose a bilevel objective optimistically biased towards potentially high-reward responses to actively explore out-of-distribution regions. By solving the inner-level problem with the reparameterized reward function, the resulting algorithm, named Self-Exploring Language Models (SELM), eliminates the need for a separate RM and iteratively updates the LLM with a straightforward objective. Compared to Direct Preference Optimization (DPO), the SELM objective reduces indiscriminate favor of unseen extrapolations and enhances exploration efficiency. Our experimental results demonstrate that when fine-tuned on Zephyr-7B-SFT and Llama-3-8B-Instruct models, SELM significantly boosts the performance on instruction-following benchmarks such as MT-Bench and AlpacaEval 2.0, as well as various standard academic benchmarks in different settings. Our code and models are available at //github.com/shenao-zhang/SELM.

無限 · 極小點 · binary · 論文 · 電氣電子工程師學會 ·

2024 年 11 月 5 日

Self-Dual Cyclic Codes with Square-Root-Like Lower Bounds on Their Minimum Distances

Hao Chen,Cunsheng Ding

from arxiv, 20 pages

Binary self-dual cyclic codes have been studied since the classical work of Sloane and Thompson published in IEEE Trans. Inf. Theory, vol. 29, 1983. Twenty five years later, an infinite family of binary self-dual cyclic codes with lengths $n_i$ and minimum distances $d_i \geq \frac{1}{2} \sqrt{n_i+2}$ was presented in a paper of IEEE Trans. Inf. Theory, vol. 55, 2009. However, no infinite family of Euclidean self-dual binary cyclic codes whose minimum distances have the square-root lower bound and no infinite family of Euclidean self-dual nonbinary cyclic codes whose minimum distances have a lower bound better than the square-root lower bound are known in the literature. In this paper, an infinite family of Euclidean self-dual cyclic codes over the fields ${\bf F}_{2^s}$ with a square-root-like lower bound is constructed. An infinite subfamily of this family consists of self-dual binary cyclic codes with the square-root lower bound. Another infinite subfamily of this family consists of self-dual cyclic codes over the fields ${\bf F}_{2^s}$ with a lower bound better than the square-root bound for $s \geq 2$. Consequently, two breakthroughs in coding theory are made in this paper. An infinite family of self-dual binary cyclic codes with a square-root-like lower bound is also presented in this paper. An infinite family of Hermitian self-dual cyclic codes over the fields ${\bf F}_{2^{2s}}$ with a square-root-like lower bound and an infinite family of Euclidean self-dual linear codes over ${\bf F}_{q}$ with $q \equiv 1 \pmod{4}$ with a square-root-like lower bound are also constructed in this paper.

INTERACT · Performer · 語言模型化 · Taxonomy · state-of-the-art ·

2024 年 11 月 4 日

AmbigNLG: Addressing Task Ambiguity in Instruction for NLG

Ayana Niwa,Hayate Iso

from arxiv, EMNLP 2024 (main)

We introduce AmbigNLG, a novel task designed to tackle the challenge of task ambiguity in instructions for Natural Language Generation (NLG). Ambiguous instructions often impede the performance of Large Language Models (LLMs), especially in complex NLG tasks. To tackle this issue, we propose an ambiguity taxonomy that categorizes different types of instruction ambiguities and refines initial instructions with clearer specifications. Accompanying this task, we present AmbigSNI-NLG, a dataset comprising 2,500 instances annotated to facilitate research in AmbigNLG. Through comprehensive experiments with state-of-the-art LLMs, we demonstrate that our method significantly enhances the alignment of generated text with user expectations, achieving up to a 15.02-point increase in ROUGE scores. Our findings highlight the critical importance of addressing task ambiguity to fully harness the capabilities of LLMs in NLG tasks. Furthermore, we confirm the effectiveness of our method in practical settings involving interactive ambiguity mitigation with users, underscoring the benefits of leveraging LLMs for interactive clarification.

命名實體識別 · entity · 語言模型化 · 樣例 · 標注 ·

2024 年 11 月 4 日

ReverseNER: A Self-Generated Example-Driven Framework for Zero-Shot Named Entity Recognition with Large Language Models

Anbang Wang

This paper presents ReverseNER, a framework aimed at overcoming the limitations of large language models (LLMs) in zero-shot Named Entity Recognition (NER) tasks, particularly in cases where certain entity types have ambiguous boundaries. ReverseNER tackles this challenge by constructing a reliable example library with the reversed process of NER. Rather than beginning with sentences, this method uses an LLM to generate entities based on their definitions and then expands them into full sentences. During sentence generation, the LLM is guided to replicate the structure of a specific 'feature sentence', extracted from the task sentences by clustering. This results in well-annotated sentences with clearly labeled entities, while preserving semantic and structural similarity to the task sentences. Once the example library is constructed, the method selects the most semantically similar example labels for each task sentence to support the LLM's inference. We also propose an entity-level self-consistency scoring mechanism to improve NER performance with LLMs. Experiments show that ReverseNER significantly outperforms traditional zero-shot NER with LLMs and surpasses several few-shot methods, marking a notable improvement in NER for domains with limited labeled data.

推斷 · CoT · Performer · 可約的 · Better ·

2024 年 11 月 4 日

Nash CoT: Multi-Path Inference with Preference Equilibrium

Ziqi Zhang,Cunxiang Wang,Xiong Xiao,Yue Zhang,Donglin Wang

Chain of thought (CoT) is a reasoning framework that can enhance the performance of Large Language Models (LLMs) on complex inference tasks. In particular, among various studies related to CoT, multi-path inference stands out as a simple yet effective improvement. However, there is no optimal setting for the number of inference paths. Therefore, we have to increase the number of inference paths to obtain better results, which in turn increases the inference cost. To address this limitation, we can utilize question-related role templates to guide LLMs into relevant roles, thereby increasing the possibility of correct inferences for each path and further reducing dependence on the number of inference paths while improving reasoning accuracy. However, placing LLMs into specific roles may reduce their reasoning diversity and performance on a few tasks where role dependence is low. To alleviate the excessive immersion of the LLM into a specific role, we propose Nash CoT by constructing a competitive system on each path that balances the generation from role-specific LLMs' and the general LLMs' generation, thereby ensuring both effective role adoption and diversity in LLM generation further maintaining the performance of multi-path inference while reducing the requirement of the number of inference paths. We evaluate Nash CoT across various inference tasks, including Arabic Reasoning, Commonsense Question Answering, and Symbolic Inference, achieving results that are comparable to or better than those of multi-path CoT with the equal number of inference paths.

search engine · Engineering · 語言模型化 · MoDELS · 大語言模型 ·

2024 年 11 月 2 日

LLM4PR: Improving Post-Ranking in Search Engine with Large Language Models

Yang Yan,Yihao Wang,Chi Zhang,Wenyuan Hou,Kang Pan,Xingkai Ren,Zelun Wu,Zhixin Zhai,Enyun Yu,Wenwu Ou,Yang Song

Alongside the rapid development of Large Language Models (LLMs), there has been a notable increase in efforts to integrate LLM techniques in information retrieval (IR) and search engines (SE). Recently, an additional post-ranking stage is suggested in SE to enhance user satisfaction in practical applications. Nevertheless, research dedicated to enhancing the post-ranking stage through LLMs remains largely unexplored. In this study, we introduce a novel paradigm named Large Language Models for Post-Ranking in search engine (LLM4PR), which leverages the capabilities of LLMs to accomplish the post-ranking task in SE. Concretely, a Query-Instructed Adapter (QIA) module is designed to derive the user/item representation vectors by incorporating their heterogeneous features. A feature adaptation step is further introduced to align the semantics of user/item representations with the LLM. Finally, the LLM4PR integrates a learning to post-rank step, leveraging both a main task and an auxiliary task to fine-tune the model to adapt the post-ranking task. Experiment studies demonstrate that the proposed framework leads to significant improvements and exhibits state-of-the-art performance compared with other alternatives.

Learning · 噪聲 · MoDELS · 聯邦學習 · ML ·

2024 年 11 月 2 日

Privacy-Preserving Federated Learning with Differentially Private Hyperdimensional Computing

Fardin Jalil Piran,Zhiling Chen,Mohsen Imani,Farhad Imani

from arxiv, 28 Pages, 10 Figures

Federated Learning (FL) is essential for efficient data exchange in Internet of Things (IoT) environments, as it trains Machine Learning (ML) models locally and shares only model updates. However, FL is vulnerable to privacy threats like model inversion and membership inference attacks, which can expose sensitive training data. To address these privacy concerns, Differential Privacy (DP) mechanisms are often applied. Yet, adding DP noise to black-box ML models degrades performance, especially in dynamic IoT systems where continuous, lifelong FL learning accumulates excessive noise over time. To mitigate this issue, we introduce Federated HyperDimensional computing with Privacy-preserving (FedHDPrivacy), an eXplainable Artificial Intelligence (XAI) framework that combines the neuro-symbolic paradigm with DP. FedHDPrivacy carefully manages the balance between privacy and performance by theoretically tracking cumulative noise from previous rounds and adding only the necessary incremental noise to meet privacy requirements. In a real-world case study involving in-process monitoring of manufacturing machining operations, FedHDPrivacy demonstrates robust performance, outperforming standard FL frameworks-including Federated Averaging (FedAvg), Federated Stochastic Gradient Descent (FedSGD), Federated Proximal (FedProx), Federated Normalized Averaging (FedNova), and Federated Adam (FedAdam)-by up to 38%. FedHDPrivacy also shows potential for future enhancements, such as multimodal data fusion.

MAPO · Prompt · 優化器 · 語言模型化 · Extensibility ·

2024 年 11 月 1 日

Introducing MAPO: Momentum-Aided Gradient Descent Prompt Optimization

Anthony Cui,Pranav Nandyalam,Ethan Cheung,Kevin Zhu

Momentum-Aided Prompt Optimization (MAPO) enhances the efficiency and efficacy of prompt optimization for Large Language Models (LLMs). Building on ProTeGi, MAPO uses positive natural language "gradients" and a momentum-based extension to refine prompts effectively. By tracking gradient history, MAPO avoids local minima and oscillations. It also utilizes beam search and an Upper Confidence Bound (UCB) algorithm for balanced candidate expansion and selection. Benchmark testing shows that MAPO achieves faster convergence time with fewer API calls and higher F1 scores than ProTeGi, proving it as a robust and scalable solution for automated prompt engineering in LLMs.

知識 (knowledge) · 3D · Vision · 粵港澳大灣區數字經濟研究院 · 機器人 ·

2024 年 11 月 1 日

ConceptFactory: Facilitate 3D Object Knowledge Annotation with Object Conceptualization

Jianhua Sun,Yuxuan Li,Longfei Xu,Nange Wang,Jiude Wei,Yining Zhang,Cewu Lu

from arxiv, NeurIPS 2024 Track on Datasets and Benchmarks

We present ConceptFactory, a novel scope to facilitate more efficient annotation of 3D object knowledge by recognizing 3D objects through generalized concepts (i.e. object conceptualization), aiming at promoting machine intelligence to learn comprehensive object knowledge from both vision and robotics aspects. This idea originates from the findings in human cognition research that the perceptual recognition of objects can be explained as a process of arranging generalized geometric components (e.g. cuboids and cylinders). ConceptFactory consists of two critical parts: i) ConceptFactory Suite, a unified toolbox that adopts Standard Concept Template Library (STL-C) to drive a web-based platform for object conceptualization, and ii) ConceptFactory Asset, a large collection of conceptualized objects acquired using ConceptFactory suite. Our approach enables researchers to effortlessly acquire or customize extensive varieties of object knowledge to comprehensively study different object understanding tasks. We validate our idea on a wide range of benchmark tasks from both vision and robotics aspects with state-of-the-art algorithms, demonstrating the high quality and versatility of annotations provided by our approach. Our website is available at //apeirony.github.io/ConceptFactory.

變換 · 控制器 · 邊界框 · Processing（編程語言） · 去噪 ·

2024 年 11 月 1 日

GrounDiT: Grounding Diffusion Transformers via Noisy Patch Transplantation

Phillip Y. Lee,Taehoon Yoon,Minhyuk Sung

from arxiv, Accepted to NeurIPS 2024. Project Page: //groundit-diffusion.github.io/

We introduce GrounDiT, a novel training-free spatial grounding technique for text-to-image generation using Diffusion Transformers (DiT). Spatial grounding with bounding boxes has gained attention for its simplicity and versatility, allowing for enhanced user control in image generation. However, prior training-free approaches often rely on updating the noisy image during the reverse diffusion process via backpropagation from custom loss functions, which frequently struggle to provide precise control over individual bounding boxes. In this work, we leverage the flexibility of the Transformer architecture, demonstrating that DiT can generate noisy patches corresponding to each bounding box, fully encoding the target object and allowing for fine-grained control over each region. Our approach builds on an intriguing property of DiT, which we refer to as semantic sharing. Due to semantic sharing, when a smaller patch is jointly denoised alongside a generatable-size image, the two become semantic clones. Each patch is denoised in its own branch of the generation process and then transplanted into the corresponding region of the original noisy image at each timestep, resulting in robust spatial grounding for each bounding box. In our experiments on the HRS and DrawBench benchmarks, we achieve state-of-the-art performance compared to previous training-free approaches.