日韩在线精品小视频_中文字幕日韩欧美爆乳在线不卡_日韩AV一级特黄无码人妻_久久久久无码人妻一区二区三区_YJIZZ视频国产网站在线播放_亚洲欧美激情国产一区二区_亚洲无码影音先锋

In this work, we present a new visual prompting method called 3DAxiesPrompts (3DAP) to unleash the capabilities of GPT-4V in performing 3D spatial tasks. Our investigation reveals that while GPT-4V exhibits proficiency in discerning the position and interrelations of 2D entities through current visual prompting techniques, its abilities in handling 3D spatial tasks have yet to be explored. In our approach, we create a 3D coordinate system tailored to 3D imagery, complete with annotated scale information. By presenting images infused with the 3DAP visual prompt as inputs, we empower GPT-4V to ascertain the spatial positioning information of the given 3D target image with a high degree of precision. Through experiments, We identified three tasks that could be stably completed using the 3DAP method, namely, 2D to 3D Point Reconstruction, 2D to 3D point matching, and 3D Object Detection. We perform experiments on our proposed dataset 3DAP-Data, the results from these experiments validate the efficacy of 3DAP-enhanced GPT-4V inputs, marking a significant stride in 3D spatial task execution.

相關內容

GPT-4V

關注 1

Analysis · 講稿 · 有向 · 穩健性 · 設計 ·

2024 年 2 月 5 日

Augmenting Security and Privacy in the Virtual Realm: An Analysis of Extended Reality Devices

Derin Cayir,Abbas Acar,Riccardo Lazzeretti,Marco Angelini,Mauro Conti,Selcuk Uluagac

from arxiv, This is the author's version of the work. It is posted here for personal/educational use only.}{The definitive version was published in IEEE Security & Privacy Magazine Jan/Feb 2024

In this work, we present a device-centric analysis of security and privacy attacks and defenses on Extended Reality (XR) devices, highlighting the need for robust and privacy-aware security mechanisms. Based on our analysis, we present future research directions and propose design considerations to help ensure the security and privacy of XR devices.

查全率/召回率 · MoDELS · 相似度 · 語義相似度 · 語言模型化 ·

2024 年 2 月 5 日

Multi-Lingual Malaysian Embedding: Leveraging Large Language Models for Semantic Representations

Husein Zolkepli,Aisyah Razak,Kamarul Adha,Ariff Nazhan

In this work, we present a comprehensive exploration of finetuning Malaysian language models, specifically Llama2 and Mistral, on embedding tasks involving negative and positive pairs. We release two distinct models tailored for Semantic Similarity and Retrieval-Augmented Generation (RAG). For Semantic Similarity, our 600 million parameter Llama2 model outperforms OpenAI text-embedding-ada-002 across all recall@k metrics for b.cari.com.my, c.cari.com.my, Malay news, and Malaysian Twitter test sets. In the realm of RAG models, our approach proves competitive with OpenAI text-embedding-ada-002 in the Malaysian context. Notably, our 2 billion parameter Llama2 model achieves superior Recall@5, Recall@10 for the "Melayu" keyword research papers dataset and excels in Recall@3, Recall@5, and Recall@10 for the lom.agc.gov.my dataset. These findings underscore the effectiveness of our finetuning strategy and highlight the performance gains in both Semantic Similarity and RAG tasks. All models released at //huggingface.co/collections/mesolitica/malaysian-embedding-6523612bfe5881ad35f81b99

Performer · Pair · 閉式 · ML · 通道 ·

2024 年 2 月 5 日

On the Performance of RIS-Aided Spatial Modulation for Downlink Transmission

Xusheng Zhu,Qingqing Wu,Wen Chen

In this study, we explore the performance of a reconfigurable reflecting surface (RIS)-assisted transmit spatial modulation (SM) system for downlink transmission, wherein the deployment of RIS serves the purpose of blind area coverage within the channel. At the receiving end, we present three detectors, i.e., maximum likelihood (ML) detector, two-stage ML detection, and greedy detector to recover the transmitted signal. By utilizing the ML detector, we initially derive the conditional pair error probability expression for the proposed scheme. Subsequently, we leverage the central limit theorem (CLT) to obtain the probability density function of the combined channel. Following this, the Gaussian-Chebyshev quadrature method is applied to derive a closed-form expression for the unconditional pair error probability and establish the union tight upper bound for the average bit error probability (ABEP). Furthermore, we derive a closed-form expression for the ergodic capacity of the proposed RIS-SM scheme. Monte Carlo simulations are conducted not only to assess the complexity and reliability of the three detection algorithms but also to validate the results obtained through theoretical derivation results.

相關系數 · Learning · 泛化理論 · MoDELS · 正則化 ·

2024 年 2 月 5 日

Learning from Teaching Regularization: Generalizable Correlations Should be Easy to Imitate

Can Jin,Tong Che,Hongwu Peng,Yiyuan Li,Marco Pavone

Generalization remains a central challenge in machine learning. In this work, we propose Learning from Teaching (LoT), a novel regularization technique for deep neural networks to enhance generalization. Inspired by the human ability to capture concise and abstract patterns, we hypothesize that generalizable correlations are expected to be easier to teach. LoT operationalizes this concept to improve the generalization of the main model with auxiliary student learners. The student learners are trained by the main model and improve the main model to capture more generalizable and teachable correlations by providing feedback. Our experimental results across several domains, including Computer Vision, Natural Language Processing, and Reinforcement Learning, demonstrate that the introduction of LoT brings significant benefits compared to merely training models on the original training data. It suggests the effectiveness of LoT in identifying generalizable information without falling into the swamp of complex patterns in data, making LoT a valuable addition to the current machine learning frameworks.

控制器 · 二次規劃 · Learning · MoDELS · Performer ·

2024 年 2 月 3 日

Bridging the Gaps: Learning Verifiable Model-Free Quadratic Programming Controllers Inspired by Model Predictive Control

Yiwen Lu,Zishuo Li,Yihan Zhou,Na Li,Yilin Mo

In this paper, we introduce a new class of parameterized controllers, drawing inspiration from Model Predictive Control (MPC). The controller resembles a Quadratic Programming (QP) solver of a linear MPC problem, with the parameters of the controller being trained via Deep Reinforcement Learning (DRL) rather than derived from system models. This approach addresses the limitations of common controllers with Multi-Layer Perceptron (MLP) or other general neural network architecture used in DRL, in terms of verifiability and performance guarantees, and the learned controllers possess verifiable properties like persistent feasibility and asymptotic stability akin to MPC. On the other hand, numerical examples illustrate that the proposed controller empirically matches MPC and MLP controllers in terms of control performance and has superior robustness against modeling uncertainty and noises. Furthermore, the proposed controller is significantly more computationally efficient compared to MPC and requires fewer parameters to learn than MLP controllers. Real-world experiments on vehicle drift maneuvering task demonstrate the potential of these controllers for robotics and other demanding control tasks.

有向 · Learning · 估計/估計量 · 穩健性 · 深度學習 ·

2024 年 2 月 2 日

Visual Gyroscope: Combination of Deep Learning Features and Direct Alignment for Panoramic Stabilization

Bruno Berenguel-Baeta,Antoine N. Andre,Guillaume Caron,Jesus Bermudez-Cameo,Jose J. Guerrero

In this article we present a visual gyroscope based on equirectangular panoramas. We propose a new pipeline where we take advantage of combining three different methods to obtain a robust and accurate estimation of the attitude of the camera. We quantitatively and qualitatively validate our method on two image sequences taken with a $360^\circ$ dual-fisheye camera mounted on different aerial vehicles.

近似 · 優化器 · 近似誤差 · 衰減 · 泛函 ·

2024 年 2 月 2 日

On the Approximation of Operator-Valued Riccati Equations in Hilbert Spaces

James Cheung

from arxiv, Revision 4

In this work, we present an abstract theory for the approximation of operator-valued Riccati equations posed on Hilbert spaces. It is demonstrated here, under the assumption of compactness in the coefficient operators, that the error of the approximate solution to the operator-valued Riccati equation is bounded above by the approximation error of the governing semigroup. One significant outcome of this result is the correct prediction of optimal convergence for finite element approximations of the operator-valued Riccati equations for when the governing semigroup involves parabolic, as well as hyperbolic processes. We derive the abstract theory for the time-dependent and time-independent operator-valued Riccati equations in the first part of this work. In the second part, we prove optimal convergence rates for the finite element approximation of the functional gain associated with model one-dimensional weakly damped wave and thermal LQR control systems. These theoretical claims are then corroborated with computational evidence.

NeRF · INFORMS · 3D · 正則化 · 正則化項 ·

2024 年 2 月 1 日

ViCA-NeRF: View-Consistency-Aware 3D Editing of Neural Radiance Fields

Jiahua Dong,Yu-Xiong Wang

from arxiv, Neurips2023; project page: //github.com/Dongjiahua/VICA-NeRF

We introduce ViCA-NeRF, the first view-consistency-aware method for 3D editing with text instructions. In addition to the implicit neural radiance field (NeRF) modeling, our key insight is to exploit two sources of regularization that explicitly propagate the editing information across different views, thus ensuring multi-view consistency. For geometric regularization, we leverage the depth information derived from NeRF to establish image correspondences between different views. For learned regularization, we align the latent codes in the 2D diffusion model between edited and unedited images, enabling us to edit key views and propagate the update throughout the entire scene. Incorporating these two strategies, our ViCA-NeRF operates in two stages. In the initial stage, we blend edits from different views to create a preliminary 3D edit. This is followed by a second stage of NeRF training, dedicated to further refining the scene's appearance. Experimental results demonstrate that ViCA-NeRF provides more flexible, efficient (3 times faster) editing with higher levels of consistency and details, compared with the state of the art. Our code is publicly available.

后向 · Learning · 估計/估計量 · 約束 · 前向 ·

2024 年 2 月 1 日

ODICE: Revealing the Mystery of Distribution Correction Estimation via Orthogonal-gradient Update

Liyuan Mao,Haoran Xu,Weinan Zhang,Xianyuan Zhan

from arxiv, Spotlight @ ICLR 2024, first two authors contribute equally

In this study, we investigate the DIstribution Correction Estimation (DICE) methods, an important line of work in offline reinforcement learning (RL) and imitation learning (IL). DICE-based methods impose state-action-level behavior constraint, which is an ideal choice for offline learning. However, they typically perform much worse than current state-of-the-art (SOTA) methods that solely use action-level behavior constraint. After revisiting DICE-based methods, we find there exist two gradient terms when learning the value function using true-gradient update: forward gradient (taken on the current state) and backward gradient (taken on the next state). Using forward gradient bears a large similarity to many offline RL methods, and thus can be regarded as applying action-level constraint. However, directly adding the backward gradient may degenerate or cancel out its effect if these two gradients have conflicting directions. To resolve this issue, we propose a simple yet effective modification that projects the backward gradient onto the normal plane of the forward gradient, resulting in an orthogonal-gradient update, a new learning rule for DICE-based methods. We conduct thorough theoretical analyses and find that the projected backward gradient brings state-level behavior regularization, which reveals the mystery of DICE-based methods: the value learning objective does try to impose state-action-level constraint, but needs to be used in a corrected way. Through toy examples and extensive experiments on complex offline RL and IL tasks, we demonstrate that DICE-based methods using orthogonal-gradient updates (O-DICE) achieve SOTA performance and great robustness.

entity · MINE · 可約的 · 規范化的 · 實體對齊 ·

2021 年 3 月 29 日

Boosting the Speed of Entity Alignment 10*: Dual Attention Matching Network with Normalized Hard Sample Mining

Xin Mao,Wenting Wang,Yuanbin Wu,Man Lan

from arxiv, 12 pages; Accepted by TheWebConf(WWW) 2021

Seeking the equivalent entities among multi-source Knowledge Graphs (KGs) is the pivotal step to KGs integration, also known as \emph{entity alignment} (EA). However, most existing EA methods are inefficient and poor in scalability. A recent summary points out that some of them even require several days to deal with a dataset containing 200,000 nodes (DWY100K). We believe over-complex graph encoder and inefficient negative sampling strategy are the two main reasons. In this paper, we propose a novel KG encoder -- Dual Attention Matching Network (Dual-AMN), which not only models both intra-graph and cross-graph information smartly, but also greatly reduces computational complexity. Furthermore, we propose the Normalized Hard Sample Mining Loss to smoothly select hard negative samples with reduced loss shift. The experimental results on widely used public datasets indicate that our method achieves both high accuracy and high efficiency. On DWY100K, the whole running process of our method could be finished in 1,100 seconds, at least 10* faster than previous work. The performances of our method also outperform previous works across all datasets, where Hits@1 and MRR have been improved from 6% to 13%.