亚洲AV午夜成人片精品网站听书_亚洲欧美中文日韩A_国产精品久久无码不卡黑寡妇_男模野外自慰chinese_亚洲欧美专区精品综合伊人久久_亚洲日韩久久综合中文字幕人_1区2区3区4区产品不卡码网站

Multimodal UAVs (Unmanned Aerial Vehicles) are rarely capable of more than two modalities, i.e., flying and walking or flying and perching. However, being able to fly, perch, and walk could further improve their usefulness by expanding their operating envelope. For instance, an aerial robot could fly a long distance, perch in a high place to survey the surroundings, then walk to avoid obstacles that could potentially inhibit flight. Birds are capable of these three tasks, and so offer a practical example of how a robot might be developed to do the same. In this paper, we present a specialized avian-inspired claw design to enable UAVs to perch passively or walk. The key innovation is the combination of a Hoberman linkage leg with Fin Ray claw that uses the weight of the UAV to wrap the claw around a perch, or hyperextend it in the opposite direction to form a curved-up shape for stable terrestrial locomotion. Because the design uses the weight of the vehicle, the underactuated design is lightweight and low power. With the inclusion of talons, the 45g claws are capable of holding a 700g UAV to an almost 20-degree angle on a perch. In scenarios where cluttered environments impede flight and long mission times are required, such a combination of flying, perching, and walking is critical.

相關內容

Weight

關注 0

模型評估 · Boosting（一種模型訓練加速方式） · 置信度 · MoDELS · 相關系數 ·

2024 年 3 月 14 日

Self-Consistency Boosts Calibration for Math Reasoning

Ante Wang,Linfeng Song,Ye Tian,Baolin Peng,Lifeng Jin,Haitao Mi,Jinsong Su,Dong Yu

Calibration, which establishes the correlation between accuracy and model confidence, is important for LLM development. We design three off-the-shelf calibration methods based on self-consistency (Wang et al., 2022) for math reasoning tasks. Evaluation on two popular benchmarks (GSM8K and MathQA) using strong open-source LLMs (Mistral and LLaMA2), our methods better bridge model confidence and accuracy than existing methods based on p(True) (Kadavath et al., 2022) or logit (Kadavath et al., 2022).

MoDELS · 優化器 · 3D · 得分 · Guidance ·

2024 年 3 月 14 日

Score-Guided Diffusion for 3D Human Recovery

Anastasis Stathopoulos,Ligong Han,Dimitris Metaxas

from arxiv, CVPR 2024 (project page: //statho.github.io/ScoreHMR)

We present Score-Guided Human Mesh Recovery (ScoreHMR), an approach for solving inverse problems for 3D human pose and shape reconstruction. These inverse problems involve fitting a human body model to image observations, traditionally solved through optimization techniques. ScoreHMR mimics model fitting approaches, but alignment with the image observation is achieved through score guidance in the latent space of a diffusion model. The diffusion model is trained to capture the conditional distribution of the human model parameters given an input image. By guiding its denoising process with a task-specific score, ScoreHMR effectively solves inverse problems for various applications without the need for retraining the task-agnostic diffusion model. We evaluate our approach on three settings/applications. These are: (i) single-frame model fitting; (ii) reconstruction from multiple uncalibrated views; (iii) reconstructing humans in video sequences. ScoreHMR consistently outperforms all optimization baselines on popular benchmarks across all settings. We make our code and models available at the //statho.github.io/ScoreHMR.

泛函 · MoDELS · 隱狀態 · Performer · 線性的 ·

2024 年 3 月 14 日

Introducing Routing Functions to Vision-Language Parameter-Efficient Fine-Tuning with Low-Rank Bottlenecks

Tingyu Qu,Tinne Tuytelaars,Marie-Francine Moens

Mainstream parameter-efficient fine-tuning (PEFT) methods, such as LoRA or Adapter, project a model's hidden states to a lower dimension, allowing pre-trained models to adapt to new data through this low-rank bottleneck. However, PEFT tasks involving multiple modalities, like vision-language (VL) tasks, require not only adaptation to new data but also learning the relationship between different modalities. Targeting at VL PEFT tasks, we propose a family of operations, called routing functions, to enhance VL alignment in the low-rank bottlenecks. The routing functions adopt linear operations and do not introduce new trainable parameters. In-depth analyses are conducted to study their behavior. In various VL PEFT settings, the routing functions significantly improve performance of the original PEFT methods, achieving over 20% improvement on VQAv2 ($\text{RoBERTa}_{\text{large}}$+ViT-L/16) and 30% on COCO Captioning (GPT2-medium+ViT-L/16). Also when fine-tuning a pre-trained multimodal model such as CLIP-BART, we observe smaller but consistent improvements across a range of VL PEFT tasks.

塑造 · 近似 · 優化器 · 相互獨立的 · CASES ·

2024 年 3 月 13 日

Tangential Fixpoint Iterations for Gromov-Wasserstein Barycenters

Florian Beier,Robert Beinert

The Gromov-Wasserstein (GW) transport problem is a relaxation of classic optimal transport, which seeks a transport between two measures while preserving their internal geometry. Due to meeting this theoretical underpinning, it is a valuable tool for the analysis of objects that do not possess a natural embedding or should be studied independently of it. Prime applications can thus be found in e.g. shape matching, classification and interpolation tasks. To tackle the latter, one theoretically justified approach is the employment of multi-marginal GW transport and GW barycenters, which are Fr\'echet means with respect to the GW distance. However, because the computation of GW itself already poses a quadratic and non-convex optimization problem, the determination of GW barycenters is a hard task and algorithms for their computation are scarce. In this paper, we revisit a known procedure for the determination of Fr\'echet means in Riemannian manifolds via tangential approximations in the context of GW. We provide a characterization of barycenters in the GW tangent space, which ultimately gives rise to a fixpoint iteration for approximating GW barycenters using multi-marginal plans. We propose a relaxation of this fixpoint iteration and show that it monotonously decreases the barycenter loss. In certain cases our proposed method naturally provides us with barycentric embeddings. The resulting algorithm is capable of producing qualitative shape interpolations between multiple 3d shapes with support sizes of over thousands of points in reasonable time. In addition, we verify our method on shape classification and multi-graph matching tasks.

Learning · 標注 · 樣例 · INFORMS · 語言模型化 ·

2024 年 3 月 13 日

In-Context Learning Learns Label Relationships but Is Not Conventional Learning

Jannik Kossen,Yarin Gal,Tom Rainforth

from arxiv, Accepted for publication at ICLR 2024

The predictions of Large Language Models (LLMs) on downstream tasks often improve significantly when including examples of the input--label relationship in the context. However, there is currently no consensus about how this in-context learning (ICL) ability of LLMs works. For example, while Xie et al. (2021) liken ICL to a general-purpose learning algorithm, Min et al. (2022) argue ICL does not even learn label relationships from in-context examples. In this paper, we provide novel insights into how ICL leverages label information, revealing both capabilities and limitations. To ensure we obtain a comprehensive picture of ICL behavior, we study probabilistic aspects of ICL predictions and thoroughly examine the dynamics of ICL as more examples are provided. Our experiments show that ICL predictions almost always depend on in-context labels and that ICL can learn truly novel tasks in-context. However, we also find that ICL struggles to fully overcome prediction preferences acquired from pre-training data and, further, that ICL does not consider all in-context information equally.

Analysis · 可辨認的 · 單元 · ChIP-Seq · cancer ·

2024 年 3 月 13 日

Kernel-Based Testing for Single-Cell Differential Analysis

Anthony Ozier-Lafontaine,Camille Fourneaux,Ghislain Durif,Céline Vallot,Olivier Gandrillon,Sandrine Giraud,Bertrand Michel,Franck Picard

Single-cell technologies offer insights into molecular feature distributions, but comparing them poses challenges. We propose a kernel-testing framework for non-linear cell-wise distribution comparison, analyzing gene expression and epigenomic modifications. Our method allows feature-wise and global transcriptome/epigenome comparisons, revealing cell population heterogeneities. Using a classifier based on embedding variability, we identify transitions in cell states, overcoming limitations of traditional single-cell analysis. Applied to single-cell ChIP-Seq data, our approach identifies untreated breast cancer cells with an epigenomic profile resembling persister cells. This demonstrates the effectiveness of kernel testing in uncovering subtle population variations that might be missed by other methods.

SQL · Learning · 可辨認的 · Performer · 網絡爬蟲 ·

2024 年 3 月 9 日

Schema-Aware Multi-Task Learning for Complex Text-to-SQL

Yangjun Wu,Han Wang

from arxiv, 8pages

Conventional text-to-SQL parsers are not good at synthesizing complex SQL queries that involve multiple tables or columns, due to the challenges inherent in identifying the correct schema items and performing accurate alignment between question and schema items. To address the above issue, we present a schema-aware multi-task learning framework (named MTSQL) for complicated SQL queries. Specifically, we design a schema linking discriminator module to distinguish the valid question-schema linkings, which explicitly instructs the encoder by distinctive linking relations to enhance the alignment quality. On the decoder side, we define 6-type relationships to describe the connections between tables and columns (e.g., WHERE_TC), and introduce an operator-centric triple extractor to recognize those associated schema items with the predefined relationship. Also, we establish a rule set of grammar constraints via the predicted triples to filter the proper SQL operators and schema items during the SQL generation. On Spider, a cross-domain challenging text-to-SQL benchmark, experimental results indicate that MTSQL is more effective than baselines, especially in extremely hard scenarios. Moreover, further analyses verify that our approach leads to promising improvements for complicated SQL queries.

蒸餾 · MoDELS · 聯邦學習 · 學成 · 歸納偏好 ·

2021 年 6 月 9 日

Data-Free Knowledge Distillation for Heterogeneous Federated Learning

Zhuangdi Zhu,Junyuan Hong,Jiayu Zhou

Federated Learning (FL) is a decentralized machine-learning paradigm, in which a global server iteratively averages the model parameters of local users without accessing their data. User heterogeneity has imposed significant challenges to FL, which can incur drifted global models that are slow to converge. Knowledge Distillation has recently emerged to tackle this issue, by refining the server model using aggregated knowledge from heterogeneous users, other than directly averaging their model parameters. This approach, however, depends on a proxy dataset, making it impractical unless such a prerequisite is satisfied. Moreover, the ensemble knowledge is not fully utilized to guide local model learning, which may in turn affect the quality of the aggregated model. Inspired by the prior art, we propose a data-free knowledge distillation} approach to address heterogeneous FL, where the server learns a lightweight generator to ensemble user information in a data-free manner, which is then broadcasted to users, regulating local training using the learned knowledge as an inductive bias. Empirical studies powered by theoretical implications show that, our approach facilitates FL with better generalization performance using fewer communication rounds, compared with the state-of-the-art.

entity · 小樣本學習 · 注意力機制 · 圖 · Networking ·

2020 年 10 月 19 日

Adaptive Attentional Network for Few-Shot Knowledge Graph Completion

Jiawei Sheng,Shu Guo,Zhenyu Chen,Juwei Yue,Lihong Wang,Tingwen Liu,Hongbo Xu

from arxiv, 11 pages, 3 figures

Few-shot Knowledge Graph (KG) completion is a focus of current research, where each task aims at querying unseen facts of a relation given its few-shot reference entity pairs. Recent attempts solve this problem by learning static representations of entities and references, ignoring their dynamic properties, i.e., entities may exhibit diverse roles within task relations, and references may make different contributions to queries. This work proposes an adaptive attentional network for few-shot KG completion by learning adaptive entity and reference representations. Specifically, entities are modeled by an adaptive neighbor encoder to discern their task-oriented roles, while references are modeled by an adaptive query-aware aggregator to differentiate their contributions. Through the attention mechanism, both entities and references can capture their fine-grained semantic meanings, and thus render more expressive representations. This will be more predictive for knowledge acquisition in the few-shot scenario. Evaluation in link prediction on two public datasets shows that our approach achieves new state-of-the-art results with different few-shot sizes.

Single-Shot · Branch · 目標檢測 · 推斷 · MS ·

2018 年 4 月 8 日

Single-Shot Object Detection with Enriched Semantics

Zhishuai Zhang,Siyuan Qiao,Cihang Xie,Wei Shen,Bo Wang,Alan L. Yuille

We propose a novel single shot object detection network named Detection with Enriched Semantics (DES). Our motivation is to enrich the semantics of object detection features within a typical deep detector, by a semantic segmentation branch and a global activation module. The segmentation branch is supervised by weak segmentation ground-truth, i.e., no extra annotation is required. In conjunction with that, we employ a global activation module which learns relationship between channels and object classes in a self-supervised manner. Comprehensive experimental results on both PASCAL VOC and MS COCO detection datasets demonstrate the effectiveness of the proposed method. In particular, with a VGG16 based DES, we achieve an mAP of 81.7 on VOC2007 test and an mAP of 32.8 on COCO test-dev with an inference speed of 31.5 milliseconds per image on a Titan Xp GPU. With a lower resolution version, we achieve an mAP of 79.7 on VOC2007 with an inference speed of 13.0 milliseconds per image.