亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We tackle the problem of 3D point cloud localization based on a few natural linguistic descriptions and introduce a novel neural network, Text2Loc, that fully interprets the semantic relationship between points and text. Text2Loc follows a coarse-to-fine localization pipeline: text-submap global place recognition, followed by fine localization. In global place recognition, relational dynamics among each textual hint are captured in a hierarchical transformer with max-pooling (HTM), whereas a balance between positive and negative pairs is maintained using text-submap contrastive learning. Moreover, we propose a novel matching-free fine localization method to further refine the location predictions, which completely removes the need for complicated text-instance matching and is lighter, faster, and more accurate than previous methods. Extensive experiments show that Text2Loc improves the localization accuracy by up to $2\times$ over the state-of-the-art on the KITTI360Pose dataset. We will make the code publicly available.

相關內容

根據激光測量原理得到的點云,包括三維坐標(XYZ)和激光反射強度(Intensity)。 根據攝影測量原理得到的點云,包括三維坐標(XYZ)和顏色信息(RGB)。 結合激光測量和攝影測量原理得到點云,包括三維坐標(XYZ)、激光反射強度(Intensity)和顏色信息(RGB)。 在獲取物體表面每個采樣點的空間坐標后,得到的是一個點的集合,稱之為“點云”(Point Cloud)

Large language models (LLMs) have exploded in popularity due to their ability to perform a wide array of natural language tasks. Text-based content moderation is one LLM use case that has received recent enthusiasm, however, there is little research investigating how LLMs perform in content moderation settings. In this work, we evaluate a suite of commodity LLMs on two common content moderation tasks: rule-based community moderation and toxic content detection. For rule-based community moderation, we instantiate 95 subcommunity specific LLMs by prompting GPT-3.5 with rules from 95 Reddit subcommunities. We find that GPT-3.5 is effective at rule-based moderation for many communities, achieving a median accuracy of 64% and a median precision of 83%. For toxicity detection, we evaluate a suite of commodity LLMs (GPT-3, GPT-3.5, GPT-4, Gemini Pro, LLAMA 2) and show that LLMs significantly outperform currently widespread toxicity classifiers. However, recent increases in model size add only marginal benefit to toxicity detection, suggesting a potential performance plateau for LLMs on toxicity detection tasks. We conclude by outlining avenues for future work in studying LLMs and content moderation.

Sentiment analysis and emotion detection are important research topics in natural language processing (NLP) and benefit many downstream tasks. With the widespread application of LLMs, researchers have started exploring the application of LLMs based on instruction-tuning in the field of sentiment analysis. However, these models only focus on single aspects of affective classification tasks (e.g. sentimental polarity or categorical emotions), and overlook the regression tasks (e.g. sentiment strength or emotion intensity), which leads to poor performance in downstream tasks. The main reason is the lack of comprehensive affective instruction tuning datasets and evaluation benchmarks, which cover various affective classification and regression tasks. Moreover, although emotional information is useful for downstream tasks, existing downstream datasets lack high-quality and comprehensive affective annotations. In this paper, we propose EmoLLMs, the first series of open-sourced instruction-following LLMs for comprehensive affective analysis based on fine-tuning various LLMs with instruction data, the first multi-task affective analysis instruction dataset (AAID) with 234K data samples based on various classification and regression tasks to support LLM instruction tuning, and a comprehensive affective evaluation benchmark (AEB) with 14 tasks from various sources and domains to test the generalization ability of LLMs. We propose a series of EmoLLMs by fine-tuning LLMs with AAID to solve various affective instruction tasks. We compare our model with a variety of LLMs on AEB, where our models outperform all other open-sourced LLMs, and surpass ChatGPT and GPT-4 in most tasks, which shows that the series of EmoLLMs achieve the ChatGPT-level and GPT-4-level generalization capabilities on affective analysis tasks, and demonstrates our models can be used as affective annotation tools.

We propose a novel, heterogeneous multi-agent architecture that miniaturizes rovers by outsourcing power generation to a central hub. By delegating power generation and distribution functions to this hub, the size, weight, power, and cost (SWAP-C) per rover are reduced, enabling efficient fleet scaling. As these rovers conduct mission tasks around the terrain, the hub charges an array of replacement battery modules. When a rover requires charging, it returns to the hub to initiate an autonomous docking sequence and exits with a fully charged battery. This confers an advantage over direct charging methods, such as wireless or wired charging, by replenishing a rover in minutes as opposed to hours, increasing net rover uptime. This work shares an open-source platform developed to demonstrate battery swapping on unknown field terrain. We detail our design methodologies utilized for increasing system reliability, with a focus on optimization, robust mechanical design, and verification. Optimization of the system is discussed, including the design of passive guide rails through simulation-based optimization methods which increase the valid docking configuration space by 258%. The full system was evaluated during integrated testing, where an average servicing time of 98 seconds was achieved on surfaces with a gradient up to 10{\deg}. We conclude by briefly proposing flight considerations for advancing the system toward a space-ready design. In sum, this prototype represents a proof of concept for autonomous docking and battery transfer on field terrain, advancing its Technology Readiness Level (TRL) from 1 to 3.

Knowledge distillation is the process of transferring knowledge from a more powerful large model (teacher) to a simpler counterpart (student). Numerous current approaches involve the student imitating the knowledge of the teacher directly. However, redundancy still exists in the learned representations through these prevalent methods, which tend to learn each spatial location's features indiscriminately. To derive a more compact representation (concept feature) from the teacher, inspired by human cognition, we suggest an innovative method, termed Generative Denoise Distillation (GDD), where stochastic noises are added to the concept feature of the student to embed them into the generated instance feature from a shallow network. Then, the generated instance feature is aligned with the knowledge of the instance from the teacher. We extensively experiment with object detection, instance segmentation, and semantic segmentation to demonstrate the versatility and effectiveness of our method. Notably, GDD achieves new state-of-the-art performance in the tasks mentioned above. We have achieved substantial improvements in semantic segmentation by enhancing PspNet and DeepLabV3, both of which are based on ResNet-18, resulting in mIoU scores of 74.67 and 77.69, respectively, surpassing their previous scores of 69.85 and 73.20 on the Cityscapes dataset of 20 categories. The source code of GDD is available at //github.com/ZhgLiu/GDD.

This study aims to provide a comprehensive assessment of single-objective and multi-objective optimisation algorithms for the design of an elbow-type draft tube, as well as to introduce a computationally efficient optimisation workflow. The proposed workflow leverages deep neural network surrogates trained on data obtained from numerical simulations. The use of surrogates allows for a more flexible and faster evaluation of novel designs. The success history-based adaptive differential evolution with linear reduction and the multi-objective evolutionary algorithm based on decomposition were identified as the best-performing algorithms and used to determine the influence of different objectives in the single-objective optimisation and their combined impact on the draft tube design in the multi-objective optimisation. The results for the single-objective algorithm are consistent with those of the multi-objective algorithm when the objectives are considered separately. Multi-objective approach, however, should typically be chosen, especially for computationally inexpensive surrogates. A multi-criteria decision analysis method was used to obtain optimal multi-objective results, showing an improvement of 1.5% and 17% for the pressure recovery factor and drag coefficient, respectively. The difference between the predictions and the numerical results is less than 0.5% for the pressure recovery factor and 3% for the drag coefficient. As the demand for renewable energy continues to increase, the relevance of data-driven optimisation workflows, as discussed in this study, will become increasingly important, especially in the context of global sustainability efforts.

While achieving tremendous success in various fields, existing multi-agent reinforcement learning (MARL) with a black-box neural network architecture makes decisions in an opaque manner that hinders humans from understanding the learned knowledge and how input observations influence decisions. Instead, existing interpretable approaches, such as traditional linear models and decision trees, usually suffer from weak expressivity and low accuracy. To address this apparent dichotomy between performance and interpretability, our solution, MIXing Recurrent soft decision Trees (MIXRTs), is a novel interpretable architecture that can represent explicit decision processes via the root-to-leaf path and reflect each agent's contribution to the team. Specifically, we construct a novel soft decision tree to address partial observability by leveraging the advances in recurrent neural networks, and demonstrate which features influence the decision-making process through the tree-based model. Then, based on the value decomposition framework, we linearly assign credit to each agent by explicitly mixing individual action values to estimate the joint action value using only local observations, providing new insights into how agents cooperate to accomplish the task. Theoretical analysis shows that MIXRTs guarantees the structural constraint on additivity and monotonicity in the factorization of joint action values. Evaluations on the challenging Spread and StarCraft II tasks show that MIXRTs achieves competitive performance compared to widely investigated methods and delivers more straightforward explanations of the decision processes. We explore a promising path toward developing learning algorithms with both high performance and interpretability, potentially shedding light on new interpretable paradigms for MARL.

We study semi-supervised sequence generation tasks where labeled data are too scarce to effectively finetune a model and at the same time few-shot prompting of a large language model (LLM) has suboptimal performance. This happens when a task, such as parsing, is expensive to annotate and also unfamiliar to a pretrained LLM. In this paper, we present a discovery that student models distilled from an in-context learned LLM can often generalize better than their teacher on such tasks. Leveraging this finding, we present a new method -- multistage collaborative knowledge distillation from an LLM (MCKD) -- for such tasks. MCKD first few-shot prompts an LLM to produce pseudolabels for unlabeled data. At each intermediate knowledge distillation (KD) stage, a new pair of students is trained on disjoint partitions of the pseudolabeled data. Each student then produces new and improved pseudolabels for its unseen partition to be used in the next stage of distillation. We demonstrate the advantage of multistage cross-partition labeling on several syntactic and semantic parsing tasks. On CRAFT biomedical parsing, for example, 3-stage MCKD with 50 labeled examples outperforms the prompted LLM and vanilla KD by 7.5% and 3.7% parsing F1, respectively, and matches the performance of supervised finetuning with 500 examples.

Receiving immediate and personalized feedback is crucial for second-language learners, and Automated Essay Scoring (AES) systems are a vital resource when human instructors are unavailable. This study investigates the effectiveness of Large Language Models (LLMs), specifically GPT-4 and fine-tuned GPT-3.5, as tools for AES. Our comprehensive set of experiments, conducted on both public and private datasets, highlights the remarkable advantages of LLM-based AES systems. They include superior accuracy, consistency, generalizability, and interpretability, with fine-tuned GPT-3.5 surpassing traditional grading models. Additionally, we undertake LLM-assisted human evaluation experiments involving both novice and expert graders. One pivotal discovery is that LLMs not only automate the grading process but also enhance the performance of human graders. Novice graders when provided with feedback generated by LLMs, achieve a level of accuracy on par with experts, while experts become more efficient and maintain greater consistency in their assessments. These results underscore the potential of LLMs in educational technology, paving the way for effective collaboration between humans and AI, ultimately leading to transformative learning experiences through AI-generated feedback.

With the advent of deep neural networks, learning-based approaches for 3D reconstruction have gained popularity. However, unlike for images, in 3D there is no canonical representation which is both computationally and memory efficient yet allows for representing high-resolution geometry of arbitrary topology. Many of the state-of-the-art learning-based 3D reconstruction approaches can hence only represent very coarse 3D geometry or are limited to a restricted domain. In this paper, we propose occupancy networks, a new representation for learning-based 3D reconstruction methods. Occupancy networks implicitly represent the 3D surface as the continuous decision boundary of a deep neural network classifier. In contrast to existing approaches, our representation encodes a description of the 3D output at infinite resolution without excessive memory footprint. We validate that our representation can efficiently encode 3D structure and can be inferred from various kinds of input. Our experiments demonstrate competitive results, both qualitatively and quantitatively, for the challenging tasks of 3D reconstruction from single images, noisy point clouds and coarse discrete voxel grids. We believe that occupancy networks will become a useful tool in a wide variety of learning-based 3D tasks.

We study the problem of learning to reason in large scale knowledge graphs (KGs). More specifically, we describe a novel reinforcement learning framework for learning multi-hop relational paths: we use a policy-based agent with continuous states based on knowledge graph embeddings, which reasons in a KG vector space by sampling the most promising relation to extend its path. In contrast to prior work, our approach includes a reward function that takes the accuracy, diversity, and efficiency into consideration. Experimentally, we show that our proposed method outperforms a path-ranking based algorithm and knowledge graph embedding methods on Freebase and Never-Ending Language Learning datasets.

北京阿比特科技有限公司