亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

While the exploration for embodied AI has spanned multiple decades, it remains a persistent challenge to endow agents with human-level intelligence, including perception, learning, reasoning, decision-making, control, and generalization capabilities, so that they can perform general-purpose tasks in open, unstructured, and dynamic environments. Recent advances in computer vision, natural language processing, and multi-modality learning have shown that the foundation models have superhuman capabilities for specific tasks. They not only provide a solid cornerstone for integrating basic modules into embodied AI systems but also shed light on how to scale up robot learning from a methodological perspective. This survey aims to provide a comprehensive and up-to-date overview of foundation models in robotics, focusing on autonomous manipulation and encompassing high-level planning and low-level control. Moreover, we showcase their commonly used datasets, simulators, and benchmarks. Importantly, we emphasize the critical challenges intrinsic to this field and delineate potential avenues for future research, contributing to advancing the frontier of academic and industrial discourse.

相關內容

ACM/IEEE第23屆模型驅動工程語言和系統國際會議,是模型驅動軟件和系統工程的首要會議系列,由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來,模型涵蓋了建模的各個方面,從語言和方法到工具和應用程序。模特的參加者來自不同的背景,包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇,參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會,并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。 官網鏈接: · 神經元 · 語言模型化 · 大語言模型 · 縮放 ·
2024 年 3 月 18 日

Large Language Models (LLMs) are composed of neurons that exhibit various behaviors and roles, which become increasingly diversified as models scale. Recent studies have revealed that not all neurons are active across different datasets, and this sparsity correlates positively with the task-specific ability, leading to advancements in model pruning and training efficiency. Traditional fine-tuning methods engage all parameters of LLMs, which is computationally expensive and may not be necessary. In contrast, Parameter-Efficient Fine-Tuning (PEFT) approaches aim to minimize the number of trainable parameters, yet they still operate at a relatively macro scale (e.g., layer-level). We introduce Neuron-Level Fine-Tuning (NeFT), a novel approach that refines the granularity of parameter training down to the individual neuron, enabling more precise and computationally efficient model updates. The experimental results show that NeFT not only exceeded the performance of full-parameter fine-tuning and PEFT but also provided insights into the analysis of neurons.

Large Language Models (LLMs) have garnered significant attention for their ability to understand text and images, generate human-like text, and perform complex reasoning tasks. However, their ability to generalize this advanced reasoning with a combination of natural language text for decision-making in dynamic situations requires further exploration. In this study, we investigate how well LLMs can adapt and apply a combination of arithmetic and common-sense reasoning, particularly in autonomous driving scenarios. We hypothesize that LLMs hybrid reasoning abilities can improve autonomous driving by enabling them to analyze detected object and sensor data, understand driving regulations and physical laws, and offer additional context. This addresses complex scenarios, like decisions in low visibility (due to weather conditions), where traditional methods might fall short. We evaluated Large Language Models (LLMs) based on accuracy by comparing their answers with human-generated ground truth inside CARLA. The results showed that when a combination of images (detected objects) and sensor data is fed into the LLM, it can offer precise information for brake and throttle control in autonomous vehicles across various weather conditions. This formulation and answers can assist in decision-making for auto-pilot systems.

Swarm robots, which are inspired from the way insects behave collectively in order to achieve a common goal, have become a major part of research with applications involving search and rescue, area exploration, surveillance etc. In this paper, we present a swarm of robots that do not require individual extrinsic sensors to sense the environment but instead use a single central camera to locate and map the swarm. The robots can be easily built using readily available components with the main chassis being 3D printed, making the system low-cost, low-maintenance, and easy to replicate. We describe Zutu's hardware and software architecture, the algorithms to map the robots to the real world, and some experiments conducted using four of our robots. Eventually, we conclude the possible applications of our system in research, education, and industries.

Ridesharing services, such as Uber or Didi, have attracted considerable attention in recent years due to their positive impact on environmental protection and the economy. Existing studies require quick responses to orders, which lack the flexibility to accommodate longer wait times for better grouping opportunities. In this paper, we address a NP-hard ridesharing problem, called Minimal Extra Time RideSharing (METRS), which balances waiting time and group quality (i.e., detour time) to improve riders' satisfaction. To tackle this problem, we propose a novel approach called WATTER (WAit To be fasTER), which leverages an order pooling management algorithm allowing orders to wait until they can be matched with suitable groups. The key challenge is to customize the extra time threshold for each order by reducing the original optimization objective into a convex function of threshold, thus offering a theoretical guarantee to be optimized efficiently. We model the dispatch process using a Markov Decision Process (MDP) with a carefully designed value function to learn the threshold. Through extensive experiments on three real datasets, we demonstrate the efficiency and effectiveness of our proposed approaches.

As discussions around 6G begin, it is important to carefully quantify the spectral efficiency gains actually realized by deployed 5G networks as compared to 4G through various enhancements such as higher modulation, beamforming, and MIMO. This will inform the design of future cellular systems, especially in the mid-bands, which provide a good balance between bandwidth and propagation. Similar to 4G, 5G also utilizes low-band (<1 GHz) and mid-band spectrum (1 to 6 GHz), and hence comparing the performance of 4G and 5G in these bands will provide insights into how further improvements can be attained. In this work, we address a crucial question: is the performance boost in 5G compared to 4G primarily a result of increased bandwidth, or do the other enhancements play significant roles, and if so, under what circumstances? Hence, we conduct city-wide measurements of 4G and 5G cellular networks deployed in low- and mid-bands in Chicago and Minneapolis, and carefully quantify the contributions of different aspects of 5G advancements to its improved throughput performance. Our analyses show that (i) compared to 4G, the throughput improvement in 5G today is mainly influenced by the wider channel bandwidth, both from single channels and channel aggregation, (ii) in addition to wider channels, improved 5G throughput requires better signal conditions, which can be delivered by denser deployment and/or use of beamforming in mid-bands, (iii) the channel rank in real-world environments rarely supports the full 4 layers of 4x4 MIMO and (iv) advanced features such as MU-MIMO and higher order modulation such as 1024-QAM have yet to be widely deployed. These observations and conclusions lead one to consider designing the next generation of cellular systems to have wider channels, perhaps with improved channel aggregation, dense deployment with more beams.

Passivity is necessary for robots to fluidly collaborate and interact with humans physically. Nevertheless, due to the unconstrained nature of passivity-based impedance control laws, the robot is vulnerable to infeasible and unsafe configurations upon physical perturbations. In this paper, we propose a novel control architecture that allows a torque-controlled robot to guarantee safety constraints such as kinematic limits, self-collisions, external collisions and singularities and is passive only when feasible. This is achieved by constraining a dynamical system based impedance control law with a relaxed hierarchical control barrier function quadratic program subject to multiple concurrent, possibly contradicting, constraints. Joint space constraints are formulated from efficient data-driven self- and external C^2 collision boundary functions. We theoretically prove constraint satisfaction and show that the robot is passive when feasible. Our approach is validated in simulation and real robot experiments on a 7DoF Franka Research 3 manipulator.

Large Language Models (LLM) have demonstrated their strong ability in the field of machine translation (MT), yet they suffer from high computational cost and latency. Therefore, transferring translation knowledge from giant LLMs to medium-sized machine translation models is a promising research direction. However, traditional knowledge distillation methods do not take the capability of student and teacher models into consideration, therefore repeatedly teaching student models on the knowledge they have learned, and failing to extend to novel contexts and knowledge. In this paper, we propose a framework called MT-Patcher, which transfers knowledge from LLMs to existing MT models in a selective, comprehensive and proactive manner. Considering the current translation ability of student MT models, we only identify and correct their translation errors, instead of distilling the whole translation from the teacher. Leveraging the strong language abilities of LLMs, we instruct LLM teachers to synthesize diverse contexts and anticipate more potential errors for the student. Experiment results on translating both specific language phenomena and general MT benchmarks demonstrate that finetuning the student MT model on about 10% examples can achieve comparable results to the traditional knowledge distillation method, and synthesized potential errors and diverse contexts further improve translation performances on unseen contexts and words.

We propose an objective intelligibility measure (OIM), called the Gammachirp Envelope Similarity Index (GESI), which can predict the speech intelligibility (SI) of simulated hearing loss (HL) sounds for normal hearing (NH) listeners. GESI is an intrusive method that computes the SI metric using the gammachirp filterbank (GCFB), the modulation filterbank, and the extended cosine similarity measure. The unique features of GESI are that i) it reflects the hearing impaired (HI) listener's HL that appears in the audiogram and is caused by active and passive cochlear dysfunction, ii) it provides a single goodness metric, as in the widely used STOI and ESTOI, that can be used immediately to evaluate SE algorithms, and iii) it provides a simple control parameter to accept the level asymmetry of the reference and test sounds and to deal with individual listening conditions and environments. We evaluated GESI and the conventional OIMs, STOI, ESTOI, MBSTOI, and HASPI versions 1 and 2 by using four SI experiments on words of male and female speech sounds in both laboratory and remote environments. GESI was shown to outperform the other OIMs in the evaluations. GESI could be used to improve SE algorithms in assistive listening devices for individual HI listeners.

Reasoning is a fundamental aspect of human intelligence that plays a crucial role in activities such as problem solving, decision making, and critical thinking. In recent years, large language models (LLMs) have made significant progress in natural language processing, and there is observation that these models may exhibit reasoning abilities when they are sufficiently large. However, it is not yet clear to what extent LLMs are capable of reasoning. This paper provides a comprehensive overview of the current state of knowledge on reasoning in LLMs, including techniques for improving and eliciting reasoning in these models, methods and benchmarks for evaluating reasoning abilities, findings and implications of previous research in this field, and suggestions on future directions. Our aim is to provide a detailed and up-to-date review of this topic and stimulate meaningful discussion and future work.

We investigate the problem of automatically determining what type of shoe left an impression found at a crime scene. This recognition problem is made difficult by the variability in types of crime scene evidence (ranging from traces of dust or oil on hard surfaces to impressions made in soil) and the lack of comprehensive databases of shoe outsole tread patterns. We find that mid-level features extracted by pre-trained convolutional neural nets are surprisingly effective descriptors for this specialized domains. However, the choice of similarity measure for matching exemplars to a query image is essential to good performance. For matching multi-channel deep features, we propose the use of multi-channel normalized cross-correlation and analyze its effectiveness. Our proposed metric significantly improves performance in matching crime scene shoeprints to laboratory test impressions. We also show its effectiveness in other cross-domain image retrieval problems: matching facade images to segmentation labels and aerial photos to map images. Finally, we introduce a discriminatively trained variant and fine-tune our system through our proposed metric, obtaining state-of-the-art performance.

北京阿比特科技有限公司