亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

In this article, we investigate self-supervised 3D scene flow estimation and class-agnostic motion prediction on point clouds. A realistic scene can be well modeled as a collection of rigidly moving parts, therefore its scene flow can be represented as a combination of the rigid motion of these individual parts. Building upon this observation, we propose to generate pseudo scene flow labels for self-supervised learning through piecewise rigid motion estimation, in which the source point cloud is decomposed into local regions and each region is treated as rigid. By rigidly aligning each region with its potential counterpart in the target point cloud, we obtain a region-specific rigid transformation to generate its pseudo flow labels. To mitigate the impact of potential outliers on label generation, when solving the rigid registration for each region, we alternately perform three steps: establishing point correspondences, measuring the confidence for the correspondences, and updating the rigid transformation based on the correspondences and their confidence. As a result, confident correspondences will dominate label generation and a validity mask will be derived for the generated pseudo labels. By using the pseudo labels together with their validity mask for supervision, models can be trained in a self-supervised manner. Extensive experiments on FlyingThings3D and KITTI datasets demonstrate that our method achieves new state-of-the-art performance in self-supervised scene flow learning, without any ground truth scene flow for supervision, even performing better than some supervised counterparts. Additionally, our method is further extended to class-agnostic motion prediction and significantly outperforms previous state-of-the-art self-supervised methods on nuScenes dataset.

相關內容

In this work, we tested the Triplet Extraction (TE) capabilities of a variety of Large Language Models (LLMs) of different sizes in the Zero- and Few-Shots settings. In detail, we proposed a pipeline that dynamically gathers contextual information from a Knowledge Base (KB), both in the form of context triplets and of (sentence, triplets) pairs as examples, and provides it to the LLM through a prompt. The additional context allowed the LLMs to be competitive with all the older fully trained baselines based on the Bidirectional Long Short-Term Memory (BiLSTM) Network architecture. We further conducted a detailed analysis of the quality of the gathered KB context, finding it to be strongly correlated with the final TE performance of the model. In contrast, the size of the model appeared to only logarithmically improve the TE capabilities of the LLMs.

We study building multi-task agents in open-world environments. Without human demonstrations, learning to accomplish long-horizon tasks in a large open-world environment with reinforcement learning (RL) is extremely inefficient. To tackle this challenge, we convert the multi-task learning problem into learning basic skills and planning over the skills. Using the popular open-world game Minecraft as the testbed, we propose three types of fine-grained basic skills, and use RL with intrinsic rewards to acquire skills. A novel Finding-skill that performs exploration to find diverse items provides better initialization for other skills, improving the sample efficiency for skill learning. In skill planning, we leverage the prior knowledge in Large Language Models to find the relationships between skills and build a skill graph. When the agent is solving a task, our skill search algorithm walks on the skill graph and generates the proper skill plans for the agent. In experiments, our method accomplishes 40 diverse Minecraft tasks, where many tasks require sequentially executing for more than 10 skills. Our method outperforms baselines by a large margin and is the most sample-efficient demonstration-free RL method to solve Minecraft Tech Tree tasks. The project's website and code can be found at //sites.google.com/view/plan4mc.

This article investigates the practical scenarios of chasing an adversarial evader in an unbounded environment with cluttered obstacles. We propose a Voronoi-based decentralized algorithm for multiple pursuers to encircle and capture the evader by reacting to collisions. An efficient approach is presented for constructing an obstacle-aware evader-centered bounded Voronoi cell (OA-ECBVC), which strictly ensures collision avoidance in various obstacle scenarios when pursuing the evader. The evader can be efficiently enclosed in a convex hull given random initial configurations. Furthermore, to cooperatively capture the evader, each pursuer continually compresses the boundary of its OA-ECBVC to quickly reduce the movement space of the evader while maintaining encirclement. Our OA-ECBVC algorithm is validated in various simulated environments with different dynamic systems of robots. Real-time performance of resisting uncertainties shows the superior reliability of our method for deployment on multiple robot platforms.

Low earth orbit (LEO) satellite communications can provide ubiquitous and reliable services, making it an essential part of the Internet of Everything network. Beam hopping (BH) is an emerging technology for effectively addressing the issue of low resource utilization caused by the non-uniform spatio-temporal distribution of traffic demands. However, how to allocate multi-dimensional resources in a timely and efficient way for the highly dynamic LEO satellite systems remains a challenge. This paper proposes a joint beam scheduling and power optimization beam hopping (JBSPO-BH) algorithm considering the differences in the geographic distribution of sink nodes. The JBSPO-BH algorithm decouples the original problem into two sub-problems. The beam scheduling problem is modelled as a potential game, and the Nash equilibrium (NE) point is obtained as the beam scheduling strategy. Moreover, the penalty function interior point method is applied to optimize the power allocation. Simulation results show that the JBSPO-BH algorithm has low time complexity and fast convergence and achieves better performance both in throughput and fairness. Compared with greedy-based BH, greedy-based BH with the power optimization, round-robin BH, Max-SINR BH and satellite resource allocation algorithm, the throughput of the proposed algorithm is improved by 44.99%, 20.79%, 156.06%, 15.39% and 8.17%, respectively.

This manuscript enriches the framework of continuous normalizing flows (CNFs) within causal inference, primarily to augment the geometric properties of parametric submodels used in targeted maximum likelihood estimation (TMLE). By introducing an innovative application of CNFs, we construct a refined series of parametric submodels that enable a directed interpolation between the prior distribution $p_0$ and the empirical distribution $p_1$. This proposed methodology serves to optimize the semiparametric efficiency bound in causal inference by orchestrating CNFs to align with Wasserstein gradient flows. Our approach not only endeavors to minimize the mean squared error in the estimation but also imbues the estimators with geometric sophistication, thereby enhancing robustness against misspecification. This robustness is crucial, as it alleviates the dependence on the standard $n^{\frac{1}{4}}$ rate for a doubly-robust perturbation direction in TMLE. By incorporating robust optimization principles and differential geometry into the estimators, the developed geometry-aware CNFs represent a significant advancement in the pursuit of doubly robust causal inference.

In this article, we propose an optimization-based integrated behavior planning and motion control scheme, which is an interpretable and adaptable urban autonomous driving solution that complies with complex traffic rules while ensuring driving safety. Inherently, to ensure compliance with traffic rules, an innovative design of potential functions (PFs) is presented to characterize various traffic rules related to traffic lights, traversable and non-traversable traffic line markings, etc. These PFs are further incorporated as part of the model predictive control (MPC) formulation. In this sense, high-level behavior planning is attained implicitly along with motion control as an integrated architecture, facilitating flexible maneuvers with safety guarantees. Due to the well-designed objective function of the MPC scheme, our integrated behavior planning and motion control scheme is competent for various urban driving scenarios and able to generate versatile behaviors, such as overtaking with adaptive cruise control, turning in the intersection, and merging in and out of the roundabout. As demonstrated from a series of simulations with challenging scenarios in CARLA, it is noteworthy that the proposed framework admits real-time performance and high generalizability.

Existing one-shot 4D head synthesis methods usually learn from monocular videos with the aid of 3DMM reconstruction, yet the latter is evenly challenging which restricts them from reasonable 4D head synthesis. We present a method to learn one-shot 4D head synthesis via large-scale synthetic data. The key is to first learn a part-wise 4D generative model from monocular images via adversarial learning, to synthesize multi-view images of diverse identities and full motions as training data; then leverage a transformer-based animatable triplane reconstructor to learn 4D head reconstruction using the synthetic data. A novel learning strategy is enforced to enhance the generalizability to real images by disentangling the learning process of 3D reconstruction and reenactment. Experiments demonstrate our superiority over the prior art.

Targeting vision applications at the edge, in this work, we systematically explore and propose a high-performance and energy-efficient Optical In-Sensor Accelerator architecture called OISA for the first time. Taking advantage of the promising efficiency of photonic devices, the OISA intrinsically implements a coarse-grained convolution operation on the input frames in an innovative minimum-conversion fashion in low-bit-width neural networks. Such a design remarkably reduces the power consumption of data conversion, transmission, and processing in the conventional cloud-centric architecture as well as recently-presented edge accelerators. Our device-to-architecture simulation results on various image data-sets demonstrate acceptable accuracy while OISA achieves 6.68 TOp/s/W efficiency. OISA reduces power consumption by a factor of 7.9 and 18.4 on average compared with existing electronic in-/near-sensor and ASIC accelerators.

We describe ACE0, a lightweight platform for evaluating the suitability and viability of AI methods for behaviour discovery in multiagent simulations. Specifically, ACE0 was designed to explore AI methods for multi-agent simulations used in operations research studies related to new technologies such as autonomous aircraft. Simulation environments used in production are often high-fidelity, complex, require significant domain knowledge and as a result have high R&D costs. Minimal and lightweight simulation environments can help researchers and engineers evaluate the viability of new AI technologies for behaviour discovery in a more agile and potentially cost effective manner. In this paper we describe the motivation for the development of ACE0.We provide a technical overview of the system architecture, describe a case study of behaviour discovery in the aerospace domain, and provide a qualitative evaluation of the system. The evaluation includes a brief description of collaborative research projects with academic partners, exploring different AI behaviour discovery methods.

In this paper, we propose the joint learning attention and recurrent neural network (RNN) models for multi-label classification. While approaches based on the use of either model exist (e.g., for the task of image captioning), training such existing network architectures typically require pre-defined label sequences. For multi-label classification, it would be desirable to have a robust inference process, so that the prediction error would not propagate and thus affect the performance. Our proposed model uniquely integrates attention and Long Short Term Memory (LSTM) models, which not only addresses the above problem but also allows one to identify visual objects of interests with varying sizes without the prior knowledge of particular label ordering. More importantly, label co-occurrence information can be jointly exploited by our LSTM model. Finally, by advancing the technique of beam search, prediction of multiple labels can be efficiently achieved by our proposed network model.

北京阿比特科技有限公司