亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Semantic segmentation is an essential technology for self-driving cars to comprehend their surroundings. Currently, real-time semantic segmentation networks commonly employ either encoder-decoder architecture or two-pathway architecture. Generally speaking, encoder-decoder models tend to be quicker,whereas two-pathway models exhibit higher accuracy. To leverage both strengths, we present the Spatial-Assistant Encoder-Decoder Network (SANet) to fuse the two architectures. In the overall architecture, we uphold the encoder-decoder design while maintaining the feature maps in the middle section of the encoder and utilizing atrous convolution branches for same-resolution feature extraction. Toward the end of the encoder, we integrate the asymmetric pooling pyramid pooling module (APPPM) to optimize the semantic extraction of the feature maps. This module incorporates asymmetric pooling layers that extract features at multiple resolutions. In the decoder, we present a hybrid attention module, SAD, that integrates horizontal and vertical attention to facilitate the combination of various branches. To ascertain the effectiveness of our approach, our SANet model achieved competitive results on the real-time CamVid and cityscape datasets. By employing a single 2080Ti GPU, SANet achieved a 78.4 % mIOU at 65.1 FPS on the Cityscape test dataset and 78.8 % mIOU at 147 FPS on the CamVid test dataset. The training code and model for SANet are available at //github.com/CuZaoo/SANet-main

相關內容

Networking:IFIP International Conferences on Networking。 Explanation:國際網絡會議。 Publisher:IFIP。 SIT:

Offline Imitation Learning (IL) is a powerful paradigm to learn visuomotor skills, especially for high-precision manipulation tasks. However, IL methods are prone to spurious correlation - expressive models may focus on distractors that are irrelevant to action prediction - and are thus fragile in real-world deployment. Prior methods have addressed this challenge by exploring different model architectures and action representations. However, none were able to balance between sample efficiency, robustness against distractors, and solving high-precision manipulation tasks with complex action space. To this end, we present $\textbf{C}$onstrained-$\textbf{C}$ontext $\textbf{C}$onditional $\textbf{D}$iffusion $\textbf{M}$odel (C3DM), a diffusion model policy for solving 6-DoF robotic manipulation tasks with high precision and ability to ignore distractions. A key component of C3DM is a fixation step that helps the action denoiser to focus on task-relevant regions around the predicted action while ignoring distractors in the context. We empirically show that C3DM is able to consistently achieve high success rate on a wide array of tasks, ranging from table top manipulation to industrial kitting, that require varying levels of precision and robustness to distractors. For details, please visit this //sites.google.com/view/c3dm-imitation-learning

Wire harnesses are essential hardware for electronic systems in modern automotive vehicles. With a shift in the automotive industry towards electrification and autonomous driving, more and more automotive electronics are responsible for energy transmission and safety-critical functions such as maneuvering, driver assistance, and safety system. This paradigm shift places more demand on automotive wiring harnesses from the safety perspective and stresses the greater importance of high-quality wire harness assembly in vehicles. However, most of the current operations of wire harness assembly are still performed manually by skilled workers, and some of the manual processes are problematic from different perspectives, such as quality control and ergonomics. There is also a persistent demand in the industry to increase competitiveness and gain market share. Hence, assuring assembly quality while improving ergonomics and optimizing labor costs is desired. Robotized assembly, accomplished by robots or in human-robot collaboration, is a key enabler for fulfilling the increasingly demanding quality and safety as it enables more replicable, transparent, and comprehensible processes than completely manual operations. However, robotized assembly of wire harnesses is challenging in real environments due to the flexibility of the deformable objects, though many preliminary automation solutions have been proposed under simplified industrial configurations. Previous research efforts have proposed the use of computer vision technology to facilitate robotized automation of wire harness assembly, enabling the robots to better perceive and manipulate the flexible wire harness. This article presents an overview on computer vision technology proposed for robotized wire harness assembly and derives research gaps that require further study to facilitate a more practical robotized assembly of wire harness.

Modeling the precise dynamics of off-road vehicles is a complex yet essential task due to the challenging terrain they encounter and the need for optimal performance and safety. Recently, there has been a focus on integrating nominal physics-based models alongside data-driven neural networks using Physics Informed Neural Networks. These approaches often assume the availability of a well-distributed dataset; however, this assumption may not hold due to regions in the physical distribution that are hard to collect, such as high-speed motions and rare terrains. Therefore, we introduce a physics-informed data augmentation methodology called PIAug. We show an example use case of the same by modeling high-speed and aggressive motion predictions, given a dataset with only low-speed data. During the training phase, we leverage the nominal model for generating target domain (medium and high velocity) data using the available source data (low velocity). Subsequently, we employ a physics-inspired loss function with this augmented dataset to incorporate prior knowledge of physics into the neural network. Our methodology results in up to 67% less mean error in trajectory prediction in comparison to a standalone nominal model, especially during aggressive maneuvers at speeds outside the training domain. In real-life navigation experiments, our model succeeds in 4x tighter waypoint tracking constraints than the Kinematic Bicycle Model (KBM) at out-of-domain velocities.

The rapid development and adoption of Generative AI (GAI) technology in the form of chatbots such as ChatGPT and Claude has greatly increased interest in agentic machines. This paper introduces the Autonomous Cognitive Entity (ACE) model, a novel framework for a cognitive architecture, enabling machines and software agents to operate more independently. Drawing inspiration from the OSI model, the ACE framework presents layers of abstraction to conceptualize artificial cognitive architectures. The model is designed to harness the capabilities of the latest generative AI technologies, including large language models (LLMs) and multimodal generative models (MMMs), to build autonomous, agentic systems. The ACE framework comprises six layers: the Aspirational Layer, Global Strategy, Agent Model, Executive Function, Cognitive Control, and Task Prosecution. Each layer plays a distinct role, ranging from setting the moral compass and strategic thinking to task selection and execution. The ACE framework also incorporates mechanisms for handling failures and adapting actions, thereby enhancing the robustness and flexibility of autonomous agents. This paper introduces the conceptual framework and proposes implementation strategies that have been tested and observed in industry. The goal of this paper is to formalize this framework so as to be more accessible.

Mobile autonomous robots have the potential to revolutionize manufacturing processes. However, employing large robot fleets in manufacturing requires addressing challenges including collision-free movement in a shared workspace, effective multi-robot collaboration to manipulate and transport large payloads, complex task allocation due to coupled manufacturing processes, and spatial planning for parallel assembly and transportation of nested subassemblies. We propose a full algorithmic stack for large-scale multi-robot assembly planning that addresses these challenges and can synthesize construction plans for complex assemblies with thousands of parts in a matter of minutes. Our approach takes in a CAD-like product specification and automatically plans a full-stack assembly procedure for a group of robots to manufacture the product. We propose an algorithmic stack that comprises: (i) an iterative radial layout optimization procedure to define a global staging layout for the manufacturing facility, (ii) a graph-repair mixed-integer program formulation and a modified greedy task allocation algorithm to optimally allocate robots and robot sub-teams to assembly and transport tasks, (iii) a geometric heuristic and a hill-climbing algorithm to plan collaborative carrying configurations of robot sub-teams, and (iv) a distributed control policy that enables robots to execute the assembly motion plan collision-free. We also present an open-source multi-robot manufacturing simulator implemented in Julia as a resource to the research community, to test our algorithms and to facilitate multi-robot manufacturing research more broadly. Our empirical results demonstrate the scalability and effectiveness of our approach by generating plans to manufacture a LEGO model of a Saturn V launch vehicle with 1845 parts, 306 subassemblies, and 250 robots in under three minutes on a standard laptop computer.

Learning continuous-time point processes is essential to many discrete event forecasting tasks. However, integration poses a major challenge, particularly for spatiotemporal point processes (STPPs), as it involves calculating the likelihood through triple integrals over space and time. Existing methods for integrating STPP either assume a parametric form of the intensity function, which lacks flexibility; or approximating the intensity with Monte Carlo sampling, which introduces numerical errors. Recent work by Omi et al. [2019] proposes a dual network approach for efficient integration of flexible intensity function. However, their method only focuses on the 1D temporal point process. In this paper, we introduce a novel paradigm: AutoSTPP (Automatic Integration for Spatiotemporal Neural Point Processes) that extends the dual network approach to 3D STPP. While previous work provides a foundation, its direct extension overly restricts the intensity function and leads to computational challenges. In response, we introduce a decomposable parametrization for the integral network using ProdNet. This approach, leveraging the product of simplified univariate graphs, effectively sidesteps the computational complexities inherent in multivariate computational graphs. We prove the consistency of AutoSTPP and validate it on synthetic data and benchmark real-world datasets. AutoSTPP shows a significant advantage in recovering complex intensity functions from irregular spatiotemporal events, particularly when the intensity is sharply localized. Our code is open-source at //github.com/Rose-STL-Lab/AutoSTPP.

Interaction-aware Autonomous Driving (IAAD) is a rapidly growing field of research that focuses on the development of autonomous vehicles (AVs) that are capable of interacting safely and efficiently with human road users. This is a challenging task, as it requires the autonomous vehicle to be able to understand and predict the behaviour of human road users. In this literature review, the current state of IAAD research is surveyed in this work. Commencing with an examination of terminology, attention is drawn to challenges and existing models employed for modelling the behaviour of drivers and pedestrians. Next, a comprehensive review is conducted on various techniques proposed for interaction modelling, encompassing cognitive methods, machine learning approaches, and game-theoretic methods. The conclusion is reached through a discussion of potential advantages and risks associated with IAAD, along with the illumination of pivotal research inquiries necessitating future exploration.

Data transmission between two or more digital devices in industry and government demands secure and agile technology. Digital information distribution often requires deployment of Internet of Things (IoT) devices and Data Fusion techniques which have also gained popularity in both, civilian and military environments, such as, emergence of Smart Cities and Internet of Battlefield Things (IoBT). This usually requires capturing and consolidating data from multiple sources. Because datasets do not necessarily originate from identical sensors, fused data typically results in a complex Big Data problem. Due to potentially sensitive nature of IoT datasets, Blockchain technology is used to facilitate secure sharing of IoT datasets, which allows digital information to be distributed, but not copied. However, blockchain has several limitations related to complexity, scalability, and excessive energy consumption. We propose an approach to hide information (sensor signal) by transforming it to an image or an audio signal. In one of the latest attempts to the military modernization, we investigate sensor fusion approach by investigating the challenges of enabling an intelligent identification and detection operation and demonstrates the feasibility of the proposed Deep Learning and Anomaly Detection models that can support future application for specific hand gesture alert system from wearable devices.

Few-shot Knowledge Graph (KG) completion is a focus of current research, where each task aims at querying unseen facts of a relation given its few-shot reference entity pairs. Recent attempts solve this problem by learning static representations of entities and references, ignoring their dynamic properties, i.e., entities may exhibit diverse roles within task relations, and references may make different contributions to queries. This work proposes an adaptive attentional network for few-shot KG completion by learning adaptive entity and reference representations. Specifically, entities are modeled by an adaptive neighbor encoder to discern their task-oriented roles, while references are modeled by an adaptive query-aware aggregator to differentiate their contributions. Through the attention mechanism, both entities and references can capture their fine-grained semantic meanings, and thus render more expressive representations. This will be more predictive for knowledge acquisition in the few-shot scenario. Evaluation in link prediction on two public datasets shows that our approach achieves new state-of-the-art results with different few-shot sizes.

We consider the problem of referring image segmentation. Given an input image and a natural language expression, the goal is to segment the object referred by the language expression in the image. Existing works in this area treat the language expression and the input image separately in their representations. They do not sufficiently capture long-range correlations between these two modalities. In this paper, we propose a cross-modal self-attention (CMSA) module that effectively captures the long-range dependencies between linguistic and visual features. Our model can adaptively focus on informative words in the referring expression and important regions in the input image. In addition, we propose a gated multi-level fusion module to selectively integrate self-attentive cross-modal features corresponding to different levels in the image. This module controls the information flow of features at different levels. We validate the proposed approach on four evaluation datasets. Our proposed approach consistently outperforms existing state-of-the-art methods.

北京阿比特科技有限公司