亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We present DiffRoom, a novel framework for tackling the problem of high-quality 3D indoor room reconstruction and generation, both of which are challenging due to the complexity and diversity of the room geometry. Although diffusion-based generative models have previously demonstrated impressive performance in image generation and object-level 3D generation, they have not yet been applied to room-level 3D generation due to their computationally intensive costs. In DiffRoom, we propose a sparse 3D diffusion network that is efficient and possesses strong generative performance for Truncated Signed Distance Field (TSDF), based on a rough occupancy prior. Inspired by KinectFusion's incremental alignment and fusion of local SDFs, we propose a diffusion-based TSDF fusion approach that iteratively diffuses and fuses TSDFs, facilitating the reconstruction and generation of an entire room environment. Additionally, to ease training, we introduce a curriculum diffusion learning paradigm that speeds up the training convergence process and enables high-quality reconstruction. According to the user study, the mesh quality generated by our DiffRoom can even outperform the ground truth mesh provided by ScanNet. Please visit our project page for the latest progress and demonstrations: //akirahero.github.io/DiffRoom/.

相關內容

 3D是(shi)(shi)(shi)英文“Three Dimensions”的(de)簡稱,中(zhong)文是(shi)(shi)(shi)指三維(wei)、三個維(wei)度、三個坐標,即(ji)有長、有寬(kuan)、有高,換句話說,就是(shi)(shi)(shi)立體的(de),是(shi)(shi)(shi)相對于只有長和(he)寬(kuan)的(de)平(ping)面(2D)而言。

This paper presents OmniDataComposer, an innovative approach for multimodal data fusion and unlimited data generation with an intent to refine and uncomplicate interplay among diverse data modalities. Coming to the core breakthrough, it introduces a cohesive data structure proficient in processing and merging multimodal data inputs, which include video, audio, and text. Our crafted algorithm leverages advancements across multiple operations such as video/image caption extraction, dense caption extraction, Automatic Speech Recognition (ASR), Optical Character Recognition (OCR), Recognize Anything Model(RAM), and object tracking. OmniDataComposer is capable of identifying over 6400 categories of objects, substantially broadening the spectrum of visual information. It amalgamates these diverse modalities, promoting reciprocal enhancement among modalities and facilitating cross-modal data correction. \textbf{The final output metamorphoses each video input into an elaborate sequential document}, virtually transmuting videos into thorough narratives, making them easier to be processed by large language models. Future prospects include optimizing datasets for each modality to encourage unlimited data generation. This robust base will offer priceless insights to models like ChatGPT, enabling them to create higher quality datasets for video captioning and easing question-answering tasks based on video content. OmniDataComposer inaugurates a new stage in multimodal learning, imparting enormous potential for augmenting AI's understanding and generation of complex, real-world data.

We propose three novel spatial data selection techniques for particle data in VR visualization environments. They are designed to be target- and context-aware and be suitable for a wide range of data features and complex scenarios. Each technique is designed to be adjusted to particular selection intents: the selection of consecutive dense regions, the selection of filament-like structures, and the selection of clusters -- with all of them facilitating post-selection threshold adjustment. These techniques allow users to precisely select those regions of space for further exploration -- with simple and approximate 3D pointing, brushing, or drawing input -- using flexible point- or path-based input and without being limited by 3D occlusions, non-homogeneous feature density, or complex data shapes. These new techniques are evaluated in a controlled experiment and compared with the Baseline method, a region-based 3D painting selection. Our results indicate that our techniques are effective in handling a wide range of scenarios and allow users to select data based on their comprehension of crucial features. Furthermore, we analyze the attributes, requirements, and strategies of our spatial selection methods and compare them with existing state-of-the-art selection methods to handle diverse data features and situations. Based on this analysis we provide guidelines for choosing the most suitable 3D spatial selection techniques based on the interaction environment, the given data characteristics, or the need for interactive post-selection threshold adjustment.

This research paper presents a novel approach to pothole detection using Deep Learning and Image Processing techniques. The proposed system leverages the VGG16 model for feature extraction and utilizes a custom Siamese network with triplet loss, referred to as RoadScan. The system aims to address the critical issue of potholes on roads, which pose significant risks to road users. Accidents due to potholes on the roads have led to numerous accidents. Although it is necessary to completely remove potholes, it is a time-consuming process. Hence, a general road user should be able to detect potholes from a safe distance in order to avoid damage. Existing methods for pothole detection heavily rely on object detection algorithms which tend to have a high chance of failure owing to the similarity in structures and textures of a road and a pothole. Additionally, these systems utilize millions of parameters thereby making the model difficult to use in small-scale applications for the general citizen. By analyzing diverse image processing methods and various high-performing networks, the proposed model achieves remarkable performance in accurately detecting potholes. Evaluation metrics such as accuracy, EER, precision, recall, and AUROC validate the effectiveness of the system. Additionally, the proposed model demonstrates computational efficiency and cost-effectiveness by utilizing fewer parameters and data for training. The research highlights the importance of technology in the transportation sector and its potential to enhance road safety and convenience. The network proposed in this model performs with a 96.12 % accuracy, 3.89 % EER, and a 0.988 AUROC value, which is highly competitive with other state-of-the-art works.

Process-aware information systems offer extensive advantages to companies, facilitating planning, operations, and optimization of day-to-day business activities. However, the time-consuming but required step of designing formal business process models often hampers the potential of these systems. To overcome this challenge, automated generation of business process models from natural language text has emerged as a promising approach to expedite this step. Generally two crucial subtasks have to be solved: extracting process-relevant information from natural language and creating the actual model. Approaches towards the first subtask are rule based methods, highly optimized for specific domains, but hard to adapt to related applications. To solve this issue, we present an extension to an existing pipeline, to make it entirely data driven. We demonstrate the competitiveness of our improved pipeline, which not only eliminates the substantial overhead associated with feature engineering and rule definition, but also enables adaptation to different datasets, entity and relation types, and new domains. Additionally, the largest available dataset (PET) for the first subtask, contains no information about linguistic references between mentions of entities in the process description. Yet, the resolution of these mentions into a single visual element is essential for high quality process models. We propose an extension to the PET dataset that incorporates information about linguistic references and a corresponding method for resolving them. Finally, we provide a detailed analysis of the inherent challenges in the dataset at hand.

We present dPASP, a novel declarative probabilistic logic programming framework for differentiable neuro-symbolic reasoning. The framework allows for the specification of discrete probabilistic models with neural predicates, logic constraints and interval-valued probabilistic choices, thus supporting models that combine low-level perception (images, texts, etc), common-sense reasoning, and (vague) statistical knowledge. To support all such features, we discuss the several semantics for probabilistic logic programs that can express nondeterministic, contradictory, incomplete and/or statistical knowledge. We also discuss how gradient-based learning can be performed with neural predicates and probabilistic choices under selected semantics. We then describe an implemented package that supports inference and learning in the language, along with several example programs. The package requires minimal user knowledge of deep learning system's inner workings, while allowing end-to-end training of rather sophisticated models and loss functions.

We study the coboundary expansion property of product codes called product expansion, which played a key role in all recent constructions of good qLDPC codes. It was shown before that this property is equivalent to robust testability and agreement testability for products of two codes with linear distance. First, we show that robust testability for product of many codes with linear distance is equivalent to agreement testability. Second, we provide an example of product of three codes with linear distance which is robustly testable but not product expanding.

Our work presents a novel spectrum-inspired learning-based approach for generating clothing deformations with dynamic effects and personalized details. Existing methods in the field of clothing animation are limited to either static behavior or specific network models for individual garments, which hinders their applicability in real-world scenarios where diverse animated garments are required. Our proposed method overcomes these limitations by providing a unified framework that predicts dynamic behavior for different garments with arbitrary topology and looseness, resulting in versatile and realistic deformations. First, we observe that the problem of bias towards low frequency always hampers supervised learning and leads to overly smooth deformations. To address this issue, we introduce a frequency-control strategy from a spectral perspective that enhances the generation of high-frequency details of the deformation. In addition, to make the network highly generalizable and able to learn various clothing deformations effectively, we propose a spectral descriptor to achieve a generalized description of the global shape information. Building on the above strategies, we develop a dynamic clothing deformation estimator that integrates frequency-controllable attention mechanisms with long short-term memory. The estimator takes as input expressive features from garments and human bodies, allowing it to automatically output continuous deformations for diverse clothing types, independent of mesh topology or vertex count. Finally, we present a neural collision handling method to further enhance the realism of garments. Our experimental results demonstrate the effectiveness of our approach on a variety of free-swinging garments and its superiority over state-of-the-art methods.

This paper presents ZePoP, a leader election protocol for distributed systems, optimizing a delay-based closeness centrality. We design the protocol specifically for the Peer to Peer(P2P) applications, where the leader peer (node) is responsible for collecting, processing, and redistributing data or control signals satisfying some timing constraints. The protocol elects an optimal leader node in the dynamically changing network and constructs a Data Collection and Distribution Tree (DCDT) rooted at the leader node. The elected optimal leader is closest to all nodes in the system compared to other nodes. We validate the proposed protocol through theoretical proofs as well as experimental results.

The current zero trust model adopted in System-on-Chip (SoC) design is vulnerable to various malicious entities, and modern SoC designs must incorporate various security policies to protect sensitive assets from unauthorized access. These policies involve complex interactions between multiple IP blocks, which poses challenges for SoC designers and security experts when implementing these policies and for system validators when ensuring compliance. Difficulties arise when upgrading policies, reusing IPs for systems targeting different security requirements, and the subsequent increase in design time and time-to-market. This paper proposes a generic and flexible framework, called DiSPEL, for enforcing security policies defined by the user represented in a formal way for any bus-based SoC design. It employs a distributed deployment strategy while ensuring trusted bus operations despite the presence of untrusted IPs. It relies on incorporating a dedicated, centralized module capable of implementing diverse security policies involving bus-level interactions while generating the necessary logic and appending in the bus-level wrapper for IP-level policies. The proposed architecture is generic and independent of specific security policy types supporting both synthesizable and non-synthesizable solutions. The experimental results demonstrate its effectiveness and correctness in enforcing the security requirements and viability due to low overhead in terms of area, delay, and power consumption tested on open-source standard SoC benchmarks.

Despite the significant research efforts on trajectory prediction for automated driving, limited work exists on assessing the prediction reliability. To address this limitation we propose an approach that covers two sources of error, namely novel situations with out-of-distribution (OOD) detection and the complexity in in-distribution (ID) situations with uncertainty estimation. We introduce two modules next to an encoder-decoder network for trajectory prediction. Firstly, a Gaussian mixture model learns the probability density function of the ID encoder features during training, and then it is used to detect the OOD samples in regions of the feature space with low likelihood. Secondly, an error regression network is applied to the encoder, which learns to estimate the trajectory prediction error in supervised training. During inference, the estimated prediction error is used as the uncertainty. In our experiments, the combination of both modules outperforms the prior work in OOD detection and uncertainty estimation, on the Shifts robust trajectory prediction dataset by $2.8 \%$ and $10.1 \%$, respectively. The code is publicly available.

北京阿比特科技有限公司