免费在线黄色电影-欧美亚州视频一区二区三区

Surface reconstruction has traditionally relied on the Multi-View Stereo (MVS)-based pipeline, which often suffers from noisy and incomplete geometry. This is due to that although MVS has been proven to be an effective way to recover the geometry of the scenes, especially for locally detailed areas with rich textures, it struggles to deal with areas with low texture and large variations of illumination where the photometric consistency is unreliable. Recently, Neural Implicit Surface Reconstruction (NISR) combines surface rendering and volume rendering techniques and bypasses the MVS as an intermediate step, which has emerged as a promising alternative to overcome the limitations of traditional pipelines. While NISR has shown impressive results on simple scenes, it remains challenging to recover delicate geometry from uncontrolled real-world scenes which is caused by its underconstrained optimization. To this end, the framework PSDF is proposed which resorts to external geometric priors from a pretrained MVS network and internal geometric priors inherent in the NISR model to facilitate high-quality neural implicit surface learning. Specifically, the visibility-aware feature consistency loss and depth prior-assisted sampling based on external geometric priors are introduced. These proposals provide powerfully geometric consistency constraints and aid in locating surface intersection points, thereby significantly improving the accuracy and delicate reconstruction of NISR. Meanwhile, the internal prior-guided importance rendering is presented to enhance the fidelity of the reconstructed surface mesh by mitigating the biased rendering issue in NISR. Extensive experiments on the Tanks and Temples dataset show that PSDF achieves state-of-the-art performance on complex uncontrolled scenes.

相關內容

Microsoft Surface

關注 5

Surface 是微軟公司（）旗下一系列使用 Windows 10（早期為 Windows 8.X）操作系統的電腦產品，目前有 Surface、Surface Pro 和 Surface Book 三個系列。 2012 年 6 月 18 日，初代 Surface Pro/RT 由時任微軟 CEO 史蒂夫·鮑爾默發布于在洛杉磯舉行的記者會，2012 年 10 月 26 日上市銷售。

剪枝 · 詞元分析器 · 多峰值 · 可約的 · 變換 ·

2024 年 3 月 5 日

MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer

Jianjian Cao,Peng Ye,Shengze Li,Chong Yu,Yansong Tang,Jiwen Lu,Tao Chen

from arxiv, 19 pages, 9 figures, Published in CVPR2024

Vision-Language Transformers (VLTs) have shown great success recently, but are meanwhile accompanied by heavy computation costs, where a major reason can be attributed to the large number of visual and language tokens. Existing token pruning research for compressing VLTs mainly follows a single-modality-based scheme yet ignores the critical role of aligning different modalities for guiding the token pruning process, causing the important tokens for one modality to be falsely pruned in another modality branch. Meanwhile, existing VLT pruning works also lack the flexibility to dynamically compress each layer based on different input samples. To this end, we propose a novel framework named Multimodal Alignment-Guided Dynamic Token Pruning (MADTP) for accelerating various VLTs. Specifically, we first introduce a well-designed Multi-modality Alignment Guidance (MAG) module that can align features of the same semantic concept from different modalities, to ensure the pruned tokens are less important for all modalities. We further design a novel Dynamic Token Pruning (DTP) module, which can adaptively adjust the token compression ratio in each layer based on different input instances. Extensive experiments on various benchmarks demonstrate that MADTP significantly reduces the computational complexity of kinds of multimodal models while preserving competitive performance. Notably, when applied to the BLIP model in the NLVR2 dataset, MADTP can reduce the GFLOPs by 80% with less than 4% performance degradation.

評分函數 · 泛函 · 得分 · MoDELS · 無監督 ·

2024 年 3 月 5 日

DISYRE: Diffusion-Inspired SYnthetic REstoration for Unsupervised Anomaly Detection

Sergio Naval Marimont,Matthew Baugh,Vasilis Siomos,Christos Tzelepis,Bernhard Kainz,Giacomo Tarroni

from arxiv, 5 pages, 3 figures. Accepted for publication in ISBI 2024

Unsupervised Anomaly Detection (UAD) techniques aim to identify and localize anomalies without relying on annotations, only leveraging a model trained on a dataset known to be free of anomalies. Diffusion models learn to modify inputs $x$ to increase the probability of it belonging to a desired distribution, i.e., they model the score function $\nabla_x \log p(x)$. Such a score function is potentially relevant for UAD, since $\nabla_x \log p(x)$ is itself a pixel-wise anomaly score. However, diffusion models are trained to invert a corruption process based on Gaussian noise and the learned score function is unlikely to generalize to medical anomalies. This work addresses the problem of how to learn a score function relevant for UAD and proposes DISYRE: Diffusion-Inspired SYnthetic REstoration. We retain the diffusion-like pipeline but replace the Gaussian noise corruption with a gradual, synthetic anomaly corruption so the learned score function generalizes to medical, naturally occurring anomalies. We evaluate DISYRE on three common Brain MRI UAD benchmarks and substantially outperform other methods in two out of the three tasks.

prototype · 噪聲 · 可辨認的 · Performer · 圖像分割 ·

2024 年 3 月 5 日

ProCNS: Progressive Prototype Calibration and Noise Suppression for Weakly-Supervised Medical Image Segmentation

Y. Liu,L. Lin,K. K. Y. Wong,X. Tang

Weakly-supervised segmentation (WSS) has emerged as a solution to mitigate the conflict between annotation cost and model performance by adopting sparse annotation formats (e.g., point, scribble, block, etc.). Typical approaches attempt to exploit anatomy and topology priors to directly expand sparse annotations into pseudo-labels. However, due to a lack of attention to the ambiguous edges in medical images and insufficient exploration of sparse supervision, existing approaches tend to generate erroneous and overconfident pseudo proposals in noisy regions, leading to cumulative model error and performance degradation. In this work, we propose a novel WSS approach, named ProCNS, encompassing two synergistic modules devised with the principles of progressive prototype calibration and noise suppression. Specifically, we design a Prototype-based Regional Spatial Affinity (PRSA) loss to maximize the pair-wise affinities between spatial and semantic elements, providing our model of interest with more reliable guidance. The affinities are derived from the input images and the prototype-refined predictions. Meanwhile, we propose an Adaptive Noise Perception and Masking (ANPM) module to obtain more enriched and representative prototype representations, which adaptively identifies and masks noisy regions within the pseudo proposals, reducing potential erroneous interference during prototype computation. Furthermore, we generate specialized soft pseudo-labels for the noisy regions identified by ANPM, providing supplementary supervision. Extensive experiments on three medical image segmentation tasks involving different modalities demonstrate that the proposed framework significantly outperforms representative state-of-the-art methods

MoDELS · Automator · HTTPS · CASE · Projection ·

2024 年 3 月 5 日

AlloyInEcore: Embedding of First-Order Relational Logic into Meta-Object Facility for Automated Model Reasoning

Ferhat Erata,Arda Goknil,Ivan Kurtev,Bedir Tekinerdogan

from arxiv, Published in ESEC/FSE 2018: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

We present AlloyInEcore, a tool for specifying metamodels with their static semantics to facilitate automated, formal reasoning on models. Software development projects require that software systems be specified in various models (e.g., requirements models, architecture models, test models, and source code). It is crucial to reason about those models to ensure the correct and complete system specifications. AlloyInEcore allows the user to specify metamodels with their static semantics, while, using the semantics, it automatically detects inconsistent models, and completes partial models. It has been evaluated on three industrial case studies in the automotive domain (//modelwriter.github.io/AlloyInEcore/).

路徑 · 容差 · 邊 · 劃分 · 圖 ·

2024 年 3 月 5 日

Improved Shortest Path Restoration Lemmas for Multiple Edge Failures: Trade-offs Between Fault-tolerance and Subpaths

Greg Bodwin,Lily Wang

from arxiv, 18 pages, 6 figures

The restoration lemma is a classic result by Afek, Bremler-Barr, Kaplan, Cohen, and Merritt [PODC '01], which relates the structure of shortest paths in a graph $G$ before and after some edges in the graph fail. Their work shows that, after one edge failure, any replacement shortest path avoiding this failing edge can be partitioned into two pre-failure shortest paths. More generally, this implies an additive tradeoff between fault tolerance and subpath count: for any $f, k$, we can partition any $f$-edge-failure replacement shortest path into $k+1$ subpaths which are each an $(f-k)$-edge-failure replacement shortest path. This generalized result has found applications in routing, graph algorithms, fault tolerant network design, and more. Our main result improves this to a multiplicative tradeoff between fault tolerance and subpath count. We show that for all $f, k$, any $f$-edge-failure replacement path can be partitioned into $O(k)$ subpaths that are each an $(f/k)$-edge-failure replacement path. We also show an asymptotically matching lower bound. In particular, our results imply that the original restoration lemma is exactly tight in the case $k=1$, but can be significantly improved for larger $k$. We also show an extension of this result to weighted input graphs, and we give efficient algorithms that compute path decompositions satisfying our improved restoration lemmas.

共軛梯度 · 控制器 · 共軛 · 線性的 · Performer ·

2024 年 3 月 3 日

MPCGPU: Real-Time Nonlinear Model Predictive Control through Preconditioned Conjugate Gradient on the GPU

Emre Adabag,Miloni Atal,William Gerard,Brian Plancher

from arxiv, Accepted to ICRA 2024, 8 pages, 6 figures

Nonlinear Model Predictive Control (NMPC) is a state-of-the-art approach for locomotion and manipulation which leverages trajectory optimization at each control step. While the performance of this approach is computationally bounded, implementations of direct trajectory optimization that use iterative methods to solve the underlying moderately-large and sparse linear systems, are a natural fit for parallel hardware acceleration. In this work, we introduce MPCGPU, a GPU-accelerated, real-time NMPC solver that leverages an accelerated preconditioned conjugate gradient (PCG) linear system solver at its core. We show that MPCGPU increases the scalability and real-time performance of NMPC, solving larger problems, at faster rates. In particular, for tracking tasks using the Kuka IIWA manipulator, MPCGPU is able to scale to kilohertz control rates with trajectories as long as 512 knot points. This is driven by a custom PCG solver which outperforms state-of-the-art, CPU-based, linear system solvers by at least 10x for a majority of solves and 3.6x on average.

MoDELS · state-of-the-art · Integration · 知識 (knowledge) · Performer ·

2024 年 3 月 2 日

MATNet: Multi-Level Fusion Transformer-Based Model for Day-Ahead PV Generation Forecasting

Matteo Tortora,Francesco Conte,Gianluca Natrella,Paolo Soda

Accurate forecasting of renewable generation is crucial to facilitate the integration of RES into the power system. Focusing on PV units, forecasting methods can be divided into two main categories: physics-based and data-based strategies, with AI-based models providing state-of-the-art performance. However, while these AI-based models can capture complex patterns and relationships in the data, they ignore the underlying physical prior knowledge of the phenomenon. Therefore, in this paper we propose MATNet, a novel self-attention transformer-based architecture for multivariate multi-step day-ahead PV power generation forecasting. It consists of a hybrid approach that combines the AI paradigm with the prior physical knowledge of PV power generation of physics-based methods. The model is fed with historical PV data and historical and forecast weather data through a multi-level joint fusion approach. The effectiveness of the proposed model is evaluated using the Ausgrid benchmark dataset with different regression performance metrics. The results show that our proposed architecture significantly outperforms the current state-of-the-art methods. These findings demonstrate the potential of MATNet in improving forecasting accuracy and suggest that it could be a promising solution to facilitate the integration of PV energy into the power grid.

大語言模型 · 語言模型化 · MoDELS · Performer · 相關系數 ·

2024 年 3 月 1 日

Crimson: Empowering Strategic Reasoning in Cybersecurity through Large Language Models

Jiandong Jin,Bowen Tang,Mingxuan Ma,Xiao Liu,Yunfei Wang,Qingnan Lai,Jia Yang,Changling Zhou

from arxiv, 9 pages, 7 figures

We introduces Crimson, a system that enhances the strategic reasoning capabilities of Large Language Models (LLMs) within the realm of cybersecurity. By correlating CVEs with MITRE ATT&CK techniques, Crimson advances threat anticipation and strategic defense efforts. Our approach includes defining and evaluating cybersecurity strategic tasks, alongside implementing a comprehensive human-in-the-loop data-synthetic workflow to develop the CVE-to-ATT&CK Mapping (CVEM) dataset. We further enhance LLMs' reasoning abilities through a novel Retrieval-Aware Training (RAT) process and its refined iteration, RAT-R. Our findings demonstrate that an LLM fine-tuned with our techniques, possessing 7 billion parameters, approaches the performance level of GPT-4, showing markedly lower rates of hallucination and errors, and surpassing other models in strategic reasoning tasks. Moreover, domain-specific fine-tuning of embedding models significantly improves performance within cybersecurity contexts, underscoring the efficacy of our methodology. By leveraging Crimson to convert raw vulnerability data into structured and actionable insights, we bolster proactive cybersecurity defenses.

事件抽取 · Extensibility · 端到端 · 有向非循環圖 · state-of-the-art ·

2019 年 9 月 23 日

Doc2EDAG: An End-to-End Document-level Framework for Chinese Financial Event Extraction

Shun Zheng,Wei Cao,Wei Xu,Jiang Bian

from arxiv, Accepted by EMNLP 2019

Most existing event extraction (EE) methods merely extract event arguments within the sentence scope. However, such sentence-level EE methods struggle to handle soaring amounts of documents from emerging applications, such as finance, legislation, health, etc., where event arguments always scatter across different sentences, and even multiple such event mentions frequently co-exist in the same document. To address these challenges, we propose a novel end-to-end model, Doc2EDAG, which can generate an entity-based directed acyclic graph to fulfill the document-level EE (DEE) effectively. Moreover, we reformalize a DEE task with the no-trigger-words design to ease the document-level event labeling. To demonstrate the effectiveness of Doc2EDAG, we build a large-scale real-world dataset consisting of Chinese financial announcements with the challenges mentioned above. Extensive experiments with comprehensive analyses illustrate the superiority of Doc2EDAG over state-of-the-art methods. Data and codes can be found at //github.com/dolphin-zs/Doc2EDAG.

學成 · 深度學習 · 可辨認的 · MoDELS · 目標跟蹤 ·

2019 年 7 月 31 日

Deep Learning in Video Multi-Object Tracking: A Survey

Gioele Ciaparrone,Francisco Luque Sánchez,Siham Tabik,Luigi Troiano,Roberto Tagliaferri,Francisco Herrera

from arxiv, New in v2: corrected typos and various minor mistakes. Submitted to Neurocomputing. Main text: 25 pages, 5 figures, 6 tables. Summary table in appendix at the end of the paper

The problem of Multiple Object Tracking (MOT) consists in following the trajectory of different objects in a sequence, usually a video. In recent years, with the rise of Deep Learning, the algorithms that provide a solution to this problem have benefited from the representational power of deep models. This paper provides a comprehensive survey on works that employ Deep Learning models to solve the task of MOT on single-camera videos. Four main steps in MOT algorithms are identified, and an in-depth review of how Deep Learning was employed in each one of these stages is presented. A complete experimental comparison of the presented works on the three MOTChallenge datasets is also provided, identifying a number of similarities among the top-performing methods and presenting some possible future research directions.