国产高清一区二区在线影院-国产在线观看成永久免费视频

High-resolution wide-angle fisheye images are becoming more and more important for robotics applications such as autonomous driving. However, using ordinary convolutional neural networks or vision transformers on this data is problematic due to projection and distortion losses introduced when projecting to a rectangular grid on the plane. We introduce the HEAL-SWIN transformer, which combines the highly uniform Hierarchical Equal Area iso-Latitude Pixelation (HEALPix) grid used in astrophysics and cosmology with the Hierarchical Shifted-Window (SWIN) transformer to yield an efficient and flexible model capable of training on high-resolution, distortion-free spherical data. In HEAL-SWIN, the nested structure of the HEALPix grid is used to perform the patching and windowing operations of the SWIN transformer, resulting in a one-dimensional representation of the spherical data with minimal computational overhead. We demonstrate the superior performance of our model for semantic segmentation and depth regression tasks on both synthetic and real automotive datasets. Our code is available at //github.com/JanEGerken/HEAL-SWIN.

相關內容

變換

關注 2

Hive · 優化器 · 邊 · 情景 · prototype ·

2023 年 9 月 5 日

Computing Hive Plots: A Combinatorial Framework

Martin N?llenburg,Markus Wallinger

from arxiv, Appears in the Proceedings of the 31st International Symposium on Graph Drawing and Network Visualization (GD 2023)

Hive plots are a graph visualization style placing vertices on a set of radial axes emanating from a common center and drawing edges as smooth curves connecting their respective endpoints. In previous work on hive plots, assignment to an axis and vertex positions on each axis were determined based on selected vertex attributes and the order of axes was prespecified. Here, we present a new framework focusing on combinatorial aspects of these drawings to extend the original hive plot idea and optimize visual properties such as the total edge length and the number of edge crossings in the resulting hive plots. Our framework comprises three steps: (1) partition the vertices into multiple groups, each corresponding to an axis of the hive plot; (2) optimize the cyclic axis order to bring more strongly connected groups near each other; (3) optimize the vertex ordering on each axis to minimize edge crossings. Each of the three steps is related to a well-studied, but NP-complete computational problem. We combine and adapt suitable algorithmic approaches, implement them as an instantiation of our framework and show in a case study how it can be applied in a practical setting. Furthermore, we conduct computational experiments to gain further insights regarding algorithmic choices of the framework. The code of the implementation and a prototype web application can be found on OSF.

MoDELS · Integration · Boosting（一種模型訓練加速方式） · 掩碼 · 圖像還原 ·

2023 年 9 月 5 日

SAM-Deblur: Let Segment Anything Boost Image Deblurring

Siwei Li,Mingxuan Liu,Yating Zhang,Shu Chen,Haoxiang Li,Hong Chen,Zifei Dou

from arxiv, Under review

Image deblurring is a critical task in the field of image restoration, aiming to eliminate blurring artifacts. However, the challenge of addressing non-uniform blurring leads to an ill-posed problem, which limits the generalization performance of existing deblurring models. To solve the problem, we propose a framework SAM-Deblur, integrating prior knowledge from the Segment Anything Model (SAM) into the deblurring task for the first time. In particular, SAM-Deblur is divided into three stages. First, We preprocess the blurred images, obtain image masks via SAM, and propose a mask dropout method for training to enhance model robustness. Then, to fully leverage the structural priors generated by SAM, we propose a Mask Average Pooling (MAP) unit specifically designed to average SAM-generated segmented areas, serving as a plug-and-play component which can be seamlessly integrated into existing deblurring networks. Finally, we feed the fused features generated by the MAP Unit into the deblurring model to obtain a sharp image. Experimental results on the RealBlurJ, ReloBlur, and REDS datasets reveal that incorporating our methods improves NAFNet's PSNR by 0.05, 0.96, and 7.03, respectively. Code will be available at \href{//github.com/HPLQAQ/SAM-Deblur}{SAM-Deblur}.

Color · 解碼 · Extensibility · state-of-the-art · Performer ·

2023 年 9 月 5 日

DDColor: Towards Photo-Realistic Image Colorization via Dual Decoders

Xiaoyang Kang,Tao Yang,Wenqi Ouyang,Peiran Ren,Lingzhi Li,Xuansong Xie

from arxiv, ICCV 2023; Code: //github.com/piddnad/DDColor

Image colorization is a challenging problem due to multi-modal uncertainty and high ill-posedness. Directly training a deep neural network usually leads to incorrect semantic colors and low color richness. While transformer-based methods can deliver better results, they often rely on manually designed priors, suffer from poor generalization ability, and introduce color bleeding effects. To address these issues, we propose DDColor, an end-to-end method with dual decoders for image colorization. Our approach includes a pixel decoder and a query-based color decoder. The former restores the spatial resolution of the image, while the latter utilizes rich visual features to refine color queries, thus avoiding hand-crafted priors. Our two decoders work together to establish correlations between color and multi-scale semantic representations via cross-attention, significantly alleviating the color bleeding effect. Additionally, a simple yet effective colorfulness loss is introduced to enhance the color richness. Extensive experiments demonstrate that DDColor achieves superior performance to existing state-of-the-art works both quantitatively and qualitatively. The codes and models are publicly available at //github.com/piddnad/DDColor.

估計/估計量 · 吉布斯采樣/吉布斯抽樣 · Continuity · 設計 · 損失函數（機器學習） ·

2023 年 9 月 4 日

Hierarchical Regression Discontinuity Design: Pursuing Subgroup Treatment Effects

Shonosuke Sugasawa,Takuya Ishihara,Daisuke Kurisu

from arxiv, 21 pages

Regression discontinuity design (RDD) is widely adopted for causal inference under intervention determined by a continuous variable. While one is interested in treatment effect heterogeneity by subgroups in many applications, RDD typically suffers from small subgroup-wise sample sizes, which makes the estimation results highly instable. To solve this issue, we introduce hierarchical RDD (HRDD), a hierarchical Bayes approach for pursuing treatment effect heterogeneity in RDD. A key feature of HRDD is to employ a pseudo-model based on a loss function to estimate subgroup-level parameters of treatment effects under RDD, and assign a hierarchical prior distribution to ``borrow strength" from other subgroups. The posterior computation can be easily done by a simple Gibbs sampling. We demonstrate the proposed HRDD through simulation and real data analysis, and show that HRDD provides much more stable point and interval estimation than separately applying the standard RDD method to each subgroup.

圖像檢索 · Branch · 多峰值 · MoDELS · 正則化項 ·

2023 年 9 月 4 日

Target-Guided Composed Image Retrieval

Haokun Wen,Xian Zhang,Xuemeng Song,Yinwei Wei,Liqiang Nie

Composed image retrieval (CIR) is a new and flexible image retrieval paradigm, which can retrieve the target image for a multimodal query, including a reference image and its corresponding modification text. Although existing efforts have achieved compelling success, they overlook the conflict relationship modeling between the reference image and the modification text for improving the multimodal query composition and the adaptive matching degree modeling for promoting the ranking of the candidate images that could present different levels of matching degrees with the given query. To address these two limitations, in this work, we propose a Target-Guided Composed Image Retrieval network (TG-CIR). In particular, TG-CIR first extracts the unified global and local attribute features for the reference/target image and the modification text with the contrastive language-image pre-training model (CLIP) as the backbone, where an orthogonal regularization is introduced to promote the independence among the attribute features. Then TG-CIR designs a target-query relationship-guided multimodal query composition module, comprising a target-free student composition branch and a target-based teacher composition branch, where the target-query relationship is injected into the teacher branch for guiding the conflict relationship modeling of the student branch. Last, apart from the conventional batch-based classification loss, TG-CIR additionally introduces a batch-based target similarity-guided matching degree regularization to promote the metric learning process. Extensive experiments on three benchmark datasets demonstrate the superiority of our proposed method.

穩健性 · INTERACT · 控制器 · 機器人 · INFORMS ·

2023 年 9 月 2 日

Proactive Human-Robot Co-Assembly: Leveraging Human Intention Prediction and Robust Safe Control

Ruixuan Liu,Rui Chen,Abulikemu Abuduweili,Changliu Liu

from arxiv, 7th IEEE Conference on Control Technology and Applications (CCTA 2023)

Human-robot collaboration (HRC) is one key component to achieving flexible manufacturing to meet the different needs of customers. However, it is difficult to build intelligent robots that can proactively assist humans in a safe and efficient way due to several challenges. First, it is challenging to achieve efficient collaboration due to diverse human behaviors and data scarcity. Second, it is difficult to ensure interactive safety due to uncertainty in human behaviors. This paper presents an integrated framework for proactive HRC. A robust intention prediction module, which leverages prior task information and human-in-the-loop training, is learned to guide the robot for efficient collaboration. The proposed framework also uses robust safe control to ensure interactive safety under uncertainty. The developed framework is applied to a co-assembly task using a Kinova Gen3 robot. The experiment demonstrates that our solution is robust to environmental changes as well as different human preferences and behaviors. In addition, it improves task efficiency by approximately 15-20%. Moreover, the experiment demonstrates that our solution can guarantee interactive safety during proactive collaboration.

可約的 · 自動問答 · 得分 · BBC News · INFORMS ·

2023 年 9 月 2 日

LeanContext: Cost-Efficient Domain-Specific Question Answering Using LLMs

Md Adnan Arefeen,Biplob Debnath,Srimat Chakradhar

from arxiv, The paper is under review

Question-answering (QA) is a significant application of Large Language Models (LLMs), shaping chatbot capabilities across healthcare, education, and customer service. However, widespread LLM integration presents a challenge for small businesses due to the high expenses of LLM API usage. Costs rise rapidly when domain-specific data (context) is used alongside queries for accurate domain-specific LLM responses. One option is to summarize the context by using LLMs and reduce the context. However, this can also filter out useful information that is necessary to answer some domain-specific queries. In this paper, we shift from human-oriented summarizers to AI model-friendly summaries. Our approach, LeanContext, efficiently extracts $k$ key sentences from the context that are closely aligned with the query. The choice of $k$ is neither static nor random; we introduce a reinforcement learning technique that dynamically determines $k$ based on the query and context. The rest of the less important sentences are reduced using a free open source text reduction method. We evaluate LeanContext against several recent query-aware and query-unaware context reduction approaches on prominent datasets (arxiv papers and BBC news articles). Despite cost reductions of $37.29\%$ to $67.81\%$, LeanContext's ROUGE-1 score decreases only by $1.41\%$ to $2.65\%$ compared to a baseline that retains the entire context (no summarization). Additionally, if free pretrained LLM-based summarizers are used to reduce context (into human consumable summaries), LeanContext can further modify the reduced context to enhance the accuracy (ROUGE-1 score) by $13.22\%$ to $24.61\%$.

圖像降噪 · 去噪 · 噪聲 · 估計/估計量 · Networking ·

2023 年 9 月 1 日

Variational Denoising Network: Toward Blind Noise Modeling and Removal

Zongsheng Yue,Hongwei Yong,Qian Zhao,Lei Zhang,Deyu Meng

from arxiv, Correct a minor typo

Blind image denoising is an important yet very challenging problem in computer vision due to the complicated acquisition process of real images. In this work we propose a new variational inference method, which integrates both noise estimation and image denoising into a unique Bayesian framework, for blind image denoising. Specifically, an approximate posterior, parameterized by deep neural networks, is presented by taking the intrinsic clean image and noise variances as latent variables conditioned on the input noisy image. This posterior provides explicit parametric forms for all its involved hyper-parameters, and thus can be easily implemented for blind image denoising with automatic noise estimation for the test noisy image. On one hand, as other data-driven deep learning methods, our method, namely variational denoising network (VDN), can perform denoising efficiently due to its explicit form of posterior expression. On the other hand, VDN inherits the advantages of traditional model-driven approaches, especially the good generalization capability of generative models. VDN has good interpretability and can be flexibly utilized to estimate and remove complicated non-i.i.d. noise collected in real scenarios. Comprehensive experiments are performed to substantiate the superiority of our method in blind image denoising.

圖卷積神經網絡/圖卷積網絡 · AdaBoost · 圖卷積 · 圖 · Networking ·

2019 年 8 月 14 日

AdaGCN: Adaboosting Graph Convolutional Networks into Deep Models

Ke Sun,Zhouchen Lin,Zhanxing Zhu

The design of deep graph models still remains to be investigated and the crucial part is how to explore and exploit the knowledge from different hops of neighbors in an efficient way. In this paper, we propose a novel RNN-like deep graph neural network architecture by incorporating AdaBoost into the computation of network; and the proposed graph convolutional network called AdaGCN~(AdaBoosting Graph Convolutional Network) has the ability to efficiently extract knowledge from high-order neighbors and integrate knowledge from different hops of neighbors into the network in an AdaBoost way. We also present the architectural difference between AdaGCN and existing graph convolutional methods to show the benefits of our proposal. Finally, extensive experiments demonstrate the state-of-the-art prediction performance and the computational advantage of our approach AdaGCN.

entity · 鏈路預測 · Extensibility · 圖 · 知識圖譜 ·

2019 年 3 月 13 日

MMKG: Multi-Modal Knowledge Graphs

Ye Liu,Hui Li,Alberto Garcia-Duran,Mathias Niepert,Daniel Onoro-Rubio,David S. Rosenblum

from arxiv, ESWC 2019

We present MMKG, a collection of three knowledge graphs that contain both numerical features and (links to) images for all entities as well as entity alignments between pairs of KGs. Therefore, multi-relational link prediction and entity matching communities can benefit from this resource. We believe this data set has the potential to facilitate the development of novel multi-modal learning approaches for knowledge graphs.We validate the utility ofMMKG in the sameAs link prediction task with an extensive set of experiments. These experiments show that the task at hand benefits from learning of multiple feature types.