久久久久精品电影_国产无遮挡又黄又爽不要VIP软_国产精品九九久久精品视频_久久国产福国产秒拍_韩国日本欧美精品视频一区二区在线视频_欧美日韩成人在线观看视频_久久精品国产91久久麻豆

Following their success in visual recognition tasks, Vision Transformers(ViTs) are being increasingly employed for image restoration. As a few recent works claim that ViTs for image classification also have better robustness properties, we investigate whether the improved adversarial robustness of ViTs extends to image restoration. We consider the recently proposed Restormer model, as well as NAFNet and the "Baseline network" which are both simplified versions of a Restormer. We use Projected Gradient Descent (PGD) and CosPGD, a recently proposed adversarial attack tailored to pixel-wise prediction tasks for our robustness evaluation. Our experiments are performed on real-world images from the GoPro dataset for image deblurring. Our analysis indicates that contrary to as advocated by ViTs in image classification works, these models are highly susceptible to adversarial attacks. We attempt to improve their robustness through adversarial training. While this yields a significant increase in robustness for Restormer, results on other networks are less promising. Interestingly, the design choices in NAFNet and Baselines, which were based on iid performance, and not on robust generalization, seem to be at odds with the model robustness. Thus, we investigate this further and find a fix.

相關內容

穩健性(xing)

關注 3

MoDELS · 相關系數 · MCMC · 泛函 · 縮放 ·

2023 年 9 月 18 日

Scalable high-dimensional Bayesian varying coefficient models with unknown within-subject covariance

Ray Bai,Mary R. Boland,Yong Chen

from arxiv, 44 pages, 7 tables, 8 figures. This new version focuses on methodology and computation and includes new MCMC algorithms for uncertainty quantification

Nonparametric varying coefficient (NVC) models are useful for modeling time-varying effects on responses that are measured repeatedly for the same subjects. When the number of covariates is moderate or large, it is desirable to perform variable selection from the varying coefficient functions. However, existing methods for variable selection in NVC models either fail to account for within-subject correlations or require the practitioner to specify a parametric form for the correlation structure. In this paper, we introduce the nonparametric varying coefficient spike-and-slab lasso (NVC-SSL) for Bayesian high-dimensional NVC models. Through the introduction of functional random effects, our method allows for flexible modeling of within-subject correlations without needing to specify a parametric covariance function. We further propose several scalable optimization and Markov chain Monte Carlo (MCMC) algorithms. For variable selection, we propose an Expectation Conditional Maximization (ECM) algorithm to rapidly obtain maximum a posteriori (MAP) estimates. Our ECM algorithm scales linearly in the total number of observations $N$ and the number of covariates $p$. For uncertainty quantification, we introduce an approximate MCMC algorithm that also scales linearly in both $N$ and $p$. We demonstrate the scalability, variable selection performance, and inferential capabilities of our method through simulations and a real data application. These algorithms are implemented in the publicly available R package NVCSSL on the Comprehensive R Archive Network.

Color · 正則化項 · Processing（編程語言） · Analysis · 查準率/準確率 ·

2023 年 9 月 18 日

Digital analysis of early color photographs taken using regular color screen processes

Jan Hubi?ka,Linda Kimrová,Kenzie Klaeser,Sara Manco,Doug Peterson

from arxiv, 8 pages, 4 figures, submitted to the proceedings of XVIII Color Conference

Some early color photographic processes based on special color screen filters pose specific challenges in their digitization and digital presentation. Those challenges include dynamic range, resolution, and the difficulty of stitching geometrically-repeating patterns. We describe a novel method used to digitize the collection of early color photographs at the National Geographic Society which makes use of a custom open-source software tool to analyze and precisely stitch regular color screen processes.

泛函 · Analysis · 能量函數 · 全局極小解 · binary ·

2023 年 9 月 18 日

Lifting-based variational multiclass segmentation algorithm: design, convergence analysis, and implementation with applications in medical imaging

Nadja Gruber,Johannes Schwab,Sebastien Court,Elke Gizewski,Markus Haltmeier

We propose, analyze and realize a variational multiclass segmentation scheme that partitions a given image into multiple regions exhibiting specific properties. Our method determines multiple functions that encode the segmentation regions by minimizing an energy functional combining information from different channels. Multichannel image data can be obtained by lifting the image into a higher dimensional feature space using specific multichannel filtering or may already be provided by the imaging modality under consideration, such as an RGB image or multimodal medical data. Experimental results show that the proposed method performs well in various scenarios. In particular, promising results are presented for two medical applications involving classification of brain abscess and tumor growth, respectively. As main theoretical contributions, we prove the existence of global minimizers of the proposed energy functional and show its stability and convergence with respect to noisy inputs. In particular, these results also apply to the special case of binary segmentation, and these results are also novel in this particular situation.

Sphering · 收縮 · 樣本 · 隨機漫步 · Performer ·

2023 年 9 月 16 日

A dimension-independent bound on the Wasserstein contraction rate of geodesic slice sampling on the sphere for uniform target

Philip Sch?r,Thilo D. Stier

from arxiv, 11 pages, 2 figures

When faced with a constant target density, geodesic slice sampling on the sphere simplifies to a geodesic random walk. We prove that this random walk is Wasserstein contractive and that its contraction rate stabilizes with increasing dimension instead of deteriorating arbitrarily far. This demonstrates that the performance of geodesic slice sampling on the sphere can be entirely robust against dimension-increases, which had not been known before. Our result is also of interest due to its implications regarding the potential for dimension-independent performance by Gibbsian polar slice sampling, which is an MCMC method on $\mathbb{R}^d$ that implicitly uses geodesic slice sampling on the sphere within its transition mechanism.

控制器 · MoDELS · 機器人 · Extensibility · Performer ·

2023 年 9 月 14 日

Shared Telemanipulation with VR controllers in an anti slosh scenario

Max Grobbel,Balint Varga,S?ren Hohmann

Telemanipulation has become a promising technology that combines human intelligence with robotic capabilities to perform tasks remotely. However, it faces several challenges such as insufficient transparency, low immersion, and limited feedback to the human operator. Moreover, the high cost of haptic interfaces is a major limitation for the application of telemanipulation in various fields, including elder care, where our research is focused. To address these challenges, this paper proposes the usage of nonlinear model predictive control for telemanipulation using low-cost virtual reality controllers, including multiple control goals in the objective function. The framework utilizes models for human input prediction and taskrelated models of the robot and the environment. The proposed framework is validated on an UR5e robot arm in the scenario of handling liquid without spilling. Further extensions of the framework such as pouring assistance and collision avoidance can easily be included.

語音識別 · Integration · Processing（編程語言） · MoDELS · 自動語音識別 ·

2023 年 9 月 14 日

CPPF: A contextual and post-processing-free model for automatic speech recognition

Lei Zhang,Zhengkun Tian,Xiang Chen,Jiaming Sun,Hongyu Xiang,Ke Ding,Guanglu Wan

from arxiv, Submitted to ICASSP2024

ASR systems have become increasingly widespread in recent years. However, their textual outputs often require post-processing tasks before they can be practically utilized. To address this issue, we draw inspiration from the multifaceted capabilities of LLMs and Whisper, and focus on integrating multiple ASR text processing tasks related to speech recognition into the ASR model. This integration not only shortens the multi-stage pipeline, but also prevents the propagation of cascading errors, resulting in direct generation of post-processed text. In this study, we focus on ASR-related processing tasks, including Contextual ASR and multiple ASR post processing tasks. To achieve this objective, we introduce the CPPF model, which offers a versatile and highly effective alternative to ASR processing. CPPF seamlessly integrates these tasks without any significant loss in recognition performance.

Continuity · MoDELS · Performer · 語音合成 · 語音識別 ·

2023 年 9 月 14 日

Voxtlm: unified decoder-only models for consolidating speech recognition/synthesis and speech/text continuation tasks

Soumi Maiti,Yifan Peng,Shukjae Choi,Jee-weon Jung,Xuankai Chang,Shinji Watanabe

We propose a decoder-only language model, VoxtLM, that can perform four tasks: speech recognition, speech synthesis, text generation, and speech continuation. VoxtLM integrates text vocabulary with discrete speech tokens from self-supervised speech features and uses special tokens to enable multitask learning. Compared to a single-task model, VoxtLM exhibits a significant improvement in speech synthesis, with improvements in both speech intelligibility from 28.9 to 5.6 and objective quality from 2.68 to 3.90. VoxtLM also improves speech generation and speech recognition performance over the single-task counterpart. VoxtLM is trained with publicly available data and training recipes and model checkpoints will be open-sourced to make fully reproducible work.

MoDELS · 樣本 · INFORMS · Next · 泛函 ·

2023 年 9 月 14 日

A Diffusion model for POI recommendation

Yifang Qin,Hongjun Wu,Wei Ju,Xiao Luo,Ming Zhang

from arxiv, Accepted by ACM Transactions on Information Systems (TOIS 2023)

Next Point-of-Interest (POI) recommendation is a critical task in location-based services that aim to provide personalized suggestions for the user's next destination. Previous works on POI recommendation have laid focused on modeling the user's spatial preference. However, existing works that leverage spatial information are only based on the aggregation of users' previous visited positions, which discourages the model from recommending POIs in novel areas. This trait of position-based methods will harm the model's performance in many situations. Additionally, incorporating sequential information into the user's spatial preference remains a challenge. In this paper, we propose Diff-POI: a Diffusion-based model that samples the user's spatial preference for the next POI recommendation. Inspired by the wide application of diffusion algorithm in sampling from distributions, Diff-POI encodes the user's visiting sequence and spatial character with two tailor-designed graph encoding modules, followed by a diffusion-based sampling strategy to explore the user's spatial visiting trends. We leverage the diffusion process and its reversed form to sample from the posterior distribution and optimized the corresponding score function. We design a joint training and inference framework to optimize and evaluate the proposed Diff-POI. Extensive experiments on four real-world POI recommendation datasets demonstrate the superiority of our Diff-POI over state-of-the-art baseline methods. Further ablation and parameter studies on Diff-POI reveal the functionality and effectiveness of the proposed diffusion-based sampling strategy for addressing the limitations of existing methods.

圖片分類 · 前饋網絡 · INTERACT · Networking · 前饋 ·

2021 年 5 月 7 日

ResMLP: Feedforward networks for image classification with data-efficient training

Hugo Touvron,Piotr Bojanowski,Mathilde Caron,Matthieu Cord,Alaaeldin El-Nouby,Edouard Grave,Armand Joulin,Gabriel Synnaeve,Jakob Verbeek,Hervé Jégou

We present ResMLP, an architecture built entirely upon multi-layer perceptrons for image classification. It is a simple residual network that alternates (i) a linear layer in which image patches interact, independently and identically across channels, and (ii) a two-layer feed-forward network in which channels interact independently per patch. When trained with a modern training strategy using heavy data-augmentation and optionally distillation, it attains surprisingly good accuracy/complexity trade-offs on ImageNet. We will share our code based on the Timm library and pre-trained models.

Performer · Color · Networking · CRAFT · 均方誤差 ·

2018 年 1 月 25 日

C2MSNet: A Novel approach for single image haze removal

Akshay Dudhane,Subrahmanyam Murala

from arxiv, Accepted in Winter Conference on Applications of Computer Vision (WACV-2018)

Degradation of image quality due to the presence of haze is a very common phenomenon. Existing DehazeNet [3], MSCNN [11] tackled the drawbacks of hand crafted haze relevant features. However, these methods have the problem of color distortion in gloomy (poor illumination) environment. In this paper, a cardinal (red, green and blue) color fusion network for single image haze removal is proposed. In first stage, network fusses color information present in hazy images and generates multi-channel depth maps. The second stage estimates the scene transmission map from generated dark channels using multi channel multi scale convolutional neural network (McMs-CNN) to recover the original scene. To train the proposed network, we have used two standard datasets namely: ImageNet [5] and D-HAZY [1]. Performance evaluation of the proposed approach has been carried out using structural similarity index (SSIM), mean square error (MSE) and peak signal to noise ratio (PSNR). Performance analysis shows that the proposed approach outperforms the existing state-of-the-art methods for single image dehazing.