2020久久精品亚洲热综合,无码精品视频一区二区免费,欧美精品一区二区视频在线播放,激情五月激情综合色区,男女十八禁啪啪无遮挡免费视看

Recent advancements in diffusion models have been effective in learning data priors for solving inverse problems. They leverage diffusion sampling steps for inducing a data prior while using a measurement guidance gradient at each step to impose data consistency. For general inverse problems, approximations are needed when an unconditionally trained diffusion model is used since the measurement likelihood is intractable, leading to inaccurate posterior sampling. In other words, due to their approximations, these methods fail to preserve the generation process on the data manifold defined by the diffusion prior, leading to artifacts in applications such as image restoration. To enhance the performance and robustness of diffusion models in solving inverse problems, we propose Diffusion State-Guided Projected Gradient (DiffStateGrad), which projects the measurement gradient onto a subspace that is a low-rank approximation of an intermediate state of the diffusion process. DiffStateGrad, as a module, can be added to a wide range of diffusion-based inverse solvers to improve the preservation of the diffusion process on the prior manifold and filter out artifact-inducing components. We highlight that DiffStateGrad improves the robustness of diffusion models in terms of the choice of measurement guidance step size and noise while improving the worst-case performance. Finally, we demonstrate that DiffStateGrad improves upon the state-of-the-art on linear and nonlinear image restoration inverse problems.

相關內容

Projection

關注 1

極大 · Next · 解碼 · 噪聲 · 約束 ·

2024 年 11 月 13 日

Sum Rate Maximization for Movable Antenna-Aided Downlink RSMA Systems

Cixiao Zhang,Size Peng,Yin Xu,Xiaowu Ou,Xinghao Guo,Dazhi He,Wenjun Zhang

Rate splitting multiple access (RSMA) is regarded as an essential and powerful physical-layer (PHY) paradigm for next generation communication systems. Under such a system, users employ successive interference cancellation (SIC), allowing them to decode a portion of the interference and treat the remainder as noise. However, a problem is that current RSMA systems rely on fixed-position antenna arrays, limiting their capacity to fully exploit spatial freedom. This constraint restricts beamforming gain, which substantially degrades RSMA performance. To address this problem, we propose an movable antenna (MA)-aided RSMA scheme that allows the antennas at the base station (BS) to adjust their positions dynamically. Our target is to maximize the system's sum rate of both common and private messages by jointly optimizing the MA positions, beamforming matrix, and common rate allocation. To tackle the formulated non-convex problem, we employ fractional programming (FP) and develop a two-stage, coarse-to-fine-grained search algorithm to obtain suboptimal solutions. Numerical results demonstrate that, with appropriate antenna adjustments, the MA-enabled system significantly enhances the overall performance and reliability of RSMA when employing the proposed algorithm compared to fixed-position antenna configurations.

MoDELS · 設計 · Automator · 優化器 · Performer ·

2024 年 11 月 12 日

Towards Automated Model Design on Recommender Systems

Tunhou Zhang,Dehua Cheng,Yuchen He,Zhengxing Chen,Xiaoliang Dai,Liang Xiong,Yudong Liu,Feng Cheng,Yufan Cao,Feng Yan,Hai Li,Yiran Chen,Wei Wen

from arxiv, Accepted in ACM Transactions on Recommender Systems. arXiv admin note: substantial text overlap with arXiv:2207.07187

The increasing popularity of deep learning models has created new opportunities for developing AI-based recommender systems. Designing recommender systems using deep neural networks requires careful architecture design, and further optimization demands extensive co-design efforts on jointly optimizing model architecture and hardware. Design automation, such as Automated Machine Learning (AutoML), is necessary to fully exploit the potential of recommender model design, including model choices and model-hardware co-design strategies. We introduce a novel paradigm that utilizes weight sharing to explore abundant solution spaces. Our paradigm creates a large supernet to search for optimal architectures and co-design strategies to address the challenges of data multi-modality and heterogeneity in the recommendation domain. From a model perspective, the supernet includes a variety of operators, dense connectivity, and dimension search options. From a co-design perspective, it encompasses versatile Processing-In-Memory (PIM) configurations to produce hardware-efficient models. Our solution space's scale, heterogeneity, and complexity pose several challenges, which we address by proposing various techniques for training and evaluating the supernet. Our crafted models show promising results on three Click-Through Rates (CTR) prediction benchmarks, outperforming both manually designed and AutoML-crafted models with state-of-the-art performance when focusing solely on architecture search. From a co-design perspective, we achieve 2x FLOPs efficiency, 1.8x energy efficiency, and 1.5x performance improvements in recommender models.

標量 · 優化器 · 在線 · 線性的 · Learning ·

2024 年 11 月 11 日

Online Mirror Descent for Tchebycheff Scalarization in Multi-Objective Optimization

Meitong Liu,Xiaoyuan Zhang,Chulin Xie,Kate Donahue,Han Zhao

from arxiv, 26 pages, 7 figures, 2 tables

The goal of multi-objective optimization (MOO) is to learn under multiple, potentially conflicting, objectives. One widely used technique to tackle MOO is through linear scalarization, where one fixed preference vector is used to combine the objectives into a single scalar value for optimization. However, recent work (Hu et al., 2024) has shown linear scalarization often fails to capture the non-convex regions of the Pareto Front, failing to recover the complete set of Pareto optimal solutions. In light of the above limitations, this paper focuses on Tchebycheff scalarization that optimizes for the worst-case objective. In particular, we propose an online mirror descent algorithm for Tchebycheff scalarization, which we call OMD-TCH. We show that OMD-TCH enjoys a convergence rate of $O(\sqrt{\log m/T})$ where $m$ is the number of objectives and $T$ is the number of iteration rounds. We also propose a novel adaptive online-to-batch conversion scheme that significantly improves the practical performance of OMD-TCH while maintaining the same convergence guarantees. We demonstrate the effectiveness of OMD-TCH and the adaptive conversion scheme on both synthetic problems and federated learning tasks under fairness constraints, showing state-of-the-art performance.

樣本 · 去噪 · 近似 · Spotlight · 可約的 ·

2024 年 11 月 11 日

Divide-and-Conquer Posterior Sampling for Denoising Diffusion Priors

Yazid Janati,Badr Moufad,Alain Durmus,Eric Moulines,Jimmy Olsson

from arxiv, Updated version with significant updates

Recent advancements in solving Bayesian inverse problems have spotlighted denoising diffusion models (DDMs) as effective priors. Although these have great potential, DDM priors yield complex posterior distributions that are challenging to sample. Existing approaches to posterior sampling in this context address this problem either by retraining model-specific components, leading to stiff and cumbersome methods, or by introducing approximations with uncontrolled errors that affect the accuracy of the produced samples. We present an innovative framework, divide-and-conquer posterior sampling, which leverages the inherent structure of DDMs to construct a sequence of intermediate posteriors that guide the produced samples to the target posterior. Our method significantly reduces the approximation error associated with current techniques without the need for retraining. We demonstrate the versatility and effectiveness of our approach for a wide range of Bayesian inverse problems. The code is available at \url{//github.com/Badr-MOUFAD/dcps}

MoDELS · Machine Learning · Learning · 穩健性 · 可理解性 ·

2024 年 11 月 11 日

Computable Model-Independent Bounds for Adversarial Quantum Machine Learning

Bacui Li,Tansu Alpcan,Chandra Thapa,Udaya Parampalli

from arxiv, 21 pages, 9 figures

By leveraging the principles of quantum mechanics, QML opens doors to novel approaches in machine learning and offers potential speedup. However, machine learning models are well-documented to be vulnerable to malicious manipulations, and this susceptibility extends to the models of QML. This situation necessitates a thorough understanding of QML's resilience against adversarial attacks, particularly in an era where quantum computing capabilities are expanding. In this regard, this paper examines model-independent bounds on adversarial performance for QML. To the best of our knowledge, we introduce the first computation of an approximate lower bound for adversarial error when evaluating model resilience against sophisticated quantum-based adversarial attacks. Experimental results are compared to the computed bound, demonstrating the potential of QML models to achieve high robustness. In the best case, the experimental error is only 10% above the estimated bound, offering evidence of the inherent robustness of quantum models. This work not only advances our theoretical understanding of quantum model resilience but also provides a precise reference bound for the future development of robust QML algorithms.

Learning · 估計/估計量 · MoDELS · Performer · 模型評估 ·

2024 年 11 月 9 日

Towards Establishing Guaranteed Error for Learned Database Operations

Sepanta Zeighami,Cyrus Shahabi

from arxiv, Appeared in ICLR'24

Machine learning models have demonstrated substantial performance enhancements over non-learned alternatives in various fundamental data management operations, including indexing (locating items in an array), cardinality estimation (estimating the number of matching records in a database), and range-sum estimation (estimating aggregate attribute values for query-matched records). However, real-world systems frequently favor less efficient non-learned methods due to their ability to offer (worst-case) error guarantees - an aspect where learned approaches often fall short. The primary objective of these guarantees is to ensure system reliability, ensuring that the chosen approach consistently delivers the desired level of accuracy across all databases. In this paper, we embark on the first theoretical study of such guarantees for learned methods, presenting the necessary conditions for such guarantees to hold when using machine learning to perform indexing, cardinality estimation and range-sum estimation. Specifically, we present the first known lower bounds on the model size required to achieve the desired accuracy for these three key database operations. Our results bound the required model size for given average and worst-case errors in performing database operations, serving as the first theoretical guidelines governing how model size must change based on data size to be able to guarantee an accuracy level. More broadly, our established guarantees pave the way for the broader adoption and integration of learned models into real-world systems.

超參數 · 優化器 · 泛化理論 · Performer · MoDELS ·

2024 年 11 月 7 日

Reshuffling Resampling Splits Can Improve Generalization of Hyperparameter Optimization

Thomas Nagler,Lennart Schneider,Bernd Bischl,Matthias Feurer

from arxiv, Accepted at NeurIPS 2024. 48 pages, 8 tables, 30 figures

Hyperparameter optimization is crucial for obtaining peak performance of machine learning models. The standard protocol evaluates various hyperparameter configurations using a resampling estimate of the generalization error to guide optimization and select a final hyperparameter configuration. Without much evidence, paired resampling splits, i.e., either a fixed train-validation split or a fixed cross-validation scheme, are often recommended. We show that, surprisingly, reshuffling the splits for every configuration often improves the final model's generalization performance on unseen data. Our theoretical analysis explains how reshuffling affects the asymptotic behavior of the validation loss surface and provides a bound on the expected regret in the limiting regime. This bound connects the potential benefits of reshuffling to the signal and noise characteristics of the underlying optimization problem. We confirm our theoretical results in a controlled simulation study and demonstrate the practical usefulness of reshuffling in a large-scale, realistic hyperparameter optimization experiment. While reshuffling leads to test performances that are competitive with using fixed splits, it drastically improves results for a single train-validation holdout protocol and can often make holdout become competitive with standard CV while being computationally cheaper.

contrastive · 對比學習 · 相似度 · MoDELS · 學成 ·

2021 年 9 月 24 日

Sequence Level Contrastive Learning for Text Summarization

Shusheng Xu,Xingxing Zhang,Yi Wu,Furu Wei

from arxiv, 2 figures, 12 tables

Contrastive learning models have achieved great success in unsupervised visual representation learning, which maximize the similarities between feature representations of different views of the same image, while minimize the similarities between feature representations of views of different images. In text summarization, the output summary is a shorter form of the input document and they have similar meanings. In this paper, we propose a contrastive learning model for supervised abstractive text summarization, where we view a document, its gold summary and its model generated summaries as different views of the same mean representation and maximize the similarities between them during training. We improve over a strong sequence-to-sequence text generation model (i.e., BART) on three different summarization datasets. Human evaluation also shows that our model achieves better faithfulness ratings compared to its counterpart without contrastive objectives.

Neural Networks · 圖形處理器 · 圖 · Networking · MoDELS ·

2021 年 1 月 27 日

Graph Neural Network for Traffic Forecasting: A Survey

Weiwei Jiang,Jiayun Luo

Traffic forecasting is an important factor for the success of intelligent transportation systems. Deep learning models including convolution neural networks and recurrent neural networks have been applied in traffic forecasting problems to model the spatial and temporal dependencies. In recent years, to model the graph structures in the transportation systems as well as the contextual information, graph neural networks (GNNs) are introduced as new tools and have achieved the state-of-the-art performance in a series of traffic forecasting problems. In this survey, we review the rapidly growing body of recent research using different GNNs, e.g., graph convolutional and graph attention networks, in various traffic forecasting problems, e.g., road traffic flow and speed forecasting, passenger flow forecasting in urban rail transit systems, demand forecasting in ride-hailing platforms, etc. We also present a collection of open data and source resources for each problem, as well as future research directions. To the best of our knowledge, this paper is the first comprehensive survey that explores the application of graph neural networks for traffic forecasting problems. We have also created a public Github repository to update the latest papers, open data and source resources.

塑造 · 解碼 · MoDELS · 學成 · 生成模型 ·

2018 年 12 月 6 日

Learning Implicit Fields for Generative Shape Modeling

Zhiqin Chen,Hao Zhang

We advocate the use of implicit fields for learning generative models of shapes and introduce an implicit field decoder for shape generation, aimed at improving the visual quality of the generated shapes. An implicit field assigns a value to each point in 3D space, so that a shape can be extracted as an iso-surface. Our implicit field decoder is trained to perform this assignment by means of a binary classifier. Specifically, it takes a point coordinate, along with a feature vector encoding a shape, and outputs a value which indicates whether the point is outside the shape or not. By replacing conventional decoders by our decoder for representation learning and generative modeling of shapes, we demonstrate superior results for tasks such as shape autoencoding, generation, interpolation, and single-view 3D reconstruction, particularly in terms of visual quality.