韩国成年性午夜免费视频_日本成年黄色一区二区三区_日韩无码抽插黄片_最新手机AV在线不卡_在线观看一区二区国产亚洲_国产AV无码国产一区二区三区_99久久精品国产99久久7

Amortized variational inference (A-VI) is a method for approximating the intractable posterior distributions that arise in probabilistic models. The defining feature of A-VI is that it learns a global inference function that maps each observation to its local latent variable's approximate posterior. This stands in contrast to the more classical factorized (or mean-field) variational inference (F-VI), which directly learns the parameters of the approximating distribution for each latent variable. In deep generative models, A-VI is used as a computational trick to speed up inference for local latent variables. In this paper, we study A-VI as a general alternative to F-VI for approximate posterior inference. A-VI cannot produce an approximation with a lower Kullback-Leibler divergence than F-VI's optimal solution, because the amortized family is a subset of the factorized family. Thus a central theoretical problem is to characterize when A-VI still attains F-VI's optimal solution. We derive conditions on both the model and the inference function under which A-VI can theoretically achieve F-VI's optimum. We show that for a broad class of hierarchical models, including deep generative models, it is possible to close the gap between A-VI and F-VI. Further, for an even broader class of models, we establish when and how to expand the domain of the inference function to make amortization a feasible strategy. Finally, we prove that for certain models -- including hidden Markov models and Gaussian processes -- A-VI cannot match F-VI's solution, no matter how expressive the inference function is. We also study A-VI empirically [...]

相關內容

推斷

關注 5

最大平均偏差 · 均值 · Neural Networks · 泛函 · Networks ·

2023 年 9 月 13 日

Maximum Mean Discrepancy Meets Neural Networks: The Radon-Kolmogorov-Smirnov Test

Seunghoon Paik,Michael Celentano,Alden Green,Ryan J. Tibshirani

Maximum mean discrepancy (MMD) refers to a general class of nonparametric two-sample tests that are based on maximizing the mean difference over samples from one distribution $P$ versus another $Q$, over all choices of data transformations $f$ living in some function space $\mathcal{F}$. Inspired by recent work that connects what are known as functions of $\textit{Radon bounded variation}$ (RBV) and neural networks (Parhi and Nowak, 2021, 2023), we study the MMD defined by taking $\mathcal{F}$ to be the unit ball in the RBV space of a given smoothness order $k \geq 0$. This test, which we refer to as the $\textit{Radon-Kolmogorov-Smirnov}$ (RKS) test, can be viewed as a generalization of the well-known and classical Kolmogorov-Smirnov (KS) test to multiple dimensions and higher orders of smoothness. It is also intimately connected to neural networks: we prove that the witness in the RKS test -- the function $f$ achieving the maximum mean difference -- is always a ridge spline of degree $k$, i.e., a single neuron in a neural network. This allows us to leverage the power of modern deep learning toolkits to (approximately) optimize the criterion that underlies the RKS test. We prove that the RKS test has asymptotically full power at distinguishing any distinct pair $P \not= Q$ of distributions, derive its asymptotic null distribution, and carry out extensive experiments to elucidate the strengths and weakenesses of the RKS test versus the more traditional kernel MMD test.

序列化 · 極大 · 可約的 · 相互獨立的 · 駐點 ·

2023 年 9 月 13 日

TTD Configurations for Near-Field Beamforming: Parallel, Serial, or Hybrid?

Zhaolin Wang,Xidong Mu,Yuanwei Liu,Robert Schober

from arxiv, 13 pages, 8 figures

True-time delayers (TTDs) are popular components for hybrid beamforming architectures to combat the spatial-wideband effect in wideband near-field communications. A serial and a hybrid serial-parallel TTD configuration are investigated for hybrid beamforming architectures. Compared to the conventional parallel configuration, the serial configuration exhibits a cumulative time delay through multiple TTDs, which potentially alleviates the maximum delay requirements on the TTDs. However, independent control of individual TTDs becomes impossible in the serial configuration. In this context, a hybrid TTD configuration is proposed as a compromise solution. Furthermore, a power equalization approach is proposed to address the cumulative insertion loss of the serial and hybrid TTD configurations. Moreover, the wideband near-field beamforming design for different configurations is studied for maximizing the spectral efficiency in both single-user and multiple-user systems. 1) For single-user systems, a closed-form solution for the beamforming design is derived. The preferred user locations and the required maximum time delay of each TTD configuration are characterized. 2) For multi-user systems, a penalty-based iterative algorithm is developed to obtain a stationary point of the spectral efficiency maximization problem for each TTD configuration. In addition, a mixed-forward-and-backward (MFB) implementation is proposed to enhance the performance of the serial configuration. Our numerical results confirm the effectiveness of the proposed designs and unveil that i) compared to the conventional parallel configuration, both the serial and hybrid configurations can significantly reduce the maximum time delays required for the TTDs and ii) the hybrid configuration excels in single-user systems, while the serial configuration is preferred in multi-user systems.

MIMO · 通道 · 估計/估計量 · massive MIMO · 統計量 ·

2023 年 9 月 13 日

Is Channel Estimation Necessary to Select Phase-Shifts for RIS-Assisted Massive MIMO?

?zlem Tu?fe Demir,Emil Bj?rnson

from arxiv, Published in IEEE Transactions on Wireless Communications, vol. 21, no. 11, November 2022

Reconfigurable intelligent surfaces (RISs) consist of many passive elements of metamaterials whose impedance can be controllable to change the characteristics of wireless signals impinging on them. Channel estimation is a critical task when it comes to the control of a large RIS when having a channel with a large number of multipath components. In this paper, we derive Bayesian channel estimators for two RIS-assisted massive multiple-input multiple-output (MIMO) configurations: i) the short-term RIS configuration based on the instantaneous channel estimates; ii) the long-term RIS configuration based on the channel statistics. The proposed methods exploit spatial correlation characteristics at both the base station and the planar RISs, and other statistical characteristics of multi-specular fading in a mobile environment. Moreover, a novel heuristic for phase-shift selection at the RISs is developed. A computationally efficient fixed-point algorithm, which solves the max-min fairness power control optimally, is proposed. Simulation results demonstrate that the proposed uplink RIS-aided framework improves the spectral efficiency of the cell-edge mobile user equipments substantially in comparison to a conventional single-cell massive MIMO system. The impact of several channel effects are studied to gain insight about when the channel estimation, i.e., the short-term configuration, is preferable in comparison to the long-term RIS configuration to boost the spectral efficiency.

MINE · 數據集 · 統計量 · 相同 · 算法與數據結構 ·

2023 年 9 月 12 日

Private Distribution Testing with Heterogeneous Constraints: Your Epsilon Might Not Be Mine

Clément L. Canonne,Yucheng Sun

Private closeness testing asks to decide whether the underlying probability distributions of two sensitive datasets are identical or differ significantly in statistical distance, while guaranteeing (differential) privacy of the data. As in most (if not all) distribution testing questions studied under privacy constraints, however, previous work assumes that the two datasets are equally sensitive, i.e., must be provided the same privacy guarantees. This is often an unrealistic assumption, as different sources of data come with different privacy requirements; as a result, known closeness testing algorithms might be unnecessarily conservative, ``paying'' too high a privacy budget for half of the data. In this work, we initiate the study of the closeness testing problem under heterogeneous privacy constraints, where the two datasets come with distinct privacy requirements.

SC · 相似度 · 語義相似度 · Performer · 可交換的 ·

2023 年 9 月 9 日

How to Evaluate Semantic Communications for Images with ViTScore Metric?

Tingting Zhu,Bo Peng,Jifan Liang,Tingchen Han,Hai Wan,Jingqiao Fu,Junjie Chen

Semantic communications (SC) have been expected to be a new paradigm shifting to catalyze the next generation communication, whose main concerns shift from accurate bit transmission to effective semantic information exchange in communications. However, the previous and widely-used metrics for images are not applicable to evaluate the image semantic similarity in SC. Classical metrics to measure the similarity between two images usually rely on the pixel level or the structural level, such as the PSNR and the MS-SSIM. Straightforwardly using some tailored metrics based on deep-learning methods in CV community, such as the LPIPS, is infeasible for SC. To tackle this, inspired by BERTScore in NLP community, we propose a novel metric for evaluating image semantic similarity, named Vision Transformer Score (ViTScore). We prove theoretically that ViTScore has 3 important properties, including symmetry, boundedness, and normalization, which make ViTScore convenient and intuitive for image measurement. To evaluate the performance of ViTScore, we compare ViTScore with 3 typical metrics (PSNR, MS-SSIM, and LPIPS) through 5 classes of experiments. Experimental results demonstrate that ViTScore can better evaluate the image semantic similarity than the other 3 typical metrics, which indicates that ViTScore is an effective performance metric when deployed in SC scenarios.

Performer · 解碼 · 模型評估 · 系統設計 · Microsoft Surface ·

2023 年 9 月 9 日

IRS-Enabled Covert and Reliable Communications: How Many Reflection Elements are Required?

Manlin Wang,Bin Xia,Yao Yao,Zhiyong Chen,Jiangzhou Wang

from arxiv, The paper has some shortcomings in the theoretical analysis. And it will not be published at the conference, as clamied in last comments

Short-packet communications are applied to various scenarios where transmission covertness and reliability are crucial due to the open wireless medium and finite blocklength. Although intelligent reflection surface (IRS) has been widely utilized to enhance transmission covertness and reliability, the question of how many reflection elements at IRS are required remains unanswered, which is vital to system design and practical deployment. The inherent strong coupling exists between the transmission covertness and reliability by IRS, leading to the question of intractability. To address this issue, the detection error probability at the warder and its approximation are derived first to reveal the relation between covertness performance and the number of reflection elements. Besides, to evaluate the reliability performance of the system, the decoding error probability at the receiver is also derived. Subsequently, the asymptotic reliability performance in high covertness regimes is investigated, which provides theoretical predictions about the number of reflection elements at IRS required to achieve a decoding error probability close to 0 with given covertness requirements. Furthermore, Monte-Carlo simulations verify the accuracy of the derived results for detection (decoding) error probabilities and the validity of the theoretical predictions for reflection elements. Moreover, results show that more reflection elements are required to achieve high reliability with tighter covertness requirements, longer blocklength and higher transmission rates.

估計/估計量 · 離散化 · 樣本 · 可約的 · state-of-the-art ·

2023 年 9 月 7 日

DBsurf: A Discrepancy Based Method for Discrete Stochastic Gradient Estimation

Pau Mulet Arabi,Alec Flowers,Lukas Mauch,Fabien Cardinaux

from arxiv, 22 pages, 7 figures

Computing gradients of an expectation with respect to the distributional parameters of a discrete distribution is a problem arising in many fields of science and engineering. Typically, this problem is tackled using Reinforce, which frames the problem of gradient estimation as a Monte Carlo simulation. Unfortunately, the Reinforce estimator is especially sensitive to discrepancies between the true probability distribution and the drawn samples, a common issue in low sampling regimes that results in inaccurate gradient estimates. In this paper, we introduce DBsurf, a reinforce-based estimator for discrete distributions that uses a novel sampling procedure to reduce the discrepancy between the samples and the actual distribution. To assess the performance of our estimator, we subject it to a diverse set of tasks. Among existing estimators, DBsurf attains the lowest variance in a least squares problem commonly used in the literature for benchmarking. Furthermore, DBsurf achieves the best results for training variational auto-encoders (VAE) across different datasets and sampling setups. Finally, we apply DBsurf to build a simple and efficient Neural Architecture Search (NAS) algorithm with state-of-the-art performance.

Learning · Processing（編程語言） · Machine Learning · 評論員 · Vision ·

2022 年 6 月 30 日

Causal Machine Learning: A Survey and Open Problems

Jean Kaddour,Aengus Lynch,Qi Liu,Matt J. Kusner,Ricardo Silva

Causal Machine Learning (CausalML) is an umbrella term for machine learning methods that formalize the data-generation process as a structural causal model (SCM). This allows one to reason about the effects of changes to this process (i.e., interventions) and what would have happened in hindsight (i.e., counterfactuals). We categorize work in \causalml into five groups according to the problems they tackle: (1) causal supervised learning, (2) causal generative modeling, (3) causal explanations, (4) causal fairness, (5) causal reinforcement learning. For each category, we systematically compare its methods and point out open problems. Further, we review modality-specific applications in computer vision, natural language processing, and graph representation learning. Finally, we provide an overview of causal benchmarks and a critical discussion of the state of this nascent field, including recommendations for future work.

異常點 · CASES · 異常檢測 · 評論員 · Machine Learning ·

2021 年 10 月 21 日

Generalized Out-of-Distribution Detection: A Survey

Jingkang Yang,Kaiyang Zhou,Yixuan Li,Ziwei Liu

from arxiv, Issues, comments, and questions are all welcomed in //github.com/Jingkang50/OODSurvey

Out-of-distribution (OOD) detection is critical to ensuring the reliability and safety of machine learning systems. For instance, in autonomous driving, we would like the driving system to issue an alert and hand over the control to humans when it detects unusual scenes or objects that it has never seen before and cannot make a safe decision. This problem first emerged in 2017 and since then has received increasing attention from the research community, leading to a plethora of methods developed, ranging from classification-based to density-based to distance-based ones. Meanwhile, several other problems are closely related to OOD detection in terms of motivation and methodology. These include anomaly detection (AD), novelty detection (ND), open set recognition (OSR), and outlier detection (OD). Despite having different definitions and problem settings, these problems often confuse readers and practitioners, and as a result, some existing studies misuse terms. In this survey, we first present a generic framework called generalized OOD detection, which encompasses the five aforementioned problems, i.e., AD, ND, OSR, OOD detection, and OD. Under our framework, these five problems can be seen as special cases or sub-tasks, and are easier to distinguish. Then, we conduct a thorough review of each of the five areas by summarizing their recent technical developments. We conclude this survey with open challenges and potential research directions.

AdderNet · Neural Networks · Networking · 卷積 · 模型評估 ·

2019 年 12 月 31 日

AdderNet: Do We Really Need Multiplications in Deep Learning?

Hanting Chen,Yunhe Wang,Chunjing Xu,Boxin Shi,Chao Xu,Qi Tian,Chang Xu

Compared with cheap addition operation, multiplication operation is of much higher computation complexity. The widely-used convolutions in deep neural networks are exactly cross-correlation to measure the similarity between input feature and convolution filters, which involves massive multiplications between float values. In this paper, we present adder networks (AdderNets) to trade these massive multiplications in deep neural networks, especially convolutional neural networks (CNNs), for much cheaper additions to reduce computation costs. In AdderNets, we take the $\ell_1$-norm distance between filters and input feature as the output response. The influence of this new similarity measure on the optimization of neural network have been thoroughly analyzed. To achieve a better performance, we develop a special back-propagation approach for AdderNets by investigating the full-precision gradient. We then propose an adaptive learning rate strategy to enhance the training procedure of AdderNets according to the magnitude of each neuron's gradient. As a result, the proposed AdderNets can achieve 74.9% Top-1 accuracy 91.7% Top-5 accuracy using ResNet-50 on the ImageNet dataset without any multiplication in convolution layer.