亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Maximum mean discrepancies (MMDs) like the kernel Stein discrepancy (KSD) have grown central to a wide range of applications, including hypothesis testing, sampler selection, distribution approximation, and variational inference. In each setting, these kernel-based discrepancy measures are required to (i) separate a target P from other probability measures or even (ii) control weak convergence to P. In this article we derive new sufficient and necessary conditions to ensure (i) and (ii). For MMDs on separable metric spaces, we characterize those kernels that separate Bochner embeddable measures and introduce simple conditions for separating all measures with unbounded kernels and for controlling convergence with bounded kernels. We use these results on $\mathbb{R}^d$ to substantially broaden the known conditions for KSD separation and convergence control and to develop the first KSDs known to exactly metrize weak convergence to P. Along the way, we highlight the implications of our results for hypothesis testing, measuring and improving sample quality, and sampling with Stein variational gradient descent.

相關內容

Resource allocation is a fundamental task in cell-free (CF) massive multi-input multi-output (MIMO) systems, which can effectively improve the network performance. In this paper, we study the downlink of CF MIMO networks with network clustering and linear precoding, and develop a sequential multiuser scheduling and power allocation scheme. In particular, we present a multiuser scheduling algorithm based on greedy techniques and a gradient ascent {(GA)} power allocation algorithm for sum-rate maximization when imperfect channel state information (CSI) is considered. Numerical results show the superiority of the proposed sequential scheduling and power allocation scheme and algorithms to existing approaches while reducing the computational complexity and the signaling load.

Machine Learning (ML) is an expressive framework for turning data into computer programs. Across many problem domains -- both in industry and policy settings -- the types of computer programs needed for accurate prediction or optimal control are difficult to write by hand. On the other hand, collecting instances of desired system behavior may be relatively more feasible. This makes ML broadly appealing, but also induces data sensitivities that often manifest as unexpected failure modes during deployment. In this sense, the training data available tend to be imperfect for the task at hand. This thesis explores several data sensitivities of modern machine learning and how to address them. We begin by discussing how to prevent ML from codifying prior human discrimination measured in the training data, where we take a fair representation learning approach. We then discuss the problem of learning from data containing spurious features, which provide predictive fidelity during training but are unreliable upon deployment. Here we observe that insofar as standard training methods tend to learn such features, this propensity can be leveraged to search for partitions of training data that expose this inconsistency, ultimately promoting learning algorithms invariant to spurious features. Finally, we turn our attention to reinforcement learning from data with insufficient coverage over all possible states and actions. To address the coverage issue, we discuss how causal priors can be used to model the single-step dynamics of the setting where data are collected. This enables a new type of data augmentation where observed trajectories are stitched together to produce new but plausible counterfactual trajectories.

In millimeter-wave (mmWave) cellular systems, reconfigurable intelligent surfaces (RISs) are foreseeably deployed with a large number of reflecting elements to achieve high beamforming gains. The large-sized RIS will make radio links fall in the near-field localization regime with spatial non-stationarity issues. Moreover, the discrete phase restriction on the RIS reflection coefficient incurs exponential complexity for discrete beamforming. It remains an open problem to find the optimal RIS reflection coefficient design in polynomial time. To address these issues, we propose a scalable partitioned-far-field protocol that considers both the near-filed non-stationarity and discrete beamforming. The protocol approximates near-field signal propagation using a partitioned-far-field representation to inherit the sparsity from the sophisticated far-field and facilitate the near-field localization scheme. To improve the theoretical localization performance, we propose a fast passive beamforming (FPB) algorithm that optimally solves the discrete RIS beamforming problem, reducing the search complexity from exponential order to linear order. Furthermore, by exploiting the partitioned structure of RIS, we introduce a two-stage coarse-to-fine localization algorithm that leverages both the time delay and angle information. Numerical results demonstrate that centimeter-level localization precision is achieved under medium and high signal-to-noise ratios (SNR), revealing that RISs can provide support for low-cost and high-precision localization in future cellular systems.

The Split and Rephrase task, which consists in splitting complex sentences into a sequence of shorter grammatical sentences, while preserving the original meaning, can facilitate the processing of complex texts for humans and machines alike. In this work, we describe an approach based on large language models, which improves over the state of the art by large margins on all the major metrics for the task, on publicly available datasets. We also describe results from two human evaluations that further establish the significant improvements obtained with large language models and the viability of the approach. We evaluate different strategies, including fine-tuning pretrained language models of varying parameter size, and applying both zero-shot and few-shot in-context learning on instruction-tuned language models. Although the latter were markedly outperformed by fine-tuned models, they still achieved promising results overall. Our results thus demonstrate the strong potential of different variants of large language models for the Split and Rephrase task, using relatively small amounts of training samples and model parameters overall.

Graph Neural Networks (GNNs) have shown promising results on a broad spectrum of applications. Most empirical studies of GNNs directly take the observed graph as input, assuming the observed structure perfectly depicts the accurate and complete relations between nodes. However, graphs in the real world are inevitably noisy or incomplete, which could even exacerbate the quality of graph representations. In this work, we propose a novel Variational Information Bottleneck guided Graph Structure Learning framework, namely VIB-GSL, in the perspective of information theory. VIB-GSL advances the Information Bottleneck (IB) principle for graph structure learning, providing a more elegant and universal framework for mining underlying task-relevant relations. VIB-GSL learns an informative and compressive graph structure to distill the actionable information for specific downstream tasks. VIB-GSL deduces a variational approximation for irregular graph data to form a tractable IB objective function, which facilitates training stability. Extensive experimental results demonstrate that the superior effectiveness and robustness of VIB-GSL.

Graph Neural Networks (GNNs) have received considerable attention on graph-structured data learning for a wide variety of tasks. The well-designed propagation mechanism which has been demonstrated effective is the most fundamental part of GNNs. Although most of GNNs basically follow a message passing manner, litter effort has been made to discover and analyze their essential relations. In this paper, we establish a surprising connection between different propagation mechanisms with a unified optimization problem, showing that despite the proliferation of various GNNs, in fact, their proposed propagation mechanisms are the optimal solution optimizing a feature fitting function over a wide class of graph kernels with a graph regularization term. Our proposed unified optimization framework, summarizing the commonalities between several of the most representative GNNs, not only provides a macroscopic view on surveying the relations between different GNNs, but also further opens up new opportunities for flexibly designing new GNNs. With the proposed framework, we discover that existing works usually utilize naive graph convolutional kernels for feature fitting function, and we further develop two novel objective functions considering adjustable graph kernels showing low-pass or high-pass filtering capabilities respectively. Moreover, we provide the convergence proofs and expressive power comparisons for the proposed models. Extensive experiments on benchmark datasets clearly show that the proposed GNNs not only outperform the state-of-the-art methods but also have good ability to alleviate over-smoothing, and further verify the feasibility for designing GNNs with our unified optimization framework.

Answering questions that require reading texts in an image is challenging for current models. One key difficulty of this task is that rare, polysemous, and ambiguous words frequently appear in images, e.g., names of places, products, and sports teams. To overcome this difficulty, only resorting to pre-trained word embedding models is far from enough. A desired model should utilize the rich information in multiple modalities of the image to help understand the meaning of scene texts, e.g., the prominent text on a bottle is most likely to be the brand. Following this idea, we propose a novel VQA approach, Multi-Modal Graph Neural Network (MM-GNN). It first represents an image as a graph consisting of three sub-graphs, depicting visual, semantic, and numeric modalities respectively. Then, we introduce three aggregators which guide the message passing from one graph to another to utilize the contexts in various modalities, so as to refine the features of nodes. The updated nodes have better features for the downstream question answering module. Experimental evaluations show that our MM-GNN represents the scene texts better and obviously facilitates the performances on two VQA tasks that require reading scene texts.

Knowledge graph completion aims to predict missing relations between entities in a knowledge graph. While many different methods have been proposed, there is a lack of a unifying framework that would lead to state-of-the-art results. Here we develop PathCon, a knowledge graph completion method that harnesses four novel insights to outperform existing methods. PathCon predicts relations between a pair of entities by: (1) Considering the Relational Context of each entity by capturing the relation types adjacent to the entity and modeled through a novel edge-based message passing scheme; (2) Considering the Relational Paths capturing all paths between the two entities; And, (3) adaptively integrating the Relational Context and Relational Path through a learnable attention mechanism. Importantly, (4) in contrast to conventional node-based representations, PathCon represents context and path only using the relation types, which makes it applicable in an inductive setting. Experimental results on knowledge graph benchmarks as well as our newly proposed dataset show that PathCon outperforms state-of-the-art knowledge graph completion methods by a large margin. Finally, PathCon is able to provide interpretable explanations by identifying relations that provide the context and paths that are important for a given predicted relation.

It is important to detect anomalous inputs when deploying machine learning systems. The use of larger and more complex inputs in deep learning magnifies the difficulty of distinguishing between anomalous and in-distribution examples. At the same time, diverse image and text data are available in enormous quantities. We propose leveraging these data to improve deep anomaly detection by training anomaly detectors against an auxiliary dataset of outliers, an approach we call Outlier Exposure (OE). This enables anomaly detectors to generalize and detect unseen anomalies. In extensive experiments on natural language processing and small- and large-scale vision tasks, we find that Outlier Exposure significantly improves detection performance. We also observe that cutting-edge generative models trained on CIFAR-10 may assign higher likelihoods to SVHN images than to CIFAR-10 images; we use OE to mitigate this issue. We also analyze the flexibility and robustness of Outlier Exposure, and identify characteristics of the auxiliary dataset that improve performance.

Dynamic programming (DP) solves a variety of structured combinatorial problems by iteratively breaking them down into smaller subproblems. In spite of their versatility, DP algorithms are usually non-differentiable, which hampers their use as a layer in neural networks trained by backpropagation. To address this issue, we propose to smooth the max operator in the dynamic programming recursion, using a strongly convex regularizer. This allows to relax both the optimal value and solution of the original combinatorial problem, and turns a broad class of DP algorithms into differentiable operators. Theoretically, we provide a new probabilistic perspective on backpropagating through these DP operators, and relate them to inference in graphical models. We derive two particular instantiations of our framework, a smoothed Viterbi algorithm for sequence prediction and a smoothed DTW algorithm for time-series alignment. We showcase these instantiations on two structured prediction tasks and on structured and sparse attention for neural machine translation.

北京阿比特科技有限公司