99视频在线播放喷射-露脸视频一区二区三区在线播放

In this paper, we study the (decentralized) distributed optimization problem with high-dimensional sparse structure. Building upon the FedDA algorithm, we propose a (Decentralized) FedDA-GT algorithm, which combines the \textbf{gradient tracking} technique. It is able to eliminate the heterogeneity among different clients' objective functions while ensuring a dimension-free convergence rate. Compared to the vanilla FedDA approach, (D)FedDA-GT can significantly reduce the communication complexity, from ${O}(s^2\log d/\varepsilon^{3/2})$ to a more efficient ${O}(s^2\log d/\varepsilon)$. In cases where strong convexity is applicable, we introduce a multistep mechanism resulting in the Multistep ReFedDA-GT algorithm, a minor modified version of FedDA-GT. This approach achieves an impressive communication complexity of ${O}\left(s\log d \log \frac{1}{\varepsilon}\right)$ through repeated calls to the ReFedDA-GT algorithm. Finally, we conduct numerical experiments, illustrating that our proposed algorithms enjoy the dual advantage of being dimension-free and heterogeneity-free.

相關內容

優化器

關注 4

可辨認的 · 優化器 · 聯合分布 · Learning · Minimax ·

2024 年 1 月 30 日

Policy Learning with Distributional Welfare

Yifan Cui,Sukjin Han

In this paper, we explore optimal treatment allocation policies that target distributional welfare. Most literature on treatment choice has considered utilitarian welfare based on the conditional average treatment effect (ATE). While average welfare is intuitive, it may yield undesirable allocations especially when individuals are heterogeneous (e.g., with outliers) - the very reason individualized treatments were introduced in the first place. This observation motivates us to propose an optimal policy that allocates the treatment based on the conditional quantile of individual treatment effects (QoTE). Depending on the choice of the quantile probability, this criterion can accommodate a policymaker who is either prudent or negligent. The challenge of identifying the QoTE lies in its requirement for knowledge of the joint distribution of the counterfactual outcomes, which is generally hard to recover even with experimental data. Therefore, we introduce minimax policies that are robust to model uncertainty. A range of identifying assumptions can be used to yield more informative policies. For both stochastic and deterministic policies, we establish the asymptotic bound on the regret of implementing the proposed policies. In simulations and two empirical applications, we compare optimal decisions based on the QoTE with decisions based on other criteria. The framework can be generalized to any setting where welfare is defined as a functional of the joint distribution of the potential outcomes.

有限差分 · Integration · 層 · 近似 · 最大范數 ·

2024 年 1 月 29 日

Exponentially Fitted Finite Difference Approximation for Singularly Perturbed Fredholm Integro-Differential Equation

Mehebub Alam,Rajni Kant Pandey

In this paper, we concentrate on solving second-order singularly perturbed Fredholm integro-differential equations (SPFIDEs). It is well known that solving these equations analytically is a challenging endeavor because of the presence of boundary and interior layers within the domain. To overcome these challenges, we develop a fitted second-order difference scheme that can capture the layer behavior of the solution accurately and efficiently, which is again, based on the integral identities with exponential basis functions, the composite trapezoidal rule, and an appropriate interpolating quadrature rules with the remainder terms in the integral form on a piecewise uniform mesh. Hence, our numerical method acts as a superior alternative to the existing methods in the literature. Further, using appropriate techniques in error analysis the scheme's convergence and stability have been studied in the discrete max norm. We have provided necessary experimental evidence that corroborates the theoretical results with a high degree of accuracy.

操作 · 噪聲 · 論文 ·

2024 年 1 月 29 日

Enriching Diagrams with Algebraic Operations

Alejandro Villoria,Henning Basold,Alfons Laarman

from arxiv, 19 pages, 8 appendix pages

In this paper, we extend diagrammatic reasoning in monoidal categories with algebraic operations and equations. We achieve this by considering monoidal categories that are enriched in the category of Eilenberg-Moore algebras for a monad. Under the condition that this monad is monoidal and affine, we construct an adjunction between symmetric monoidal categories and symmetric monoidal categories enriched over algebras for the monad. This allows us to devise an extension, and its semantics, of the ZX-calculus with probabilistic choices by freely enriching over convex algebras, which are the algebras of the finite distribution monad. We show how this construction can be used for diagrammatic reasoning of noise in quantum systems.

向量化 · 線性組合 · 壓縮感知 · 線性的 · 確切的 ·

2024 年 1 月 27 日

Signal Recovery From Product of Two Vandermonde Matrices

Dzevdan Kapetanovic

In this work, we present some new results for compressed sensing and phase retrieval. For compressed sensing, it is shown that if the unknown $n$-dimensional vector can be expressed as a linear combination of $s$ unknown Vandermonde vectors (with Fourier vectors as a special case) and the measurement matrix is a Vandermonde matrix, exact recovery of the vector with $2s$ measurements and $O(\mathrm{poly}(s))$ complexity is possible when $n \geq 2s$. From these results, a measurement matrix is constructed from which it is possible to recover $s$-sparse $n$-dimensional vectors for $n \geq 2s$ with as few as $2s$ measurements and with a recovery algorithm of $O(\mathrm{poly}(s))$ complexity. In the second part of the work, these results are extended to the challenging problem of phase retrieval. The most significant discovery in this direction is that if the unknown $n$-dimensional vector is composed of $s$ frequencies with at least one being non-harmonic, $n \geq 4s - 1$ and we take at least $8s-3$ Fourier measurements, there are, remarkably, only two possible vectors producing the observed measurement values and they are easily obtainable from each other. The two vectors can be found by an algorithm with only $O(\mathrm{poly}(s))$ complexity. An immediate application of the new result is construction of a measurement matrix from which it is possible to recover all $s$-sparse $n$-dimensional signals (up to a global phase) from $O(s)$ magnitude-only measurements and $O(\mathrm{poly}(s))$ recovery complexity when $n \geq 4s - 1$.

線性的 · 操作 · BASIC · 情景 · 控制器 ·

2024 年 1 月 26 日

From Differential Linear Logic to Coherent Differentiation

Thomas Ehrhard

In this survey, we present in a unified way the categorical and syntactical settings of coherent differentiation introduced recently, which shows that the basic ideas of differential linear logic and of the differential lambda-calculus are compatible with determinism. Indeed, due to the Leibniz rule of the differential calculus, differential linear logic and the differential lambda-calculus feature an operation of addition of proofs or terms operationally interpreted as a strong form of nondeterminism. The main idea of coherent differentiation is that these sums can be controlled and kept in the realm of determinism by means of a notion of summability, upon enforcing summability restrictions on the derivatives which can be written in the models and in the syntax.

數據增強 · 圖 · 圖形處理器 · Performer · Neural Networks ·

2020 年 12 月 2 日

Data Augmentation for Graph Neural Networks

Tong Zhao,Yozen Liu,Leonardo Neves,Oliver Woodford,Meng Jiang,Neil Shah

from arxiv, AAAI 2021. This complete version contains the Appendix

Data augmentation has been widely used to improve generalizability of machine learning models. However, comparatively little work studies data augmentation for graphs. This is largely due to the complex, non-Euclidean structure of graphs, which limits possible manipulation operations. Augmentation operations commonly used in vision and language have no analogs for graphs. Our work studies graph data augmentation for graph neural networks (GNNs) in the context of improving semi-supervised node-classification. We discuss practical and theoretical motivations, considerations and strategies for graph data augmentation. Our work shows that neural edge predictors can effectively encode class-homophilic structure to promote intra-class edges and demote inter-class edges in given graph structure, and our main contribution introduces the GAug graph data augmentation framework, which leverages these insights to improve performance in GNN-based node classification via edge prediction. Extensive experiments on multiple benchmarks show that augmentation via GAug improves performance across GNN architectures and datasets.

注意力機制 · 機器閱讀理解 · Extensibility · state-of-the-art · MoDELS ·

2018 年 4 月 25 日

Reinforced Mnemonic Reader for Machine Reading Comprehension

Minghao Hu,Yuxing Peng,Zhen Huang,Xipeng Qiu,Furu Wei,Ming Zhou

from arxiv, Published in 26th International Joint Conference on Artificial Intelligence (IJCAI), 2018

In this paper, we introduce the Reinforced Mnemonic Reader for machine reading comprehension tasks, which enhances previous attentive readers in two aspects. First, a reattention mechanism is proposed to refine current attentions by directly accessing to past attentions that are temporally memorized in a multi-round alignment architecture, so as to avoid the problems of attention redundancy and attention deficiency. Second, a new optimization approach, called dynamic-critical reinforcement learning, is introduced to extend the standard supervised method. It always encourages to predict a more acceptable answer so as to address the convergence suppression problem occurred in traditional reinforcement learning algorithms. Extensive experiments on the Stanford Question Answering Dataset (SQuAD) show that our model achieves state-of-the-art results. Meanwhile, our model outperforms previous systems by over 6% in terms of both Exact Match and F1 metrics on two adversarial SQuAD datasets.

entity · 推斷 · Performer · 向量空間 · Pair ·

2018 年 4 月 5 日

Variational Knowledge Graph Reasoning

Wenhu Chen,Wenhan Xiong,Xifeng Yan,William Wang

from arxiv, Accepted to NAACL 2018

Inferring missing links in knowledge graphs (KG) has attracted a lot of attention from the research community. In this paper, we tackle a practical query answering task involving predicting the relation of a given entity pair. We frame this prediction problem as an inference problem in a probabilistic graphical model and aim at resolving it from a variational inference perspective. In order to model the relation between the query entity pair, we assume that there exists an underlying latent variable (paths connecting two nodes) in the KG, which carries the equivalent semantics of their relations. However, due to the intractability of connections in large KGs, we propose to use variation inference to maximize the evidence lower bound. More specifically, our framework (\textsc{Diva}) is composed of three modules, i.e. a posterior approximator, a prior (path finder), and a likelihood (path reasoner). By using variational inference, we are able to incorporate them closely into a unified architecture and jointly optimize them to perform KG reasoning. With active interactions among these sub-modules, \textsc{Diva} is better at handling noise and coping with more complex reasoning scenarios. In order to evaluate our method, we conduct the experiment of the link prediction task on multiple datasets and achieve state-of-the-art performances on both datasets.

卷積神經網絡 · Neural Networks · 知識表示 · Networking · 卷積 ·

2018 年 2 月 14 日

Interpretable Convolutional Neural Networks

Quanshi Zhang,Ying Nian Wu,Song-Chun Zhu

from arxiv, In this version, we release the website of the code. Compared to the previous version, we have corrected all values of location instability in Table 3--6 by dividing the values by sqrt(2), i.e., a=a/sqrt(2). Such revisions do NOT decrease the significance of the superior performance of our method, because we make the same correction to location-instability values of all baselines

This paper proposes a method to modify traditional convolutional neural networks (CNNs) into interpretable CNNs, in order to clarify knowledge representations in high conv-layers of CNNs. In an interpretable CNN, each filter in a high conv-layer represents a certain object part. We do not need any annotations of object parts or textures to supervise the learning process. Instead, the interpretable CNN automatically assigns each filter in a high conv-layer with an object part during the learning process. Our method can be applied to different types of CNNs with different structures. The clear knowledge representation in an interpretable CNN can help people understand the logics inside a CNN, i.e., based on which patterns the CNN makes the decision. Experiments showed that filters in an interpretable CNN were more semantically meaningful than those in traditional CNNs.

Softmax · 邊緣化 · Performer · Better · state-of-the-art ·

2018 年 1 月 18 日

Additive Margin Softmax for Face Verification

Feng Wang,Weiyang Liu,Haijun Liu,Jian Cheng

from arxiv, technical report

In this paper, we propose a conceptually simple and geometrically interpretable objective function, i.e. additive margin Softmax (AM-Softmax), for deep face verification. In general, the face verification task can be viewed as a metric learning problem, so learning large-margin face features whose intra-class variation is small and inter-class difference is large is of great importance in order to achieve good performance. Recently, Large-margin Softmax and Angular Softmax have been proposed to incorporate the angular margin in a multiplicative manner. In this work, we introduce a novel additive angular margin for the Softmax loss, which is intuitively appealing and more interpretable than the existing works. We also emphasize and discuss the importance of feature normalization in the paper. Most importantly, our experiments on LFW BLUFR and MegaFace show that our additive margin softmax loss consistently performs better than the current state-of-the-art methods using the same network architecture and training dataset. Our code has also been made available at //github.com/happynear/AMSoftmax