一本色道综合久久欧美日韩精品,国产亚洲成A人片在线观看蜜桃,中文字幕字幕乱码在线视,国产一区二区三区免费视频

This work is concerned with the classical wave equation with a high-contrast coefficient in the spatial derivative operator. We first treat the periodic case, where we derive a new limit in the one-dimensional case. The behavior is illustrated numerically and contrasted to the higher-dimensional case. For general unstructured high-contrast coefficients, we present the Localized Orthogonal Decomposition and show a priori error estimates in suitably weighted norms. Numerical experiments illustrate the convergence rates in various settings.

相關內容

對比度

關注 0

Performer · SGD · MoDELS · Learning · 聯邦學習 ·

2023 年 5 月 16 日

Faster Federated Learning with Decaying Number of Local SGD Steps

Jed Mills,Jia Hu,Geyong Min

In Federated Learning (FL) client devices connected over the internet collaboratively train a machine learning model without sharing their private data with a central server or with other clients. The seminal Federated Averaging (FedAvg) algorithm trains a single global model by performing rounds of local training on clients followed by model averaging. FedAvg can improve the communication-efficiency of training by performing more steps of Stochastic Gradient Descent (SGD) on clients in each round. However, client data in real-world FL is highly heterogeneous, which has been extensively shown to slow model convergence and harm final performance when $K > 1$ steps of SGD are performed on clients per round. In this work we propose decaying $K$ as training progresses, which can jointly improve the final performance of the FL model whilst reducing the wall-clock time and the total computational cost of training compared to using a fixed $K$. We analyse the convergence of FedAvg with decaying $K$ for strongly-convex objectives, providing novel insights into the convergence properties, and derive three theoretically-motivated decay schedules for $K$. We then perform thorough experiments on four benchmark FL datasets (FEMNIST, CIFAR100, Sentiment140, Shakespeare) to show the real-world benefit of our approaches in terms of real-world convergence time, computational cost, and generalisation performance.

Networking · INFORMS · Neural Networks · 代價 · 卷積神經網絡 ·

2023 年 5 月 16 日

Content-Adaptive Downsampling in Convolutional Neural Networks

Robin Hesse,Simone Schaub-Meyer,Stefan Roth

from arxiv, Accepted at CVPR 2023 Workshop on Efficient Deep Learning for Computer Vision (ECV). Code: //github.com/visinf/cad

Many convolutional neural networks (CNNs) rely on progressive downsampling of their feature maps to increase the network's receptive field and decrease computational cost. However, this comes at the price of losing granularity in the feature maps, limiting the ability to correctly understand images or recover fine detail in dense prediction tasks. To address this, common practice is to replace the last few downsampling operations in a CNN with dilated convolutions, allowing to retain the feature map resolution without reducing the receptive field, albeit increasing the computational cost. This allows to trade off predictive performance against cost, depending on the output feature resolution. By either regularly downsampling or not downsampling the entire feature map, existing work implicitly treats all regions of the input image and subsequent feature maps as equally important, which generally does not hold. We propose an adaptive downsampling scheme that generalizes the above idea by allowing to process informative regions at a higher resolution than less informative ones. In a variety of experiments, we demonstrate the versatility of our adaptive downsampling strategy and empirically show that it improves the cost-accuracy trade-off of various established CNNs.

暫退法 · 數據拆分 · AIM · 成比例 · SimPLe ·

2023 年 5 月 15 日

Evaluating Splitting Approaches in the Context of Student Dropout Prediction

Bruno de M. Barros,Hugo A. D. do Nascimento,Raphael Guedes,Sandro E. Monsueto

from arxiv, 11 pages, 3 figures, 3 tables, FECS'21 - The 17th International Conference on Frontiers in Education: Computer Science and Computer Engineering, Transactions on Computational Science and Computational Intelligence

The prediction of academic dropout, with the aim of preventing it, is one of the current challenges of higher education institutions. Machine learning techniques are a great ally in this task. However, attention is needed in the way that academic data are used by such methods, so that it reflects the reality of the prediction problem under study and allows achieving good results. In this paper, we study strategies for splitting and using academic data in order to create training and testing sets. Through a conceptual analysis and experiments with data from a public higher education institution, we show that a random proportional data splitting, and even a simple temporal splitting are not suitable for dropout prediction. The study indicates that a temporal splitting combined with a time-based selection of the students' incremental academic histories leads to the best strategy for the problem in question.

Integration · 輸出 · Performer · 離散化 · 標量 ·

2023 年 5 月 14 日

Validated integration of semilinear parabolic PDEs

Jan Bouwe van den Berg,Maxime Breden,Ray Sheombarsing

Integrating evolutionary partial differential equations (PDEs) is an essential ingredient for studying the dynamics of the solutions. Indeed, simulations are at the core of scientific computing, but their mathematical reliability is often difficult to quantify, especially when one is interested in the output of a given simulation, rather than in the asymptotic regime where the discretization parameter tends to zero. In this paper we present a computer-assisted proof methodology to perform rigorous time integration for scalar semilinear parabolic PDEs with periodic boundary conditions. We formulate an equivalent zero-finding problem based on a variations of constants formula in Fourier space. Using Chebyshev interpolation and domain decomposition, we then finish the proof with a Newton--Kantorovich type argument. The final output of this procedure is a proof of existence of an orbit, together with guaranteed error bounds between this orbit and a numerically computed approximation. We illustrate the versatility of the approach with results for the Fisher equation, the Swift--Hohenberg equation, the Ohta--Kawasaki equation and the Kuramoto--Sivashinsky equation. We expect that this rigorous integrator can form the basis for studying boundary value problems for connecting orbits in partial differential equations.

Signal Processing · 離散化 · Processing（編程語言） · 泛化理論 · 噪聲 ·

2023 年 5 月 14 日

Tao General Differential and Difference: Theory and Application

Linmi Tao,Ruiyang Liu,Donglai Tao,Wu Xia,Feilong Ma,Jingmao Cui

Modern numerical analysis is executed on discrete data, of which numerical difference computation is one of the cores and is indispensable. Nevertheless, difference algorithms have a critical weakness in their sensitivity to noise, which has long posed a challenge in various fields including signal processing. Difference is an extension or generalization of differential in the discrete domain. However, due to the finite interval in discrete calculation, there is a failure in meeting the most fundamental definition of differential, where dy and dx are both infinitesimal (Leibniz) or the limit of dx is 0 (Cauchy). In this regard, the generalization of differential to difference does not hold. To address this issue, we depart from the original derivative approach, construct a finite interval-based differential, and further generalize it to obtain the difference by convolution. Based on this theory, we present a variety of difference operators suitable for practical signal processing. Experimental results demonstrate that these difference operators possess exceptional signal processing capabilities, including high noise immunity.

state-of-the-art · 模型評估 · 情景 · MoDELS · 評論員 ·

2023 年 5 月 13 日

Detection and Mitigation of Byzantine Attacks in Distributed Training

Konstantinos Konstantinidis,Namrata Vaswani,Aditya Ramamoorthy

from arxiv, 21 pages, 17 figures, 6 tables. The material in this work appeared in part at arXiv:2108.02416 which has been published at the 2022 IEEE International Symposium on Information Theory

A plethora of modern machine learning tasks require the utilization of large-scale distributed clusters as a critical component of the training pipeline. However, abnormal Byzantine behavior of the worker nodes can derail the training and compromise the quality of the inference. Such behavior can be attributed to unintentional system malfunctions or orchestrated attacks; as a result, some nodes may return arbitrary results to the parameter server (PS) that coordinates the training. Recent work considers a wide range of attack models and has explored robust aggregation and/or computational redundancy to correct the distorted gradients. In this work, we consider attack models ranging from strong ones: $q$ omniscient adversaries with full knowledge of the defense protocol that can change from iteration to iteration to weak ones: $q$ randomly chosen adversaries with limited collusion abilities which only change every few iterations at a time. Our algorithms rely on redundant task assignments coupled with detection of adversarial behavior. We also show the convergence of our method to the optimal point under common assumptions and settings considered in literature. For strong attacks, we demonstrate a reduction in the fraction of distorted gradients ranging from 16%-99% as compared to the prior state-of-the-art. Our top-1 classification accuracy results on the CIFAR-10 data set demonstrate 25% advantage in accuracy (averaged over strong and weak scenarios) under the most sophisticated attacks compared to state-of-the-art methods.

Analysis · 正則化項 · 估計/估計量 · Pair · 散度 ·

2023 年 5 月 12 日

The Crouzeix-Raviart Element for non-conforming dual mixed methods: A Priori Analysis

Tomás P. Barrios,J. Manuel Cascón,Andreas Wachtel

from arxiv, 25 pages, 5 tables, 8 figures

Under some regularity assumptions, we report an a priori error analysis of a dG scheme for the Poisson and Stokes flow problem in their dual mixed formulation. Both formulations satisfy a Babu\v{s}ka-Brezzi type condition within the space H(div) x L2. It is well known that the lowest order Crouzeix-Raviart element paired with piecewise constants satisfies such a condition on (broken) H1 x L2 spaces. In the present article, we use this pair. The continuity of the normal component is weakly imposed by penalizing jumps of the broken H(div) component. For the resulting methods, we prove well-posedness and convergence with constants independent of data and mesh size. We report error estimates in the methods natural norms and optimal local error estimates for the divergence error. In fact, our finite element solution shares for each triangle one DOF with the CR interpolant and the divergence is locally the best-approximation for any regularity. Numerical experiments support the findings and suggest that the other errors converge optimally even for the lowest regularity solutions and a crack-problem, as long as the crack is resolved by the mesh.

SMORE · 多跳 · 圖 · Performer · entity ·

2021 年 10 月 28 日

SMORE: Knowledge Graph Completion and Multi-hop Reasoning in Massive Knowledge Graphs

Hongyu Ren,Hanjun Dai,Bo Dai,Xinyun Chen,Denny Zhou,Jure Leskovec,Dale Schuurmans

Knowledge graphs (KGs) capture knowledge in the form of head--relation--tail triples and are a crucial component in many AI systems. There are two important reasoning tasks on KGs: (1) single-hop knowledge graph completion, which involves predicting individual links in the KG; and (2), multi-hop reasoning, where the goal is to predict which KG entities satisfy a given logical query. Embedding-based methods solve both tasks by first computing an embedding for each entity and relation, then using them to form predictions. However, existing scalable KG embedding frameworks only support single-hop knowledge graph completion and cannot be applied to the more challenging multi-hop reasoning task. Here we present Scalable Multi-hOp REasoning (SMORE), the first general framework for both single-hop and multi-hop reasoning in KGs. Using a single machine SMORE can perform multi-hop reasoning in Freebase KG (86M entities, 338M edges), which is 1,500x larger than previously considered KGs. The key to SMORE's runtime performance is a novel bidirectional rejection sampling that achieves a square root reduction of the complexity of online training data generation. Furthermore, SMORE exploits asynchronous scheduling, overlapping CPU-based data sampling, GPU-based embedding computation, and frequent CPU--GPU IO. SMORE increases throughput (i.e., training speed) over prior multi-hop KG frameworks by 2.2x with minimal GPU memory requirements (2GB for training 400-dim embeddings on 86M-node Freebase) and achieves near linear speed-up with the number of GPUs. Moreover, on the simpler single-hop knowledge graph completion task SMORE achieves comparable or even better runtime performance to state-of-the-art frameworks on both single GPU and multi-GPU settings.

圖卷積 · 圖卷積神經網絡/圖卷積網絡 · 圖 · 卷積 · Networking ·

2020 年 12 月 15 日

Coupled Layer-wise Graph Convolution for Transportation Demand Prediction

Junchen Ye,Leilei Sun,Bowen Du,Yanjie Fu,Hui Xiong

Graph Convolutional Network (GCN) has been widely applied in transportation demand prediction due to its excellent ability to capture non-Euclidean spatial dependence among station-level or regional transportation demands. However, in most of the existing research, the graph convolution was implemented on a heuristically generated adjacency matrix, which could neither reflect the real spatial relationships of stations accurately, nor capture the multi-level spatial dependence of demands adaptively. To cope with the above problems, this paper provides a novel graph convolutional network for transportation demand prediction. Firstly, a novel graph convolution architecture is proposed, which has different adjacency matrices in different layers and all the adjacency matrices are self-learned during the training process. Secondly, a layer-wise coupling mechanism is provided, which associates the upper-level adjacency matrix with the lower-level one. It also reduces the scale of parameters in our model. Lastly, a unitary network is constructed to give the final prediction result by integrating the hidden spatial states with gated recurrent unit, which could capture the multi-level spatial dependence and temporal dynamics simultaneously. Experiments have been conducted on two real-world datasets, NYC Citi Bike and NYC Taxi, and the results demonstrate the superiority of our model over the state-of-the-art ones.

離散化 · 圖 · 圖形處理器 · Neural Networks · Networking ·

2019 年 3 月 28 日

Learning Discrete Structures for Graph Neural Networks

Luca Franceschi,Mathias Niepert,Massimiliano Pontil,Xiao He

from arxiv, 18 pages

Graph neural networks (GNNs) are a popular class of machine learning models whose major advantage is their ability to incorporate a sparse and discrete dependency structure between data points. Unfortunately, GNNs can only be used when such a graph-structure is available. In practice, however, real-world graphs are often noisy and incomplete or might not be available at all. With this work, we propose to jointly learn the graph structure and the parameters of graph convolutional networks (GCNs) by approximately solving a bilevel program that learns a discrete probability distribution on the edges of the graph. This allows one to apply GCNs not only in scenarios where the given graph is incomplete or corrupted but also in those where a graph is not available. We conduct a series of experiments that analyze the behavior of the proposed method and demonstrate that it outperforms related methods by a significant margin.