亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Developing robust sparse models fit for safety-critical and resource-constrained systems such as drones, autonomous robots, etc., has been an issue of longstanding interest. The inability of adversarial training mechanisms to provide a formal robustness guarantee kindles the requirement for verified local robustness mechanisms. This work aims to compute sparse verified locally robust networks which exhibit (benign) accuracy and verified local robustness comparable to their dense counterparts. Towards this objective, we examine several model sparsification approaches and present `SparseVLR'-- a framework to search verified locally robust sparse networks. We empirically investigated SparseVLR's efficacy and generalizability by evaluating various benchmark and application-specific datasets across several models. Above all, we provide an in-depth study and reasoning to unveil the causes for the ascendancy of SparseVLR.

相關內容

Deep neural networks provide state-of-the-art accuracy for vision tasks but they require significant resources for training. Thus, they are trained on cloud servers far from the edge devices that acquire the data. This issue increases communication cost, runtime and privacy concerns. In this study, a novel hierarchical training method for deep neural networks is proposed that uses early exits in a divided architecture between edge and cloud workers to reduce the communication cost, training runtime and privacy concerns. The method proposes a brand-new use case for early exits to separate the backward pass of neural networks between the edge and the cloud during the training phase. We address the issues of most available methods that due to the sequential nature of the training phase, cannot train the levels of hierarchy simultaneously or they do it with the cost of compromising privacy. In contrast, our method can use both edge and cloud workers simultaneously, does not share the raw input data with the cloud and does not require communication during the backward pass. Several simulations and on-device experiments for different neural network architectures demonstrate the effectiveness of this method. It is shown that the proposed method reduces the training runtime by 29% and 61% in CIFAR-10 classification experiment for VGG-16 and ResNet-18 when the communication with the cloud is done at a low bit rate channel. This gain in the runtime is achieved whilst the accuracy drop is negligible. This method is advantageous for online learning of high-accuracy deep neural networks on low-resource devices such as mobile phones or robots as a part of an edge-cloud system, making them more flexible in facing new tasks and classes of data.

Personalized Federated Learning (PFL) is a new Federated Learning (FL) paradigm, particularly tackling the heterogeneity issues brought by various mobile user equipments (UEs) in mobile edge computing (MEC) networks. However, due to the ever-increasing number of UEs and the complicated administrative work it brings, it is desirable to switch the PFL algorithm from its conventional two-layer framework to a multiple-layer one. In this paper, we propose hierarchical PFL (HPFL), an algorithm for deploying PFL over massive MEC networks. The UEs in HPFL are divided into multiple clusters, and the UEs in each cluster forward their local updates to the edge server (ES) synchronously for edge model aggregation, while the ESs forward their edge models to the cloud server semi-asynchronously for global model aggregation. The above training manner leads to a tradeoff between the training loss in each round and the round latency. HPFL combines the objectives of training loss minimization and round latency minimization while jointly determining the optimal bandwidth allocation as well as the ES scheduling policy in the hierarchical learning framework. Extensive experiments verify that HPFL not only guarantees convergence in hierarchical aggregation frameworks but also has advantages in round training loss maximization and round latency minimization.

We prove a weak rate of convergence of a fully discrete scheme for stochastic Cahn--Hilliard equation with additive noise, where the spectral Galerkin method is used in space and the backward Euler method is used in time. Compared with the Allen--Cahn type stochastic partial differential equation, the error analysis here is much more sophisticated due to the presence of the unbounded operator in front of the nonlinear term. To address such issues, a novel and direct approach has been exploited which does not rely on a Kolmogorov equation but on the integration by parts formula from Malliavin calculus. To the best of our knowledge, the rates of weak convergence are revealed in the stochastic Cahn--Hilliard equation setting for the first time.

State-of-art NPUs are typically architected as a self-contained sub-system with multiple heterogeneous hardware computing modules, and a dataflow-driven programming model. There lacks well-established methodology and tools in the industry to evaluate and compare the performance of NPUs from different architectures. We present an event-based performance modeling framework, VPU-EM, targeting scalable performance evaluation of modern NPUs across diversified AI workloads. The framework adopts high-level event-based system-simulation methodology to abstract away design details for speed, while maintaining hardware pipelining, concurrency and interaction with software task scheduling. It is natively developed in Python and built to interface directly with AI frameworks such as Tensorflow, PyTorch, ONNX and OpenVINO, linking various in-house NPU graph compilers to achieve optimized full model performance. Furthermore, VPU-EM also provides the capability to model power characteristics of NPU in Power-EM mode to enable joint performance/power analysis. Using VPU-EM, we conduct performance/power analysis of models from representative neural network architecture. We demonstrate that even though this framework is developed for Intel VPU, an Intel in-house NPU IP technology, the methodology can be generalized for analysis of modern NPUs.

High-performance deep neural network (DNN)-based systems are in high demand in edge environments. Due to its high computational complexity, it is challenging to deploy DNNs on edge devices with strict limitations on computational resources. In this paper, we derive a compact while highly-accurate DNN model, termed dsODENet, by combining recently-proposed parameter reduction techniques: Neural ODE (Ordinary Differential Equation) and DSC (Depthwise Separable Convolution). Neural ODE exploits a similarity between ResNet and ODE, and shares most of weight parameters among multiple layers, which greatly reduces the memory consumption. We apply dsODENet to a domain adaptation as a practical use case with image classification datasets. We also propose a resource-efficient FPGA-based design for dsODENet, where all the parameters and feature maps except for pre- and post-processing layers can be mapped onto on-chip memories. It is implemented on Xilinx ZCU104 board and evaluated in terms of domain adaptation accuracy, inference speed, FPGA resource utilization, and speedup rate compared to a software counterpart. The results demonstrate that dsODENet achieves comparable or slightly better domain adaptation accuracy compared to our baseline Neural ODE implementation, while the total parameter size without pre- and post-processing layers is reduced by 54.2% to 79.8%. Our FPGA implementation accelerates the inference speed by 23.8 times.

In this paper, we tackle the new task of video-based Activated Muscle Group Estimation (AMGE) aiming at identifying active muscle regions during physical activity. To this intent, we provide the MuscleMap136 dataset featuring >15K video clips with 136 different activities and 20 labeled muscle groups. This dataset opens the vistas to multiple video-based applications in sports and rehabilitation medicine. We further complement the main MuscleMap136 dataset, which specifically targets physical exercise, with Muscle-UCF90 and Muscle-HMDB41, which are new variants of the well-known activity recognition benchmarks extended with AMGE annotations. To make the AMGE model applicable in real-life situations, it is crucial to ensure that the model can generalize well to types of physical activities not present during training and involving new combinations of activated muscles. To achieve this, our benchmark also covers an evaluation setting where the model is exposed to activity types excluded from the training set. Our experiments reveal that generalizability of existing architectures adapted for the AMGE task remains a challenge. Therefore, we also propose a new approach, TransM3E, which employs a transformer-based model with cross-modal multi-label knowledge distillation and surpasses all popular video classification models when dealing with both, previously seen and new types of physical activities. The datasets and code will be publicly available at //github.com/KPeng9510/MuscleMap.

Using typical solution strategies to compute the solution curve of challenging problems often leads to the break down of the algorithm. To improve the solution process, numerical continuation methods have proved to be a very efficient tool. However, these methods can still lead to undesired results. In particular, near severe limit points and cusps, the solution process frequently encounters one of the following situations : divergence of the algorithm, a change in direction which makes the algorithm backtrack on a part of the solution curve that has already been obtained and omitting important regions of the solution curve by converging to a point that is much farther than the one anticipated. Detecting these situations is not an easy task when solving practical problems since the shape of the solution curve is not known in advance. This paper will therefore present a modified Moore-Penrose continuation method that will include two key aspects to solve challenging problems : detection of problematic regions during the solution process and additional steps to deal with them. The proposed approach can either be used as a basic continuation method or simply activated when difficulties occur. Numerical examples will be presented to show the efficiency of the new approach.

The conventional wisdom behind learning deep classification models is to focus on bad-classified examples and ignore well-classified examples that are far from the decision boundary. For instance, when training with cross-entropy loss, examples with higher likelihoods (i.e., well-classified examples) contribute smaller gradients in back-propagation. However, we theoretically show that this common practice hinders representation learning, energy optimization, and margin growth. To counteract this deficiency, we propose to reward well-classified examples with additive bonuses to revive their contribution to the learning process. This counterexample theoretically addresses these three issues. We empirically support this claim by directly verifying the theoretical results or significant performance improvement with our counterexample on diverse tasks, including image classification, graph classification, and machine translation. Furthermore, this paper shows that we can deal with complex scenarios, such as imbalanced classification, OOD detection, and applications under adversarial attacks because our idea can solve these three issues. Code is available at: //github.com/lancopku/well-classified-examples-are-underestimated.

Graph Neural Networks (GNNs) have been studied from the lens of expressive power and generalization. However, their optimization properties are less well understood. We take the first step towards analyzing GNN training by studying the gradient dynamics of GNNs. First, we analyze linearized GNNs and prove that despite the non-convexity of training, convergence to a global minimum at a linear rate is guaranteed under mild assumptions that we validate on real-world graphs. Second, we study what may affect the GNNs' training speed. Our results show that the training of GNNs is implicitly accelerated by skip connections, more depth, and/or a good label distribution. Empirical results confirm that our theoretical results for linearized GNNs align with the training behavior of nonlinear GNNs. Our results provide the first theoretical support for the success of GNNs with skip connections in terms of optimization, and suggest that deep GNNs with skip connections would be promising in practice.

北京阿比特科技有限公司