亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

This paper introduces scalable, sampling-based algorithms that optimize trained neural networks with ReLU activations. We first propose an iterative algorithm that takes advantage of the piecewise linear structure of ReLU neural networks and reduces the initial mixed-integer optimization problem (MIP) into multiple easy-to-solve linear optimization problems (LPs) through sampling. Subsequently, we extend this approach by searching around the neighborhood of the LP solution computed at each iteration. This scheme allows us to devise a second, enhanced algorithm that reduces the initial MIP problem into smaller, easier-to-solve MIPs. We analytically show the convergence of the methods and we provide a sample complexity guarantee. We also validate the performance of our algorithms by comparing them against state-of-the-art MIP-based methods. Finally, we show computationally how the sampling algorithms can be used effectively to warm-start MIP-based methods.

相關內容

We propose a novel automatic parameter selection strategy for variational imaging problems under Poisson noise corruption. The selection of a suitable regularization parameter, whose value is crucial in order to achieve high quality reconstructions, is known to be a particularly hard task in low photon-count regimes. In this work, we extend the so-called residual whiteness principle originally designed for additive white noise to Poisson data. The proposed strategy relies on the study of the whiteness property of a standardized Poisson noise process. After deriving the theoretical properties that motivate our proposal, we solve the target minimization problem with a linearized version of the alternating direction method of multipliers, which is particularly suitable in presence of a general linear forward operator. Our strategy is extensively tested on image restoration and computed tomography reconstruction problems, and compared to the well-known discrepancy principle for Poisson noise proposed by Zanella at al. and with a nearly exact version of it previously proposed by the authors.

Our theoretical understanding of deep learning has not kept pace with its empirical success. While network architecture is known to be critical, we do not yet understand its effect on learned representations and network behavior, or how this architecture should reflect task structure.In this work, we begin to address this gap by introducing the Gated Deep Linear Network framework that schematizes how pathways of information flow impact learning dynamics within an architecture. Crucially, because of the gating, these networks can compute nonlinear functions of their input. We derive an exact reduction and, for certain cases, exact solutions to the dynamics of learning. Our analysis demonstrates that the learning dynamics in structured networks can be conceptualized as a neural race with an implicit bias towards shared representations, which then govern the model's ability to systematically generalize, multi-task, and transfer. We validate our key insights on naturalistic datasets and with relaxed assumptions. Taken together, our work gives rise to general hypotheses relating neural architecture to learning and provides a mathematical approach towards understanding the design of more complex architectures and the role of modularity and compositionality in solving real-world problems. The code and results are available at //www.saxelab.org/gated-dln .

We establish the minimax risk for parameter estimation in sparse high-dimensional Gaussian mixture models and show that a constrained maximum likelihood estimator (MLE) achieves the minimax optimality. However, the optimization-based constrained MLE is computationally intractable due to non-convexity of the problem. Therefore, we propose a Bayesian approach to estimate high-dimensional Gaussian mixtures whose cluster centers exhibit sparsity using a continuous spike-and-slab prior, and prove that the posterior contraction rate of the proposed Bayesian method is minimax optimal. The mis-clustering rate is obtained as a by-product using tools from matrix perturbation theory. Computationally, posterior inference of the proposed Bayesian method can be implemented via an efficient Gibbs sampler with data augmentation, circumventing the challenging frequentist nonconvex optimization-based algorithms. The proposed Bayesian sparse Gaussian mixture model does not require pre-specifying the number of clusters, which is allowed to grow with the sample size and can be adaptively estimated via posterior inference. The validity and usefulness of the proposed method is demonstrated through simulation studies and the analysis of a real-world single-cell RNA sequencing dataset.

Interpreting the performance results of models that attempt to realize user behavior in platforms that employ recommenders is a big challenge that researchers and practitioners continue to face. Although current evaluation tools possess the capacity to provide solid general overview of a system's performance, they still lack consistency and effectiveness in their use as evident in most recent studies on the topic. Current traditional assessment techniques tend to fail to detect variations that could occur on smaller subsets of the data and lack the ability to explain how such variations affect the overall performance. In this article, we focus on the concept of data clustering for evaluation in recommenders and apply a neighborhood assessment method for the datasets of recommender system applications. This new method, named neighborhood-based evaluation, aids in better understanding critical performance variations in more compact subsets of the system to help spot weaknesses where such variations generally go unnoticed with conventional metrics and are typically averaged out. This new modular evaluation layer complements the existing assessment mechanisms and provides the possibility of several applications to the recommender ecosystem such as model evolution tests, fraud/attack detection and a possibility for hosting a hybrid model setup.

Block stacking storage systems are highly adaptable warehouse systems with low investment costs. With multiple, deep lanes they can achieve high storage densities, but accessing some unit loads can be time-consuming. The unit-load pre-marshalling problem sorts the unit loads in a block stacking storage system in off-peak time periods to prepare for upcoming orders. The goal is to find a minimum number of unit-load moves needed to sequence a storage bay in ascending order based on the retrieval priority group of each unit load. In this paper, we present two solution approaches for determining the minimum number of unit-load moves. We show that for storage bays with one access direction, it is possible to adapt existing, optimal tree search procedures and lower bound heuristics from the container pre-marshalling problem. For multiple access directions, we develop a novel, two-step solution approach based on a network flow model and an A* algorithm with an adapted lower bound that is applicable in all scenarios. We further analyze the performance of the presented solutions in computational experiments for randomly generated problem instances and show that multiple access directions greatly reduce both the total access time of unit loads and the required sorting effort.

The deployment of the sensor nodes (SNs) always plays a decisive role in the system performance of wireless sensor networks (WSNs). In this work, we propose an optimal deployment method for practical heterogeneous WSNs which gives a deep insight into the trade-off between the reliability and deployment cost. Specifically, this work aims to provide the optimal deployment of SNs to maximize the coverage degree and connection degree, and meanwhile minimize the overall deployment cost. In addition, this work fully considers the heterogeneity of SNs (i.e. differentiated sensing range and deployment cost) and three-dimensional (3-D) deployment scenarios. This is a multi-objective optimization problem, non-convex, multimodal and NP-hard. To solve it, we develop a novel swarm-based multi-objective optimization algorithm, known as the competitive multi-objective marine predators algorithm (CMOMPA) whose performance is verified by comprehensive comparative experiments with ten other stateof-the-art multi-objective optimization algorithms. The computational results demonstrate that CMOMPA is superior to others in terms of convergence and accuracy and shows excellent performance on multimodal multiobjective optimization problems. Sufficient simulations are also conducted to evaluate the effectiveness of the CMOMPA based optimal SNs deployment method. The results show that the optimized deployment can balance the trade-off among deployment cost, sensing reliability and network reliability. The source code is available on //github.com/iNet-WZU/CMOMPA.

Designing smart home services is a complex task when multiple services with a large number of sensors and actuators are deployed simultaneously. It may rely on knowledge-based or data-driven approaches. The former can use rule-based methods to design services statically, and the latter can use learning methods to discover inhabitants' preferences dynamically. However, neither of these approaches is entirely satisfactory because rules cannot cover all possible situations that may change, and learning methods may make decisions that are sometimes incomprehensible to the inhabitant. In this paper, PBRE (Pedagogic Based Rule Extractor) is proposed to extract rules from learning methods to realize dynamic rule generation for smart home systems. The expected advantage is that both the explainability of rule-based methods and the dynamicity of learning methods are adopted. We compare PBRE with an existing rule extraction method, and the results show better performance of PBRE. We also apply PBRE to extract rules from a smart home service represented by an NRL (Neural Network-based Reinforcement Learning). The results show that PBRE can help the NRL-simulated service to make understandable suggestions to the inhabitant.

Graph Neural Networks (GNNs) have been studied from the lens of expressive power and generalization. However, their optimization properties are less well understood. We take the first step towards analyzing GNN training by studying the gradient dynamics of GNNs. First, we analyze linearized GNNs and prove that despite the non-convexity of training, convergence to a global minimum at a linear rate is guaranteed under mild assumptions that we validate on real-world graphs. Second, we study what may affect the GNNs' training speed. Our results show that the training of GNNs is implicitly accelerated by skip connections, more depth, and/or a good label distribution. Empirical results confirm that our theoretical results for linearized GNNs align with the training behavior of nonlinear GNNs. Our results provide the first theoretical support for the success of GNNs with skip connections in terms of optimization, and suggest that deep GNNs with skip connections would be promising in practice.

The growing energy and performance costs of deep learning have driven the community to reduce the size of neural networks by selectively pruning components. Similarly to their biological counterparts, sparse networks generalize just as well, if not better than, the original dense networks. Sparsity can reduce the memory footprint of regular networks to fit mobile devices, as well as shorten training time for ever growing networks. In this paper, we survey prior work on sparsity in deep learning and provide an extensive tutorial of sparsification for both inference and training. We describe approaches to remove and add elements of neural networks, different training strategies to achieve model sparsity, and mechanisms to exploit sparsity in practice. Our work distills ideas from more than 300 research papers and provides guidance to practitioners who wish to utilize sparsity today, as well as to researchers whose goal is to push the frontier forward. We include the necessary background on mathematical methods in sparsification, describe phenomena such as early structure adaptation, the intricate relations between sparsity and the training process, and show techniques for achieving acceleration on real hardware. We also define a metric of pruned parameter efficiency that could serve as a baseline for comparison of different sparse networks. We close by speculating on how sparsity can improve future workloads and outline major open problems in the field.

Graph Neural Networks (GNNs) have received considerable attention on graph-structured data learning for a wide variety of tasks. The well-designed propagation mechanism which has been demonstrated effective is the most fundamental part of GNNs. Although most of GNNs basically follow a message passing manner, litter effort has been made to discover and analyze their essential relations. In this paper, we establish a surprising connection between different propagation mechanisms with a unified optimization problem, showing that despite the proliferation of various GNNs, in fact, their proposed propagation mechanisms are the optimal solution optimizing a feature fitting function over a wide class of graph kernels with a graph regularization term. Our proposed unified optimization framework, summarizing the commonalities between several of the most representative GNNs, not only provides a macroscopic view on surveying the relations between different GNNs, but also further opens up new opportunities for flexibly designing new GNNs. With the proposed framework, we discover that existing works usually utilize naive graph convolutional kernels for feature fitting function, and we further develop two novel objective functions considering adjustable graph kernels showing low-pass or high-pass filtering capabilities respectively. Moreover, we provide the convergence proofs and expressive power comparisons for the proposed models. Extensive experiments on benchmark datasets clearly show that the proposed GNNs not only outperform the state-of-the-art methods but also have good ability to alleviate over-smoothing, and further verify the feasibility for designing GNNs with our unified optimization framework.

北京阿比特科技有限公司