亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

A large number of current machine learning methods rely upon deep neural networks. Yet, viewing neural networks as nonlinear dynamical systems, it becomes quickly apparent that mathematically rigorously establishing certain patterns generated by the nodes in the network is extremely difficult. Indeed, it is well-understood in the nonlinear dynamics of complex systems that, even in low-dimensional models, analytical techniques rooted in pencil-and-paper approaches reach their limits quickly. In this work, we propose a completely different perspective via the paradigm of rigorous numerical methods of nonlinear dynamics. The idea is to use computer-assisted proofs to validate mathematically the existence of nonlinear patterns in neural networks. As a case study, we consider a class of recurrent neural networks, where we prove via computer assistance the existence of several hundred Hopf bifurcation points, their non-degeneracy, and hence also the existence of several hundred periodic orbits. Our paradigm has the capability to rigorously verify complex nonlinear behaviour of neural networks, which provides a first step to explain the full abilities, as well as potential sensitivities, of machine learning methods via computer-assisted proofs.

相關內容

神(shen)(shen)經(jing)網(wang)(wang)(wang)(wang)絡(luo)(luo)(luo)(Neural Networks)是世界上(shang)三個最古老的(de)(de)(de)(de)神(shen)(shen)經(jing)建(jian)模學(xue)(xue)(xue)(xue)會(hui)(hui)的(de)(de)(de)(de)檔案期刊(kan):國際(ji)神(shen)(shen)經(jing)網(wang)(wang)(wang)(wang)絡(luo)(luo)(luo)學(xue)(xue)(xue)(xue)會(hui)(hui)(INNS)、歐(ou)洲神(shen)(shen)經(jing)網(wang)(wang)(wang)(wang)絡(luo)(luo)(luo)學(xue)(xue)(xue)(xue)會(hui)(hui)(ENNS)和(he)日本神(shen)(shen)經(jing)網(wang)(wang)(wang)(wang)絡(luo)(luo)(luo)學(xue)(xue)(xue)(xue)會(hui)(hui)(JNNS)。神(shen)(shen)經(jing)網(wang)(wang)(wang)(wang)絡(luo)(luo)(luo)提供了(le)一(yi)個論(lun)壇,以發展和(he)培育一(yi)個國際(ji)社會(hui)(hui)的(de)(de)(de)(de)學(xue)(xue)(xue)(xue)者(zhe)和(he)實踐者(zhe)感興(xing)趣(qu)的(de)(de)(de)(de)所有(you)方(fang)面的(de)(de)(de)(de)神(shen)(shen)經(jing)網(wang)(wang)(wang)(wang)絡(luo)(luo)(luo)和(he)相(xiang)關方(fang)法的(de)(de)(de)(de)計(ji)算(suan)(suan)智(zhi)能。神(shen)(shen)經(jing)網(wang)(wang)(wang)(wang)絡(luo)(luo)(luo)歡迎(ying)高(gao)質量論(lun)文的(de)(de)(de)(de)提交,有(you)助于全面的(de)(de)(de)(de)神(shen)(shen)經(jing)網(wang)(wang)(wang)(wang)絡(luo)(luo)(luo)研(yan)(yan)究,從行為和(he)大腦建(jian)模,學(xue)(xue)(xue)(xue)習算(suan)(suan)法,通過數(shu)學(xue)(xue)(xue)(xue)和(he)計(ji)算(suan)(suan)分(fen)析(xi),系(xi)統(tong)的(de)(de)(de)(de)工程和(he)技術(shu)(shu)應用,大量使用神(shen)(shen)經(jing)網(wang)(wang)(wang)(wang)絡(luo)(luo)(luo)的(de)(de)(de)(de)概念和(he)技術(shu)(shu)。這一(yi)獨特而廣(guang)泛的(de)(de)(de)(de)范圍促進了(le)生(sheng)物(wu)(wu)和(he)技術(shu)(shu)研(yan)(yan)究之間的(de)(de)(de)(de)思想交流,并有(you)助于促進對(dui)生(sheng)物(wu)(wu)啟(qi)發的(de)(de)(de)(de)計(ji)算(suan)(suan)智(zhi)能感興(xing)趣(qu)的(de)(de)(de)(de)跨學(xue)(xue)(xue)(xue)科(ke)社區的(de)(de)(de)(de)發展。因此,神(shen)(shen)經(jing)網(wang)(wang)(wang)(wang)絡(luo)(luo)(luo)編委會(hui)(hui)代表的(de)(de)(de)(de)專家領域包括心理(li)(li)學(xue)(xue)(xue)(xue),神(shen)(shen)經(jing)生(sheng)物(wu)(wu)學(xue)(xue)(xue)(xue),計(ji)算(suan)(suan)機科(ke)學(xue)(xue)(xue)(xue),工程,數(shu)學(xue)(xue)(xue)(xue),物(wu)(wu)理(li)(li)。該雜志發表文章、信件和(he)評論(lun)以及給編輯的(de)(de)(de)(de)信件、社論(lun)、時(shi)事、軟件調查(cha)和(he)專利信息(xi)。文章發表在五(wu)個部分(fen)之一(yi):認知科(ke)學(xue)(xue)(xue)(xue),神(shen)(shen)經(jing)科(ke)學(xue)(xue)(xue)(xue),學(xue)(xue)(xue)(xue)習系(xi)統(tong),數(shu)學(xue)(xue)(xue)(xue)和(he)計(ji)算(suan)(suan)分(fen)析(xi)、工程和(he)應用。 官(guan)網(wang)(wang)(wang)(wang)地址:

Neural networks have had discernible achievements in a wide range of applications. The wide-spread adoption also raises the concern of their dependability and reliability. Similar to traditional decision-making programs, neural networks can have defects that need to be repaired. The defects may cause unsafe behaviors, raise security concerns or unjust societal impacts. In this work, we address the problem of repairing a neural network for desirable properties such as fairness and the absence of backdoor. The goal is to construct a neural network that satisfies the property by (minimally) adjusting the given neural network's parameters (i.e., weights). Specifically, we propose CARE (\textbf{CA}usality-based \textbf{RE}pair), a causality-based neural network repair technique that 1) performs causality-based fault localization to identify the `guilty' neurons and 2) optimizes the parameters of the identified neurons to reduce the misbehavior. We have empirically evaluated CARE on various tasks such as backdoor removal, neural network repair for fairness and safety properties. Our experiment results show that CARE is able to repair all neural networks efficiently and effectively. For fairness repair tasks, CARE successfully improves fairness by $61.91\%$ on average. For backdoor removal tasks, CARE reduces the attack success rate from over $98\%$ to less than $1\%$. For safety property repair tasks, CARE reduces the property violation rate to less than $1\%$. Results also show that thanks to the causality-based fault localization, CARE's repair focuses on the misbehavior and preserves the accuracy of the neural networks.

Distributed machine learning (ML) can bring more computational resources to bear than single-machine learning, thus enabling reductions in training time. Distributed learning partitions models and data over many machines, allowing model and dataset sizes beyond the available compute power and memory of a single machine. In practice though, distributed ML is challenging when distribution is mandatory, rather than chosen by the practitioner. In such scenarios, data could unavoidably be separated among workers due to limited memory capacity per worker or even because of data privacy issues. There, existing distributed methods will utterly fail due to dominant transfer costs across workers, or do not even apply. We propose a new approach to distributed fully connected neural network learning, called independent subnet training (IST), to handle these cases. In IST, the original network is decomposed into a set of narrow subnetworks with the same depth. These subnetworks are then trained locally before parameters are exchanged to produce new subnets and the training cycle repeats. Such a naturally "model parallel" approach limits memory usage by storing only a portion of network parameters on each device. Additionally, no requirements exist for sharing data between workers (i.e., subnet training is local and independent) and communication volume and frequency are reduced by decomposing the original network into independent subnets. These properties of IST can cope with issues due to distributed data, slow interconnects, or limited device memory, making IST a suitable approach for cases of mandatory distribution. We show experimentally that IST results in training times that are much lower than common distributed learning approaches.

Momentum methods, including heavy-ball~(HB) and Nesterov's accelerated gradient~(NAG), are widely used in training neural networks for their fast convergence. However, there is a lack of theoretical guarantees for their convergence and acceleration since the optimization landscape of the neural network is non-convex. Nowadays, some works make progress towards understanding the convergence of momentum methods in an over-parameterized regime, where the number of the parameters exceeds that of the training instances. Nonetheless, current results mainly focus on the two-layer neural network, which are far from explaining the remarkable success of the momentum methods in training deep neural networks. Motivated by this, we investigate the convergence of NAG with constant learning rate and momentum parameter in training two architectures of deep linear networks: deep fully-connected linear neural networks and deep linear ResNets. Based on the over-parameterization regime, we first analyze the residual dynamics induced by the training trajectory of NAG for a deep fully-connected linear neural network under the random Gaussian initialization. Our results show that NAG can converge to the global minimum at a $(1 - \mathcal{O}(1/\sqrt{\kappa}))^t$ rate, where $t$ is the iteration number and $\kappa > 1$ is a constant depending on the condition number of the feature matrix. Compared to the $(1 - \mathcal{O}(1/{\kappa}))^t$ rate of GD, NAG achieves an acceleration over GD. To the best of our knowledge, this is the first theoretical guarantee for the convergence of NAG to the global minimum in training deep neural networks. Furthermore, we extend our analysis to deep linear ResNets and derive a similar convergence result.

The dynamic response of the legged robot locomotion is non-Lipschitz and can be stochastic due to environmental uncertainties. To test, validate, and characterize the safety performance of legged robots, existing solutions on observed and inferred risk can be incomplete and sampling inefficient. Some formal verification methods suffer from the model precision and other surrogate assumptions. In this paper, we propose a scenario sampling based testing framework that characterizes the overall safety performance of a legged robot by specifying (i) where (in terms of a set of states) the robot is potentially safe, and (ii) how safe the robot is within the specified set. The framework can also help certify the commercial deployment of the legged robot in real-world environment along with human and compare safety performance among legged robots with different mechanical structures and dynamic properties. The proposed framework is further deployed to evaluate a group of state-of-the-art legged robot locomotion controllers from various model-based, deep neural network involved, and reinforcement learning based methods in the literature. Among a series of intended work domains of the studied legged robots (e.g. tracking speed on sloped surface, with abrupt changes on demanded velocity, and against adversarial push-over disturbances), we show that the method can adequately capture the overall safety characterization and the subtle performance insights. Many of the observed safety outcomes, to the best of our knowledge, have never been reported by the existing work in the legged robot literature.

Present-day atomistic simulations generate long trajectories of ever more complex systems. Analyzing these data, discovering metastable states, and uncovering their nature is becoming increasingly challenging. In this paper, we first use the variational approach to conformation dynamics to discover the slowest dynamical modes of the simulations. This allows the different metastable states of the system to be located and organized hierarchically. The physical descriptors that characterize metastable states are discovered by means of a machine learning method. We show in the cases of two proteins, Chignolin and Bovine Pancreatic Trypsin Inhibitor, how such analysis can be effortlessly performed in a matter of seconds. Another strength of our approach is that it can be applied to the analysis of both unbiased and biased simulations.

The conjoining of dynamical systems and deep learning has become a topic of great interest. In particular, neural differential equations (NDEs) demonstrate that neural networks and differential equation are two sides of the same coin. Traditional parameterised differential equations are a special case. Many popular neural network architectures, such as residual networks and recurrent networks, are discretisations. NDEs are suitable for tackling generative problems, dynamical systems, and time series (particularly in physics, finance, ...) and are thus of interest to both modern machine learning and traditional mathematical modelling. NDEs offer high-capacity function approximation, strong priors on model space, the ability to handle irregular data, memory efficiency, and a wealth of available theory on both sides. This doctoral thesis provides an in-depth survey of the field. Topics include: neural ordinary differential equations (e.g. for hybrid neural/mechanistic modelling of physical systems); neural controlled differential equations (e.g. for learning functions of irregular time series); and neural stochastic differential equations (e.g. to produce generative models capable of representing complex stochastic dynamics, or sampling from complex high-dimensional distributions). Further topics include: numerical methods for NDEs (e.g. reversible differential equations solvers, backpropagation through differential equations, Brownian reconstruction); symbolic regression for dynamical systems (e.g. via regularised evolution); and deep implicit models (e.g. deep equilibrium models, differentiable optimisation). We anticipate this thesis will be of interest to anyone interested in the marriage of deep learning with dynamical systems, and hope it will provide a useful reference for the current state of the art.

Residual networks (ResNets) have displayed impressive results in pattern recognition and, recently, have garnered considerable theoretical interest due to a perceived link with neural ordinary differential equations (neural ODEs). This link relies on the convergence of network weights to a smooth function as the number of layers increases. We investigate the properties of weights trained by stochastic gradient descent and their scaling with network depth through detailed numerical experiments. We observe the existence of scaling regimes markedly different from those assumed in neural ODE literature. Depending on certain features of the network architecture, such as the smoothness of the activation function, one may obtain an alternative ODE limit, a stochastic differential equation or neither of these. These findings cast doubts on the validity of the neural ODE model as an adequate asymptotic description of deep ResNets and point to an alternative class of differential equations as a better description of the deep network limit.

Behaviors of the synthetic characters in current military simulations are limited since they are generally generated by rule-based and reactive computational models with minimal intelligence. Such computational models cannot adapt to reflect the experience of the characters, resulting in brittle intelligence for even the most effective behavior models devised via costly and labor-intensive processes. Observation-based behavior model adaptation that leverages machine learning and the experience of synthetic entities in combination with appropriate prior knowledge can address the issues in the existing computational behavior models to create a better training experience in military training simulations. In this paper, we introduce a framework that aims to create autonomous synthetic characters that can perform coherent sequences of believable behavior while being aware of human trainees and their needs within a training simulation. This framework brings together three mutually complementary components. The first component is a Unity-based simulation environment - Rapid Integration and Development Environment (RIDE) - supporting One World Terrain (OWT) models and capable of running and supporting machine learning experiments. The second is Shiva, a novel multi-agent reinforcement and imitation learning framework that can interface with a variety of simulation environments, and that can additionally utilize a variety of learning algorithms. The final component is the Sigma Cognitive Architecture that will augment the behavior models with symbolic and probabilistic reasoning capabilities. We have successfully created proof-of-concept behavior models leveraging this framework on realistic terrain as an essential step towards bringing machine learning into military simulations.

Since deep neural networks were developed, they have made huge contributions to everyday lives. Machine learning provides more rational advice than humans are capable of in almost every aspect of daily life. However, despite this achievement, the design and training of neural networks are still challenging and unpredictable procedures. To lower the technical thresholds for common users, automated hyper-parameter optimization (HPO) has become a popular topic in both academic and industrial areas. This paper provides a review of the most essential topics on HPO. The first section introduces the key hyper-parameters related to model training and structure, and discusses their importance and methods to define the value range. Then, the research focuses on major optimization algorithms and their applicability, covering their efficiency and accuracy especially for deep learning networks. This study next reviews major services and toolkits for HPO, comparing their support for state-of-the-art searching algorithms, feasibility with major deep learning frameworks, and extensibility for new modules designed by users. The paper concludes with problems that exist when HPO is applied to deep learning, a comparison between optimization algorithms, and prominent approaches for model evaluation with limited computational resources.

To provide more accurate, diverse, and explainable recommendation, it is compulsory to go beyond modeling user-item interactions and take side information into account. Traditional methods like factorization machine (FM) cast it as a supervised learning problem, which assumes each interaction as an independent instance with side information encoded. Due to the overlook of the relations among instances or items (e.g., the director of a movie is also an actor of another movie), these methods are insufficient to distill the collaborative signal from the collective behaviors of users. In this work, we investigate the utility of knowledge graph (KG), which breaks down the independent interaction assumption by linking items with their attributes. We argue that in such a hybrid structure of KG and user-item graph, high-order relations --- which connect two items with one or multiple linked attributes --- are an essential factor for successful recommendation. We propose a new method named Knowledge Graph Attention Network (KGAT) which explicitly models the high-order connectivities in KG in an end-to-end fashion. It recursively propagates the embeddings from a node's neighbors (which can be users, items, or attributes) to refine the node's embedding, and employs an attention mechanism to discriminate the importance of the neighbors. Our KGAT is conceptually advantageous to existing KG-based recommendation methods, which either exploit high-order relations by extracting paths or implicitly modeling them with regularization. Empirical results on three public benchmarks show that KGAT significantly outperforms state-of-the-art methods like Neural FM and RippleNet. Further studies verify the efficacy of embedding propagation for high-order relation modeling and the interpretability benefits brought by the attention mechanism.

北京阿比特科技有限公司