亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Geometric deep learning refers to the scenario in which the symmetries of a dataset are used to constrain the parameter space of a neural network and thus, improve their trainability and generalization. Recently this idea has been incorporated into the field of quantum machine learning, which has given rise to equivariant quantum neural networks (EQNNs). In this work, we investigate the role of classical-to-quantum embedding on the performance of equivariant quantum convolutional neural networks (EQCNNs) for the classification of images. We discuss the connection between the data embedding method and the resulting representation of a symmetry group and analyze how changing representation affects the expressibility of an EQCNN. We numerically compare the classification accuracy of EQCNNs with three different basis-permuted amplitude embeddings to the one obtained from a non-equivariant quantum convolutional neural network (QCNN). Our results show that all the EQCNNs achieve higher classification accuracy than the non-equivariant QCNN for small numbers of training iterations, while for large iterations this improvement crucially depends on the used embedding. It is expected that the results of this work can be useful to the community for a better understanding of the importance of data embedding choice in the context of geometric quantum machine learning.

相關內容

Networking:IFIP International Conferences on Networking。 Explanation:國際網絡會議。 Publisher:IFIP。 SIT:

The deformed energy method has shown to be a good option for dimensional synthesis of mechanisms. In this paper the introduction of some new features to such approach is proposed. First, constraints fixing dimensions of certain links are introduced in the error function of the synthesis problem. Second, requirements on distances between determinate nodes are included in the error function for the analysis of the deformed position problem. Both the overall synthesis error function and the inner analysis error function are optimized using a Sequential Quadratic Problem (SQP) approach. This also reduces the probability of branch or circuit defects. In the case of the inner function analytical derivatives are used, while in the synthesis optimization approximate derivatives have been introduced. Furthermore, constraints are analyzed under two formulations, the Euclidean distance and an alternative approach that uses the previous raised to the power of two. The latter approach is often used in kinematics, and simplifies the computation of derivatives. Some examples are provided to show the convergence order of the error function and the fulfilment of the constraints in both formulations studied under different topological situations or achieved energy levels.

Understanding the mechanisms through which neural networks extract statistics from input-label pairs is one of the most important unsolved problems in supervised learning. Prior works have identified that the gram matrices of the weights in trained neural networks of general architectures are proportional to the average gradient outer product of the model, in a statement known as the Neural Feature Ansatz (NFA). However, the reason these quantities become correlated during training is poorly understood. In this work, we explain the emergence of this correlation. We identify that the NFA is equivalent to alignment between the left singular structure of the weight matrices and a significant component of the empirical neural tangent kernels associated with those weights. We establish that the NFA introduced in prior works is driven by a centered NFA that isolates this alignment. We show that the speed of NFA development can be predicted analytically at early training times in terms of simple statistics of the inputs and labels. Finally, we introduce a simple intervention to increase NFA correlation at any given layer, which dramatically improves the quality of features learned.

The development of cubical type theory inspired the idea of "extension types" which has been found to have applications in other type theories that are unrelated to homotopy type theory or cubical type theory. This article describes these applications, including on records, metaprogramming, controlling unfolding, and some more exotic ones.

We perform a quantitative assessment of different strategies to compute the contribution due to surface tension in incompressible two-phase flows using a conservative level set (CLS) method. More specifically, we compare classical approaches, such as the direct computation of the curvature from the level set or the Laplace-Beltrami operator, with an evolution equation for the mean curvature recently proposed in literature. We consider the test case of a static bubble, for which an exact solution for the pressure jump across the interface is available, and the test case of an oscillating bubble, showing pros and cons of the different approaches.

We consider nonparametric Bayesian inference in a multidimensional diffusion model with reflecting boundary conditions based on discrete high-frequency observations. We prove a general posterior contraction rate theorem in $L^2$-loss, which is applied to Gaussian priors. The resulting posteriors, as well as their posterior means, are shown to converge to the ground truth at the minimax optimal rate over H\"older smoothness classes in any dimension. Of independent interest and as part of our proofs, we show that certain frequentist penalized least squares estimators are also minimax optimal.

Collecting large quantities of high-quality data is often prohibitively expensive or impractical, and a crucial bottleneck in machine learning. One may instead augment a small set of $n$ data points from the target distribution with data from more accessible sources like public datasets, data collected under different circumstances, or synthesized by generative models. Blurring distinctions, we refer to such data as `surrogate data'. We define a simple scheme for integrating surrogate data into training and use both theoretical models and empirical studies to explore its behavior. Our main findings are: $(i)$ Integrating surrogate data can significantly reduce the test error on the original distribution; $(ii)$ In order to reap this benefit, it is crucial to use optimally weighted empirical risk minimization; $(iii)$ The test error of models trained on mixtures of real and surrogate data is well described by a scaling law. This can be used to predict the optimal weighting and the gain from surrogate data.

Anomaly detection in SDN using data flow prediction is a difficult task. This problem is included in the category of time series and regression problems. Machine learning approaches are challenging in this field due to the manual selection of features. On the other hand, deep learning approaches have important features due to the automatic selection of features. Meanwhile, RNN-based approaches have been used the most. The LSTM and GRU approaches learn dependent entities well; on the other hand, the IndRNN approach learns non-dependent entities in time series. The proposed approach tried to use a combination of IndRNN and LSTM approaches to learn dependent and non-dependent features. Feature selection approaches also provide a suitable view of features for the models; for this purpose, four feature selection models, Filter, Wrapper, Embedded, and Autoencoder were used. The proposed IndRNNLSTM algorithm, in combination with Embedded, was able to achieve MAE=1.22 and RMSE=9.92 on NSL-KDD data.

We hypothesize that due to the greedy nature of learning in multi-modal deep neural networks, these models tend to rely on just one modality while under-fitting the other modalities. Such behavior is counter-intuitive and hurts the models' generalization, as we observe empirically. To estimate the model's dependence on each modality, we compute the gain on the accuracy when the model has access to it in addition to another modality. We refer to this gain as the conditional utilization rate. In the experiments, we consistently observe an imbalance in conditional utilization rates between modalities, across multiple tasks and architectures. Since conditional utilization rate cannot be computed efficiently during training, we introduce a proxy for it based on the pace at which the model learns from each modality, which we refer to as the conditional learning speed. We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning. The proposed algorithm improves the model's generalization on three datasets: Colored MNIST, Princeton ModelNet40, and NVIDIA Dynamic Hand Gesture.

The remarkable practical success of deep learning has revealed some major surprises from a theoretical perspective. In particular, simple gradient methods easily find near-optimal solutions to non-convex optimization problems, and despite giving a near-perfect fit to training data without any explicit effort to control model complexity, these methods exhibit excellent predictive accuracy. We conjecture that specific principles underlie these phenomena: that overparametrization allows gradient methods to find interpolating solutions, that these methods implicitly impose regularization, and that overparametrization leads to benign overfitting. We survey recent theoretical progress that provides examples illustrating these principles in simpler settings. We first review classical uniform convergence results and why they fall short of explaining aspects of the behavior of deep learning methods. We give examples of implicit regularization in simple settings, where gradient methods lead to minimal norm functions that perfectly fit the training data. Then we review prediction methods that exhibit benign overfitting, focusing on regression problems with quadratic loss. For these methods, we can decompose the prediction rule into a simple component that is useful for prediction and a spiky component that is useful for overfitting but, in a favorable setting, does not harm prediction accuracy. We focus specifically on the linear regime for neural networks, where the network can be approximated by a linear model. In this regime, we demonstrate the success of gradient flow, and we consider benign overfitting with two-layer networks, giving an exact asymptotic analysis that precisely demonstrates the impact of overparametrization. We conclude by highlighting the key challenges that arise in extending these insights to realistic deep learning settings.

Graph representation learning for hypergraphs can be used to extract patterns among higher-order interactions that are critically important in many real world problems. Current approaches designed for hypergraphs, however, are unable to handle different types of hypergraphs and are typically not generic for various learning tasks. Indeed, models that can predict variable-sized heterogeneous hyperedges have not been available. Here we develop a new self-attention based graph neural network called Hyper-SAGNN applicable to homogeneous and heterogeneous hypergraphs with variable hyperedge sizes. We perform extensive evaluations on multiple datasets, including four benchmark network datasets and two single-cell Hi-C datasets in genomics. We demonstrate that Hyper-SAGNN significantly outperforms the state-of-the-art methods on traditional tasks while also achieving great performance on a new task called outsider identification. Hyper-SAGNN will be useful for graph representation learning to uncover complex higher-order interactions in different applications.

北京阿比特科技有限公司