亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

In this paper, we introduce a novel analysis of neural networks based on geometric (Clifford) algebra and convex optimization. We show that optimal weights of deep ReLU neural networks are given by the wedge product of training samples when trained with standard regularized loss. Furthermore, the training problem reduces to convex optimization over wedge product features, which encode the geometric structure of the training dataset. This structure is given in terms of signed volumes of triangles and parallelotopes generated by data vectors. The convex problem finds a small subset of samples via $\ell_1$ regularization to discover only relevant wedge product features. Our analysis provides a novel perspective on the inner workings of deep neural networks and sheds light on the role of the hidden layers.

相關內容

Networking:IFIP International Conferences on Networking。 Explanation:國際網絡會議。 Publisher:IFIP。 SIT:

In this paper, we propose a method to avoid "no-solution" situations of the control barrier function (CBF) for distributed collision avoidance in a multiagent autonomous robotic system (MARS). MARS, which is composed of distributed autonomous mobile robots, is expected to effectively perform cooperative tasks such as searching in a certain area. Therefore, collision avoidance must be considered when implementing MARS in the real world. The CBF is effective for solving collision-avoidance problems. However, in extreme conditions where many robots congregate at one location, the CBF constraints that ensure a safe distance between robots may be violated. We theoretically demonstrate that this problem can occur in certain situations, and introduce an asymmetric design for the inequality constraints of CBF. We asymmetrically decentralized inequality constraints with weight functions using the absolute speed of the robot so that other robots can take over the constraints of the robot in severe condition. We demonstrate the effectiveness of the proposed method in a two-dimensional situation wherein multiple robots congregate at one location. We implement the proposed method on real robots and the confirmed the effectiveness of this theory.

In this article, we establish the mathematical foundations for modeling the randomness of shapes and conducting statistical inference on shapes using the smooth Euler characteristic transform. Based on these foundations, we propose two parametric algorithms for testing hypotheses on random shapes. Simulation studies are presented to validate our mathematical derivations and to compare our algorithms with state-of-the-art methods to demonstrate the utility of our proposed framework. As real applications, we analyze a data set of mandibular molars from four genera of primates and show that our algorithms have the power to detect significant shape differences that recapitulate known morphological variation across suborders. Altogether, our discussions bridge the following fields: algebraic and computational topology, probability theory and stochastic processes, Sobolev spaces and functional analysis, statistical inference, and geometric morphometrics.

In this study, we present an investigation into the anisotropy dynamics and intrinsic dimension of embeddings in transformer architectures, focusing on the dichotomy between encoders and decoders. Our findings reveal that the anisotropy profile in transformer decoders exhibits a distinct bell-shaped curve, with the highest anisotropy concentrations in the middle layers. This pattern diverges from the more uniformly distributed anisotropy observed in encoders. In addition, we found that the intrinsic dimension of embeddings increases in the initial phases of training, indicating an expansion into higher-dimensional space. Which is then followed by a compression phase towards the end of training with dimensionality decrease, suggesting a refinement into more compact representations. Our results provide fresh insights to the understanding of encoders and decoders embedding properties.

This paper investigates a large unitarily invariant system (LUIS) involving a unitarily invariant sensing matrix, an arbitrarily fixed signal distribution, and forward error control (FEC) coding. A universal Gram-Schmidt orthogonalization is considered for constructing orthogonal approximate message passing (OAMP), enabling its applicability to a wide range of prototypes without the constraint of differentiability. We develop two single-input-single-output variational transfer functions for OAMP with Lipschitz continuous local estimators, facilitating an analysis of achievable rates. Furthermore, when the state evolution of OAMP has a unique fixed point, we reveal that OAMP can achieve the constrained capacity predicted by the replica method of LUIS based on matched FEC coding, regardless of the signal distribution. The replica method is rigorously validated for LUIS with Gaussian signaling and certain sub-classes of LUIS with arbitrary signal distributions. Several area properties are established based on the variational transfer functions of OAMP. Meanwhile, we present a replica constrained capacity-achieving coding principle for LUIS. This principle serves as the basis for optimizing irregular low-density parity-check (LDPC) codes specifically tailored for binary signaling in our simulation results. The performance of OAMP with these optimized codes exhibits a remarkable improvement over the unoptimized codes and even surpasses the well-known Turbo-LMMSE algorithm. For quadrature phase-shift keying (QPSK) modulation, we observe bit error rates (BER) performance near the replica constrained capacity across diverse channel conditions.

When training a neural network, it will quickly memorise some source-target mappings from your dataset but never learn some others. Yet, memorisation is not easily expressed as a binary feature that is good or bad: individual datapoints lie on a memorisation-generalisation continuum. What determines a datapoint's position on that spectrum, and how does that spectrum influence neural models' performance? We address these two questions for neural machine translation (NMT) models. We use the counterfactual memorisation metric to (1) build a resource that places 5M NMT datapoints on a memorisation-generalisation map, (2) illustrate how the datapoints' surface-level characteristics and a models' per-datum training signals are predictive of memorisation in NMT, (3) and describe the influence that subsets of that map have on NMT systems' performance.

In this work, we present an abstract theory for the approximation of operator-valued Riccati equations posed on Hilbert spaces. It is demonstrated here, under the assumption of compactness in the coefficient operators, that the error of the approximate solution to the operator-valued Riccati equation is bounded above by the approximation error of the governing semigroup. One significant outcome of this result is the correct prediction of optimal convergence for finite element approximations of the operator-valued Riccati equations for when the governing semigroup involves parabolic, as well as hyperbolic processes. We derive the abstract theory for the time-dependent and time-independent operator-valued Riccati equations in the first part of this work. In the second part, we prove optimal convergence rates for the finite element approximation of the functional gain associated with model one-dimensional weakly damped wave and thermal LQR control systems. These theoretical claims are then corroborated with computational evidence.

Recently, graph neural networks have been gaining a lot of attention to simulate dynamical systems due to their inductive nature leading to zero-shot generalizability. Similarly, physics-informed inductive biases in deep-learning frameworks have been shown to give superior performance in learning the dynamics of physical systems. There is a growing volume of literature that attempts to combine these two approaches. Here, we evaluate the performance of thirteen different graph neural networks, namely, Hamiltonian and Lagrangian graph neural networks, graph neural ODE, and their variants with explicit constraints and different architectures. We briefly explain the theoretical formulation highlighting the similarities and differences in the inductive biases and graph architecture of these systems. We evaluate these models on spring, pendulum, gravitational, and 3D deformable solid systems to compare the performance in terms of rollout error, conserved quantities such as energy and momentum, and generalizability to unseen system sizes. Our study demonstrates that GNNs with additional inductive biases, such as explicit constraints and decoupling of kinetic and potential energies, exhibit significantly enhanced performance. Further, all the physics-informed GNNs exhibit zero-shot generalizability to system sizes an order of magnitude larger than the training system, thus providing a promising route to simulate large-scale realistic systems.

In the past decade, we have witnessed the rise of deep learning to dominate the field of artificial intelligence. Advances in artificial neural networks alongside corresponding advances in hardware accelerators with large memory capacity, together with the availability of large datasets enabled researchers and practitioners alike to train and deploy sophisticated neural network models that achieve state-of-the-art performance on tasks across several fields spanning computer vision, natural language processing, and reinforcement learning. However, as these neural networks become bigger, more complex, and more widely used, fundamental problems with current deep learning models become more apparent. State-of-the-art deep learning models are known to suffer from issues that range from poor robustness, inability to adapt to novel task settings, to requiring rigid and inflexible configuration assumptions. Ideas from collective intelligence, in particular concepts from complex systems such as self-organization, emergent behavior, swarm optimization, and cellular systems tend to produce solutions that are robust, adaptable, and have less rigid assumptions about the environment configuration. It is therefore natural to see these ideas incorporated into newer deep learning methods. In this review, we will provide a historical context of neural network research's involvement with complex systems, and highlight several active areas in modern deep learning research that incorporate the principles of collective intelligence to advance its current capabilities. To facilitate a bi-directional flow of ideas, we also discuss work that utilize modern deep learning models to help advance complex systems research. We hope this review can serve as a bridge between complex systems and deep learning communities to facilitate the cross pollination of ideas and foster new collaborations across disciplines.

Small data challenges have emerged in many learning problems, since the success of deep neural networks often relies on the availability of a huge amount of labeled data that is expensive to collect. To address it, many efforts have been made on training complex models with small data in an unsupervised and semi-supervised fashion. In this paper, we will review the recent progresses on these two major categories of methods. A wide spectrum of small data models will be categorized in a big picture, where we will show how they interplay with each other to motivate explorations of new ideas. We will review the criteria of learning the transformation equivariant, disentangled, self-supervised and semi-supervised representations, which underpin the foundations of recent developments. Many instantiations of unsupervised and semi-supervised generative models have been developed on the basis of these criteria, greatly expanding the territory of existing autoencoders, generative adversarial nets (GANs) and other deep networks by exploring the distribution of unlabeled data for more powerful representations. While we focus on the unsupervised and semi-supervised methods, we will also provide a broader review of other emerging topics, from unsupervised and semi-supervised domain adaptation to the fundamental roles of transformation equivariance and invariance in training a wide spectrum of deep networks. It is impossible for us to write an exclusive encyclopedia to include all related works. Instead, we aim at exploring the main ideas, principles and methods in this area to reveal where we are heading on the journey towards addressing the small data challenges in this big data era.

In this paper, we propose a novel multi-task learning architecture, which incorporates recent advances in attention mechanisms. Our approach, the Multi-Task Attention Network (MTAN), consists of a single shared network containing a global feature pool, together with task-specific soft-attention modules, which are trainable in an end-to-end manner. These attention modules allow for learning of task-specific features from the global pool, whilst simultaneously allowing for features to be shared across different tasks. The architecture can be built upon any feed-forward neural network, is simple to implement, and is parameter efficient. Experiments on the CityScapes dataset show that our method outperforms several baselines in both single-task and multi-task learning, and is also more robust to the various weighting schemes in the multi-task loss function. We further explore the effectiveness of our method through experiments over a range of task complexities, and show how our method scales well with task complexity compared to baselines.

北京阿比特科技有限公司