亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Weir has defined a hierarchy of language classes whose second member ($\mathcal{L}_2$) is generated by tree-adjoining grammars (TAG), linear indexed grammars (LIG), combinatory categorial grammars, and head grammars. The hierarchy is obtained using the mechanism of control, and $\mathcal{L}_2$ is obtained using a context-free grammar (CFG) whose derivations are controlled by another CFG. We adapt Weir's definition of a controllable CFG to give a definition of controllable pushdown automata (PDAs). This yields three new characterizations of $\mathcal{L}_2$ as the class of languages generated by PDAs controlling PDAs, PDAs controlling CFGs, and CFGs controlling PDAs. We show that these four formalisms are not only weakly equivalent but equivalent in a stricter sense that we call d-weak equivalence. Furthermore, using an even stricter notion of equivalence called d-strong equivalence, we make precise the intuition that a CFG controlling a CFG is a TAG, a PDA controlling a PDA is an embedded PDA, and a PDA controlling a CFG is a LIG. The fourth member of this family, a CFG controlling a PDA, does not correspond to any formalism we know of, so we invent one and call it a Pushdown Adjoining Automaton.

相關內容

Poisson process models are defined in terms of their rates for outage and restore processes in power system resilience events. These outage and restore processes easily yield the performance curves that track the evolution of resilience events, and the area, nadir, and duration of the performance curves are standard resilience metrics. This letter analyzes typical resilience events by analyzing the area, nadir, and duration of mean performance curves. Explicit and intuitive formulas for these metrics are derived in terms of the Poisson process model parameters, and these parameters can be estimated from utility data. This clarifies the calculation of metrics of typical resilience events, and shows what they depend on. The metric formulas are derived with lognormal, exponential, or constant rates of restoration. The method is illustrated with a typical North American transmission event. Similarly nice formulas are obtained for the area metric for empirical power system data.

In this paper, we investigate the problem of controlling multiple unmanned aerial vehicles (UAVs) to enclose a moving target in a distributed fashion based on a relative distance and self-displacement measurements. A relative localization technique is developed based on the recursive least square estimation (RLSE) technique with a forgetting factor to estimates both the ``UAV-UAV'' and ``UAV-target'' relative positions. The formation enclosing motion is planned using a coupled oscillator model, which generates desired motion for UAVs to distribute evenly on a circle. The coupled-oscillator-based motion can also facilitate the exponential convergence of relative localization due to its persistent excitation nature. Based on the generation strategy of desired formation pattern and relative localization estimates, a cooperative formation tracking control scheme is proposed, which enables the formation geometric center to asymptotically converge to the moving target. The asymptotic convergence performance is analyzed theoretically for both the relative localization technique and the formation control algorithm. Numerical simulations are provided to show the efficiency of the proposed algorithm. Experiments with three quadrotors tracking one target are conducted to evaluate the proposed target enclosing method in real platforms.

Splines over triangulations and splines over quadrangulations (tensor product splines) are two common ways to extend bivariate polynomials to splines. However, combination of both approaches leads to splines defined over mixed triangle and quadrilateral meshes using the isogeometric approach. Mixed meshes are especially useful for representing complicated geometries obtained e.g. from trimming. As (bi-)linearly parameterized mesh elements are not flexible enough to cover smooth domains, we focus in this work on the case of planar mixed meshes parameterized by (bi-)quadratic geometry mappings. In particular we study in detail the space of $C^1$-smooth isogeometric spline functions of general polynomial degree over two such mixed mesh elements. We present the theoretical framework to analyze the smoothness conditions over the common interface for all possible configurations of mesh elements. This comprises the investigation of the dimension as well as the construction of a basis of the corresponding $C^1$-smooth isogeometric spline space over the domain described by two elements. Several examples of interest are presented in detail.

Compositionality is at the heart of computer science and several other areas of applied category theory such as computational linguistics, categorical quantum mechanics, interpretable AI, dynamical systems, compositional game theory, and Petri nets. However, the meaning of the term seems to vary across the many different applications. This work contributes to understanding, and in particular qualifying, different kinds of compositionality. Formally, we introduce invariants of categories that we call zeroth and first homotopy posets, generalising in a precise sense the $\pi_0$ and $\pi_1$ of a groupoid. These posets can be used to obtain a qualitative description of how far an object is from being terminal and a morphism is from being iso. In the context of applied category theory, this formal machinery gives us a way to qualitatively describe the "failures of compositionality", seen as failures of certain (op)lax functors to be strong, by classifying obstructions to the (op)laxators being isomorphisms. Failure of compositionality, for example for the interpretation of a categorical syntax in a semantic universe, can both be a bad thing and a good thing, which we illustrate by respective examples in graph theory and quantum theory.

We establish a connection between stochastic optimal control and generative models based on stochastic differential equations (SDEs), such as recently developed diffusion probabilistic models. In particular, we derive a Hamilton-Jacobi-Bellman equation that governs the evolution of the log-densities of the underlying SDE marginals. This perspective allows to transfer methods from optimal control theory to generative modeling. First, we show that the evidence lower bound is a direct consequence of the well-known verification theorem from control theory. Further, we can formulate diffusion-based generative modeling as a minimization of the Kullback-Leibler divergence between suitable measures in path space. Finally, we develop a novel diffusion-based method for sampling from unnormalized densities -- a problem frequently occurring in statistics and computational sciences. We demonstrate that our time-reversed diffusion sampler (DIS) can outperform other diffusion-based sampling approaches on multiple numerical examples.

Recently, Transformer-like deep architectures have shown strong performance on tabular data problems. Unlike traditional models, e.g., MLP, these architectures map scalar values of numerical features to high-dimensional embeddings before mixing them in the main backbone. In this work, we argue that embeddings for numerical features are an underexplored degree of freedom in tabular DL, which allows constructing more powerful DL models and competing with GBDT on some traditionally GBDT-friendly benchmarks. We start by describing two conceptually different approaches to building embedding modules: the first one is based on a piecewise linear encoding of scalar values, and the second one utilizes periodic activations. Then, we empirically demonstrate that these two approaches can lead to significant performance boosts compared to the embeddings based on conventional blocks such as linear layers and ReLU activations. Importantly, we also show that embedding numerical features is beneficial for many backbones, not only for Transformers. Specifically, after proper embeddings, simple MLP-like models can perform on par with the attention-based architectures. Overall, we highlight embeddings for numerical features as an important design aspect with good potential for further improvements in tabular DL.

The level-$k$ $\ell_1$-Fourier weight of a Boolean function refers to the sum of absolute values of its level-$k$ Fourier coefficients. Fourier growth refers to the growth of these weights as $k$ grows. It has been extensively studied for various computational models, and bounds on the Fourier growth, even for the first few levels, have proven useful in learning theory, circuit lower bounds, pseudorandomness, and quantum-classical separations. We investigate the Fourier growth of certain functions that naturally arise from communication protocols for XOR functions (partial functions evaluated on the bitwise XOR of the inputs to Alice and Bob). If a protocol $\mathcal C$ computes an XOR function, then $\mathcal C(x,y)$ is a function of the parity $x\oplus y$. This motivates us to analyze the XOR-fiber of $\mathcal C$, defined as $h(z):=\mathbb E_{x,y}[\mathcal C(x,y)|x\oplus y=z]$. We present improved Fourier growth bounds for the XOR-fibers of protocols that communicate $d$ bits. For the first level, we show a tight $O(\sqrt d)$ bound and obtain a new coin theorem, as well as an alternative proof for the tight randomized communication lower bound for Gap-Hamming. For the second level, we show an $d^{3/2}\cdot\mathrm{polylog}(n)$ bound, which improves the previous $O(d^2)$ bound by Girish, Raz, and Tal (ITCS 2021) and implies a polynomial improvement on the randomized communication lower bound for the XOR-lift of Forrelation, extending its quantum-classical gap. Our analysis is based on a new way of adaptively partitioning a relatively large set in Gaussian space to control its moments in all directions. We achieve this via martingale arguments and allowing protocols to transmit real values. We also show a connection between Fourier growth and lifting theorems with constant-sized gadgets as a potential approach to prove optimal bounds for the second level and beyond.

Reachability types are a recent proposal that has shown promise in scaling to higher-order but monomorphic settings, tracking aliasing and separation on top of a substrate inspired by separation logic. The prior $\lambda^*$ reachability type system qualifies types with sets of reachable variables and guarantees separation if two terms have disjoint qualifiers. However, naive extensions with type polymorphism and/or precise reachability polymorphism are unsound, making $\lambda^*$ unsuitable for adoption in real languages. Combining reachability and type polymorphism that is precise, sound, and parametric remains an open challenge. This paper presents a rethinking of the design of reachability tracking and proposes a solution to the key challenge of reachability polymorphism. Instead of always tracking the transitive closure of reachable variables as in the original design, we only track variables reachable in a single step and compute transitive closures only when necessary, thus preserving chains of reachability over known variables that can be refined using substitution. To enable this property, we introduce a new freshness qualifier, which indicates variables whose reachability sets may grow during evaluation steps. These ideas yield the simply-typed $\lambda^\diamond$-calculus with precise lightweight, i.e., quantifier-free, reachability polymorphism, and the $\mathsf{F}_{<:}^\diamond$-calculus with bounded parametric polymorphism over types and reachability qualifiers. We prove type soundness and a preservation of separation property in Coq.

The generalization mystery in deep learning is the following: Why do over-parameterized neural networks trained with gradient descent (GD) generalize well on real datasets even though they are capable of fitting random datasets of comparable size? Furthermore, from among all solutions that fit the training data, how does GD find one that generalizes well (when such a well-generalizing solution exists)? We argue that the answer to both questions lies in the interaction of the gradients of different examples during training. Intuitively, if the per-example gradients are well-aligned, that is, if they are coherent, then one may expect GD to be (algorithmically) stable, and hence generalize well. We formalize this argument with an easy to compute and interpretable metric for coherence, and show that the metric takes on very different values on real and random datasets for several common vision networks. The theory also explains a number of other phenomena in deep learning, such as why some examples are reliably learned earlier than others, why early stopping works, and why it is possible to learn from noisy labels. Moreover, since the theory provides a causal explanation of how GD finds a well-generalizing solution when one exists, it motivates a class of simple modifications to GD that attenuate memorization and improve generalization. Generalization in deep learning is an extremely broad phenomenon, and therefore, it requires an equally general explanation. We conclude with a survey of alternative lines of attack on this problem, and argue that the proposed approach is the most viable one on this basis.

This book develops an effective theory approach to understanding deep neural networks of practical relevance. Beginning from a first-principles component-level picture of networks, we explain how to determine an accurate description of the output of trained networks by solving layer-to-layer iteration equations and nonlinear learning dynamics. A main result is that the predictions of networks are described by nearly-Gaussian distributions, with the depth-to-width aspect ratio of the network controlling the deviations from the infinite-width Gaussian description. We explain how these effectively-deep networks learn nontrivial representations from training and more broadly analyze the mechanism of representation learning for nonlinear models. From a nearly-kernel-methods perspective, we find that the dependence of such models' predictions on the underlying learning algorithm can be expressed in a simple and universal way. To obtain these results, we develop the notion of representation group flow (RG flow) to characterize the propagation of signals through the network. By tuning networks to criticality, we give a practical solution to the exploding and vanishing gradient problem. We further explain how RG flow leads to near-universal behavior and lets us categorize networks built from different activation functions into universality classes. Altogether, we show that the depth-to-width ratio governs the effective model complexity of the ensemble of trained networks. By using information-theoretic techniques, we estimate the optimal aspect ratio at which we expect the network to be practically most useful and show how residual connections can be used to push this scale to arbitrary depths. With these tools, we can learn in detail about the inductive bias of architectures, hyperparameters, and optimizers.

北京阿比特科技有限公司