亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We address data-driven learning of the infinitesimal generator of stochastic diffusion processes, essential for understanding numerical simulations of natural and physical systems. The unbounded nature of the generator poses significant challenges, rendering conventional analysis techniques for Hilbert-Schmidt operators ineffective. To overcome this, we introduce a novel framework based on the energy functional for these stochastic processes. Our approach integrates physical priors through an energy-based risk metric in both full and partial knowledge settings. We evaluate the statistical performance of a reduced-rank estimator in reproducing kernel Hilbert spaces (RKHS) in the partial knowledge setting. Notably, our approach provides learning bounds independent of the state space dimension and ensures non-spurious spectral estimation. Additionally, we elucidate how the distortion between the intrinsic energy-induced metric of the stochastic diffusion and the RKHS metric used for generator estimation impacts the spectral learning bounds.

相關內容

The dominant paradigm in generative modeling consists of two steps: i) pre-training on a large-scale but unsafe dataset, ii) aligning the pre-trained model with human values via fine-tuning. This practice is considered safe, as no current method can recover the unsafe, pre-fine-tuning model weights. In this paper, we demonstrate that this assumption is often false. Concretely, we present Spectral DeTuning, a method that can recover the weights of the pre-fine-tuning model using a few low-rank (LoRA) fine-tuned models. In contrast to previous attacks that attempt to recover pre-fine-tuning capabilities, our method aims to recover the exact pre-fine-tuning weights. Our approach exploits this new vulnerability against large-scale models such as a personalized Stable Diffusion and an aligned Mistral.

To analyze the worst-case running time of branching algorithms, the majority of work in exponential time algorithms focuses on designing complicated branching rules over developing better analysis methods for simple algorithms. In the mid-$2000$s, Fomin et al. [2005] introduced measure & conquer, an advanced general analysis method, sparking widespread adoption for obtaining tighter worst-case running time upper bounds for many fundamental NP-complete problems. Yet, much potential in this direction remains untapped, as most subsequent work applied it without further advancement. Motivated by this, we present piecewise analysis, a new general method that analyzes the running time of branching algorithms. Our approach is to define a similarity ratio that divides instances into groups and then analyze the running time within each group separately. The similarity ratio is a scale between two parameters of an instance I. Instead of relying on a single measure and a single analysis for the whole instance space, our method allows to take advantage of different intrinsic properties of instances with different similarity ratios. To showcase its potential, we reanalyze two $17$-year-old algorithms from Fomin et al. [2007] that solve $4$-Coloring and #$3$-Coloring respectively. The original analysis in their paper gave running times of $O(1.7272^n)$ and $O(1.6262^n)$ respectively for these algorithms, our analysis improves these running times to $O(1.7207^n)$ and $O(1.6225^n)$.

We define and study translations between the maximal class of analytic display calculi for tense logics and labeled sequent calculi, thus solving an open problem about the translatability of proofs between the two formalisms. In particular, we provide PTIME translations that map cut-free display proofs to and from special cut-free labeled proofs, which we dub 'strict' labeled proofs. This identifies the space of cut-free display proofs with a polynomially equivalent subspace of labeled proofs, showing how calculi within the two formalisms polynomially simulate one another. We analyze the relative sizes of proofs under this translation, finding that display proofs become polynomially shorter when translated to strict labeled proofs, though with a potential increase in the length of sequents; in the reverse translation, strict labeled proofs may become polynomially larger when translated into display proofs. In order to achieve our results, we formulate labeled sequent calculi in a new way that views rules as 'templates', which are instantiated with substitutions to obtain rule applications; we also provide the first definition of primitive tense structural rules within the labeled sequent formalism. Therefore, our formulation of labeled calculi more closely resembles how display calculi are defined for tense logics, which permits a more fine-grained analysis of rules, substitutions, and translations. This work establishes that every analytic display calculus for a tense logic can be viewed as a labeled sequent calculus, showing conclusively that the labeled formalism subsumes and extends the display formalism in the setting of primitive tense logics.

Quantum information decoupling is a fundamental quantum information processing task, which also serves as a crucial tool in a diversity of topics in quantum physics. In this paper, we characterize the reliability function of catalytic quantum information decoupling, that is, the best exponential rate under which perfect decoupling is asymptotically approached. We have obtained the exact formula when the decoupling cost is below a critical value. In the situation of high cost, we provide meaningful upper and lower bounds. This result is then applied to quantum state merging, exploiting its inherent connection to decoupling. In addition, as technical tools, we derive the exact exponents for the smoothing of the conditional min-entropy and max-information, and we prove a novel bound for the convex-split lemma. Our results are given in terms of the sandwiched R\'enyi divergence, providing it with a new type of operational meaning in characterizing how fast the performance of quantum information tasks approaches the perfect.

Recently, graph neural networks have been gaining a lot of attention to simulate dynamical systems due to their inductive nature leading to zero-shot generalizability. Similarly, physics-informed inductive biases in deep-learning frameworks have been shown to give superior performance in learning the dynamics of physical systems. There is a growing volume of literature that attempts to combine these two approaches. Here, we evaluate the performance of thirteen different graph neural networks, namely, Hamiltonian and Lagrangian graph neural networks, graph neural ODE, and their variants with explicit constraints and different architectures. We briefly explain the theoretical formulation highlighting the similarities and differences in the inductive biases and graph architecture of these systems. We evaluate these models on spring, pendulum, gravitational, and 3D deformable solid systems to compare the performance in terms of rollout error, conserved quantities such as energy and momentum, and generalizability to unseen system sizes. Our study demonstrates that GNNs with additional inductive biases, such as explicit constraints and decoupling of kinetic and potential energies, exhibit significantly enhanced performance. Further, all the physics-informed GNNs exhibit zero-shot generalizability to system sizes an order of magnitude larger than the training system, thus providing a promising route to simulate large-scale realistic systems.

The generalization mystery in deep learning is the following: Why do over-parameterized neural networks trained with gradient descent (GD) generalize well on real datasets even though they are capable of fitting random datasets of comparable size? Furthermore, from among all solutions that fit the training data, how does GD find one that generalizes well (when such a well-generalizing solution exists)? We argue that the answer to both questions lies in the interaction of the gradients of different examples during training. Intuitively, if the per-example gradients are well-aligned, that is, if they are coherent, then one may expect GD to be (algorithmically) stable, and hence generalize well. We formalize this argument with an easy to compute and interpretable metric for coherence, and show that the metric takes on very different values on real and random datasets for several common vision networks. The theory also explains a number of other phenomena in deep learning, such as why some examples are reliably learned earlier than others, why early stopping works, and why it is possible to learn from noisy labels. Moreover, since the theory provides a causal explanation of how GD finds a well-generalizing solution when one exists, it motivates a class of simple modifications to GD that attenuate memorization and improve generalization. Generalization in deep learning is an extremely broad phenomenon, and therefore, it requires an equally general explanation. We conclude with a survey of alternative lines of attack on this problem, and argue that the proposed approach is the most viable one on this basis.

Data augmentation, the artificial creation of training data for machine learning by transformations, is a widely studied research field across machine learning disciplines. While it is useful for increasing the generalization capabilities of a model, it can also address many other challenges and problems, from overcoming a limited amount of training data over regularizing the objective to limiting the amount data used to protect privacy. Based on a precise description of the goals and applications of data augmentation (C1) and a taxonomy for existing works (C2), this survey is concerned with data augmentation methods for textual classification and aims to achieve a concise and comprehensive overview for researchers and practitioners (C3). Derived from the taxonomy, we divided more than 100 methods into 12 different groupings and provide state-of-the-art references expounding which methods are highly promising (C4). Finally, research perspectives that may constitute a building block for future work are given (C5).

Deep neural networks have revolutionized many machine learning tasks in power systems, ranging from pattern recognition to signal processing. The data in these tasks is typically represented in Euclidean domains. Nevertheless, there is an increasing number of applications in power systems, where data are collected from non-Euclidean domains and represented as the graph-structured data with high dimensional features and interdependency among nodes. The complexity of graph-structured data has brought significant challenges to the existing deep neural networks defined in Euclidean domains. Recently, many studies on extending deep neural networks for graph-structured data in power systems have emerged. In this paper, a comprehensive overview of graph neural networks (GNNs) in power systems is proposed. Specifically, several classical paradigms of GNNs structures (e.g., graph convolutional networks, graph recurrent neural networks, graph attention networks, graph generative networks, spatial-temporal graph convolutional networks, and hybrid forms of GNNs) are summarized, and key applications in power systems such as fault diagnosis, power prediction, power flow calculation, and data generation are reviewed in detail. Furthermore, main issues and some research trends about the applications of GNNs in power systems are discussed.

Neural machine translation (NMT) is a deep learning based approach for machine translation, which yields the state-of-the-art translation performance in scenarios where large-scale parallel corpora are available. Although the high-quality and domain-specific translation is crucial in the real world, domain-specific corpora are usually scarce or nonexistent, and thus vanilla NMT performs poorly in such scenarios. Domain adaptation that leverages both out-of-domain parallel corpora as well as monolingual corpora for in-domain translation, is very important for domain-specific translation. In this paper, we give a comprehensive survey of the state-of-the-art domain adaptation techniques for NMT.

Object detection typically assumes that training and test data are drawn from an identical distribution, which, however, does not always hold in practice. Such a distribution mismatch will lead to a significant performance drop. In this work, we aim to improve the cross-domain robustness of object detection. We tackle the domain shift on two levels: 1) the image-level shift, such as image style, illumination, etc, and 2) the instance-level shift, such as object appearance, size, etc. We build our approach based on the recent state-of-the-art Faster R-CNN model, and design two domain adaptation components, on image level and instance level, to reduce the domain discrepancy. The two domain adaptation components are based on H-divergence theory, and are implemented by learning a domain classifier in adversarial training manner. The domain classifiers on different levels are further reinforced with a consistency regularization to learn a domain-invariant region proposal network (RPN) in the Faster R-CNN model. We evaluate our newly proposed approach using multiple datasets including Cityscapes, KITTI, SIM10K, etc. The results demonstrate the effectiveness of our proposed approach for robust object detection in various domain shift scenarios.

北京阿比特科技有限公司