亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Multi-scale problems, where variables of interest evolve in different time-scales and live in different state-spaces. can be found in many fields of science. Here, we introduce a new recursive methodology for Bayesian inference that aims at estimating the static parameters and tracking the dynamic variables of these kind of systems. Although the proposed approach works in rather general multi-scale systems, for clarity we analyze the case of a heterogeneous multi-scale model with 3 time-scales (static parameters, slow dynamic state variables and fast dynamic state variables). The proposed scheme, based on nested filtering methodology of P\'erez-Vieites et al. (2018), combines three intertwined layers of filtering techniques that approximate recursively the joint posterior probability distribution of the parameters and both sets of dynamic state variables given a sequence of partial and noisy observations. We explore the use of sequential Monte Carlo schemes in the first and second layers while we use an unscented Kalman filter to obtain a Gaussian approximation of the posterior probability distribution of the fast variables in the third layer. Some numerical results are presented for a stochastic two-scale Lorenz 96 model with unknown parameters.

相關內容

“后驗”是指在考慮與所審查的特定案件有關的相關證據之后。類似地,后驗概率分布是未知量的概率分布,視從實驗或調查獲得的證據為條件,該未知量被視為隨機變量。

Reversible data hiding continues to attract significant attention in recent years. In particular, an increasing number of authors focus on the higher significant bit (HSB) plane of an image which can yield more redundant space. On the other hand, the lower significant bit plane is often discarded in existing schemes due to their harm to the embedding rate. This paper proposes an efficient reversible data hiding scheme via a double-peak two-layer embedding (DTLE) strategy with prediction error expansion. The higher six-bit planes of the image are assigned as the HSB plane, and double prediction error peaks are applied in either embedding layer. This makes fuller use of the redundancy space of images compared with the one error peak strategy. Moreover, we carry out the median-edge detector pre-processing for complex images to reduce the size of the auxiliary information. A series of experimental results show that our DTLE approach achieves up to 83% higher embedding rate on real-world data while providing better image quality.

We consider inference for a collection of partially observed, stochastic, interacting, nonlinear dynamic processes. Each process is identified with a label called its unit, and our primary motivation arises in biological metapopulation systems where a unit corresponds to a spatially distinct sub-population. Metapopulation systems are characterized by strong dependence through time within a single unit and relatively weak interactions between units, and these properties make block particle filters an effective tool for simulation-based likelihood evaluation. Iterated filtering algorithms can facilitate likelihood maximization for simulation-based filters. We introduce a new iterated block particle filter algorithm applicable when parameters are unit-specific or shared between units. We demonstrate this algorithm by performing inference on a coupled epidemiological model describing spatiotemporal measles case report data for twenty towns.

To investigate the heterogeneity of federated learning in real-world scenarios, we generalize the classical federated learning to federated hetero-task learning, which emphasizes the inconsistency across the participants in federated learning in terms of both data distribution and learning tasks. We also present B-FHTL, a federated hetero-task learning benchmark consisted of simulation dataset, FL protocols and a unified evaluation mechanism. B-FHTL dataset contains three well-designed federated learning tasks with increasing heterogeneity. Each task simulates the clients with different data distributions and learning tasks. To ensure fair comparison among different FL algorithms, B-FHTL builds in a full suite of FL protocols by providing high-level APIs to avoid privacy leakage, and presets most common evaluation metrics spanning across different learning tasks, such as regression, classification, text generation and etc. Furthermore, we compare the FL algorithms in fields of federated multi-task learning, federated personalization and federated meta learning within B-FHTL, and highlight the influence of heterogeneity and difficulties of federated hetero-task learning. Our benchmark, including the federated dataset, protocols, the evaluation mechanism and the preliminary experiment, is open-sourced at //github.com/alibaba/FederatedScope/tree/contest/v1.0.

In modern machine learning, users often have to collaborate to learn the distribution of the data. Communication can be a significant bottleneck. Prior work has studied homogeneous users -- i.e., whose data follow the same discrete distribution -- and has provided optimal communication-efficient methods for estimating that distribution. However, these methods rely heavily on homogeneity, and are less applicable in the common case when users' discrete distributions are heterogeneous. Here we consider a natural and tractable model of heterogeneity, where users' discrete distributions only vary sparsely, on a small number of entries. We propose a novel two-stage method named SHIFT: First, the users collaborate by communicating with the server to learn a central distribution; relying on methods from robust statistics. Then, the learned central distribution is fine-tuned to estimate their respective individual distribution. We show that SHIFT is minimax optimal in our model of heterogeneity and under communication constraints. Further, we provide experimental results using both synthetic data and $n$-gram frequency estimation in the text domain, which corroborate its efficiency.

Bilevel optimization has arisen as a powerful tool in modern machine learning. However, due to the nested structure of bilevel optimization, even gradient-based methods require second-order derivative approximations via Jacobian- or/and Hessian-vector computations, which can be costly and unscalable in practice. Recently, Hessian-free bilevel schemes have been proposed to resolve this issue, where the general idea is to use zeroth- or first-order methods to approximate the full hypergradient of the bilevel problem. However, we empirically observe that such approximation can lead to large variance and unstable training, but estimating only the response Jacobian matrix as a partial component of the hypergradient turns out to be extremely effective. To this end, we propose a new Hessian-free method, which adopts the zeroth-order-like method to approximate the response Jacobian matrix via taking difference between two optimization paths. Theoretically, we provide the convergence rate analysis for the proposed algorithms, where our key challenge is to characterize the approximation and smoothness properties of the trajectory-dependent estimator, which can be of independent interest. This is the first known convergence rate result for this type of Hessian-free bilevel algorithms. Experimentally, we demonstrate that the proposed algorithms outperform baseline bilevel optimizers on various bilevel problems. Particularly, in our experiment on few-shot meta-learning with ResNet-12 network over the miniImageNet dataset, we show that our algorithm outperforms baseline meta-learning algorithms, while other baseline bilevel optimizers do not solve such meta-learning problems within a comparable time frame.

This paper provides a variational analysis of the unconstrained formulation of the LASSO problem, ubiquitous in statistical learning, signal processing, and inverse problems. In particular, we establish smoothness results for the optimal value as well as Lipschitz properties of the optimal solution as functions of the right-hand side (or measurement vector) and the regularization parameter. Moreover, we show how to apply the proposed variational analysis to study the sensitivity of the optimal solution to the tuning parameter in the context of compressed sensing with subgaussian measurements. Our theoretical findings are validated by numerical experiments.

Hierarchical matrices provide a powerful representation for significantly reducing the computational complexity associated with dense kernel matrices. For general kernel functions, interpolation-based methods are widely used for the efficient construction of hierarchical matrices. In this paper, we present a fast hierarchical data reduction (HiDR) procedure with $O(n)$ complexity for the memory-efficient construction of hierarchical matrices with nested bases where $n$ is the number of data points. HiDR aims to reduce the given data in a hierarchical way so as to obtain $O(1)$ representations for all nearfield and farfield interactions. Based on HiDR, a linear complexity $\mathcal{H}^2$ matrix construction algorithm is proposed. The use of data-driven methods enables {better efficiency than other general-purpose methods} and flexible computation without accessing the kernel function. Experiments demonstrate significantly improved memory efficiency of the proposed data-driven method compared to interpolation-based methods over a wide range of kernels. Though the method is not optimized for any special kernel, benchmark experiments for the Coulomb kernel show that the proposed general-purpose algorithm offers competitive performance for hierarchical matrix construction compared to several state-of-the-art algorithms for the Coulomb kernel.

In this paper, we introduce a high-level controller synthesis framework that enables teams of heterogeneous agents to assist each other in resolving environmental conflicts that appear at runtime. This conflict resolution method is built upon temporal-logic-based reactive synthesis to guarantee safety and task completion under specific environment assumptions. In heterogeneous multi-agent systems, every agent is expected to complete its own tasks in service of a global team objective. However, at runtime, an agent may encounter un-modeled obstacles (e.g., doors or walls) that prevent it from achieving its own task. To address this problem, we take advantage of the capability of other heterogeneous agents to resolve the obstacle. A controller framework is proposed to redirect agents with the capability of resolving the appropriate obstacles to the required target when such a situation is detected. A set of case studies involving a bipedal robot Digit and a quadcopter are used to evaluate the controller performance in action. Additionally, we implement the proposed framework on a physical multi-agent robotic system to demonstrate its viability for real world applications.

High-dimensional parabolic partial integro-differential equations (PIDEs) appear in many applications in insurance and finance. Existing numerical methods suffer from the curse of dimensionality or provide solutions only for a given space-time point. This gave rise to a growing literature on deep learning based methods for solving partial differential equations; results for integro-differential equations on the other hand are scarce. In this paper we consider an extension of the deep splitting scheme due to arXiv:1907.03452 and arXiv:2006.01496v3 to PIDEs. Our main contribution is an analysis of the approximation error which yields convergence rates in terms of the number of neurons for shallow neural networks. Moreover we discuss several test case studies to show the viability of our approach.

Training foundation models, such as GPT-3 and PaLM, can be extremely expensive, often involving tens of thousands of GPUs running continuously for months. These models are typically trained in specialized clusters featuring fast, homogeneous interconnects and using carefully designed software systems that support both data parallelism and model/pipeline parallelism. Such dedicated clusters can be costly and difficult to obtain. Can we instead leverage the much greater amount of decentralized, heterogeneous, and lower-bandwidth interconnected compute? Previous works examining the heterogeneous, decentralized setting focus on relatively small models that can be trained in a purely data parallel manner. State-of-the-art schemes for model parallel foundation model training, such as Megatron, only consider the homogeneous data center setting. In this paper, we present the first study of training large foundation models with model parallelism in a decentralized regime over a heterogeneous network. Our key technical contribution is a scheduling algorithm that allocates different computational "tasklets" in the training of foundation models to a group of decentralized GPU devices connected by a slow heterogeneous network. We provide a formal cost model and further propose an efficient evolutionary algorithm to find the optimal allocation strategy. We conduct extensive experiments that represent different scenarios for learning over geo-distributed devices simulated using real-world network measurements. In the most extreme case, across 8 different cities spanning 3 continents, our approach is 4.8X faster than prior state-of-the-art training systems (Megatron).

北京阿比特科技有限公司