亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We consider online convex optimization (OCO) over a heterogeneous network with communication delay, where multiple workers together with a master execute a sequence of decisions to minimize the accumulation of time-varying global costs. The local data may not be independent or identically distributed, and the global cost functions may not be locally separable. Due to communication delay, neither the master nor the workers have in-time information about the current global cost function. We propose a new algorithm, termed Hierarchical OCO (HiOCO), which takes full advantage of the network heterogeneity in information timeliness and computation capacity to enable multi-step gradient descent at both the workers and the master. We analyze the impacts of the unique hierarchical architecture, multi-slot delay, and gradient estimation error to derive upper bounds on the dynamic regret of HiOCO, which measures the gap of costs between HiOCO and an offline globally optimal performance benchmark.

相關內容

We consider the problem of offline reinforcement learning with model-based control, whose goal is to learn a dynamics model from the experience replay and obtain a pessimism-oriented agent under the learned model. Current model-based constraint includes explicit uncertainty penalty and implicit conservative regularization that pushes Q-values of out-of-distribution state-action pairs down and the in-distribution up. While the uncertainty estimation, on which the former relies on, can be loosely calibrated for complex dynamics, the latter performs slightly better. To extend the basic idea of regularization without uncertainty quantification, we propose distributionally robust offline model-based policy optimization (DROMO), which leverages the ideas in distributionally robust optimization to penalize a broader range of out-of-distribution state-action pairs beyond the standard empirical out-of-distribution Q-value minimization. We theoretically show that our method optimizes a lower bound on the ground-truth policy evaluation, and it can be incorporated into any existing policy gradient algorithms. We also analyze the theoretical properties of DROMO's linear and non-linear instantiations.

Drift control is significant to the safety of autonomous vehicles when there is a sudden loss of traction due to external conditions such as rain or snow. It is a challenging control problem due to the presence of significant sideslip and nearly full saturation of the tires. In this paper, we focus on the control of drift maneuvers following circular paths with either fixed or moving centers, subject to change in the tire-ground interaction, which are common training tasks for drift enthusiasts and can therefore be used as benchmarks of the performance of drift control. In order to achieve the above tasks, we propose a novel hierarchical control architecture which decouples the curvature and center control of the trajectory. In particular, an outer loop stabilizes the center by tuning the target curvature, and an inner loop tracks the curvature using a feedforward/feedback controller enhanced by an $\mathcal{L}_1$ adaptive component. The hierarchical architecture is flexible because the inner loop is task-agnostic and adaptive to changes in tire-road interaction, which allows the outer loop to be designed independent of low-level dynamics, opening up the possibility of incorporating sophisticated planning algorithms. We implement our control strategy on a simulation platform as well as on a 1/10 scale Radio-Control~(RC) car, and both the simulation and experiment results illustrate the effectiveness of our strategy in achieving the above described set of drift maneuvering tasks.

We propose a hierarchical version of dual averaging for zeroth-order online non-convex optimization - i.e., learning processes where, at each stage, the optimizer is facing an unknown non-convex loss function and only receives the incurred loss as feedback. The proposed class of policies relies on the construction of an online model that aggregates loss information as it arrives, and it consists of two principal components: (a) a regularizer adapted to the Fisher information metric (as opposed to the metric norm of the ambient space); and (b) a principled exploration of the problem's state space based on an adapted hierarchical schedule. This construction enables sharper control of the model's bias and variance, and allows us to derive tight bounds for both the learner's static and dynamic regret - i.e., the regret incurred against the best dynamic policy in hindsight over the horizon of play.

Recent studies indicate that hierarchical Vision Transformer with a macro architecture of interleaved non-overlapped window-based self-attention \& shifted-window operation is able to achieve state-of-the-art performance in various visual recognition tasks, and challenges the ubiquitous convolutional neural networks (CNNs) using densely slid kernels. Most follow-up works attempt to replace the shifted-window operation with other kinds of cross-window communication paradigms, while treating self-attention as the de-facto standard for window-based information aggregation. In this manuscript, we question whether self-attention is the only choice for hierarchical Vision Transformer to attain strong performance, and the effects of different kinds of cross-window communication. To this end, we replace self-attention layers with embarrassingly simple linear mapping layers, and the resulting proof-of-concept architecture termed as LinMapper can achieve very strong performance in ImageNet-1k image recognition. Moreover, we find that LinMapper is able to better leverage the pre-trained representations from image recognition and demonstrates excellent transfer learning properties on downstream dense prediction tasks such as object detection and instance segmentation. We also experiment with other alternatives to self-attention for content aggregation inside each non-overlapped window under different cross-window communication approaches, which all give similar competitive results. Our study reveals that the \textbf{macro architecture} of Swin model families, other than specific aggregation layers or specific means of cross-window communication, may be more responsible for its strong performance and is the real challenger to the ubiquitous CNN's dense sliding window paradigm. Code and models will be publicly available to facilitate future research.

Many meta-learning approaches for few-shot learning rely on simple base learners such as nearest-neighbor classifiers. However, even in the few-shot regime, discriminatively trained linear predictors can offer better generalization. We propose to use these predictors as base learners to learn representations for few-shot learning and show they offer better tradeoffs between feature size and performance across a range of few-shot recognition benchmarks. Our objective is to learn feature embeddings that generalize well under a linear classification rule for novel categories. To efficiently solve the objective, we exploit two properties of linear classifiers: implicit differentiation of the optimality conditions of the convex problem and the dual formulation of the optimization problem. This allows us to use high-dimensional embeddings with improved generalization at a modest increase in computational overhead. Our approach, named MetaOptNet, achieves state-of-the-art performance on miniImageNet, tieredImageNet, CIFAR-FS, and FC100 few-shot learning benchmarks. Our code is available at //github.com/kjunelee/MetaOptNet.

Gradient-based meta-learning techniques are both widely applicable and proficient at solving challenging few-shot learning and fast adaptation problems. However, they have the practical difficulties of operating in high-dimensional parameter spaces in extreme low-data regimes. We show that it is possible to bypass these limitations by learning a low-dimensional latent generative representation of model parameters and performing gradient-based meta-learning in this space with latent embedding optimization (LEO), effectively decoupling the gradient-based adaptation procedure from the underlying high-dimensional space of model parameters. Our evaluation shows that LEO can achieve state-of-the-art performance on the competitive 5-way 1-shot miniImageNet classification task.

We propose a new method of estimation in topic models, that is not a variation on the existing simplex finding algorithms, and that estimates the number of topics K from the observed data. We derive new finite sample minimax lower bounds for the estimation of A, as well as new upper bounds for our proposed estimator. We describe the scenarios where our estimator is minimax adaptive. Our finite sample analysis is valid for any number of documents (n), individual document length (N_i), dictionary size (p) and number of topics (K), and both p and K are allowed to increase with n, a situation not handled well by previous analyses. We complement our theoretical results with a detailed simulation study. We illustrate that the new algorithm is faster and more accurate than the current ones, although we start out with a computational and theoretical disadvantage of not knowing the correct number of topics K, while we provide the competing methods with the correct value in our simulations.

Reinforcement learning and symbolic planning have both been used to build intelligent autonomous agents. Reinforcement learning relies on learning from interactions with real world, which often requires an unfeasibly large amount of experience. Symbolic planning relies on manually crafted symbolic knowledge, which may not be robust to domain uncertainties and changes. In this paper we present a unified framework {\em PEORL} that integrates symbolic planning with hierarchical reinforcement learning (HRL) to cope with decision-making in a dynamic environment with uncertainties. Symbolic plans are used to guide the agent's task execution and learning, and the learned experience is fed back to symbolic knowledge to improve planning. This method leads to rapid policy search and robust symbolic plans in complex domains. The framework is tested on benchmark domains of HRL.

Current convolutional neural networks algorithms for video object tracking spend the same amount of computation for each object and video frame. However, it is harder to track an object in some frames than others, due to the varying amount of clutter, scene complexity, amount of motion, and object's distinctiveness against its background. We propose a depth-adaptive convolutional Siamese network that performs video tracking adaptively at multiple neural network depths. Parametric gating functions are trained to control the depth of the convolutional feature extractor by minimizing a joint loss of computational cost and tracking error. Our network achieves accuracy comparable to the state-of-the-art on the VOT2016 benchmark. Furthermore, our adaptive depth computation achieves higher accuracy for a given computational cost than traditional fixed-structure neural networks. The presented framework extends to other tasks that use convolutional neural networks and enables trading speed for accuracy at runtime.

北京阿比特科技有限公司