亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We consider Time-to-Live (TTL) caches that tag every object in cache with a specific (and possibly renewable) expiration time. State-of-the-art models for TTL caches assume zero object fetch delay, i.e., the time required to fetch a requested object that is not in cache from a different cache or the origin server. Particularly, in cache hierarchies, this delay has a significant impact on performance metrics such as the object hit probability. Recent work suggests that the impact of the object fetch delay on the cache performance will continue to increase due to the scaling mismatch between shrinking inter-request times (due to higher data center link rates) in contrast to processing and memory access times. In this paper, we analyze tree-based cache hierarchies with random object fetch delays and provide an exact analysis of the corresponding object hit probability. Our analysis allows understanding the impact of random delays and TTLs on cache metrics for a wide class of request stream models characterized through Markov arrival processes. This is expressed through a metric that we denote delay impairment of the hit probability. In addition, we analyze and extend state-of-the-art approximations of the hit probability to take the delay into account. We provide numerical and trace-based simulation-based evaluation results showing that larger TTLs do not efficiently compensate for the detrimental effect of object fetch delays. Our evaluations also show that unlike our exact model the state-of-the-art approximations do not capture the impact of the object fetch delay well especially for cache hierarchies. Surprisingly, we show that the impact of this delay on the hit probability is not monotonic but depends on the request stream properties, as well as, the TTL.

相關內容

簡稱 哈工大,創建于1920年,是C9聯盟成員之一,國內工科頂尖高校。1999年成為首批九所985工程院校之一,校訓是“規格嚴格,功夫到家”。

As mobile edge computing (MEC) finds widespread use for relieving the computational burden of compute- and interaction-intensive applications on end user devices, understanding the resulting delay and cost performance is drawing significant attention. While most existing works focus on singletask offloading in single-hop MEC networks, next generation applications (e.g., industrial automation, augmented/virtual reality) require advance models and algorithms for dynamic configuration of multi-task services over multi-hop MEC networks. In this work, we leverage recent advances in dynamic cloud network control to provide a comprehensive study of the performance of multi-hop MEC networks, addressing the key problems of multi-task offloading, timely packet scheduling, and joint computation and communication resource allocation. We present a fully distributed algorithm based on Lyapunov control theory that achieves throughput-optimal performance with delay and cost guarantees. Simulation results validate our theoretical analysis and provide insightful guidelines on the interplay between communication and computation resources in MEC networks.

Safety is critical in autonomous robotic systems. A safe control law ensures forward invariance of a safe set (a subset in the state space). It has been extensively studied regarding how to derive a safe control law with a control-affine analytical dynamic model. However, in complex environments and tasks, it is challenging and time-consuming to obtain a principled analytical model of the system. In these situations, data-driven learning is extensively used and the learned models are encoded in neural networks. How to formally derive a safe control law with Neural Network Dynamic Models (NNDM) remains unclear due to the lack of computationally tractable methods to deal with these black-box functions. In fact, even finding the control that minimizes an objective for NNDM without any safety constraint is still challenging. In this work, we propose MIND-SIS (Mixed Integer for Neural network Dynamic model with Safety Index Synthesis), the first method to derive safe control laws for NNDM. The method includes two parts: 1) SIS: an algorithm for the offline synthesis of the safety index (also called as barrier function), which uses evolutionary methods and 2) MIND: an algorithm for online computation of the optimal and safe control signal, which solves a constrained optimization using a computationally efficient encoding of neural networks. It has been theoretically proved that MIND-SIS guarantees forward invariance and finite convergence. And it has been numerically validated that MIND-SIS achieves safe and optimal control of NNDM. From our experiments, the optimality gap is less than $10^{-8}$, and the safety constraint violation is $0$.

Transformers have achieved state-of-the-art results across multiple NLP tasks. However, the self-attention mechanism complexity scales quadratically with the sequence length, creating an obstacle for tasks involving long sequences, like in the speech domain. In this paper, we discuss the usefulness of self-attention for Direct Speech Translation. First, we analyze the layer-wise token contributions in the self-attention of the encoder, unveiling local diagonal patterns. To prove that some attention weights are avoidable, we propose to substitute the standard self-attention with a local efficient one, setting the amount of context used based on the results of the analysis. With this approach, our model matches the baseline performance, and improves the efficiency by skipping the computation of those weights that standard attention discards.

We study online convex optimization with switching costs, a practically important but also extremely challenging problem due to the lack of complete offline information. By tapping into the power of machine learning (ML) based optimizers, ML-augmented online algorithms (also referred to as expert calibration in this paper) have been emerging as state of the art, with provable worst-case performance guarantees. Nonetheless, by using the standard practice of training an ML model as a standalone optimizer and plugging it into an ML-augmented algorithm, the average cost performance can be even worse than purely using ML predictions. In order to address the "how to learn" challenge, we propose EC-L2O (expert-calibrated learning to optimize), which trains an ML-based optimizer by explicitly taking into account the downstream expert calibrator. To accomplish this, we propose a new differentiable expert calibrator that generalizes regularized online balanced descent and offers a provably better competitive ratio than pure ML predictions when the prediction error is large. For training, our loss function is a weighted sum of two different losses -- one minimizing the average ML prediction error for better robustness, and the other one minimizing the post-calibration average cost. We also provide theoretical analysis for EC-L2O, highlighting that expert calibration can be even beneficial for the average cost performance and that the high-percentile tail ratio of the cost achieved by EC-L2O to that of the offline optimal oracle (i.e., tail cost ratio) can be bounded. Finally, we test EC-L2O by running simulations for sustainable datacenter demand response. Our results demonstrate that EC-L2O can empirically achieve a lower average cost as well as a lower competitive ratio than the existing baseline algorithms.

Machine learning (ML) inference is a real-time workload that must comply with strict Service Level Objectives (SLOs), including latency and accuracy targets. Unfortunately, ensuring that SLOs are not violated in inference-serving systems is challenging due to inherent model accuracy-latency tradeoffs, SLO diversity across and within application domains, evolution of SLOs over time, unpredictable query patterns, and co-location interference. In this paper, we observe that neural networks exhibit high degrees of per-input activation sparsity during inference. . Thus, we propose SLO-Aware Neural Networks which dynamically drop out nodes per-inference query, thereby tuning the amount of computation performed, according to specified SLO optimization targets and machine utilization. SLO-Aware Neural Networks achieve average speedups of $1.3-56.7\times$ with little to no accuracy loss (less than 0.3%). When accuracy constrained, SLO-Aware Neural Networks are able to serve a range of accuracy targets at low latency with the same trained model. When latency constrained, SLO-Aware Neural Networks can proactively alleviate latency degradation from co-location interference while maintaining high accuracy to meet latency constraints.

Many forms of dependence manifest themselves over time, with behavior of variables in dynamical systems as a paradigmatic example. This paper studies temporal dependence in dynamical systems from a logical perspective, by extending a minimal modal base logic of static functional dependencies. We define a logic for dynamical systems with single time steps, provide a complete axiomatic proof calculus, and show the decidability of the satisfiability problem for a substantial fragment. The system comes in two guises: modal and first-order, that naturally complement each other. Next, we consider a timed semantics for our logic, as an intermediate between state spaces and temporal universes for the unfoldings of a dynamical system. We prove completeness and decidability by combining techniques from dynamic-epistemic logic and modal logic of functional dependencies with complex terms for objects. Also, we extend these results to the timed logic with functional symbols and term identity. Finally, we conclude with a brief outlook on how the system proposed here connects with richer temporal logics of system behavior, and with dynamic topological logic.

The best neural architecture for a given machine learning problem depends on many factors: not only the complexity and structure of the dataset, but also on resource constraints including latency, compute, energy consumption, etc. Neural architecture search (NAS) for tabular datasets is an important but under-explored problem. Previous NAS algorithms designed for image search spaces incorporate resource constraints directly into the reinforcement learning rewards. In this paper, we argue that search spaces for tabular NAS pose considerable challenges for these existing reward-shaping methods, and propose a new reinforcement learning (RL) controller to address these challenges. Motivated by rejection sampling, when we sample candidate architectures during a search, we immediately discard any architecture that violates our resource constraints. We use a Monte-Carlo-based correction to our RL policy gradient update to account for this extra filtering step. Results on several tabular datasets show TabNAS, the proposed approach, efficiently finds high-quality models that satisfy the given resource constraints.

We demonstrate that merely analog transmissions and match filtering can realize the function of an edge server in federated learning (FL). Therefore, a network with massively distributed user equipments (UEs) can achieve large-scale FL without an edge server. We also develop a training algorithm that allows UEs to continuously perform local computing without being interrupted by the global parameter uploading, which exploits the full potential of UEs' processing power. We derive convergence rates for the proposed schemes to quantify their training efficiency. The analyses reveal that when the interference obeys a Gaussian distribution, the proposed algorithm retrieves the convergence rate of a server-based FL. But if the interference distribution is heavy-tailed, then the heavier the tail, the slower the algorithm converges. Nonetheless, the system run time can be largely reduced by enabling computation in parallel with communication, whereas the gain is particularly pronounced when communication latency is high. These findings are corroborated via excessive simulations.

Universal coding of integers~(UCI) is a class of variable-length code, such that the ratio of the expected codeword length to $\max\{1,H(P)\}$ is within a constant factor, where $H(P)$ is the Shannon entropy of the decreasing probability distribution $P$. However, if we consider the ratio of the expected codeword length to $H(P)$, the ratio tends to infinity by using UCI, when $H(P)$ tends to zero. To solve this issue, this paper introduces a class of codes, termed generalized universal coding of integers~(GUCI), such that the ratio of the expected codeword length to $H(P)$ is within a constant factor $K$. First, the definition of GUCI is proposed and the coding structure of GUCI is introduced. Next, we propose a class of GUCI $\mathcal{C}$ to achieve the expansion factor $K_{\mathcal{C}}=2$ and show that the optimal GUCI is in the range $1\leq K_{\mathcal{C}}^{*}\leq 2$. Then, by comparing UCI and GUCI, we show that when the entropy is very large or $P(0)$ is not large, there are also cases where the average codeword length of GUCI is shorter. Finally, the asymptotically optimal GUCI is presented.

Present-day atomistic simulations generate long trajectories of ever more complex systems. Analyzing these data, discovering metastable states, and uncovering their nature is becoming increasingly challenging. In this paper, we first use the variational approach to conformation dynamics to discover the slowest dynamical modes of the simulations. This allows the different metastable states of the system to be located and organized hierarchically. The physical descriptors that characterize metastable states are discovered by means of a machine learning method. We show in the cases of two proteins, Chignolin and Bovine Pancreatic Trypsin Inhibitor, how such analysis can be effortlessly performed in a matter of seconds. Another strength of our approach is that it can be applied to the analysis of both unbiased and biased simulations.

北京阿比特科技有限公司