亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Extended reality (XR) applications often perform resource-intensive tasks, which are computed remotely, a process that prioritizes the latency criticality aspect. To this end, this paper shows that through leveraging the power of the central cloud (CC), the close proximity of edge computers (ECs), and the flexibility of uncrewed aerial vehicles (UAVs), a UAV-aided hybrid cloud/mobile-edge computing architecture promises to handle the intricate requirements of future XR applications. In this context, this paper distinguishes between two types of XR devices, namely, strong and weak devices. The paper then introduces a cooperative non-orthogonal multiple access (Co-NOMA) scheme, pairing strong and weak devices, so as to aid the XR devices quality-of-user experience by intelligently selecting either the direct or the relay links toward the weak XR devices. A sum logarithmic-rate maximization problem is, thus, formulated so as to jointly determine the computation and communication resources, and link-selection strategy as a means to strike a trade-off between the system throughput and fairness. Subject to realistic network constraints, e.g., power consumption and delay, the optimization problem is then solved iteratively via discrete relaxations, successive-convex approximation, and fractional programming, an approach which can be implemented in a distributed fashion across the network. Simulation results validate the proposed algorithms performance in terms of log-rate maximization, delay-sensitivity, scalability, and runtime performance. The practical distributed Co-NOMA implementation is particularly shown to offer appreciable benefits over traditional multiple access and NOMA methods, highlighting its applicability in decentralized XR systems.

相關內容

It is common to model a deterministic response function, such as the output of a computer experiment, as a Gaussian process with a Mat\'ern covariance kernel. The smoothness parameter of a Mat\'ern kernel determines many important properties of the model in the large data limit, including the rate of convergence of the conditional mean to the response function. We prove that the maximum likelihood estimate of the smoothness parameter cannot asymptotically undersmooth the truth when the data are obtained on a fixed bounded subset of $\mathbb{R}^d$. That is, if the data-generating response function has Sobolev smoothness $\nu_0 > d/2$, then the smoothness parameter estimate cannot be asymptotically less than $\nu_0$. The lower bound is sharp. Additionally, we show that maximum likelihood estimation recovers the true smoothness for a class of compactly supported self-similar functions. For cross-validation we prove an asymptotic lower bound $\nu_0 - d/2$, which however is unlikely to be sharp. The results are based on approximation theory in Sobolev spaces and some general theorems that restrict the set of values that the parameter estimators can take.

The increasing demand for latency-sensitive applications has necessitated the development of sophisticated algorithms that efficiently manage packets with end-to-end delay targets traversing the networked infrastructure. Network components must consider minimizing the packets' end-to-end delay violation probabilities (DVP) as a guiding principle throughout the transmission path to ensure timely deliveries. Active queue management (AQM) schemes are commonly used to mitigate congestion by dropping packets and controlling queuing delay. Today's established AQM schemes are threshold-driven, identifying congestion and trigger packet dropping using a predefined criteria which is unaware of packets' DVPs. In this work, we propose a novel framework, Delta, that combines end-to-end delay characterization with AQM for minimizing DVP. In a queuing theoretic environment, we show that such a policy is feasible by utilizing a data-driven approach to predict the queued packets' DVPs. That enables Delta AQM to effectively handle links with arbitrary stationary service time processes. The implementation is described in detail, and its performance is evaluated and compared with state of the art AQM algorithms. Our results show the Delta outperforms current AQM schemes substantially, in particular in scenarios where high reliability, i.e. high quantiles of the tail latency distribution, are of interest.

Deploying Large Language Models (LLMs) in streaming applications that involve long contexts, particularly for extended dialogues and text analysis, is of paramount importance but presents two significant challenges. Firstly, the memory consumption is substantial during the decoding phase due to the caching of Key and Value states (KV) of previous tokens. Secondly, attention computation is time-consuming with a time complexity of $O(n^2)$ for the generation of each token. In recent OpenAI DevDay (Nov 6, 2023), OpenAI released a new model that is able to support a 128K-long document, in our paper, we focus on the memory-efficient issue when context length $n$ is much greater than 128K ($n \gg 2^d$). Considering a single-layer self-attention with Query, Key, and Value matrices $Q, K, V \in \mathbb{R}^{n \times d}$, the polynomial method approximates the attention output $T \in \mathbb{R}^{n \times d}$. It accomplishes this by constructing $U_1, U_2 \in \mathbb{R}^{n \times t}$ to expedite attention ${\sf Attn}(Q, K, V)$ computation within $n^{1+o(1)}$ time executions. Despite this, storing the Key and Value matrices $K, V \in \mathbb{R}^{n \times d}$ still necessitates $O( n d)$ space, leading to significant memory usage. In response to these challenges, we introduce a new algorithm that only reads one pass of the data in streaming fashion. This method employs sublinear space $o(n)$ to store three sketch matrices, alleviating the need for exact $K, V$ storage. Notably, our algorithm exhibits exceptional memory-efficient performance with super-long tokens. As the token length $n$ increases, our error guarantee diminishes while the memory usage remains nearly constant. This unique attribute underscores the potential of our technique in efficiently handling LLMs in streaming applications.

Nowadays, many modern applications require heterogeneous tabular data, which is still a challenging task in terms of regression and classification. Many approaches have been proposed to adapt neural networks for this task, but still, boosting and bagging of decision trees are the best-performing methods for this task. In this paper, we show that a binomial initialized neural network can be used effectively on tabular data. The proposed approach shows a simple but effective approach for initializing the first hidden layer in neural networks. We also show that this initializing schema can be used to jointly train ensembles by adding gradient masking to batch entries and using the binomial initialization for the last layer in a neural network. For this purpose, we modified the hinge binary loss and the soft max loss to make them applicable for joint ensemble training. We evaluate our approach on multiple public datasets and showcase the improved performance compared to other neural network-based approaches. In addition, we discuss the limitations and possible further research of our approach for improving the applicability of neural networks to tabular data. Link: //es-cloud.cs.uni-tuebingen.de/d/8e2ab8c3fdd444e1a135/?p=%2FInitializationNeuronalNetworksTabularData&mode=list

Linear real-valued computations over distributed datasets are common in many applications, most notably as part of machine learning inference. In particular, linear computations that are quantized, i.e., where the coefficients are restricted to a predetermined set of values (such as $\pm 1$), have gained increasing interest lately due to their role in efficient, robust, or private machine learning models. Given a dataset to store in a distributed system, we wish to encode it so that all such computations could be conducted by accessing a small number of servers, called the access parameter of the system. Doing so relieves the remaining servers to execute other tasks. Minimizing the access parameter gives rise to an access-redundancy tradeoff, where a smaller access parameter requires more redundancy in the system, and vice versa. In this paper, we study this tradeoff and provide several explicit low-access schemes for $\{\pm1\}$ quantized linear computations based on covering codes in a novel way. While the connection to covering codes has been observed in the past, our results strictly outperform the state-of-the-art for two-valued linear computations. We further show that the same storage scheme can be used to retrieve any linear combination with two distinct coefficients -- regardless of what those coefficients are -- with the same access parameter. This universality result is then extended to all possible quantizations with any number of values; while the storage remains identical, the access parameter increases according to a new additive-combinatorics property we call coefficient complexity. We then turn to study the coefficient complexity -- we characterize the complexity of small sets of coefficients, provide bounds, and identify coefficient sets having the highest and lowest complexity.

Reasoning is a fundamental aspect of human intelligence that plays a crucial role in activities such as problem solving, decision making, and critical thinking. In recent years, large language models (LLMs) have made significant progress in natural language processing, and there is observation that these models may exhibit reasoning abilities when they are sufficiently large. However, it is not yet clear to what extent LLMs are capable of reasoning. This paper provides a comprehensive overview of the current state of knowledge on reasoning in LLMs, including techniques for improving and eliciting reasoning in these models, methods and benchmarks for evaluating reasoning abilities, findings and implications of previous research in this field, and suggestions on future directions. Our aim is to provide a detailed and up-to-date review of this topic and stimulate meaningful discussion and future work.

The existence of representative datasets is a prerequisite of many successful artificial intelligence and machine learning models. However, the subsequent application of these models often involves scenarios that are inadequately represented in the data used for training. The reasons for this are manifold and range from time and cost constraints to ethical considerations. As a consequence, the reliable use of these models, especially in safety-critical applications, is a huge challenge. Leveraging additional, already existing sources of knowledge is key to overcome the limitations of purely data-driven approaches, and eventually to increase the generalization capability of these models. Furthermore, predictions that conform with knowledge are crucial for making trustworthy and safe decisions even in underrepresented scenarios. This work provides an overview of existing techniques and methods in the literature that combine data-based models with existing knowledge. The identified approaches are structured according to the categories integration, extraction and conformity. Special attention is given to applications in the field of autonomous driving.

Current models for event causality identification (ECI) mainly adopt a supervised framework, which heavily rely on labeled data for training. Unfortunately, the scale of current annotated datasets is relatively limited, which cannot provide sufficient support for models to capture useful indicators from causal statements, especially for handing those new, unseen cases. To alleviate this problem, we propose a novel approach, shortly named CauSeRL, which leverages external causal statements for event causality identification. First of all, we design a self-supervised framework to learn context-specific causal patterns from external causal statements. Then, we adopt a contrastive transfer strategy to incorporate the learned context-specific causal patterns into the target ECI model. Experimental results show that our method significantly outperforms previous methods on EventStoryLine and Causal-TimeBank (+2.0 and +3.4 points on F1 value respectively).

Visual dialogue is a challenging task that needs to extract implicit information from both visual (image) and textual (dialogue history) contexts. Classical approaches pay more attention to the integration of the current question, vision knowledge and text knowledge, despising the heterogeneous semantic gaps between the cross-modal information. In the meantime, the concatenation operation has become de-facto standard to the cross-modal information fusion, which has a limited ability in information retrieval. In this paper, we propose a novel Knowledge-Bridge Graph Network (KBGN) model by using graph to bridge the cross-modal semantic relations between vision and text knowledge in fine granularity, as well as retrieving required knowledge via an adaptive information selection mode. Moreover, the reasoning clues for visual dialogue can be clearly drawn from intra-modal entities and inter-modal bridges. Experimental results on VisDial v1.0 and VisDial-Q datasets demonstrate that our model outperforms exiting models with state-of-the-art results.

The problem of Multiple Object Tracking (MOT) consists in following the trajectory of different objects in a sequence, usually a video. In recent years, with the rise of Deep Learning, the algorithms that provide a solution to this problem have benefited from the representational power of deep models. This paper provides a comprehensive survey on works that employ Deep Learning models to solve the task of MOT on single-camera videos. Four main steps in MOT algorithms are identified, and an in-depth review of how Deep Learning was employed in each one of these stages is presented. A complete experimental comparison of the presented works on the three MOTChallenge datasets is also provided, identifying a number of similarities among the top-performing methods and presenting some possible future research directions.

北京阿比特科技有限公司