亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

In this paper, we aim to explore the use of uplink semantic communications with the assistance of UAV in order to improve data collection effiicency for metaverse users in remote areas. To reduce the time for uplink data collection while balancing the trade-off between reconstruction quality and computational energy cost, we propose a hybrid action reinforcement learning (RL) framework to make decisions on semantic model scale, channel allocation, transmission power, and UAV trajectory. The variables are classified into discrete type and continuous type, which are optimized by two different RL agents to generate the combined action. Simulation results indicate that the proposed hybrid action reinforcement learning framework can effectively improve the efficiency of uplink semantic data collection under different parameter settings and outperforms the benchmark scenarios.

相關內容

In this paper we tackle the problem of persistently covering a complex non-convex environment with a team of robots. We consider scenarios where the coverage quality of the environment deteriorates with time, requiring to constantly revisit every point. As a first step, our solution finds a partition of the environment where the amount of work for each robot, weighted by the importance of each point, is equal. This is achieved using a power diagram and finding an equitable partition through a provably correct distributed control law on the power weights. Compared to other existing partitioning methods, our solution considers a continuous environment formulation with non-convex obstacles. In the second step, each robot computes a graph that gathers sweep-like paths and covers its entire partition. At each planning time, the coverage error at the graph vertices is assigned as weights of the corresponding edges. Then, our solution is capable of efficiently finding the optimal open coverage path through the graph with respect to the coverage error per distance traversed. Simulation and experimental results are presented to support our proposal.

In this paper, we present significant advancements in the pretraining of Mistral 7B, a large-scale language model, using a dataset of 32.6 GB, equivalent to 1.1 billion tokens. We explore the impact of extending the context length, releasing models with context lengths of 4096 and 32768 tokens, and further refining performance with a specialized 16384 context length instruction-tuned model, we called it Malaysian Mistral. Our experiments demonstrate the efficacy of continue pretraining and the influence of extended context lengths on Mistral 7B's language understanding capabilities. Additionally, we release a model specifically tuned with a 16384 context length instruction, showcasing its potential for capturing nuanced language intricacies. Furthermore, our research contributes to the benchmarking of Malaysian Mistral against prominent language models, including ChatGPT3.5 and Claude 2. We present compelling results indicating Malaysian Mistral's superior performance on Tatabahasa (Malay grammar) test set, particularly when fine-tuned with instructions. All models released at //huggingface.co/collections/mesolitica/malaysian-mistral-7b-6528f2ec825f4bba46c1700c

In this paper, we present a novel deep image clustering approach termed PICI, which enforces the partial information discrimination and the cross-level interaction in a joint learning framework. In particular, we leverage a Transformer encoder as the backbone, through which the masked image modeling with two paralleled augmented views is formulated. After deriving the class tokens from the masked images by the Transformer encoder, three partial information learning modules are further incorporated, including the PISD module for training the auto-encoder via masked image reconstruction, the PICD module for employing two levels of contrastive learning, and the CLI module for mutual interaction between the instance-level and cluster-level subspaces. Extensive experiments have been conducted on six real-world image datasets, which demononstrate the superior clustering performance of the proposed PICI approach over the state-of-the-art deep clustering approaches. The source code is available at //github.com/Regan-Zhang/PICI.

In this paper, we propose and study construction of confidence bands for shape-constrained regression functions when the predictor is multivariate. In particular, we consider the continuous multidimensional white noise model given by $d Y(\mathbf{t}) = n^{1/2} f(\mathbf{t}) \,d\mathbf{t} + d W(\mathbf{t})$, where $Y$ is the observed stochastic process on $[0,1]^d$ ($d\ge 1$), $W$ is the standard Brownian sheet on $[0,1]^d$, and $f$ is the unknown function of interest assumed to belong to a (shape-constrained) function class, e.g., coordinate-wise monotone functions or convex functions. The constructed confidence bands are based on local kernel averaging with bandwidth chosen automatically via a multivariate multiscale statistic. The confidence bands have guaranteed coverage for every $n$ and for every member of the underlying function class. Under monotonicity/convexity constraints on $f$, the proposed confidence bands automatically adapt (in terms of width) to the global and local (H\"{o}lder) smoothness and intrinsic dimensionality of the unknown $f$; the bands are also shown to be optimal in a certain sense. These bands have (almost) parametric ($n^{-1/2}$) widths when the underlying function has ``low-complexity'' (e.g., piecewise constant/affine).

In this paper, we propose the use of self-supervised pretraining on a large unlabelled data set to improve the performance of a personalized voice activity detection (VAD) model in adverse conditions. We pretrain a long short-term memory (LSTM)-encoder using the autoregressive predictive coding (APC) framework and fine-tune it for personalized VAD. We also propose a denoising variant of APC, with the goal of improving the robustness of personalized VAD. The trained models are systematically evaluated on both clean speech and speech contaminated by various types of noise at different SNR-levels and compared to a purely supervised model. Our experiments show that self-supervised pretraining not only improves performance in clean conditions, but also yields models which are more robust to adverse conditions compared to purely supervised learning.

In this paper, we present a simultaneous target tracking and multi-user communications system realized by a full duplex holographic Multiple-Input Multiple-Output (MIMO) node equipped with Dynamic Metasurface Antennas (DMAs) at both its communication ends. Focusing on the near-field regime, we extend Fresnel's approximation to metasurfaces and devise a subspace tracking scheme with DMA-based hybrid Analog and Digital (A/D) reception as well as hybrid A/D transmission with a DMA for sum-rate maximization. The presented simulation results corroborate the efficiency of the proposed framework for various system parameters.

This paper considers the optimal sensor allocation for estimating the emission rates of multiple sources in a two-dimensional spatial domain. Locations of potential emission sources are known (e.g., factory stacks), and the number of sources is much greater than the number of sensors that can be deployed, giving rise to the optimal sensor allocation problem. In particular, we consider linear dispersion forward models, and the optimal sensor allocation is formulated as a bilevel optimization problem. The outer problem determines the optimal sensor locations by minimizing the overall Mean Squared Error of the estimated emission rates over various wind conditions, while the inner problem solves an inverse problem that estimates the emission rates. Two algorithms, including the repeated Sample Average Approximation and the Stochastic Gradient Descent based bilevel approximation, are investigated in solving the sensor allocation problem. Convergence analysis is performed to obtain the performance guarantee, and numerical examples are presented to illustrate the proposed approach.

In this paper, we formulate the multi-agent graph bandit problem as a multi-agent extension of the graph bandit problem introduced by Zhang, Johansson, and Li [CISS 57, 1-6 (2023)]. In our formulation, $N$ cooperative agents travel on a connected graph $G$ with $K$ nodes. Upon arrival at each node, agents observe a random reward drawn from a node-dependent probability distribution. The reward of the system is modeled as a weighted sum of the rewards the agents observe, where the weights capture the decreasing marginal reward associated with multiple agents sampling the same node at the same time. We propose an Upper Confidence Bound (UCB)-based learning algorithm, Multi-G-UCB, and prove that its expected regret over $T$ steps is bounded by $O(N\log(T)[\sqrt{KT} + DK])$, where $D$ is the diameter of graph $G$. Lastly, we numerically test our algorithm by comparing it to alternative methods.

We propose a novel single shot object detection network named Detection with Enriched Semantics (DES). Our motivation is to enrich the semantics of object detection features within a typical deep detector, by a semantic segmentation branch and a global activation module. The segmentation branch is supervised by weak segmentation ground-truth, i.e., no extra annotation is required. In conjunction with that, we employ a global activation module which learns relationship between channels and object classes in a self-supervised manner. Comprehensive experimental results on both PASCAL VOC and MS COCO detection datasets demonstrate the effectiveness of the proposed method. In particular, with a VGG16 based DES, we achieve an mAP of 81.7 on VOC2007 test and an mAP of 32.8 on COCO test-dev with an inference speed of 31.5 milliseconds per image on a Titan Xp GPU. With a lower resolution version, we achieve an mAP of 79.7 on VOC2007 with an inference speed of 13.0 milliseconds per image.

In this paper, we propose the joint learning attention and recurrent neural network (RNN) models for multi-label classification. While approaches based on the use of either model exist (e.g., for the task of image captioning), training such existing network architectures typically require pre-defined label sequences. For multi-label classification, it would be desirable to have a robust inference process, so that the prediction error would not propagate and thus affect the performance. Our proposed model uniquely integrates attention and Long Short Term Memory (LSTM) models, which not only addresses the above problem but also allows one to identify visual objects of interests with varying sizes without the prior knowledge of particular label ordering. More importantly, label co-occurrence information can be jointly exploited by our LSTM model. Finally, by advancing the technique of beam search, prediction of multiple labels can be efficiently achieved by our proposed network model.

北京阿比特科技有限公司