亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Using a generating function approach, a computationally tractable expression is derived to predict the frame error rate arising at the output of the binary symmetric channel when a number of outer Reed-Solomon codes are concatenated with a number of inner Bose-Ray-Chaudhuri-Hocquenghem codes, thereby obviating the need for time-consuming Monte Carlo simulations. Measuring (a) code performance via the gap to the Shannon limit, (b) decoding complexity via an estimate of the number of operations per decoded bit, and (c) decoding latency by the overall frame length, a code search is performed to determine the Pareto frontier for performance-complexity-latency trade-offs.

相關內容

Impressive advances in text-to-image (T2I) generative models have yielded a plethora of high performing models which are able to generate aesthetically appealing, photorealistic images. Despite the progress, these models still struggle to produce images that are consistent with the input prompt, oftentimes failing to capture object quantities, relations and attributes properly. Existing solutions to improve prompt-image consistency suffer from the following challenges: (1) they oftentimes require model fine-tuning, (2) they only focus on nearby prompt samples, and (3) they are affected by unfavorable trade-offs among image quality, representation diversity, and prompt-image consistency. In this paper, we address these challenges and introduce a T2I optimization-by-prompting framework, OPT2I, which leverages a large language model (LLM) to improve prompt-image consistency in T2I models. Our framework starts from a user prompt and iteratively generates revised prompts with the goal of maximizing a consistency score. Our extensive validation on two datasets, MSCOCO and PartiPrompts, shows that OPT2I can boost the initial consistency score by up to 24.9% in terms of DSG score while preserving the FID and increasing the recall between generated and real data. Our work paves the way toward building more reliable and robust T2I systems by harnessing the power of LLMs.

We propose a text-guided variational image generation method to address the challenge of getting clean data for anomaly detection in industrial manufacturing. Our method utilizes text information about the target object, learned from extensive text library documents, to generate non-defective data images resembling the input image. The proposed framework ensures that the generated non-defective images align with anticipated distributions derived from textual and image-based knowledge, ensuring stability and generality. Experimental results demonstrate the effectiveness of our approach, surpassing previous methods even with limited non-defective data. Our approach is validated through generalization tests across four baseline models and three distinct datasets. We present an additional analysis to enhance the effectiveness of anomaly detection models by utilizing the generated images.

The speed-precision trade-off is a critical problem for visual object tracking which usually requires low latency and deployment on constrained resources. Existing solutions for efficient tracking mainly focus on adopting light-weight backbones or modules, which nevertheless come at the cost of a sacrifice in precision. In this paper, inspired by dynamic network routing, we propose DyTrack, a dynamic transformer framework for efficient tracking. Real-world tracking scenarios exhibit diverse levels of complexity. We argue that a simple network is sufficient for easy frames in video sequences, while more computation could be assigned to difficult ones. DyTrack automatically learns to configure proper reasoning routes for various inputs, gaining better utilization of the available computational budget. Thus, it can achieve higher performance with the same running speed. We formulate instance-specific tracking as a sequential decision problem and attach terminating branches to intermediate layers of the entire model. Especially, to fully utilize the computations, we introduce the feature recycling mechanism to reuse the outputs of predecessors. Furthermore, a target-aware self-distillation strategy is designed to enhance the discriminating capabilities of early predictions by effectively mimicking the representation pattern of the deep model. Extensive experiments on multiple benchmarks demonstrate that DyTrack achieves promising speed-precision trade-offs with only a single model. For instance, DyTrack obtains 64.9% AUC on LaSOT with a speed of 256 fps.

The formation trajectory planning using complete graphs to model collaborative constraints becomes computationally intractable as the number of drones increases due to the curse of dimensionality. To tackle this issue, this paper presents a sparse graph construction method for formation planning to realize better efficiency-performance trade-off. Firstly, a sparsification mechanism for complete graphs is designed to ensure the global rigidity of sparsified graphs, which is a necessary condition for uniquely corresponding to a geometric shape. Secondly, a good sparse graph is constructed to preserve the main structural feature of complete graphs sufficiently. Since the graph-based formation constraint is described by Laplacian matrix, the sparse graph construction problem is equivalent to submatrix selection, which has combinatorial time complexity and needs a scoring metric. Via comparative simulations, the Max-Trace matrix-revealing metric shows the promising performance. The sparse graph is integrated into the formation planning. Simulation results with 72 drones in complex environments demonstrate that when preserving 30\% connection edges, our method has comparative formation error and recovery performance w.r.t. complete graphs. Meanwhile, the planning efficiency is improved by approximate an order of magnitude. Benchmark comparisons and ablation studies are conducted to fully validate the merits of our method.

Extremely large-scale multiple-input multiple-output (XL-MIMO) systems are capable of improving spectral efficiency by employing far more antennas than conventional massive MIMO at the base station (BS). However, beam training in multiuser XL-MIMO systems is challenging. To tackle these issues, we conceive a three-phase graph neural network (GNN)-based beam training scheme for multiuser XL-MIMO systems. In the first phase, only far-field wide beams have to be tested for each user and the GNN is utilized to map the beamforming gain information of the far-field wide beams to the optimal near-field beam for each user. In addition, the proposed GNN-based scheme can exploit the position-correlation between adjacent users for further improvement of the accuracy of beam training. In the second phase, a beam allocation scheme based on the probability vectors produced at the outputs of GNNs is proposed to address the above beam-direction conflicts between users. In the third phase, the hybrid TBF is designed for further reducing the inter-user interference. Our simulation results show that the proposed scheme improves the beam training performance of the benchmarks. Moreover, the performance of the proposed beam training scheme approaches that of an exhaustive search, despite requiring only about 7% of the pilot overhead.

Audit logs are one of the most important tools for transparently tracking system events and maintaining continuous oversight in corporate organizations and enterprise business systems. There are many cases where the audit logs contain sensitive data, or the audit logs are enormous. In these situations, dealing with a subset of the data is more practical than the entire data set. To provide a secure solution to handle these issues, a sanitizable signature scheme (SSS) is a viable cryptographic primitive. Herein, we first present the first post-quantum secure multivariate-based SSS, namely Mul-SAN. Our proposed design provides unforgeability, privacy, immutability, signer accountability, and sanitizer accountability under the assumption that the MQ problem is NP-hard. Mul-SAN is very efficient and only requires computing field multiplications and additions over a finite field for its implementation. Mul-SAN presents itself as a practical method to partially delegate control of the authenticated data in avenues like the healthcare industry and government organizations. We also explore using Blockchain to provide a tamper-proof and robust audit log mechanism.

Recent advancements in deep learning-based image compression are notable. However, prevalent schemes that employ a serial context-adaptive entropy model to enhance rate-distortion (R-D) performance are markedly slow. Furthermore, the complexities of the encoding and decoding networks are substantially high, rendering them unsuitable for some practical applications. In this paper, we propose two techniques to balance the trade-off between complexity and performance. First, we introduce two branching coding networks to independently learn a low-resolution latent representation and a high-resolution latent representation of the input image, discriminatively representing the global and local information therein. Second, we utilize the high-resolution latent representation as conditional information for the low-resolution latent representation, furnishing it with global information, thus aiding in the reduction of redundancy between low-resolution information. We do not utilize any serial entropy models. Instead, we employ a parallel channel-wise auto-regressive entropy model for encoding and decoding low-resolution and high-resolution latent representations. Experiments demonstrate that our method is approximately twice as fast in both encoding and decoding compared to the parallelizable checkerboard context model, and it also achieves a 1.2% improvement in R-D performance compared to state-of-the-art learned image compression schemes. Our method also outperforms classical image codecs including H.266/VVC-intra (4:4:4) and some recent learned methods in rate-distortion performance, as validated by both PSNR and MS-SSIM metrics on the Kodak dataset.

The distortion-rate function of output-constrained lossy source coding with limited common randomness is analyzed for the special case of squared error distortion measure. An explicit expression is obtained when both source and reconstruction distributions are Gaussian. This further leads to a partial characterization of the information-theoretic limit of quadratic Gaussian rate-distortion-perception coding with the perception measure given by Kullback-Leibler divergence or squared quadratic Wasserstein distance.

It is always well believed that modeling relationships between objects would be helpful for representing and eventually describing an image. Nevertheless, there has not been evidence in support of the idea on image description generation. In this paper, we introduce a new design to explore the connections between objects for image captioning under the umbrella of attention-based encoder-decoder framework. Specifically, we present Graph Convolutional Networks plus Long Short-Term Memory (dubbed as GCN-LSTM) architecture that novelly integrates both semantic and spatial object relationships into image encoder. Technically, we build graphs over the detected objects in an image based on their spatial and semantic connections. The representations of each region proposed on objects are then refined by leveraging graph structure through GCN. With the learnt region-level features, our GCN-LSTM capitalizes on LSTM-based captioning framework with attention mechanism for sentence generation. Extensive experiments are conducted on COCO image captioning dataset, and superior results are reported when comparing to state-of-the-art approaches. More remarkably, GCN-LSTM increases CIDEr-D performance from 120.1% to 128.7% on COCO testing set.

We introduce a generic framework that reduces the computational cost of object detection while retaining accuracy for scenarios where objects with varied sizes appear in high resolution images. Detection progresses in a coarse-to-fine manner, first on a down-sampled version of the image and then on a sequence of higher resolution regions identified as likely to improve the detection accuracy. Built upon reinforcement learning, our approach consists of a model (R-net) that uses coarse detection results to predict the potential accuracy gain for analyzing a region at a higher resolution and another model (Q-net) that sequentially selects regions to zoom in. Experiments on the Caltech Pedestrians dataset show that our approach reduces the number of processed pixels by over 50% without a drop in detection accuracy. The merits of our approach become more significant on a high resolution test set collected from YFCC100M dataset, where our approach maintains high detection performance while reducing the number of processed pixels by about 70% and the detection time by over 50%.

北京阿比特科技有限公司