精品夜色国产国偷自产乱码_91人妻社区论坛精选_日韩三级电影网站_国产精品免费露脸视频_国产丝袜福利精品一区二区_中文字幕自拍偷拍精品无码_美女裸体黄18禁网站

The Rayleigh-product channel model is utilized to characterize the rank deficiency caused by keyhole effects. However, the finite blocklength analysis for Rayleigh-product channels is not available in the literature. In this paper, we will characterize the mutual information density (MID) and perform the FBL analysis to reveal the impact of rank-deficiency in Rayleigh-product channels. To this end, we first set up a central limit theorem for the MID over Rayleigh-product MIMO channels in the asymptotic regime where the number of scatterers, number of antennas, and blocklength go to infinity at the same pace. Then, we utilize the CLT to obtain the upper and lower bounds for the packet error probability, whose approximations in the high and low signal to noise ratio regimes are then derived to illustrate the impact of rank-deficiency. One interesting observation is that rank-deficiency degrades the performance of MIMO systems with FBL and the fundamental limits of Rayleigh-product channels degenerate to those of the Rayleigh case when the number of scatterers approaches infinity.

相關內容

MIMO

關注 0

MoDELS · 詞向量表示 · Performer · state-of-the-art · 均值 ·

2024 年 2 月 19 日

A Systematic Comparison of Contextualized Word Embeddings for Lexical Semantic Change

Francesco Periti,Nina Tahmasebi

from arxiv, Submitted to NAACL 2024

Contextualized embeddings are the preferred tool for modeling Lexical Semantic Change (LSC). Current evaluations typically focus on a specific task known as Graded Change Detection (GCD). However, performance comparison across work are often misleading due to their reliance on diverse settings. In this paper, we evaluate state-of-the-art models and approaches for GCD under equal conditions. We further break the LSC problem into Word-in-Context (WiC) and Word Sense Induction (WSI) tasks, and compare models across these different levels. Our evaluation is performed across different languages on eight available benchmarks for LSC, and shows that (i) APD outperforms other approaches for GCD; (ii) XL-LEXEME outperforms other contextualized models for WiC, WSI, and GCD, while being comparable to GPT-4; (iii) there is a clear need for improving the modeling of word meanings, as well as focus on how, when, and why these meanings change, rather than solely focusing on the extent of semantic change.

MoDELS · 有向 · Consistent Optimization · 逼真度 · 優化器 ·

2024 年 2 月 19 日

Direct Consistency Optimization for Compositional Text-to-Image Personalization

Kyungmin Lee,Sangkyung Kwak,Kihyuk Sohn,Jinwoo Shin

from arxiv, Preprint. See our project page (//dco-t2i.github.io/) for more examples and codes

Text-to-image (T2I) diffusion models, when fine-tuned on a few personal images, are able to generate visuals with a high degree of consistency. However, they still lack in synthesizing images of different scenarios or styles that are possible in the original pretrained models. To address this, we propose to fine-tune the T2I model by maximizing consistency to reference images, while penalizing the deviation from the pretrained model. We devise a novel training objective for T2I diffusion models that minimally fine-tunes the pretrained model to achieve consistency. Our method, dubbed \emph{Direct Consistency Optimization}, is as simple as regular diffusion loss, while significantly enhancing the compositionality of personalized T2I models. Also, our approach induces a new sampling method that controls the tradeoff between image fidelity and prompt fidelity. Lastly, we emphasize the necessity of using a comprehensive caption for reference images to further enhance the image-text alignment. We show the efficacy of the proposed method on the T2I personalization for subject, style, or both. In particular, our method results in a superior Pareto frontier to the baselines. Generated examples and codes are in our project page( //dco-t2i.github.io/).

Performer · MoDELS · 離散化 · 過擬合 · 全 ·

2024 年 2 月 19 日

Parameter Efficient Finetuning for Speech Emotion Recognition and Domain Adaptation

Nineli Lashkarashvili,Wen Wu,Guangzhi Sun,Philip C. Woodland

Foundation models have shown superior performance for speech emotion recognition (SER). However, given the limited data in emotion corpora, finetuning all parameters of large pre-trained models for SER can be both resource-intensive and susceptible to overfitting. This paper investigates parameter-efficient finetuning (PEFT) for SER. Various PEFT adaptors are systematically studied for both classification of discrete emotion categories and prediction of dimensional emotional attributes. The results demonstrate that the combination of PEFT methods surpasses full finetuning with a significant reduction in the number of trainable parameters. Furthermore, a two-stage adaptation strategy is proposed to adapt models trained on acted emotion data, which is more readily available, to make the model more adept at capturing natural emotional expressions. Both intra- and cross-corpus experiments validate the efficacy of the proposed approach in enhancing the performance on both the source and target domains.

非線性模型 · 線性的 · MoDELS · 混合 · BASIC ·

2024 年 2 月 17 日

Linear and Non-Linear Models for Master Scheduling of Dynamic Resources Product Mix

Ayman R. Mohammed,Ahmad Abu Sleem,Mohammad A. M. Abdel-Aal

The literature on master production scheduling for product mix problems under the Theory of Constraints (TOC) was considered by many previous studies. Most studies assume a static resources availability. In this study, the raw materials supplied to the manufacturer is considered as dynamic depending on the results of the problem. Thus, an integer linear heuristic, an integer non-linear optimization model, and a basic non-linear model are developed to find a good solution of the problem. The results of the three models were compared to each other in terms of profit, raw materials costs, inventory costs and raw materials utilization. Recent studies in the field are reviewed and conclusions are drawn.

優化器 · Prompt · MoDELS · 可約的 · 大學 ·

2024 年 2 月 16 日

Universal Prompt Optimizer for Safe Text-to-Image Generation

Zongyu Wu,Hongcheng Gao,Yueze Wang,Xiang Zhang,Suhang Wang

Text-to-Image (T2I) models have shown great performance in generating images based on textual prompts. However, these models are vulnerable to unsafe input to generate unsafe content like sexual, harassment and illegal-activity images. Existing studies based on image checker, model fine-tuning and embedding blocking are impractical in real-world applications. Hence, \textit{we propose the first universal prompt optimizer for safe T2I generation in black-box scenario}. We first construct a dataset consisting of toxic-clean prompt pairs by GPT-3.5 Turbo. To guide the optimizer to have the ability of converting toxic prompt to clean prompt while preserving semantic information, we design a novel reward function measuring toxicity and text alignment of generated images and train the optimizer through Proximal Policy Optimization. Experiments show that our approach can effectively reduce the likelihood of various T2I models in generating inappropriate images, with no significant impact on text alignment. It is also flexible to be combined with methods to achieve better performance.

MoDELS · 泛函 · Processing（編程語言） · ML · Learning ·

2024 年 2 月 16 日

A Mass-Conserving-Perceptron for Machine Learning-Based Modeling of Geoscientific Systems

Yuan-Heng Wang,Hoshin V. Gupta

from arxiv, 65 pages, 7 figures in the main text, 10 figures, and 10 tables in the supplementary materials

Although decades of effort have been devoted to building Physical-Conceptual (PC) models for predicting the time-series evolution of geoscientific systems, recent work shows that Machine Learning (ML) based Gated Recurrent Neural Network technology can be used to develop models that are much more accurate. However, the difficulty of extracting physical understanding from ML-based models complicates their utility for enhancing scientific knowledge regarding system structure and function. Here, we propose a physically-interpretable Mass Conserving Perceptron (MCP) as a way to bridge the gap between PC-based and ML-based modeling approaches. The MCP exploits the inherent isomorphism between the directed graph structures underlying both PC models and GRNNs to explicitly represent the mass-conserving nature of physical processes while enabling the functional nature of such processes to be directly learned (in an interpretable manner) from available data using off-the-shelf ML technology. As a proof of concept, we investigate the functional expressivity (capacity) of the MCP, explore its ability to parsimoniously represent the rainfall-runoff (RR) dynamics of the Leaf River Basin, and demonstrate its utility for scientific hypothesis testing. To conclude, we discuss extensions of the concept to enable ML-based physical-conceptual representation of the coupled nature of mass-energy-information flows through geoscientific systems.

INFORMS · 可理解性 · 模型評估 · Instagram · YouTube ·

2024 年 2 月 16 日

Making Short-Form Videos Accessible with Hierarchical Video Summaries

Tess Van Daele,Akhil Iyer,Yuning Zhang,Jalyn C. Derry,Mina Huh,Amy Pavel

from arxiv, To appear at CHI 2024

Short videos on platforms such as TikTok, Instagram Reels, and YouTube Shorts (i.e. short-form videos) have become a primary source of information and entertainment. Many short-form videos are inaccessible to blind and low vision (BLV) viewers due to their rapid visual changes, on-screen text, and music or meme-audio overlays. In our formative study, 7 BLV viewers who regularly watched short-form videos reported frequently skipping such inaccessible content. We present ShortScribe, a system that provides hierarchical visual summaries of short-form videos at three levels of detail to support BLV viewers in selecting and understanding short-form videos. ShortScribe allows BLV users to navigate between video descriptions based on their level of interest. To evaluate ShortScribe, we assessed description accuracy and conducted a user study with 10 BLV participants comparing ShortScribe to a baseline interface. When using ShortScribe, participants reported higher comprehension and provided more accurate summaries of video content.

可辨認的 · binary · MoDELS · 估計/估計量 · Analysis ·

2024 年 2 月 15 日

A Spectral Method for Identifiable Grade of Membership Analysis with Binary Responses

Ling Chen,Yuqi Gu

from arxiv, Psychometrika (2024)

Grade of Membership (GoM) models are popular individual-level mixture models for multivariate categorical data. GoM allows each subject to have mixed memberships in multiple extreme latent profiles. Therefore GoM models have a richer modeling capacity than latent class models that restrict each subject to belong to a single profile. The flexibility of GoM comes at the cost of more challenging identifiability and estimation problems. In this work, we propose a singular value decomposition (SVD) based spectral approach to GoM analysis with multivariate binary responses. Our approach hinges on the observation that the expectation of the data matrix has a low-rank decomposition under a GoM model. For identifiability, we develop sufficient and almost necessary conditions for a notion of expectation identifiability. For estimation, we extract only a few leading singular vectors of the observed data matrix, and exploit the simplex geometry of these vectors to estimate the mixed membership scores and other parameters. We also establish the consistency of our estimator in the double-asymptotic regime where both the number of subjects and the number of items grow to infinity. Our spectral method has a huge computational advantage over Bayesian or likelihood-based methods and is scalable to large-scale and high-dimensional data. Extensive simulation studies demonstrate the superior efficiency and accuracy of our method. We also illustrate our method by applying it to a personality test dataset.

Vision · 模型評估 · 可約的 · 計算機視覺 · DNN ·

2020 年 3 月 24 日

A Survey of Methods for Low-Power Deep Learning and Computer Vision

Abhinav Goel,Caleb Tung,Yung-Hsiang Lu,George K. Thiruvathukal

from arxiv, Accepted for publication at 2020 IEEE 6th World Forum on Internet of Things (WF-IoT), New Orleans, LA, USA 2020

Deep neural networks (DNNs) are successful in many computer vision tasks. However, the most accurate DNNs require millions of parameters and operations, making them energy, computation and memory intensive. This impedes the deployment of large DNNs in low-power devices with limited compute resources. Recent research improves DNN models by reducing the memory requirement, energy consumption, and number of operations without significantly decreasing the accuracy. This paper surveys the progress of low-power deep learning and computer vision, specifically in regards to inference, and discusses the methods for compacting and accelerating DNN models. The techniques can be divided into four major categories: (1) parameter quantization and pruning, (2) compressed convolutional filters and matrix factorization, (3) network architecture search, and (4) knowledge distillation. We analyze the accuracy, advantages, disadvantages, and potential solutions to the problems with the techniques in each category. We also discuss new evaluation metrics as a guideline for future research.

ConvNets · DAM · 特征空間 · 無監督 · 圖像分割 ·

2018 年 4 月 29 日

Unsupervised Cross-Modality Domain Adaptation of ConvNets for Biomedical Image Segmentations with Adversarial Loss

Qi Dou,Cheng Ouyang,Cheng Chen,Hao Chen,Pheng-Ann Heng

from arxiv, Accepted to IJCAI 2018

Convolutional networks (ConvNets) have achieved great successes in various challenging vision tasks. However, the performance of ConvNets would degrade when encountering the domain shift. The domain adaptation is more significant while challenging in the field of biomedical image analysis, where cross-modality data have largely different distributions. Given that annotating the medical data is especially expensive, the supervised transfer learning approaches are not quite optimal. In this paper, we propose an unsupervised domain adaptation framework with adversarial learning for cross-modality biomedical image segmentations. Specifically, our model is based on a dilated fully convolutional network for pixel-wise prediction. Moreover, we build a plug-and-play domain adaptation module (DAM) to map the target input to features which are aligned with source domain feature space. A domain critic module (DCM) is set up for discriminating the feature space of both domains. We optimize the DAM and DCM via an adversarial loss without using any target domain label. Our proposed method is validated by adapting a ConvNet trained with MRI images to unpaired CT data for cardiac structures segmentations, and achieved very promising results.