亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Domain shift poses a significant challenge in cross-domain spoken language recognition (SLR) by reducing its effectiveness. Unsupervised domain adaptation (UDA) algorithms have been explored to address domain shifts in SLR without relying on class labels in the target domain. One successful UDA approach focuses on learning domain-invariant representations to align feature distributions between domains. However, disregarding the class structure during the learning process of domain-invariant representations can result in over-alignment, negatively impacting the classification task. To overcome this limitation, we propose an optimal transport (OT)-based UDA algorithm for a cross-domain SLR, leveraging the distribution geometry structure-aware property of OT. An OT-based discrepancy measure on a joint distribution over feature and label information is considered during domain alignment in OT-based UDA. Our previous study discovered that completely aligning the distributions between the source and target domains can introduce a negative transfer, where classes or irrelevant classes from the source domain map to a different class in the target domain during distribution alignment. This negative transfer degrades the performance of the adaptive model. To mitigate this issue, we introduce coupling-weighted partial optimal transport (POT) within our UDA framework for SLR, where soft weighting on the OT coupling based on transport cost is adaptively set during domain alignment. A cross-domain SLR task was used in the experiments to evaluate the proposed UDA. The results demonstrated that our proposed UDA algorithm significantly improved the performance over existing UDA algorithms in a cross-channel SLR task.

相關內容

The locations of different mRNA molecules can be revealed by multiplexed in situ RNA detection. By assigning detected mRNA molecules to individual cells, it is possible to identify many different cell types in parallel. This in turn enables investigation of the spatial cellular architecture in tissue, which is crucial for furthering our understanding of biological processes and diseases. However, cell typing typically depends on the segmentation of cell nuclei, which is often done based on images of a DNA stain, such as DAPI. Limiting cell definition to a nuclear stain makes it fundamentally difficult to determine accurate cell borders, and thereby also difficult to assign mRNA molecules to the correct cell. As such, we have developed a computational tool that segments cells solely based on the local composition of mRNA molecules. First, a small neural network is trained to compute attractive and repulsive edges between pairs of mRNA molecules. The signed graph is then partitioned by a mutex watershed into components corresponding to different cells. We evaluated our method on two publicly available datasets and compared it against the current state-of-the-art and older baselines. We conclude that combining neural networks with combinatorial optimization is a promising approach for cell segmentation of in situ transcriptomics data.

Sample selection models represent a common methodology for correcting bias induced by data missing not at random. It is well known that these models are not empirically identifiable without exclusion restrictions. In other words, some variables predictive of missingness do not affect the outcome model of interest. The drive to establish this requirement often leads to the inclusion of irrelevant variables in the model. A recent proposal uses adaptive LASSO to circumvent this problem, but its performance depends on the so-called covariance assumption, which can be violated in small to moderate samples. Additionally, there are no tools yet for post-selection inference for this model. To address these challenges, we propose two families of spike-and-slab priors to conduct Bayesian variable selection in sample selection models. These prior structures allow for constructing a Gibbs sampler with tractable conditionals, which is scalable to the dimensions of practical interest. We illustrate the performance of the proposed methodology through a simulation study and present a comparison against adaptive LASSO and stepwise selection. We also provide two applications using publicly available real data. An implementation and code to reproduce the results in this paper can be found at //github.com/adam-iqbal/selection-spike-slab

The Conformer has become the most popular encoder model for automatic speech recognition (ASR). It adds convolution modules to a transformer to learn both local and global dependencies. In this work we describe a faster, more memory-efficient, and better-performing transformer, called Zipformer. Modeling changes include: 1) a U-Net-like encoder structure where middle stacks operate at lower frame rates; 2) reorganized block structure with more modules, within which we re-use attention weights for efficiency; 3) a modified form of LayerNorm called BiasNorm allows us to retain some length information; 4) new activation functions SwooshR and SwooshL work better than Swish. We also propose a new optimizer, called ScaledAdam, which scales the update by each tensor's current scale to keep the relative change about the same, and also explictly learns the parameter scale. It achieves faster convergence and better performance than Adam. Extensive experiments on LibriSpeech, Aishell-1, and WenetSpeech datasets demonstrate the effectiveness of our proposed Zipformer over other state-of-the-art ASR models. Our code is publicly available at //github.com/k2-fsa/icefall.

We discuss probabilistic neural networks with a fixed internal representation as models for machine understanding. Here understanding is intended as mapping data to an already existing representation which encodes an {\em a priori} organisation of the feature space. We derive the internal representation by requiring that it satisfies the principles of maximal relevance and of maximal ignorance about how different features are combined. We show that, when hidden units are binary variables, these two principles identify a unique model -- the Hierarchical Feature Model (HFM) -- which is fully solvable and provides a natural interpretation in terms of features. We argue that learning machines with this architecture enjoy a number of interesting properties, like the continuity of the representation with respect to changes in parameters and data, the possibility to control the level of compression and the ability to support functions that go beyond generalisation. We explore the behaviour of the model with extensive numerical experiments and argue that models where the internal representation is fixed reproduce a learning modality which is qualitatively different from that of traditional models such as Restricted Boltzmann Machines.

Hesitant fuzzy sets are widely used in the instances of uncertainty and hesitation. The inclusion relationship is an important and foundational definition for sets. Hesitant fuzzy set, as a kind of set, needs explicit definition of inclusion relationship. Base on the hesitant fuzzy membership degree of discrete form, several kinds of inclusion relationships for hesitant fuzzy sets are proposed. And then some foundational propositions of hesitant fuzzy sets and the families of hesitant fuzzy sets are presented. Finally, some foundational propositions of hesitant fuzzy information systems with respect to parameter reductions are put forward, and an example and an algorithm are given to illustrate the processes of parameter reductions.

We study the severity of conflict-related violence in Colombia at an unprecedented granular scale in space and across time. Splitting the data into different geographical regions and different historically-relevant eras, we uncover variations in the patterns of conflict severity which we then explain in terms of local conflict actors' different collective behaviors and/or conditions using a simple mathematical model of conflict actors' grouping dynamics (coalescence and fragmentation). Specifically, variations in the approximate scaling values of the distributions of event lethalities can be explained by the changing strength ratio of the local conflict actors for distinct conflict periods and organizational regions. In this way, our findings open the door to a new granular spectroscopy of human conflicts in terms of local conflict actor strength ratios for any armed conflict.

JPEG is still the most widely used image compression algorithm. Most image compression algorithms only consider uncompressed original image, while ignoring a large number of already existing JPEG images. Recently, JPEG recompression approaches have been proposed to further reduce the size of JPEG files. However, those methods only consider JPEG lossless recompression, which is just a special case of the rate-distortion theorem. In this paper, we propose a unified lossly and lossless JPEG recompression framework, which consists of learned quantization table and Markovian hierarchical variational autoencoders. Experiments show that our method can achieve arbitrarily low distortion when the bitrate is close to the upper bound, namely the bitrate of the lossless compression model. To the best of our knowledge, this is the first learned method that bridges the gap between lossy and lossless recompression of JPEG images.

Sign Languages (SL) serve as the predominant mode of communication for the Deaf and Hard of Hearing communities. The advent of deep learning has aided numerous methods in SL recognition and translation, achieving remarkable results. However, Sign Language Production (SLP) poses a challenge for the computer vision community as the motions generated must be realistic and have precise semantic meanings. Most SLP methods rely on 2D data, thus impeding their ability to attain a necessary level of realism. In this work, we propose a diffusion-based SLP model trained on a curated large-scale dataset of 4D signing avatars and their corresponding text transcripts. The proposed method can generate dynamic sequences of 3D avatars from an unconstrained domain of discourse using a diffusion process formed on a novel and anatomically informed graph neural network defined on the SMPL-X body skeleton. Through a series of quantitative and qualitative experiments, we show that the proposed method considerably outperforms previous methods of SLP. We believe that this work presents an important and necessary step towards realistic neural sign avatars, bridging the communication gap between Deaf and hearing communities. The code, method and generated data will be made publicly available.

Representations from transformer-based unidirectional language models are known to be effective at predicting brain responses to natural language. However, most studies comparing language models to brains have used GPT-2 or similarly sized language models. Here we tested whether larger open-source models such as those from the OPT and LLaMA families are better at predicting brain responses recorded using fMRI. Mirroring scaling results from other contexts, we found that brain prediction performance scales logarithmically with model size from 125M to 30B parameter models, with ~15% increased encoding performance as measured by correlation with a held-out test set across 3 subjects. Similar logarithmic behavior was observed when scaling the size of the fMRI training set. We also characterized scaling for acoustic encoding models that use HuBERT, WavLM, and Whisper, and we found comparable improvements with model size. A noise ceiling analysis of these large, high-performance encoding models showed that performance is nearing the theoretical maximum for brain areas such as the precuneus and higher auditory cortex. These results suggest that increasing scale in both models and data will yield incredibly effective models of language processing in the brain, enabling better scientific understanding as well as applications such as decoding.

Longitudinal studies are often subject to missing data. The ICH E9(R1) addendum addresses the importance of defining a treatment effect estimand with the consideration of intercurrent events. Jump-to-reference (J2R) is one classically envisioned control-based scenario for the treatment effect evaluation using the hypothetical strategy, where the participants in the treatment group after intercurrent events are assumed to have the same disease progress as those with identical covariates in the control group. We establish new estimators to assess the average treatment effect based on a proposed potential outcomes framework under J2R. Various identification formulas are constructed under the assumptions addressed by J2R, motivating estimators that rely on different parts of the observed data distribution. Moreover, we obtain a novel estimator inspired by the efficient influence function, with multiple robustness in the sense that it achieves $n^{1/2}$-consistency if any pairs of multiple nuisance functions are correctly specified, or if the nuisance functions converge at a rate not slower than $n^{-1/4}$ when using flexible modeling approaches. The finite-sample performance of the proposed estimators is validated in simulation studies and an antidepressant clinical trial.

北京阿比特科技有限公司