云南虫谷在线观看免费观看电视剧-亚洲丁香婷婷久久综合激情综合

We explore inference within sparse linear models, focusing on scenarios where both predictors and errors carry serial correlations. We establish a clear link between predictor serial correlation and the finite sample performance of the LASSO, showing that even orthogonal or weakly correlated stationary AR processes can lead to significant spurious correlations due to their serial correlations. To address this challenge, we propose a novel approach named ARMAr-LASSO (ARMA residuals LASSO), which applies the LASSO to predictor time series that have been pre-whitened with ARMA filters and lags of dependent variable. Utilizing the near-epoch dependence framework, we derive both asymptotic results and oracle inequalities for the ARMAr-LASSO, and demonstrate that it effectively reduces estimation errors while also providing an effective forecasting and feature selection strategy. Our findings are supported by extensive simulations and an application to real-world macroeconomic data, which highlight the superior performance of the ARMAr-LASSO for handling sparse linear models in the context of time series.

相關內容

預測器/決策函數

關注 1

正交 · MoDELS · 表示 · Learning · 相互獨立的 ·

2024 年 10 月 1 日

Measuring Orthogonality in Representations of Generative Models

Robin C. Geyer,Alessandro Torcinovich,Jo?o B. Carvalho,Alexander Meyer,Joachim M. Buhmann

In unsupervised representation learning, models aim to distill essential features from high-dimensional data into lower-dimensional learned representations, guided by inductive biases. Understanding the characteristics that make a good representation remains a topic of ongoing research. Disentanglement of independent generative processes has long been credited with producing high-quality representations. However, focusing solely on representations that adhere to the stringent requirements of most disentanglement metrics, may result in overlooking many high-quality representations, well suited for various downstream tasks. These metrics often demand that generative factors be encoded in distinct, single dimensions aligned with the canonical basis of the representation space. Motivated by these observations, we propose two novel metrics: Importance-Weighted Orthogonality (IWO) and Importance-Weighted Rank (IWR). These metrics evaluate the mutual orthogonality and rank of generative factor subspaces. Throughout extensive experiments on common downstream tasks, over several benchmark datasets and models, IWO and IWR consistently show stronger correlations with downstream task performance than traditional disentanglement metrics. Our findings suggest that representation quality is closer related to the orthogonality of independent generative processes rather than their disentanglement, offering a new direction for evaluating and improving unsupervised learning models.

簇 · 解碼 · CLIP · contrastive · 樣例 ·

2024 年 10 月 1 日

Finding Shared Decodable Concepts and their Negations in the Brain

Cory Efird,Alex Murphy,Joel Zylberberg,Alona Fyshe

Prior work has offered evidence for functional localization in the brain; different anatomical regions preferentially activate for certain types of visual input. For example, the fusiform face area preferentially activates for visual stimuli that include a face. However, the spectrum of visual semantics is extensive, and only a few semantically-tuned patches of cortex have so far been identified in the human brain. Using a multimodal (natural language and image) neural network architecture (CLIP) we train a highly accurate contrastive model that maps brain responses during naturalistic image viewing to CLIP embeddings. We then use a novel adaptation of the DBSCAN clustering algorithm to cluster the parameters of these participant-specific contrastive models. This reveals what we call Shared Decodable Concepts (SDCs): clusters in CLIP space that are decodable from common sets of voxels across multiple participants. Examining the images most and least associated with each SDC cluster gives us additional insight into the semantic properties of each SDC. We note SDCs for previously reported visual features (e.g. orientation tuning in early visual cortex) as well as visual semantic concepts such as faces, places and bodies. In cases where our method finds multiple clusters for a visuo-semantic concept, the least associated images allow us to dissociate between confounding factors. For example, we discovered two clusters of food images, one driven by color, the other by shape. We also uncover previously unreported areas such as regions of extrastriate body area (EBA) tuned for legs/hands and sensitivity to numerosity in right intraparietal sulcus, and more. Thus, our contrastive-learning methodology better characterizes new and existing visuo-semantic representations in the brain by leveraging multimodal neural network representations and a novel adaptation of clustering algorithms.

奇異的 · 優化器 · 論文 · Branch · 哈希學習 ·

2024 年 10 月 1 日

On the Counting of Involutory MDS Matrices

Susanta Samanta

The optimal branch number of MDS matrices has established their importance in designing diffusion layers for various block ciphers and hash functions. As a result, numerous matrix structures, including Hadamard and circulant matrices, have been proposed for constructing MDS matrices. Also, in the literature, significant attention is typically given to identifying MDS candidates with optimal implementations or proposing new constructions across different orders. However, this paper takes a different approach by not emphasizing efficiency issues or introducing new constructions. Instead, its primary objective is to enumerate Hadamard MDS and involutory Hadamard MDS matrices of order $4$ within the field $\mathbb{F}_{2^r}$. Specifically, it provides an explicit formula for the count of both Hadamard MDS and involutory Hadamard MDS matrices of order $4$ over $\mathbb{F}_{2^r}$. Additionally, it derives the count of Hadamard Near-MDS (NMDS) and involutory Hadamard NMDS matrices, each with exactly one zero in each row, of order $4$ over $\mathbb{F}_{2^r}$. Furthermore, the paper discusses some circulant-like matrices for constructing NMDS matrices and proves that when $n$ is even, any $2n \times 2n$ Type-II circulant-like matrix can never be an NMDS matrix. While it is known that NMDS matrices may be singular, this paper establishes that singular Hadamard matrices can never be NMDS matrices. Moreover, it proves that there exist exactly two orthogonal Type-I circulant-like matrices of order $4$ over $\mathbb{F}_{2^r}$.

Java · 層 · Scala · state-of-the-art · Kotlin ·

2024 年 9 月 30 日

Reasoning About Exceptional Behavior At the Level of Java Bytecode

Marco Paganoni,Carlo A. Furia

A program's exceptional behavior can substantially complicate its control flow, and hence accurately reasoning about the program's correctness. On the other hand, formally verifying realistic programs is likely to involve exceptions -- a ubiquitous feature in modern programming languages. In this paper, we present a novel approach to verify the exceptional behavior of Java programs, which extends our previous work on ByteBack. ByteBack works on a program's bytecode, while providing means to specify the intended behavior at the source-code level; this approach sets ByteBack apart from most state-of-the-art verifiers that target source code. To explicitly model a program's exceptional behavior in a way that is amenable to formal reasoning, we introduce Vimp: a high-level bytecode representation that extends the Soot framework's Grimp with verification-oriented features, thus serving as an intermediate layer between bytecode and the Boogie intermediate verification language. Working on bytecode through this intermediate layer brings flexibility and adaptability to new language versions and variants: as our experiments demonstrate, ByteBack can verify programs involving exceptional behavior in all versions of Java, as well as in Scala and Kotlin (two other popular JVM languages).

語音識別 · MoDELS · 語言模型化 · 大語言模型 · INFORMS ·

2024 年 9 月 30 日

Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition

Yosuke Higuchi,Tetsuji Ogawa,Tetsunori Kobayashi

from arxiv, Submitted to ICASSP2025

We propose to utilize an instruction-tuned large language model (LLM) for guiding the text generation process in automatic speech recognition (ASR). Modern large language models (LLMs) are adept at performing various text generation tasks through zero-shot learning, prompted with instructions designed for specific objectives. This paper explores the potential of LLMs to derive linguistic information that can facilitate text generation in end-to-end ASR models. Specifically, we instruct an LLM to correct grammatical errors in an ASR hypothesis and use the LLM-derived representations to refine the output further. The proposed model is built on the joint CTC and attention architecture, with the LLM serving as a front-end feature extractor for the decoder. The ASR hypothesis, subject to correction, is obtained from the encoder via CTC decoding and fed into the LLM along with a specific instruction. The decoder subsequently takes as input the LLM output to perform token predictions, combining acoustic information from the encoder and the powerful linguistic information provided by the LLM. Experimental results show that the proposed LLM-guided model achieves a relative gain of approximately 13\% in word error rates across major benchmarks.

類別 · binary · 二分類 · Performer · 閾值 ·

2024 年 9 月 29 日

Balancing the Scales: A Comprehensive Study on Tackling Class Imbalance in Binary Classification

Mohamed Abdelhamid,Abhyuday Desai

from arxiv, 13 pages including appendix, 4 tables

Class imbalance in binary classification tasks remains a significant challenge in machine learning, often resulting in poor performance on minority classes. This study comprehensively evaluates three widely-used strategies for handling class imbalance: Synthetic Minority Over-sampling Technique (SMOTE), Class Weights tuning, and Decision Threshold Calibration. We compare these methods against a baseline scenario of no-intervention across 15 diverse machine learning models and 30 datasets from various domains, conducting a total of 9,000 experiments. Performance was primarily assessed using the F1-score, although our study also tracked results on additional 9 metrics including F2-score, precision, recall, Brier-score, PR-AUC, and AUC. Our results indicate that all three strategies generally outperform the baseline, with Decision Threshold Calibration emerging as the most consistently effective technique. However, we observed substantial variability in the best-performing method across datasets, highlighting the importance of testing multiple approaches for specific problems. This study provides valuable insights for practitioners dealing with imbalanced datasets and emphasizes the need for dataset-specific analysis in evaluating class imbalance handling techniques.

線性的 · 類別 · Extensibility · 優化器 · Analysis ·

2024 年 9 月 27 日

On the Effects of Data Heterogeneity on the Convergence Rates of Distributed Linear System Solvers

Boris Velasevic,Rohit Parasnis,Christopher G. Brinton,Navid Azizan

from arxiv, 16 pages, 6 figures

We consider the problem of solving a large-scale system of linear equations in a distributed or federated manner by a taskmaster and a set of machines, each possessing a subset of the equations. We provide a comprehensive comparison of two well-known classes of algorithms used to solve this problem: projection-based methods and optimization-based methods. First, we introduce a novel geometric notion of data heterogeneity called angular heterogeneity and discuss its generality. Using this notion, we characterize the optimal convergence rates of the most prominent algorithms from each class, capturing the effects of the number of machines, the number of equations, and that of both cross-machine and local data heterogeneity on these rates. Our analysis establishes the superiority of Accelerated Projected Consensus in realistic scenarios with significant data heterogeneity and offers several insights into how angular heterogeneity affects the efficiency of the methods studied. Additionally, we develop distributed algorithms for the efficient computation of the proposed angular heterogeneity metrics. Our extensive numerical analyses validate and complement our theoretical results.

語言模型化 · MoDELS · 知識 (knowledge) · 大語言模型 · Principle ·

2024 年 9 月 27 日

A Survey on the Honesty of Large Language Models

Siheng Li,Cheng Yang,Taiqiang Wu,Chufan Shi,Yuji Zhang,Xinyu Zhu,Zesen Cheng,Deng Cai,Mo Yu,Lemao Liu,Jie Zhou,Yujiu Yang,Ngai Wong,Xixin Wu,Wai Lam

from arxiv, Project Page: //github.com/SihengLi99/LLM-Honesty-Survey

Honesty is a fundamental principle for aligning large language models (LLMs) with human values, requiring these models to recognize what they know and don't know and be able to faithfully express their knowledge. Despite promising, current LLMs still exhibit significant dishonest behaviors, such as confidently presenting wrong answers or failing to express what they know. In addition, research on the honesty of LLMs also faces challenges, including varying definitions of honesty, difficulties in distinguishing between known and unknown knowledge, and a lack of comprehensive understanding of related research. To address these issues, we provide a survey on the honesty of LLMs, covering its clarification, evaluation approaches, and strategies for improvement. Moreover, we offer insights for future research, aiming to inspire further exploration in this important area.

潛在 · Networking · 流形 · Learning · 自適應學習 ·

2024 年 9 月 27 日

Adaptive Learning of the Latent Space of Wasserstein Generative Adversarial Networks

Yixuan Qiu,Qingyi Gao,Xiao Wang

Generative models based on latent variables, such as generative adversarial networks (GANs) and variational auto-encoders (VAEs), have gained lots of interests due to their impressive performance in many fields. However, many data such as natural images usually do not populate the ambient Euclidean space but instead reside in a lower-dimensional manifold. Thus an inappropriate choice of the latent dimension fails to uncover the structure of the data, possibly resulting in mismatch of latent representations and poor generative qualities. Towards addressing these problems, we propose a novel framework called the latent Wasserstein GAN (LWGAN) that fuses the Wasserstein auto-encoder and the Wasserstein GAN so that the intrinsic dimension of the data manifold can be adaptively learned by a modified informative latent distribution. We prove that there exist an encoder network and a generator network in such a way that the intrinsic dimension of the learned encoding distribution is equal to the dimension of the data manifold. We theoretically establish that our estimated intrinsic dimension is a consistent estimate of the true dimension of the data manifold. Meanwhile, we provide an upper bound on the generalization error of LWGAN, implying that we force the synthetic data distribution to be similar to the real data distribution from a population perspective. Comprehensive empirical experiments verify our framework and show that LWGAN is able to identify the correct intrinsic dimension under several scenarios, and simultaneously generate high-quality synthetic data by sampling from the learned latent distribution.

Faster R-CNN · domain shift · R-CNN · 目標檢測 · 可約的 ·

2018 年 3 月 8 日

Domain Adaptive Faster R-CNN for Object Detection in the Wild

Yuhua Chen,Wen Li,Christos Sakaridis,Dengxin Dai,Luc Van Gool

from arxiv, Accepted to CVPR 2018

Object detection typically assumes that training and test data are drawn from an identical distribution, which, however, does not always hold in practice. Such a distribution mismatch will lead to a significant performance drop. In this work, we aim to improve the cross-domain robustness of object detection. We tackle the domain shift on two levels: 1) the image-level shift, such as image style, illumination, etc, and 2) the instance-level shift, such as object appearance, size, etc. We build our approach based on the recent state-of-the-art Faster R-CNN model, and design two domain adaptation components, on image level and instance level, to reduce the domain discrepancy. The two domain adaptation components are based on H-divergence theory, and are implemented by learning a domain classifier in adversarial training manner. The domain classifiers on different levels are further reinforced with a consistency regularization to learn a domain-invariant region proposal network (RPN) in the Faster R-CNN model. We evaluate our newly proposed approach using multiple datasets including Cityscapes, KITTI, SIM10K, etc. The results demonstrate the effectiveness of our proposed approach for robust object detection in various domain shift scenarios.