一级a视频免费一区二区_亚洲欧美中文日韩A_欧美美女午夜性爱双飞_色窝窝无码一区二区三批_国产女同久久精品国产99_亚洲一区免费观看_国产论理精品在线

The purpose of modeling document relevance for search engines is to rank better in subsequent searches. Document-specific historical click-through rates can be important features in a dynamic ranking system which updates as we accumulate more sample. This paper describes the properties of several such features, and tests them in controlled experiments. Extending the inverse propensity weighting method to documents creates an unbiased estimate of document relevance. This feature can approximate relevance accurately, leading to near-optimal ranking in ideal circumstances. However, it has high variance that is increasing with respect to the degree of position bias. Furthermore, inaccurate position bias estimation leads to poor performance. Under several scenarios this feature can perform worse than biased click-through rates. This paper underscores the need for accurate position bias estimation, and is unique in suggesting simultaneous use of biased and unbiased position bias features.

相關內容

有偏

關注 0

Sphering · 泛函 · MoDELS · 歐氏空間 · 梯度下降法 ·

2024 年 3 月 18 日

Riemannian gradient descent for spherical area-preserving mappings

Marco Sutti,Mei-Heng Yueh

from arxiv, 30 pages, 11 figures, 8 tables

We propose a new Riemannian gradient descent method for computing spherical area-preserving mappings of topological spheres using a Riemannian retraction-based framework with theoretically guaranteed convergence. The objective function is based on the stretch energy functional, and the minimization is constrained on a power manifold of unit spheres embedded in 3-dimensional Euclidean space. Numerical experiments on several mesh models demonstrate the accuracy and stability of the proposed framework. Comparisons with two existing state-of-the-art methods for computing area-preserving mappings demonstrate that our algorithm is both competitive and more efficient. Finally, we present a concrete application to the problem of landmark-aligned surface registration of two brain models.

直徑 · Analysis · 優化器 · Engineering · 知識 (knowledge) ·

2024 年 3 月 16 日

Additive Schwarz methods for fourth-order variational inequalities

Jongho Park

from arxiv, 22 pages, 2 figures

Fourth-order variational inequalities are encountered in various scientific and engineering disciplines, including elliptic optimal control problems and plate obstacle problems. In this paper, we consider additive Schwarz methods for solving fourth-order variational inequalities. Based on a unified framework of various finite element methods for fourth-order variational inequalities, we develop one- and two-level additive Schwarz methods. We prove that the two-level method is scalable in the sense that the convergence rate of the method depends on $H/h$ and $H/\delta$ only, where $h$ and $H$ are the typical diameters of an element and a subdomain, respectively, and $\delta$ measures the overlap among the subdomains. This proof relies on a new nonlinear positivity-preserving coarse interpolation operator, the construction of which was previously unknown. To the best of our knowledge, this analysis represents the first investigation into the scalability of the two-level additive Schwarz method for fourth-order variational inequalities. Our theoretical results are verified by numerical experiments.

分解的 · 代碼 · 解碼 · 操作 ·

2024 年 3 月 16 日

Codes from Goppa codes

Chunlei Liu

from arxiv, The artical is reorganized

On a Goppa code whose structure polynomial has coefficients in the symbol field, the Frobenius acts. Its fixed codewords form a subcode. Deleting the naturally occurred redundance, we obtain a new code. It is proved that these new codes approach the Gilbert-Varshamov bound. It is also proved that these codes can be decoded within $O(n^2(\logn)^a)$ operations in the symbol field, which is usually much small than the location field, where $n$ is the codeword length, and $a$ a constant determined by the polynomial factorization algorithm.

近似 · 統計量 · INFORMS · 閉式 · MoDELS ·

2024 年 3 月 15 日

Approximation and bounding techniques for the Fisher-Rao distances

Frank Nielsen

from arxiv, 38 pages

The Fisher-Rao distance between two probability distributions of a statistical model is defined as the Riemannian geodesic distance induced by the Fisher information metric. In order to calculate the Fisher-Rao distance in closed-form, we need (1) to elicit a formula for the Fisher-Rao geodesics, and (2) to integrate the Fisher length element along those geodesics. We consider several numerically robust approximation and bounding techniques for the Fisher-Rao distances: First, we report generic upper bounds on Fisher-Rao distances based on closed-form 1D Fisher-Rao distances of submodels. Second, we describe several generic approximation schemes depending on whether the Fisher-Rao geodesics or pregeodesics are available in closed-form or not. In particular, we obtain a generic method to guarantee an arbitrarily small additive error on the approximation provided that Fisher-Rao pregeodesics and tight lower and upper bounds are available. Third, we consider the case of Fisher metrics being Hessian metrics, and report generic tight upper bounds on the Fisher-Rao distances using techniques of information geometry. Uniparametric and biparametric statistical models always have Fisher Hessian metrics, and in general a simple test allows to check whether the Fisher information matrix yields a Hessian metric or not. Fourth, we consider elliptical distribution families and show how to apply the above techniques to these models. We also propose two new distances based either on the Fisher-Rao lengths of curves serving as proxies of Fisher-Rao geodesics, or based on the Birkhoff/Hilbert projective cone distance. Last, we consider an alternative group-theoretic approach for statistical transformation models based on the notion of maximal invariant which yields insights on the structures of the Fisher-Rao distance formula which may be used fruitfully in applications.

CLIP · INFORMS · 未標記 · 類別 · Learning ·

2024 年 3 月 15 日

GET: Unlocking the Multi-modal Potential of CLIP for Generalized Category Discovery

Enguang Wang,Zhimao Peng,Zhengyuan Xie,Xialei Liu,Ming-Ming Cheng

Given unlabelled datasets containing both old and new categories, generalized category discovery (GCD) aims to accurately discover new classes while correctly classifying old classes, leveraging the class concepts learned from labeled samples. Current GCD methods only use a single visual modality of information, resulting in poor classification of visually similar classes. Though certain classes are visually confused, their text information might be distinct, motivating us to introduce text information into the GCD task. However, the lack of class names for unlabelled data makes it impractical to utilize text information. To tackle this challenging problem, in this paper, we propose a Text Embedding Synthesizer (TES) to generate pseudo text embeddings for unlabelled samples. Specifically, our TES leverages the property that CLIP can generate aligned vision-language features, converting visual embeddings into tokens of the CLIP's text encoder to generate pseudo text embeddings. Besides, we employ a dual-branch framework, through the joint learning and instance consistency of different modality branches, visual and semantic information mutually enhance each other, promoting the interaction and fusion of visual and text embedding space. Our method unlocks the multi-modal potentials of CLIP and outperforms the baseline methods by a large margin on all GCD benchmarks, achieving new state-of-the-art. The code will be released at \url{//github.com/enguangW/GET}.

經驗分布 · 歐氏空間 · 分離的 · 協方差矩陣 · 分解的 ·

2024 年 3 月 14 日

Sharp bounds for max-sliced Wasserstein distances

March T. Boedihardjo

We obtain essentially matching upper and lower bounds for the expected max-sliced 1-Wasserstein distance between a probability measure on a separable Hilbert space and its empirical distribution from $n$ samples. By proving a Banach space version of this result, we also obtain an upper bound, that is sharp up to a log factor, for the expected max-sliced 2-Wasserstein distance between a symmetric probability measure $\mu$ on a Euclidean space and its symmetrized empirical distribution in terms of the norm of the covariance matrix of $\mu$ and the diameter of the support of $\mu$.

特征選擇 · 不變 · MoDELS · CASE · 情景 ·

2024 年 3 月 14 日

Model-based causal feature selection for general response types

Lucas Kook,Sorawit Saengkyongam,Anton Rask Lundborg,Torsten Hothorn,Jonas Peters

from arxiv, Code available at //github.com/LucasKook/tramicp.git

Discovering causal relationships from observational data is a fundamental yet challenging task. Invariant causal prediction (ICP, Peters et al., 2016) is a method for causal feature selection which requires data from heterogeneous settings and exploits that causal models are invariant. ICP has been extended to general additive noise models and to nonparametric settings using conditional independence tests. However, the latter often suffer from low power (or poor type I error control) and additive noise models are not suitable for applications in which the response is not measured on a continuous scale, but reflects categories or counts. Here, we develop transformation-model (TRAM) based ICP, allowing for continuous, categorical, count-type, and uninformatively censored responses (these model classes, generally, do not allow for identifiability when there is no exogenous heterogeneity). As an invariance test, we propose TRAM-GCM based on the expected conditional covariance between environments and score residuals with uniform asymptotic level guarantees. For the special case of linear shift TRAMs, we also consider TRAM-Wald, which tests invariance based on the Wald statistic. We provide an open-source R package 'tramicp' and evaluate our approach on simulated data and in a case study investigating causal features of survival in critically ill patients.

估計/估計量 · 極大似然 · MoDELS · 似然 · 對數幾率回歸 ·

2024 年 3 月 14 日

Existence of Firth's modified estimates in binomial regression models

Mitsunori Ogawa,Yui Tomo

In logistic regression modeling, Firth's modified estimator is widely used to address the issue of data separation, which results in the nonexistence of the maximum likelihood estimate. Firth's modified estimator can be formulated as a penalized maximum likelihood estimator in which Jeffreys' prior is adopted as the penalty term. Despite its widespread use in practice, the formal verification of the corresponding estimate's existence has not been established. In this study, we establish the existence theorem of Firth's modified estimate in binomial logistic regression models, assuming only the full column rankness of the design matrix. We also discuss other binomial regression models obtained through alternating link functions and prove the existence of similar penalized maximum likelihood estimates for such models.

MoDELS · 有偏 · 語言模型化 · 生成式人工智能 · AI ·

2024 年 3 月 13 日

Non-discrimination Criteria for Generative Language Models

Sara Sterlie,Nina Weng,Aasa Feragen

from arxiv, 14 pages, 5 figures. Submitted to ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT 2024)

Within recent years, generative AI, such as large language models, has undergone rapid development. As these models become increasingly available to the public, concerns arise about perpetuating and amplifying harmful biases in applications. Gender stereotypes can be harmful and limiting for the individuals they target, whether they consist of misrepresentation or discrimination. Recognizing gender bias as a pervasive societal construct, this paper studies how to uncover and quantify the presence of gender biases in generative language models. In particular, we derive generative AI analogues of three well-known non-discrimination criteria from classification, namely independence, separation and sufficiency. To demonstrate these criteria in action, we design prompts for each of the criteria with a focus on occupational gender stereotype, specifically utilizing the medical test to introduce the ground truth in the generative AI context. Our results address the presence of occupational gender bias within such conversational language models.

基 · 原點 · 大語言模型 · 縮放 · Performer ·

2024 年 3 月 13 日

Scaling Laws of RoPE-based Extrapolation

Xiaoran Liu,Hang Yan,Shuo Zhang,Chenxin An,Xipeng Qiu,Dahua Lin

from arxiv, 26 pages, 12 figures, Accepted by ICLR 2024

The extrapolation capability of Large Language Models (LLMs) based on Rotary Position Embedding is currently a topic of considerable interest. The mainstream approach to addressing extrapolation with LLMs involves modifying RoPE by replacing 10000, the rotary base of $\theta_n={10000}^{-2n/d}$ in the original RoPE, with a larger value and providing longer fine-tuning text. In this work, we first observe that fine-tuning a RoPE-based LLM with either a smaller or larger base in pre-training context length could significantly enhance its extrapolation performance. After that, we propose \textbf{\textit{Scaling Laws of RoPE-based Extrapolation}}, a unified framework from the periodic perspective, to describe the relationship between the extrapolation performance and base value as well as tuning context length. In this process, we also explain the origin of the RoPE-based extrapolation issue by \textbf{\textit{critical dimension for extrapolation}}. Besides these observations and analyses, we achieve extrapolation up to 1 million context length within only 16K training length on LLaMA2 7B and 13B.