欧美成年黄色网站在线观看-欧美体内SHE精高潮

Fairness in decision-making processes is often quantified using probabilistic metrics. However, these metrics may not fully capture the real-world consequences of unfairness. In this article, we adopt a utility-based approach to more accurately measure the real-world impacts of decision-making process. In particular, we show that if the concept of $\varepsilon$-fairness is employed, it can possibly lead to outcomes that are maximally unfair in the real-world context. Additionally, we address the common issue of unavailable data on false negatives by proposing a reduced setting that still captures essential fairness considerations. We illustrate our findings with two real-world examples: college admissions and credit risk assessment. Our analysis reveals that while traditional probability-based evaluations might suggest fairness, a utility-based approach uncovers the necessary actions to truly achieve equality. For instance, in the college admission case, we find that enhancing completion rates is crucial for ensuring fairness. Summarizing, this paper highlights the importance of considering the real-world context when evaluating fairness.

相關內容

Facebook AI Research

關注 10

Weight · 圖 · 邊 · 無向 · 無向圖 ·

2024 年 6 月 25 日

Dynamic Metric Embedding into $\ell_p$ Space

Kiarash Banihashem,MohammadTaghi Hajiaghayi,Dariusz R. Kowalski,Jan Olkowski,Max Springer

from arxiv, Accepted to ICML 2024 (15 pages, 3 figures)

We give the first non-trivial decremental dynamic embedding of a weighted, undirected graph $G$ into $\ell_p$ space. Given a weighted graph $G$ undergoing a sequence of edge weight increases, the goal of this problem is to maintain a (randomized) mapping $\phi: (G,d) \to (X,\ell_p)$ from the set of vertices of the graph to the $\ell_p$ space such that for every pair of vertices $u$ and $v$, the expected distance between $\phi(u)$ and $\phi(v)$ in the $\ell_p$ metric is within a small multiplicative factor, referred to as the distortion, of their distance in $G$. Our main result is a dynamic algorithm with expected distortion $O(\log^2 n)$ and total update time $O\left((m^{1+o(1)} \log^2 W + Q)\log(nW) \right)$, where $W$ is the maximum weight of the edges, $Q$ is the total number of updates and $n, m$ denote the number of vertices and edges in $G$ respectively. This is the first result of its kind, extending the seminal result of Bourgain to the expanding field of dynamic algorithms. Moreover, we demonstrate that in the fully dynamic regime, where we tolerate edge insertions as well as deletions, no algorithm can explicitly maintain an embedding into $\ell_p$ space that has a low distortion with high probability.

GANs · 統計量 · 泛函 · 線性的 · Performer ·

2024 年 6 月 24 日

Concentration Inequalities for $(f,Γ)$-GANs

Jeremiah Birrell

from arxiv, 21 pages

Generative adversarial networks (GANs) are unsupervised learning methods for training a generator distribution to produce samples that approximate those drawn from a target distribution. Many such methods can be formulated as minimization of a metric or divergence. Recent works have proven the statistical consistency of GANs that are based on integral probability metrics (IPMs), e.g., WGAN which is based on the 1-Wasserstein metric. IPMs are defined by optimizing a linear functional (difference of expectations) over a space of discriminators. A much larger class of GANs, which allow for the use of nonlinear objective functionals, can be constructed using $(f,\Gamma)$-divergences; these generalize and interpolate between IPMs and $f$-divergences (e.g., KL or $\alpha$-divergences). Instances of $(f,\Gamma)$-GANs have been shown to exhibit improved performance in a number of applications. In this work we study the statistical consistency of $(f,\Gamma)$-GANs for general $f$ and $\Gamma$. Specifically, we derive finite-sample concentration inequalities. These derivations require novel arguments due to nonlinearity of the objective functional. We demonstrate that our new results reduce to the known results for IPM-GANs in the appropriate limit while also significantly extending the domain of applicability of this theory.

Performer · Learning · Analysis · 深度強化學習 · 多樣性 ·

2024 年 6 月 24 日

$\text{Alpha}^2$: Discovering Logical Formulaic Alphas using Deep Reinforcement Learning

Feng Xu,Yan Yin,Xinyu Zhang,Tianyuan Liu,Shengyi Jiang,Zongzhang Zhang

Alphas are pivotal in providing signals for quantitative trading. The industry highly values the discovery of formulaic alphas for their interpretability and ease of analysis, compared with the expressive yet overfitting-prone black-box alphas. In this work, we focus on discovering formulaic alphas. Prior studies on automatically generating a collection of formulaic alphas were mostly based on genetic programming (GP), which is known to suffer from the problems of being sensitive to the initial population, converting to local optima, and slow computation speed. Recent efforts employing deep reinforcement learning (DRL) for alpha discovery have not fully addressed key practical considerations such as alpha correlations and validity, which are crucial for their effectiveness. In this work, we propose a novel framework for alpha discovery using DRL by formulating the alpha discovery process as program construction. Our agent, $\text{Alpha}^2$, assembles an alpha program optimized for an evaluation metric. A search algorithm guided by DRL navigates through the search space based on value estimates for potential alpha outcomes. The evaluation metric encourages both the performance and the diversity of alphas for a better final trading strategy. Our formulation of searching alphas also brings the advantage of pre-calculation dimensional analysis, ensuring the logical soundness of alphas, and pruning the vast search space to a large extent. Empirical experiments on real-world stock markets demonstrates $\text{Alpha}^2$'s capability to identify a diverse set of logical and effective alphas, which significantly improves the performance of the final trading strategy. The code of our method is available at //github.com/x35f/alpha2.

離散化 · 離散數學 ·

2024 年 6 月 24 日

Local Limit Theorems for $q$-Multinomial and Multiple Heine Distributions

Malvina Vamvakari

from arxiv, In Proceedings GASCom 2024, arXiv:2406.14588

In this work we establish local limit theorems for q-multinomial and multiple Heine distributions. Specifically, the pointwise convergence of the q-multinomial distribution of the first kind, as well as for its discrete limit, the multiple Heine distribution, to a multivariate Stieltjes-Wigert type distribution, are provided.

優化器 · 線性的 · 秩 · 雅克比 · 正定 ·

2024 年 6 月 23 日

The $ω$-Condition Number for Optimal Preconditioning and Low Rank Generalized Jacobian Updating

Woosuk L. Jung,David Torregrosa-Belén,Henry Wolkowicz

Preconditioning is essential in iterative methods for solving linear systems. It is also the implicit objective in updating approximations of Jacobians in optimization methods, e.g., in quasi-Newton methods. Motivated by the latter, we study a nonclassic matrix condition number, the $\omega$-condition number. We do this in the context of optimal conditioning for: (i) our application to low rank updating of generalized Jacobians; (ii) iterative methods for linear systems: (iia) clustering of eigenvalues and (iib) convergence rates. For a positive definite matrix, the $\omega$-condition measure is the ratio of the arithmetic and geometric means of the eigenvalues. In particular, our applications concentrate on linear systems with low rank updates of ill-conditioned positive definite matrices. These systems arise in the context of nonsmooth Newton methods using generalized Jacobians. We are able to use optimality conditions and derive explicit formulae for $\omega$-optimal preconditioners and preconditioned updates. Connections to partial Cholesky sparse preconditioners are made. Evaluating or estimating the classical condition number $\kappa$ can be expensive. We show that the $\omega$-condition number can be evaluated explicitly following a Cholesky or LU factorization. Moreover, the simplicity of $\omega$ allows for the derivation of formulae for optimal preconditioning in various scenarios, i.e., this avoids the need for expensive algorithmic calculations. Our empirics show that $\omega$ estimates the actual condition of a linear system significantly better. Moreover, our empirical results show a significant decrease in the number of iterations required for a requested accuracy in the residual during an iterative method, i.e., these results confirm the efficacy of using the $\omega$-condition number compared to the classical condition number.

MoDELS · 秩 · 統計量 · INFORMS · 邊緣化 ·

2024 年 6 月 22 日

Statistical Models of Top-$k$ Partial Orders

Amel Awadelkarim,Johan Ugander

from arxiv, 9 pages, 5 figures

In many contexts involving ranked preferences, agents submit partial orders over available alternatives. Statistical models often treat these as marginal in the space of total orders, but this approach overlooks information contained in the list length itself. In this work, we introduce and taxonomize approaches for jointly modeling distributions over top-$k$ partial orders and list lengths $k$, considering two classes of approaches: composite models that view a partial order as a truncation of a total order, and augmented ranking models that model the construction of the list as a sequence of choice decisions, including the decision to stop. For composite models, we consider three dependency structures for joint modeling of order and truncation length. For augmented ranking models, we consider different assumptions on how the stop-token choice is modeled. Using data consisting of partial rankings from San Francisco school choice and San Francisco ranked choice elections, we evaluate how well the models predict observed data and generate realistic synthetic datasets. We find that composite models, explicitly modeling length as a categorical variable, produce synthetic datasets with accurate length distributions, and an augmented model with position-dependent item utilities jointly models length and preferences in the training data best, as measured by negative log loss. Methods from this work have significant implications on the simulation and evaluation of real-world social systems that solicit ranked preferences.

變換 · MoDELS · 語言模型化 · 表示容量 · 可理解性 ·

2024 年 6 月 20 日

Transformers Can Represent $n$-gram Language Models

Anej Svete,Ryan Cotterell

Existing work has analyzed the representational capacity of the transformer architecture by means of formal models of computation. However, the focus so far has been on analyzing the architecture in terms of language \emph{acceptance}. We contend that this is an ill-suited problem in the study of \emph{language models} (LMs), which are definitionally \emph{probability distributions} over strings. In this paper, we focus on the relationship between transformer LMs and $n$-gram LMs, a simple and historically relevant class of language models. We show that transformer LMs using the hard or sparse attention mechanisms can exactly represent any $n$-gram LM, giving us a concrete lower bound on their probabilistic representational capacity. This provides a first step towards understanding the mechanisms that transformer LMs can use to represent probability distributions over strings.

數據集 · Networking · Neural Networks · MoDELS · ForCES ·

2024 年 6 月 20 日

$\nabla^2$DFT: A Universal Quantum Chemistry Dataset of Drug-Like Molecules and a Benchmark for Neural Network Potentials

Kuzma Khrabrov,Anton Ber,Artem Tsypin,Konstantin Ushenin,Egor Rumiantsev,Alexander Telepov,Dmitry Protasov,Ilya Shenbin,Anton Alekseev,Mikhail Shirokikh,Sergey Nikolenko,Elena Tutubalina,Artur Kadurin

Methods of computational quantum chemistry provide accurate approximations of molecular properties crucial for computer-aided drug discovery and other areas of chemical science. However, high computational complexity limits the scalability of their applications. Neural network potentials (NNPs) are a promising alternative to quantum chemistry methods, but they require large and diverse datasets for training. This work presents a new dataset and benchmark called $\nabla^2$DFT that is based on the nablaDFT. It contains twice as much molecular structures, three times more conformations, new data types and tasks, and state-of-the-art models. The dataset includes energies, forces, 17 molecular properties, Hamiltonian and overlap matrices, and a wavefunction object. All calculations were performed at the DFT level ($\omega$B97X-D/def2-SVP) for each conformation. Moreover, $\nabla^2$DFT is the first dataset that contains relaxation trajectories for a substantial number of drug-like molecules. We also introduce a novel benchmark for evaluating NNPs in molecular property prediction, Hamiltonian prediction, and conformational optimization tasks. Finally, we propose an extendable framework for training NNPs and implement 10 models within it.

逼真度 · 圖 · 剪枝 · 可行 · 邊 ·

2024 年 6 月 17 日

On the Feasibility of Fidelity$^-$ for Graph Pruning

Yong-Min Shin,Won-Yong Shin

from arxiv, 6 pages, 3 figures, 2 tables; IJCAI Workshop on Explainable AI (XAI 2024) (to appear) (Please cite our workshop version.)

As one of popular quantitative metrics to assess the quality of explanation of graph neural networks (GNNs), fidelity measures the output difference after removing unimportant parts of the input graph. Fidelity has been widely used due to its straightforward interpretation that the underlying model should produce similar predictions when features deemed unimportant from the explanation are removed. This raises a natural question: "Does fidelity induce a global (soft) mask for graph pruning?" To solve this, we aim to explore the potential of the fidelity measure to be used for graph pruning, eventually enhancing the GNN models for better efficiency. To this end, we propose Fidelity$^-$-inspired Pruning (FiP), an effective framework to construct global edge masks from local explanations. Our empirical observations using 7 edge attribution methods demonstrate that, surprisingly, general eXplainable AI methods outperform methods tailored to GNNs in terms of graph pruning performance.

分解的 · 可約的 · 簇 · 直徑 · 優化器 ·

2024 年 6 月 16 日

Moderate Dimension Reduction for $k$-Center Clustering

Shaofeng H. -C. Jiang,Robert Krauthgamer,Shay Sapir

from arxiv, 23 pages, appeared in SoCG 2024. Minor corrections in page 8 and in section 5

The Johnson-Lindenstrauss (JL) Lemma introduced the concept of dimension reduction via a random linear map, which has become a fundamental technique in many computational settings. For a set of $n$ points in $\mathbb{R}^d$ and any fixed $\epsilon>0$, it reduces the dimension $d$ to $O(\log n)$ while preserving, with high probability, all the pairwise Euclidean distances within factor $1+\epsilon$. Perhaps surprisingly, the target dimension can be lower if one only wishes to preserve the optimal value of a certain problem on the pointset, e.g., Euclidean max-cut or $k$-means. However, for some notorious problems, like diameter (aka furthest pair), dimension reduction via the JL map to below $O(\log n)$ does not preserve the optimal value within factor $1+\epsilon$. We propose to focus on another regime, of \emph{moderate dimension reduction}, where a problem's value is preserved within factor $\alpha>1$ using target dimension $\tfrac{\log n}{poly(\alpha)}$. We establish the viability of this approach and show that the famous $k$-center problem is $\alpha$-approximated when reducing to dimension $O(\tfrac{\log n}{\alpha^2}+\log k)$. Along the way, we address the diameter problem via the special case $k=1$. Our result extends to several important variants of $k$-center (with outliers, capacities, or fairness constraints), and the bound improves further with the input's doubling dimension. While our $poly(\alpha)$-factor improvement in the dimension may seem small, it actually has significant implications for streaming algorithms, and easily yields an algorithm for $k$-center in dynamic geometric streams, that achieves $O(\alpha)$-approximation using space $poly(kdn^{1/\alpha^2})$. This is the first algorithm to beat $O(n)$ space in high dimension $d$, as all previous algorithms require space at least $\exp(d)$. Furthermore, it extends to the $k$-center variants mentioned above.