亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Recent work in algorithmic fairness has highlighted the challenge of defining racial categories for the purposes of anti-discrimination. These challenges are not new but have previously fallen to the state, which enacts race through government statistics, policies, and evidentiary standards in anti-discrimination law. Drawing on the history of state race-making, we examine how longstanding questions about the nature of race and discrimination appear within the algorithmic fairness literature. Through a content analysis of 60 papers published at FAccT between 2018 and 2020, we analyze how race is conceptualized and formalized in algorithmic fairness frameworks. We note that differing notions of race are adopted inconsistently, at times even within a single analysis. We also explore the institutional influences and values associated with these choices. While we find that categories used in algorithmic fairness work often echo legal frameworks, we demonstrate that values from academic computer science play an equally important role in the construction of racial categories. Finally, we examine the reasoning behind different operationalizations of race, finding that few papers explicitly describe their choices and even fewer justify them. We argue that the construction of racial categories is a value-laden process with significant social and political consequences for the project of algorithmic fairness. The widespread lack of justification around the operationalization of race reflects institutional norms that allow these political decisions to remain obscured within the backstage of knowledge production.

相關內容

In this work, our goal is to develop a theoretical framework that can eventually be used for analyzing the effectiveness of visual stories such as feature films to comic books. To develop this theoretical framework, we introduce a new story element called moments. Our conjecture is that any linear story such as the story of a feature film can be decomposed into a set of moments that follow each other. Moments are defined as the perception of the actions, interactions, and expressions of all characters or a single character during a given time period. We categorize the moments into two major types: story moments and discourse moments. Each type of moment can further be classified into three types, which we call universal storytelling moments. We believe these universal moments foster or deteriorate the emotional attachment of the audience to a particular character or the story. We present a methodology to catalog the occurrences of these universal moments as they are found in the story. The cataloged moments can be represented using curves or color strips. Therefore, we can visualize a character's journey through the story as either a 3D curve or a color strip. We also demonstrated that both story and discourse moments can be transformed into one lump-sum attraction parameter. The attraction parameter in time provides a function that can be plotted graphically onto a timeline illustrating changes in the emotional attachment of audience to a character or the story. By inspecting these functions the story analyst can analytically decipher the moments in the story where the attachment is being established, maintained, strengthened, or conversely where it is languishing.

Self-training is a well-known approach for semi-supervised learning. It consists of iteratively assigning pseudo-labels to unlabeled data for which the model is confident and treating them as labeled examples. For neural networks, softmax prediction probabilities are often used as a confidence measure, despite the fact that they are known to be overconfident, even for wrong predictions. This phenomenon is particularly intensified in the presence of sample selection bias, i.e., when data labeling is subject to some constraint. To address this issue, we propose a novel confidence measure, called $\mathcal{T}$-similarity, built upon the prediction diversity of an ensemble of linear classifiers. We provide the theoretical analysis of our approach by studying stationary points and describing the relationship between the diversity of the individual members and their performance. We empirically demonstrate the benefit of our confidence measure for three different pseudo-labeling policies on classification datasets of various data modalities.

The persistent homology transform (PHT) represents a shape with a multiset of persistence diagrams parameterized by the sphere of directions in the ambient space. In this work, we describe a finite set of diagrams that discretize the PHT such that it faithfully represents the underlying shape. We provide a discretization that is exponential in the dimension of the shape. Moreover, we show that this discretization is stable with respect to various perturbations. Furthermore, we provide an algorithm for computing the discretization. Our approach relies only on knowing the heights and dimensions of topological events, which means that it can be adapted to provide discretizations of other dimension-returning topological transforms, including the Betti curve transform. With mild alterations, we also adapt our methods to faithfully discretize the Euler Characteristic curve transform.

Conventionally, during the knowledge distillation process (e.g. feature distillation), an additional projector is often required to perform feature transformation due to the dimension mismatch between the teacher and the student networks. Interestingly, we discovered that even if the student and the teacher have the same feature dimensions, adding a projector still helps to improve the distillation performance. In addition, projectors even improve logit distillation if we add them to the architecture too. Inspired by these surprising findings and the general lack of understanding of the projectors in the knowledge distillation process from existing literature, this paper investigates the implicit role that projectors play but so far have been overlooked. Our empirical study shows that the student with a projector (1) obtains a better trade-off between the training accuracy and the testing accuracy compared to the student without a projector when it has the same feature dimensions as the teacher, (2) better preserves its similarity to the teacher beyond shallow and numeric resemblance, from the view of Centered Kernel Alignment (CKA), and (3) avoids being over-confident as the teacher does at the testing phase. Motivated by the positive effects of projectors, we propose a projector ensemble-based feature distillation method to further improve distillation performance. Despite the simplicity of the proposed strategy, empirical results from the evaluation of classification tasks on benchmark datasets demonstrate the superior classification performance of our method on a broad range of teacher-student pairs and verify from the aspects of CKA and model calibration that the student's features are of improved quality with the projector ensemble design.

With policing coming under greater scrutiny in recent years, researchers have begun to more thoroughly study the effects of contact between police and minority communities. Despite data archives of hundreds of thousands of recorded Broadcast Police Communications (BPC) being openly available to the public, a closer look at a large-scale analysis of the language of policing has remained largely unexplored. While this research is critical in understanding a "pre-reflective" notion of policing, the large quantity of data presents numerous challenges in its organization and analysis. In this paper, we describe preliminary work towards enabling Speech Emotion Recognition (SER) in an analysis of the Chicago Police Department's (CPD) BPC by demonstrating the pipelined creation of a datastore to enable a multimodal analysis of composed raw audio files.

In pace with developments in the research field of artificial intelligence, knowledge graphs (KGs) have attracted a surge of interest from both academia and industry. As a representation of semantic relations between entities, KGs have proven to be particularly relevant for natural language processing (NLP), experiencing a rapid spread and wide adoption within recent years. Given the increasing amount of research work in this area, several KG-related approaches have been surveyed in the NLP research community. However, a comprehensive study that categorizes established topics and reviews the maturity of individual research streams remains absent to this day. Contributing to closing this gap, we systematically analyzed 507 papers from the literature on KGs in NLP. Our survey encompasses a multifaceted review of tasks, research types, and contributions. As a result, we present a structured overview of the research landscape, provide a taxonomy of tasks, summarize our findings, and highlight directions for future work.

The rapid recent progress in machine learning (ML) has raised a number of scientific questions that challenge the longstanding dogma of the field. One of the most important riddles is the good empirical generalization of overparameterized models. Overparameterized models are excessively complex with respect to the size of the training dataset, which results in them perfectly fitting (i.e., interpolating) the training data, which is usually noisy. Such interpolation of noisy data is traditionally associated with detrimental overfitting, and yet a wide range of interpolating models -- from simple linear models to deep neural networks -- have recently been observed to generalize extremely well on fresh test data. Indeed, the recently discovered double descent phenomenon has revealed that highly overparameterized models often improve over the best underparameterized model in test performance. Understanding learning in this overparameterized regime requires new theory and foundational empirical studies, even for the simplest case of the linear model. The underpinnings of this understanding have been laid in very recent analyses of overparameterized linear regression and related statistical learning tasks, which resulted in precise analytic characterizations of double descent. This paper provides a succinct overview of this emerging theory of overparameterized ML (henceforth abbreviated as TOPML) that explains these recent findings through a statistical signal processing perspective. We emphasize the unique aspects that define the TOPML research area as a subfield of modern ML theory and outline interesting open questions that remain.

The dominating NLP paradigm of training a strong neural predictor to perform one task on a specific dataset has led to state-of-the-art performance in a variety of applications (eg. sentiment classification, span-prediction based question answering or machine translation). However, it builds upon the assumption that the data distribution is stationary, ie. that the data is sampled from a fixed distribution both at training and test time. This way of training is inconsistent with how we as humans are able to learn from and operate within a constantly changing stream of information. Moreover, it is ill-adapted to real-world use cases where the data distribution is expected to shift over the course of a model's lifetime. The first goal of this thesis is to characterize the different forms this shift can take in the context of natural language processing, and propose benchmarks and evaluation metrics to measure its effect on current deep learning architectures. We then proceed to take steps to mitigate the effect of distributional shift on NLP models. To this end, we develop methods based on parametric reformulations of the distributionally robust optimization framework. Empirically, we demonstrate that these approaches yield more robust models as demonstrated on a selection of realistic problems. In the third and final part of this thesis, we explore ways of efficiently adapting existing models to new domains or tasks. Our contribution to this topic takes inspiration from information geometry to derive a new gradient update rule which alleviate catastrophic forgetting issues during adaptation.

This work considers the question of how convenient access to copious data impacts our ability to learn causal effects and relations. In what ways is learning causality in the era of big data different from -- or the same as -- the traditional one? To answer this question, this survey provides a comprehensive and structured review of both traditional and frontier methods in learning causality and relations along with the connections between causality and machine learning. This work points out on a case-by-case basis how big data facilitates, complicates, or motivates each approach.

Over the last several years, the field of natural language processing has been propelled forward by an explosion in the use of deep learning models. This survey provides a brief introduction to the field and a quick overview of deep learning architectures and methods. It then sifts through the plethora of recent studies and summarizes a large assortment of relevant contributions. Analyzed research areas include several core linguistic processing issues in addition to a number of applications of computational linguistics. A discussion of the current state of the art is then provided along with recommendations for future research in the field.

北京阿比特科技有限公司