亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Public art shapes our shared spaces. Public art should speak to community and context, and yet, recent work has demonstrated numerous instances of art in prominent institutions favoring outdated cultural norms and legacy communities. Motivated by this, we develop a novel recommender system to curate public art exhibits with built-in equity objectives and a local value-based allocation of constrained resources. We develop a cost matrix by drawing on Schelling's model of segregation. Using the cost matrix as an input, the scoring function is optimized via a projected gradient descent to obtain a soft assignment matrix. Our optimization program allocates artwork to public spaces in a way that de-prioritizes "in-group" preferences, by satisfying minimum representation and exposure criteria. We draw on existing literature to develop a fairness metric for our algorithmic output, and we assess the effectiveness of our approach and discuss its potential pitfalls from both a curatorial and equity standpoint.

相關內容

For sensible progress in natural language processing, it is important that we are aware of the limitations of the evaluation metrics we use. In this work, we evaluate how robust metrics are to non-standardized dialects, i.e. spelling differences in language varieties that do not have a standard orthography. To investigate this, we collect a dataset of human translations and human judgments for automatic machine translations from English to two Swiss German dialects. We further create a challenge set for dialect variation and benchmark existing metrics' performances. Our results show that existing metrics cannot reliably evaluate Swiss German text generation outputs, especially on segment level. We propose initial design adaptations that increase robustness in the face of non-standardized dialects, although there remains much room for further improvement. The dataset, code, and models are available here: //github.com/textshuttle/dialect_eval

Diffusion models have shown great potential for vision-related tasks, particularly for image generation. However, their training is typically conducted in a centralized manner, relying on data collected from publicly available sources. This approach may not be feasible or practical in many domains, such as the medical field, which involves privacy concerns over data collection. Despite the challenges associated with privacy-sensitive data, such domains could still benefit from valuable vision services provided by diffusion models. Federated learning (FL) plays a crucial role in enabling decentralized model training without compromising data privacy. Instead of collecting data, an FL system gathers model parameters, effectively safeguarding the private data of different parties involved. This makes FL systems vital for managing decentralized learning tasks, especially in scenarios where privacy-sensitive data is distributed across a network of clients. Nonetheless, FL presents its own set of challenges due to its distributed nature and privacy-preserving properties. Therefore, in this study, we explore the FL strategy to train diffusion models, paving the way for the development of federated diffusion models. We conduct experiments on various FL scenarios, and our findings demonstrate that federated diffusion models have great potential to deliver vision services to privacy-sensitive domains.

Internet of Things (IoT) devices have become prevalent, embedding intelligence into our environment. It is projected that over 75 billion IoT devices will be connected by 2025 worldwide, with the majority being operated indoors. Dye-sensitized solar cells (DSSC) have recently been optimized for ambient light, having the capabilities of providing sufficient energy for self-powered IoT devices. Interaction with digital technologies, termed Human Computer Interaction (HCI), is often achieved via physical mechanisms (e.g. remote controls, cell phones) which can hinder the natural interface between users and IoT devices, a key consideration for HCI. What if the solar cell that is powering the IoT device can also recognize hand gestures which would allow the user to naturally interact with the system? Previous attempts to achieve this have necessarily employed an array of solar cell/photodiodes to detect directionality. In this work, we demonstrate that by monitoring the photocurrent output of an asymmetrically patterned monolithic (i.e., single cell) DSSC, and using machine learning, we can recognize simple hand gestures, achieving an accuracy prediction of 97.71%. This work shows that, DSSCs are the perfect choice for self-powered interactive technologies, both in terms of powering IoT devices in ambient light conditions and having aesthetic qualities that are prioritized by users. As well as powering interactive technologies, they can also provide a means of interactive control.

We propose a diarization system, that estimates "who spoke when" based on spatial information, to be used as a front-end of a meeting transcription system running on the signals gathered from an acoustic sensor network (ASN). Although the spatial distribution of the microphones is advantageous, exploiting the spatial diversity for diarization and signal enhancement is challenging, because the microphones' positions are typically unknown, and the recorded signals are initially unsynchronized in general. Here, we approach these issues by first blindly synchronizing the signals and then estimating time differences of arrival (TDOAs). The TDOA information is exploited to estimate the speakers' activity, even in the presence of multiple speakers being simultaneously active. This speaker activity information serves as a guide for a spatial mixture model, on which basis the individual speaker's signals are extracted via beamforming. Finally, the extracted signals are forwarded to a speech recognizer. Additionally, a novel initialization scheme for spatial mixture models based on the TDOA estimates is proposed. Experiments conducted on real recordings from the LibriWASN data set have shown that our proposed system is advantageous compared to a system using a spatial mixture model, which does not make use of external diarization information.

Reconfigurable intelligent surfaces (RIS) can improve signal propagation environments by adjusting the phase of the incident signal. However, optimizing the phase shifts jointly with the beamforming vector at the access point is challenging due to the non-convex objective function and constraints. In this study, we propose an algorithm based on weighted minimum mean square error optimization and power iteration to maximize the weighted sum rate (WSR) of a RIS-assisted downlink multi-user multiple-input single-output system. To further improve performance, a model-driven deep learning (DL) approach is designed, where trainable variables and graph neural networks are introduced to accelerate the convergence of the proposed algorithm. We also extend the proposed method to include beamforming with imperfect channel state information and derive a two-timescale stochastic optimization algorithm. Simulation results show that the proposed algorithm outperforms state-of-the-art algorithms in terms of complexity and WSR. Specifically, the model-driven DL approach has a runtime that is approximately 3% of the state-of-the-art algorithm to achieve the same performance. Additionally, the proposed algorithm with 2-bit phase shifters outperforms the compared algorithm with continuous phase shift.

In the realm of few-shot learning, foundation models like CLIP have proven effective but exhibit limitations in cross-domain robustness especially in few-shot settings. Recent works add text as an extra modality to enhance the performance of these models. Most of these approaches treat text as an auxiliary modality without fully exploring its potential to elucidate the underlying class visual features distribution. In this paper, we present a novel approach that leverages text-derived statistics to predict the mean and covariance of the visual feature distribution for each class. This predictive framework enriches the latent space, yielding more robust and generalizable few-shot learning models. We demonstrate the efficacy of incorporating both mean and covariance statistics in improving few-shot classification performance across various datasets. Our method shows that we can use text to predict the mean and covariance of the distribution offering promising improvements in few-shot learning scenarios.

Pre-training has achieved remarkable success when transferred to downstream tasks. In machine learning, we care about not only the good performance of a model but also its behavior under reasonable shifts of condition. The same philosophy holds when pre-training a foundation model. However, the foundation model may not uniformly behave well for a series of related downstream tasks. This happens, for example, when conducting mask recovery regression where the recovery ability or the training instances diverge like pattern features are extracted dominantly on pre-training, but semantic features are also required on a downstream task. This paper considers pre-training a model that guarantees a uniformly good performance over the downstream tasks. We call this goal as $\textit{downstream-task robustness}$. Our method first separates the upstream task into several representative ones and applies a simple minimax loss for pre-training. We then design an efficient algorithm to solve the minimax loss and prove its convergence in the convex setting. In the experiments, we show both on large-scale natural language processing and computer vision datasets our method increases the metrics on worse-case downstream tasks. Additionally, some theoretical explanations for why our loss is beneficial are provided. Specifically, we show fewer samples are inherently required for the most challenging downstream task in some cases.

Managing project risk is a key part of the successful implementation of any large project and is widely recognized as a best practice for public agencies to deliver infrastructures. The conventional method of identifying and evaluating project risks involves getting input from subject matter experts at risk workshops in the early phases of a project. As a project moves through its life cycle, these identified risks and their assessments evolve. Some risks are realized to become issues, some are mitigated, and some are retired as no longer important. Despite the value provided by conventional expert-based approaches, several challenges remain due to the time-consuming and expensive processes involved. Moreover, limited is known about how risks evolve from ex-ante to ex-post over time. How well does the project team identify and evaluate risks in the initial phase compared to what happens during project execution? Using historical data and artificial intelligence techniques, this study addressed these limitations by introducing a data-driven framework to identify risks automatically and to examine the quality of early risk registers and risk assessments. Risk registers from more than 70 U.S. major transportation projects form the input dataset.

Diffusion models (DMs) have shown great potential for high-quality image synthesis. However, when it comes to producing images with complex scenes, how to properly describe both image global structures and object details remains a challenging task. In this paper, we present Frido, a Feature Pyramid Diffusion model performing a multi-scale coarse-to-fine denoising process for image synthesis. Our model decomposes an input image into scale-dependent vector quantized features, followed by a coarse-to-fine gating for producing image output. During the above multi-scale representation learning stage, additional input conditions like text, scene graph, or image layout can be further exploited. Thus, Frido can be also applied for conditional or cross-modality image synthesis. We conduct extensive experiments over various unconditioned and conditional image generation tasks, ranging from text-to-image synthesis, layout-to-image, scene-graph-to-image, to label-to-image. More specifically, we achieved state-of-the-art FID scores on five benchmarks, namely layout-to-image on COCO and OpenImages, scene-graph-to-image on COCO and Visual Genome, and label-to-image on COCO. Code is available at //github.com/davidhalladay/Frido.

Machine learning techniques have deeply rooted in our everyday life. However, since it is knowledge- and labor-intensive to pursue good learning performance, human experts are heavily involved in every aspect of machine learning. In order to make machine learning techniques easier to apply and reduce the demand for experienced human experts, automated machine learning (AutoML) has emerged as a hot topic with both industrial and academic interest. In this paper, we provide an up to date survey on AutoML. First, we introduce and define the AutoML problem, with inspiration from both realms of automation and machine learning. Then, we propose a general AutoML framework that not only covers most existing approaches to date but also can guide the design for new methods. Subsequently, we categorize and review the existing works from two aspects, i.e., the problem setup and the employed techniques. Finally, we provide a detailed analysis of AutoML approaches and explain the reasons underneath their successful applications. We hope this survey can serve as not only an insightful guideline for AutoML beginners but also an inspiration for future research.

北京阿比特科技有限公司