亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

In this paper, we propose a new class of local differential privacy (LDP) schemes based on combinatorial block designs for discrete distribution estimation. This class not only recovers many known LDP schemes in a unified framework of combinatorial block design, but also suggests a novel way of finding new schemes achieving the exactly optimal (or near-optimal) privacy-utility trade-off with lower communication costs. Indeed, we find many new LDP schemes that achieve the exactly optimal privacy-utility trade-off, with the minimum communication cost among all the unbiased or consistent schemes, for a certain set of input data size and LDP constraint. Furthermore, to partially solve the sparse existence issue of block design schemes, we consider a broader class of LDP schemes based on regular and pairwise-balanced designs, called RPBD schemes, which relax one of the symmetry requirements on block designs. By considering this broader class of RPBD schemes, we can find LDP schemes achieving near-optimal privacy-utility trade-off with reasonably low communication costs for a much larger set of input data size and LDP constraint.

相關內容

Despite the progress seen in classification methods, current approaches for handling videos with distribution shifts in source and target domains remain source-dependent as they require access to the source data during the adaptation stage. In this paper, we present a self-training based source-free video domain adaptation approach to address this challenge by bridging the gap between the source and the target domains. We use the source pre-trained model to generate pseudo-labels for the target domain samples, which are inevitably noisy. Thus, we treat the problem of source-free video domain adaptation as learning from noisy labels and argue that the samples with correct pseudo-labels can help us in adaptation. To this end, we leverage the cross-entropy loss as an indicator of the correctness of the pseudo-labels and use the resulting small-loss samples from the target domain for fine-tuning the model. We further enhance the adaptation performance by implementing a teacher-student framework, in which the teacher, which is updated gradually, produces reliable pseudo-labels. Meanwhile, the student undergoes fine-tuning on the target domain videos using these generated pseudo-labels to improve its performance. Extensive experimental evaluations show that our methods, termed as CleanAdapt, CleanAdapt + TS, achieve state-of-the-art results, outperforming the existing approaches on various open datasets. Our source code is publicly available at //avijit9.github.io/CleanAdapt.

The bi-encoder architecture provides a framework for understanding machine-learned retrieval models based on dense and sparse vector representations. Although these representations capture parametric realizations of the same underlying conceptual framework, their respective implementations of top-$k$ similarity search require the coordination of different software components (e.g., inverted indexes, HNSW indexes, and toolkits for neural inference), often knitted together in complex architectures. In this work, we ask the following question: What's the simplest design, in terms of requiring the fewest changes to existing infrastructure, that can support end-to-end retrieval with modern dense and sparse representations? The answer appears to be that Lucene is sufficient, as we demonstrate in Anserini, a toolkit for reproducible information retrieval research. That is, effective retrieval with modern single-vector neural models can be efficiently performed directly in Java on the CPU. We examine the implications of this design for information retrieval researchers pushing the state of the art as well as for software engineers building production search systems.

Vision-language models (VLMs) pre-trained on web-scale datasets have demonstrated remarkable capabilities on downstream tasks when fine-tuned with minimal data. However, many VLMs rely on proprietary data and are not open-source, which restricts the use of white-box approaches for fine-tuning. As such, we aim to develop a black-box approach to optimize VLMs through natural language prompts, thereby avoiding the need to access model parameters, feature embeddings, or even output logits. We propose employing chat-based LLMs to search for the best text prompt for VLMs. Specifically, we adopt an automatic hill-climbing procedure that converges to an effective prompt by evaluating the performance of current prompts and asking LLMs to refine them based on textual feedback, all within a conversational process without human-in-the-loop. In a challenging 1-shot image classification setup, our simple approach surpasses the white-box continuous prompting method (CoOp) by an average of 1.5% across 11 datasets including ImageNet. Our approach also outperforms both human-engineered and LLM-generated prompts. We highlight the advantage of conversational feedback that incorporates both positive and negative prompts, suggesting that LLMs can utilize the implicit gradient direction in textual feedback for a more efficient search. In addition, we find that the text prompts generated through our strategy are not only more interpretable but also transfer well across different VLM architectures in a black-box manner. Lastly, we demonstrate our framework on a state-of-the-art black-box VLM (DALL-E 3) for text-to-image optimization.

We demonstrate a user-focused verification approach for evaluating probability forecasts of binary outcomes (also known as probabilistic classifiers) that is (i) based on proper scoring rules, (ii) focuses on user decision thresholds, and (iii) provides actionable insights. We argue that the widespread use of categorical performance diagrams and the critical success index to evaluate probabilistic forecasts may produce misleading results and instead illustrate how Murphy diagrams are better for understanding performance across user decision thresholds. The use of proper scoring rules that account for the relative importance of different user decision thresholds is shown to impact scores of overall performance, as well as supporting measures of discrimination and calibration. These methods are demonstrated by evaluating several probabilistic thunderstorm forecast systems. Furthermore, we illustrate an approach that allows a fair comparison between continuous probabilistic forecasts and categorical outlooks using the FIxed Risk Multicategorical (FIRM) score and establish the relationship between the FIRM score and Murphy diagrams. The results highlight how the performance of thunderstorm forecasts produced for tropical Australian waters varies between operational meteorologists and an automated system depending on what decision thresholds a user is acting on. A hindcast of a new automated system is shown to generally perform better than both meteorologists and the old automated system across tropical Australian waters. While the methods are illustrated using thunderstorm forecasts, they are applicable for evaluating probabilistic forecasts for any situation with binary outcomes.

In this paper, we address the limitations of existing text-to-image diffusion models in generating demographically fair results when given human-related descriptions. These models often struggle to disentangle the target language context from sociocultural biases, resulting in biased image generation. To overcome this challenge, we propose Fair Mapping, a general, model-agnostic, and lightweight approach that modifies a pre-trained text-to-image model by controlling the prompt to achieve fair image generation. One key advantage of our approach is its high efficiency. The training process only requires updating a small number of parameters in an additional linear mapping network. This not only reduces the computational cost but also accelerates the optimization process. We first demonstrate the issue of bias in generated results caused by language biases in text-guided diffusion models. By developing a mapping network that projects language embeddings into an unbiased space, we enable the generation of relatively balanced demographic results based on a keyword specified in the prompt. With comprehensive experiments on face image generation, we show that our method significantly improves image generation performance when prompted with descriptions related to human faces. By effectively addressing the issue of bias, we produce more fair and diverse image outputs. This work contributes to the field of text-to-image generation by enhancing the ability to generate images that accurately reflect the intended demographic characteristics specified in the text.

We propose a robust transceiver design for a covert integrated sensing and communications (ISAC) system with imperfect channel state information (CSI). Considering both bounded and probabilistic CSI error models, we formulate worst-case and outage-constrained robust optimization problems of joint trasceiver beamforming and radar waveform design to balance the radar performance of multiple targets while ensuring communications performance and covertness of the system. The optimization problems are challenging due to the non-convexity arising from the semi-infinite constraints (SICs) and the coupled transceiver variables. In an effort to tackle the former difficulty, S-procedure and Bernstein-type inequality are introduced for converting the SICs into finite convex linear matrix inequalities (LMIs) and second-order cone constraints. A robust alternating optimization framework referred to alternating double-checking is developed for decoupling the transceiver design problem into feasibility-checking transmitter- and receiver-side subproblems, transforming the rank-one constraints into a set of LMIs, and verifying the feasibility of beamforming by invoking the matrix-lifting scheme. Numerical results are provided to demonstrate the effectiveness and robustness of the proposed algorithm in improving the performance of covert ISAC systems.

Current models for event causality identification (ECI) mainly adopt a supervised framework, which heavily rely on labeled data for training. Unfortunately, the scale of current annotated datasets is relatively limited, which cannot provide sufficient support for models to capture useful indicators from causal statements, especially for handing those new, unseen cases. To alleviate this problem, we propose a novel approach, shortly named CauSeRL, which leverages external causal statements for event causality identification. First of all, we design a self-supervised framework to learn context-specific causal patterns from external causal statements. Then, we adopt a contrastive transfer strategy to incorporate the learned context-specific causal patterns into the target ECI model. Experimental results show that our method significantly outperforms previous methods on EventStoryLine and Causal-TimeBank (+2.0 and +3.4 points on F1 value respectively).

Image-to-image translation aims to learn the mapping between two visual domains. There are two main challenges for many applications: 1) the lack of aligned training pairs and 2) multiple possible outputs from a single input image. In this work, we present an approach based on disentangled representation for producing diverse outputs without paired training images. To achieve diversity, we propose to embed images onto two spaces: a domain-invariant content space capturing shared information across domains and a domain-specific attribute space. Our model takes the encoded content features extracted from a given input and the attribute vectors sampled from the attribute space to produce diverse outputs at test time. To handle unpaired training data, we introduce a novel cross-cycle consistency loss based on disentangled representations. Qualitative results show that our model can generate diverse and realistic images on a wide range of tasks without paired training data. For quantitative comparisons, we measure realism with user study and diversity with a perceptual distance metric. We apply the proposed model to domain adaptation and show competitive performance when compared to the state-of-the-art on the MNIST-M and the LineMod datasets.

This paper introduces an online model for object detection in videos designed to run in real-time on low-powered mobile and embedded devices. Our approach combines fast single-image object detection with convolutional long short term memory (LSTM) layers to create an interweaved recurrent-convolutional architecture. Additionally, we propose an efficient Bottleneck-LSTM layer that significantly reduces computational cost compared to regular LSTMs. Our network achieves temporal awareness by using Bottleneck-LSTMs to refine and propagate feature maps across frames. This approach is substantially faster than existing detection methods in video, outperforming the fastest single-frame models in model size and computational cost while attaining accuracy comparable to much more expensive single-frame models on the Imagenet VID 2015 dataset. Our model reaches a real-time inference speed of up to 15 FPS on a mobile CPU.

In this paper, we propose the joint learning attention and recurrent neural network (RNN) models for multi-label classification. While approaches based on the use of either model exist (e.g., for the task of image captioning), training such existing network architectures typically require pre-defined label sequences. For multi-label classification, it would be desirable to have a robust inference process, so that the prediction error would not propagate and thus affect the performance. Our proposed model uniquely integrates attention and Long Short Term Memory (LSTM) models, which not only addresses the above problem but also allows one to identify visual objects of interests with varying sizes without the prior knowledge of particular label ordering. More importantly, label co-occurrence information can be jointly exploited by our LSTM model. Finally, by advancing the technique of beam search, prediction of multiple labels can be efficiently achieved by our proposed network model.

北京阿比特科技有限公司