亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We study the fundamental problem of sampling independent events, called subset sampling. Specifically, consider a set of $n$ events $S=\{x_1, \ldots, x_n\}$, where each event $x_i$ has an associated probability $p(x_i)$. The subset sampling problem aims to sample a subset $T \subseteq S$, such that every $x_i$ is independently included in $S$ with probability $p_i$. A naive solution is to flip a coin for each event, which takes $O(n)$ time. However, the specific goal is to develop data structures that allow drawing a sample in time proportional to the expected output size $\mu=\sum_{i=1}^n p(x_i)$, which can be significantly smaller than $n$ in many applications. The subset sampling problem serves as an important building block in many tasks and has been the subject of various research for more than a decade. However, most of the existing subset sampling approaches are conducted in a static setting, where the events or their associated probability in set $S$ is not allowed to be changed over time. These algorithms incur either large query time or update time in a dynamic setting despite the ubiquitous time-evolving events with changing probability in real life. Therefore, it is a pressing need, but still, an open problem, to design efficient dynamic subset sampling algorithms. In this paper, we propose ODSS, the first optimal dynamic subset sampling algorithm. The expected query time and update time of ODSS are both optimal, matching the lower bounds of the subset sampling problem. We present a nontrivial theoretical analysis to demonstrate the superiority of ODSS. We also conduct comprehensive experiments to empirically evaluate the performance of ODSS. Moreover, we apply ODSS to a concrete application: influence maximization. We empirically show that our ODSS can improve the complexities of existing influence maximization algorithms on large real-world evolving social networks.

相關內容

We present a critical analysis of the simulation framework RICE-N, an integrated assessment model (IAM) for evaluating the impacts of climate change on the economy. We identify key issues with RICE-N, including action masking and irrelevant actions, and suggest improvements such as utilizing tariff revenue and penalizing overproduction. We also critically engage with features of IAMs in general, namely overly optimistic damage functions and unrealistic abatement cost functions. Our findings contribute to the ongoing efforts to further develop the RICE-N framework in an effort to improve the simulation, making it more useful as an inspiration for policymakers.

A lately created metaheuristic algorithm called Child Drawing Development Optimization (CDDO) has proven to be effective in a number of benchmark tests. A Binary Child Drawing Development Optimization (BCDDO) is suggested for choosing the wrapper features in this study. To achieve the best classification accuracy, a subset of crucial features is selected using the suggested BCDDO. The proposed feature selection technique's efficiency and effectiveness are assessed using the Harris Hawk, Grey Wolf, Salp, and Whale optimization algorithms. The suggested approach has significantly outperformed the previously discussed techniques in the area of feature selection to increase classification accuracy. Moderate COVID, breast cancer, and big COVID are the three datasets utilized in this study. The classification accuracy for each of the three datasets was (98.75, 98.83%, and 99.36) accordingly.

Schema matching is a core data integration task, focusing on identifying correspondences among attributes of multiple schemata. Numerous algorithmic approaches were suggested for schema matching over the years, aiming at solving the task with as little human involvement as possible. Yet, humans are still required in the loop -- to validate algorithms and to produce ground truth data for algorithms to be trained against. In recent years, a new research direction investigates the capabilities and behavior of humans while performing matching tasks. Previous works utilized this knowledge to predict, and even improve, the performance of human matchers. In this work, we continue this line of research by suggesting a novel measure to evaluate the performance of human matchers, based on calibration, a common meta-cognition measure. The proposed measure enables detailed analysis of various factors of the behavior of human matchers and their relation to human performance. Such analysis can be further utilized to develop heuristics and methods to better asses and improve the annotation quality.

Given a graph $G$, the number of its vertices is represented by $n(G)$, while the number of its edges is denoted as $m(G)$. An independent set in a graph is a set of vertices where no two vertices are adjacent to each other and the size of the maximum independent set is denoted by $\alpha(G)$. A matching in a graph refers to a set of edges where no two edges share a common vertex and the maximum matching size is denoted by $\mu(G)$. If $\alpha(G) + \mu(G) = n(G)$, then the graph $G$ is called a K\"{o}nig-Egerv\'{a}ry graph. Considering a graph $G$ with a degree sequence $d_1 \leq d_2 \leq \cdots \leq d_n$, the annihilation number $a(G)$ is defined as the largest integer $k$ such that the sum of the first $k$ degrees in the sequence is less than or equal to $m(G)$ (Pepper, 2004). It is a known fact that $\alpha(G)$ is less than or equal to $a(G)$ for any graph $G$. Our goal is to estimate the difference between these two parameters. Specifically, we prove a series of inequalities, including $a(G) - \alpha(G) \leq \frac{\mu(G) - 1}{2}$ for trees, $a(G) - \alpha(G) \leq 2 + \mu(G) - 2\sqrt{1 + \mu(G)}$ for bipartite graphs and $a(G) - \alpha(G) \leq \mu(G) - 2$ for K\"{o}nig-Egerv\'{a}ry graphs. Furthermore, we demonstrate that these inequalities serve as tight upper bounds for the difference between the annihilation and independence numbers, regardless of the assigned value for $\mu(G)$.

We present a new algorithm for automatically bounding the Taylor remainder series. In the special case of a scalar function $f: \mathbb{R} \to \mathbb{R}$, our algorithm takes as input a reference point $x_0$, trust region $[a, b]$, and integer $k \ge 1$, and returns an interval $I$ such that $f(x) - \sum_{i=0}^{k-1} \frac {1} {i!} f^{(i)}(x_0) (x - x_0)^i \in I (x - x_0)^k$ for all $x \in [a, b]$. As in automatic differentiation, the function $f$ is provided to the algorithm in symbolic form, and must be composed of known atomic functions. At a high level, our algorithm has two steps. First, for a variety of commonly-used elementary functions (e.g., $\exp$, $\log$), we use recently-developed theory to derive sharp polynomial upper and lower bounds on the Taylor remainder series. We then recursively combine the bounds for the elementary functions using an interval arithmetic variant of Taylor-mode automatic differentiation. Our algorithm can make efficient use of machine learning hardware accelerators, and we provide an open source implementation in JAX. We then turn our attention to applications. Most notably, in a companion paper we use our new machinery to create the first universal majorization-minimization optimization algorithms: algorithms that iteratively minimize an arbitrary loss using a majorizer that is derived automatically, rather than by hand. We also show that our automatically-derived bounds can be used for verified global optimization and numerical integration, and to prove sharper versions of Jensen's inequality.

Causal Machine Learning (CausalML) is an umbrella term for machine learning methods that formalize the data-generation process as a structural causal model (SCM). This allows one to reason about the effects of changes to this process (i.e., interventions) and what would have happened in hindsight (i.e., counterfactuals). We categorize work in \causalml into five groups according to the problems they tackle: (1) causal supervised learning, (2) causal generative modeling, (3) causal explanations, (4) causal fairness, (5) causal reinforcement learning. For each category, we systematically compare its methods and point out open problems. Further, we review modality-specific applications in computer vision, natural language processing, and graph representation learning. Finally, we provide an overview of causal benchmarks and a critical discussion of the state of this nascent field, including recommendations for future work.

Images can convey rich semantics and induce various emotions in viewers. Recently, with the rapid advancement of emotional intelligence and the explosive growth of visual data, extensive research efforts have been dedicated to affective image content analysis (AICA). In this survey, we will comprehensively review the development of AICA in the recent two decades, especially focusing on the state-of-the-art methods with respect to three main challenges -- the affective gap, perception subjectivity, and label noise and absence. We begin with an introduction to the key emotion representation models that have been widely employed in AICA and description of available datasets for performing evaluation with quantitative comparison of label noise and dataset bias. We then summarize and compare the representative approaches on (1) emotion feature extraction, including both handcrafted and deep features, (2) learning methods on dominant emotion recognition, personalized emotion prediction, emotion distribution learning, and learning from noisy data or few labels, and (3) AICA based applications. Finally, we discuss some challenges and promising research directions in the future, such as image content and context understanding, group emotion clustering, and viewer-image interaction.

Deep Learning (DL) is vulnerable to out-of-distribution and adversarial examples resulting in incorrect outputs. To make DL more robust, several posthoc anomaly detection techniques to detect (and discard) these anomalous samples have been proposed in the recent past. This survey tries to provide a structured and comprehensive overview of the research on anomaly detection for DL based applications. We provide a taxonomy for existing techniques based on their underlying assumptions and adopted approaches. We discuss various techniques in each of the categories and provide the relative strengths and weaknesses of the approaches. Our goal in this survey is to provide an easier yet better understanding of the techniques belonging to different categories in which research has been done on this topic. Finally, we highlight the unsolved research challenges while applying anomaly detection techniques in DL systems and present some high-impact future research directions.

Joint image-text embedding is the bedrock for most Vision-and-Language (V+L) tasks, where multimodality inputs are jointly processed for visual and textual understanding. In this paper, we introduce UNITER, a UNiversal Image-TExt Representation, learned through large-scale pre-training over four image-text datasets (COCO, Visual Genome, Conceptual Captions, and SBU Captions), which can power heterogeneous downstream V+L tasks with joint multimodal embeddings. We design three pre-training tasks: Masked Language Modeling (MLM), Image-Text Matching (ITM), and Masked Region Modeling (MRM, with three variants). Different from concurrent work on multimodal pre-training that apply joint random masking to both modalities, we use conditioned masking on pre-training tasks (i.e., masked language/region modeling is conditioned on full observation of image/text). Comprehensive analysis shows that conditioned masking yields better performance than unconditioned masking. We also conduct a thorough ablation study to find an optimal setting for the combination of pre-training tasks. Extensive experiments show that UNITER achieves new state of the art across six V+L tasks (over nine datasets), including Visual Question Answering, Image-Text Retrieval, Referring Expression Comprehension, Visual Commonsense Reasoning, Visual Entailment, and NLVR2.

With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling. However, relying on corrupting the input with masks, BERT neglects dependency between the masked positions and suffers from a pretrain-finetune discrepancy. In light of these pros and cons, we propose XLNet, a generalized autoregressive pretraining method that (1) enables learning bidirectional contexts by maximizing the expected likelihood over all permutations of the factorization order and (2) overcomes the limitations of BERT thanks to its autoregressive formulation. Furthermore, XLNet integrates ideas from Transformer-XL, the state-of-the-art autoregressive model, into pretraining. Empirically, XLNet outperforms BERT on 20 tasks, often by a large margin, and achieves state-of-the-art results on 18 tasks including question answering, natural language inference, sentiment analysis, and document ranking.

北京阿比特科技有限公司