亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We consider the best arm identification problem in the stochastic multi-armed bandit framework where each arm has a tiny probability of realizing large rewards while with overwhelming probability the reward is zero. A key application of this framework is in online advertising where click rates of advertisements could be a fraction of a single percent and final conversion to sales, while highly profitable, may again be a small fraction of the click rates. Lately, algorithms for BAI problems have been developed that minimise sample complexity while providing statistical guarantees on the correct arm selection. As we observe, these algorithms can be computationally prohibitive. We exploit the fact that the reward process for each arm is well approximated by a Compound Poisson process to arrive at algorithms that are faster, with a small increase in sample complexity. We analyze the problem in an asymptotic regime as rarity of reward occurrence reduces to zero, and reward amounts increase to infinity. This helps illustrate the benefits of the proposed algorithm. It also sheds light on the underlying structure of the optimal BAI algorithms in the rare event setting.

相關內容

 安謀控股公司,又稱ARM公司,跨國性半導體設計與軟件公司,總部位于英國英格蘭劍橋。主要的產品是ARM架構處理器的設計,將其以知識產權的形式向客戶進行授權,同時也提供軟件開發工具。

Integrating sensing functionalities is envisioned as a distinguishing feature of next-generation mobile networks, which has given rise to the development of a novel enabling technology -- \emph{Integrated Sensing and Communication (ISAC)}. Portraying the theoretical performance bounds of ISAC systems is fundamentally important to understand how sensing and communication functionalities interact (e.g., competitively or cooperatively) in terms of resource utilization, while revealing insights and guidelines for the development of effective physical-layer techniques. In this paper, we characterize the fundamental performance tradeoff between the detection probability for target monitoring and the user's achievable rate in ISAC systems. To this end, we first discuss the achievable rate of the user under sensing-free and sensing-interfered communication scenarios. Furthermore, we derive closed-form expressions for the probability of false alarm (PFA) and the successful probability of detection (PD) for monitoring the target of interest, where we consider both communication-assisted and communication-interfered sensing scenarios. In addition, the effects of the unknown channel coefficient are also taken into account in our theoretical analysis. Based on our analytical results, we then carry out a comprehensive assessment of the performance tradeoff between sensing and communication functionalities. Specifically, we formulate a power allocation problem to minimize the transmit power at the base station (BS) under the constraints of ensuring a required PD for perception as well as the communication user's quality of service requirement in terms of achievable rate. Finally, simulation results corroborate the accuracy of our theoretical analysis and the effectiveness of the proposed power allocation solutions.

Electrocardiogram (ECG) signals are recordings of the heart's electrical activity and are widely used in the medical field to diagnose various cardiac conditions and monitor heart function. The accurate classification of ECG signals is crucial for the early detection and treatment of heart-related diseases. This paper proposes a novel approach based on an improved differential evolution (DE) algorithm for ECG signal classification. To this end, after the preprocessing step, we extracted several features such as BPM, IBI, and SDNN. Then, the features are fed into a multi-layer perceptron (MLP). While MLPs are still widely used for ECG signal classification, using gradient-based training methods, the most widely used algorithm for the training process, has significant disadvantages, such as the possibility of being stuck in local optimums. Population-based metaheuristic techniques have been effectively used to address this. This paper employs an enhanced differential evolution (DE) algorithm for the training process as one of the most effective population-based algorithms. To this end, we improved DE based on a clustering-based strategy, opposition-based learning, and a local search. Clustering-based strategies can act as crossover operators, while the goal of the opposition operator is to improve the exploration of the DE algorithm. The weights and biases found by the improved DE algorithm are then fed into six gradient-based local search algorithms. In other words, the weights found by the DE are employed as an initialization point. Therefore, we introduced six different algorithms for the training process (in terms of different local search algorithms). In an extensive set of experiments, we showed that our proposed training algorithm could provide better results than the conventional training algorithms.

Travel time derivatives are introduced as financial derivatives based on road travel times - a non-tradable underlying asset. In the transportation area, it is proposed as a more fundamental approach to value pricing because it conduct road pricing based on not only level but also volatility of travel time; in the financial market, it is propose as an innovative hedging instrument against market risk, especially after the recent stress of crypto market and traditional banking sector. The paper addresses (a) the motivation for introducing such derivatives (that is, the demand for hedging), (b) the potential market, and (c) the product design and pricing schemes. Pricing schemes are designed based on the travel time data captured by real time sensors, which are modeled as Ornstein - Uhlenbeck processes and more generally, continuous time auto regression moving average (CARMA) models. The calibration of such model is conducted via a hidden factor model, which described the dynamics of travel time processes. The risk neutral pricing principle is used to generate the derivative price, with reasonably designed procedures to identify the market value of risk.

Most of the existing federated multi-armed bandits (FMAB) designs are based on the presumption that clients will implement the specified design to collaborate with the server. In reality, however, it may not be possible to modify the client's existing protocols. To address this challenge, this work focuses on clients who always maximize their individual cumulative rewards, and introduces a novel idea of "reward teaching", where the server guides the clients towards global optimality through implicit local reward adjustments. Under this framework, the server faces two tightly coupled tasks of bandit learning and target teaching, whose combination is non-trivial and challenging. A phased approach, called Teaching-After-Learning (TAL), is first designed to encourage and discourage clients' explorations separately. General performance analyses of TAL are established when the clients' strategies satisfy certain mild requirements. With novel technical approaches developed to analyze the warm-start behaviors of bandit algorithms, particularized guarantees of TAL with clients running UCB or epsilon-greedy strategies are then obtained. These results demonstrate that TAL achieves logarithmic regrets while only incurring logarithmic adjustment costs, which is order-optimal w.r.t. a natural lower bound. As a further extension, the Teaching-While-Learning (TWL) algorithm is developed with the idea of successive arm elimination to break the non-adaptive phase separation in TAL. Rigorous analyses demonstrate that when facing clients with UCB1, TWL outperforms TAL in terms of the dependencies on sub-optimality gaps thanks to its adaptive design. Experimental results demonstrate the effectiveness and generality of the proposed algorithms.

The analysis of large-scale time-series network data, such as social media and email communications, remains a significant challenge for graph analysis methodology. In particular, the scalability of graph analysis is a critical issue hindering further progress in large-scale downstream inference. In this paper, we introduce a novel approach called "temporal encoder embedding" that can efficiently embed large amounts of graph data with linear complexity. We apply this method to an anonymized time-series communication network from a large organization spanning 2019-2020, consisting of over 100 thousand vertices and 80 million edges. Our method embeds the data within 10 seconds on a standard computer and enables the detection of communication pattern shifts for individual vertices, vertex communities, and the overall graph structure. Through supporting theory and synthesis studies, we demonstrate the theoretical soundness of our approach under random graph models and its numerical effectiveness through simulation studies.

A framework consists of an undirected graph $G$ and a matroid $M$ whose elements correspond to the vertices of $G$. Recently, Fomin et al. [SODA 2023] and Eiben et al. [ArXiV 2023] developed parameterized algorithms for computing paths of rank $k$ in frameworks. More precisely, for vertices $s$ and $t$ of $G$, and an integer $k$, they gave FPT algorithms parameterized by $k$ deciding whether there is an $(s,t)$-path in $G$ whose vertex set contains a subset of elements of $M$ of rank $k$. These algorithms are based on Schwartz-Zippel lemma for polynomial identity testing and thus are randomized, and therefore the existence of a deterministic FPT algorithm for this problem remains open. We present the first deterministic FPT algorithm that solves the problem in frameworks whose underlying graph $G$ is planar. While the running time of our algorithm is worse than the running times of the recent randomized algorithms, our algorithm works on more general classes of matroids. In particular, this is the first FPT algorithm for the case when matroid $M$ is represented over rationals. Our main technical contribution is the nontrivial adaptation of the classic irrelevant vertex technique to frameworks to reduce the given instance to one of bounded treewidth. This allows us to employ the toolbox of representative sets to design a dynamic programming procedure solving the problem efficiently on instances of bounded treewidth.

1. Species identification errors may have severe implications for the inference of species distributions. Accounting for misclassification in species distributions is an important topic of biodiversity research. With an increasing amount of biodiversity that comes from Citizen Science projects, where identification is not verified by preserved specimens, this issue is becoming more important. This has often been dealt with by accounting for false positives in species distribution models. However, the problem should account for misclassifications in general. 2. Here we present a flexible framework that accounts for misclassification in the distribution models and provides estimates of uncertainty around these estimates. The model was applied to data on viceroy, queen and monarch butterflies in the United States. The data were obtained from the iNaturalist database in the period 2019 to 2020. 3. Simulations and analysis of butterfly data showed that the proposed model was able to correct the reported abundance distribution for misclassification and also predict the true state for misclassified state.

Deep learning-based approaches have produced models with good insect classification accuracy; Most of these models are conducive for application in controlled environmental conditions. One of the primary emphasis of researchers is to implement identification and classification models in the real agriculture fields, which is challenging because input images that are wildly out of the distribution (e.g., images like vehicles, animals, humans, or a blurred image of an insect or insect class that is not yet trained on) can produce an incorrect insect classification. Out-of-distribution (OOD) detection algorithms provide an exciting avenue to overcome these challenge as it ensures that a model abstains from making incorrect classification prediction of non-insect and/or untrained insect class images. We generate and evaluate the performance of state-of-the-art OOD algorithms on insect detection classifiers. These algorithms represent a diversity of methods for addressing an OOD problem. Specifically, we focus on extrusive algorithms, i.e., algorithms that wrap around a well-trained classifier without the need for additional co-training. We compared three OOD detection algorithms: (i) Maximum Softmax Probability, which uses the softmax value as a confidence score, (ii) Mahalanobis distance-based algorithm, which uses a generative classification approach; and (iii) Energy-Based algorithm that maps the input data to a scalar value, called energy. We performed an extensive series of evaluations of these OOD algorithms across three performance axes: (a) \textit{Base model accuracy}: How does the accuracy of the classifier impact OOD performance? (b) How does the \textit{level of dissimilarity to the domain} impact OOD performance? and (c) \textit{Data imbalance}: How sensitive is OOD performance to the imbalance in per-class sample size?

Control design for robotic systems is complex and often requires solving an optimization to follow a trajectory accurately. Online optimization approaches like Model Predictive Control (MPC) have been shown to achieve great tracking performance, but require high computing power. Conversely, learning-based offline optimization approaches, such as Reinforcement Learning (RL), allow fast and efficient execution on the robot but hardly match the accuracy of MPC in trajectory tracking tasks. In systems with limited compute, such as aerial vehicles, an accurate controller that is efficient at execution time is imperative. We propose an Analytic Policy Gradient (APG) method to tackle this problem. APG exploits the availability of differentiable simulators by training a controller offline with gradient descent on the tracking error. We address training instabilities that frequently occur with APG through curriculum learning and experiment on a widely used controls benchmark, the CartPole, and two common aerial robots, a quadrotor and a fixed-wing drone. Our proposed method outperforms both model-based and model-free RL methods in terms of tracking error. Concurrently, it achieves similar performance to MPC while requiring more than an order of magnitude less computation time. Our work provides insights into the potential of APG as a promising control method for robotics. To facilitate the exploration of APG, we open-source our code and make it available at //github.com/lis-epfl/apg_trajectory_tracking.

Games and simulators can be a valuable platform to execute complex multi-agent, multiplayer, imperfect information scenarios with significant parallels to military applications: multiple participants manage resources and make decisions that command assets to secure specific areas of a map or neutralize opposing forces. These characteristics have attracted the artificial intelligence (AI) community by supporting development of algorithms with complex benchmarks and the capability to rapidly iterate over new ideas. The success of artificial intelligence algorithms in real-time strategy games such as StarCraft II have also attracted the attention of the military research community aiming to explore similar techniques in military counterpart scenarios. Aiming to bridge the connection between games and military applications, this work discusses past and current efforts on how games and simulators, together with the artificial intelligence algorithms, have been adapted to simulate certain aspects of military missions and how they might impact the future battlefield. This paper also investigates how advances in virtual reality and visual augmentation systems open new possibilities in human interfaces with gaming platforms and their military parallels.

北京阿比特科技有限公司