云南虫谷在线观看免费观看电视剧_五月天婷婷丁香基地综合_国产精品99久久久久久一二区_亚洲精品无码毛片_怡红院AV在线永久免费麻豆_人妻高清无码专区_日本H在线精品免费观看

Conducting experiments with objectives that take significant delays to materialize (e.g. conversions, add-to-cart events, etc.) is challenging. Although the classical "split sample testing" is still valid for the delayed feedback, the experiment will take longer to complete, which also means spending more resources on worse-performing strategies due to their fixed allocation schedules. Alternatively, adaptive approaches such as "multi-armed bandits" are able to effectively reduce the cost of experimentation. But these methods generally cannot handle delayed objectives directly out of the box. This paper presents an adaptive experimentation solution tailored for delayed binary feedback objectives by estimating the real underlying objectives before they materialize and dynamically allocating variants based on the estimates. Experiments show that the proposed method is more efficient for delayed feedback compared to various other approaches and is robust in different settings. In addition, we describe an experimentation product powered by this algorithm. This product is currently deployed in the online experimentation platform of JD.com, a large e-commerce company and a publisher of digital ads.

相關內容

binary

關注 1

自動問答 · 知識 (knowledge) · INFORMS · 數據集 · 基準 ·

2022 年 4 月 19 日

Expert Finding in Legal Community Question Answering

Arian Askari,Suzan Verberne,Gabriella Pasi

from arxiv, Accepted at Proceedings of the 44th European Conference on Information Retrieval, ECIR 2022. Please cite the published version

Expert finding has been well-studied in community question answering (QA) systems in various domains. However, none of these studies addresses expert finding in the legal domain, where the goal is for citizens to find lawyers based on their expertise. In the legal domain, there is a large knowledge gap between the experts and the searchers, and the content on the legal QA websites consist of a combination formal and informal communication. In this paper, we propose methods for generating query-dependent textual profiles for lawyers covering several aspects including sentiment, comments, and recency. We combine query-dependent profiles with existing expert finding methods. Our experiments are conducted on a novel dataset gathered from an online legal QA service. We discovered that taking into account different lawyer profile aspects improves the best baseline model. We make our dataset publicly available for future work.

估計/估計量 · 優化器 · 馬爾可夫鏈 · 輸出 · INTERACT ·

2022 年 4 月 19 日

Adaptive measurement filter: efficient strategy for optimal estimation of quantum Markov chains

Alfred Godley,Madalin Guta

from arxiv, 23 pages 6 figures

Continuous-time measurements are instrumental for a multitude of tasks in quantum engineering and quantum control, including the estimation of dynamical parameters of open quantum systems monitored through the environment. However, such measurements do not extract the maximum amount of information available in the output state, so finding alternative optimal measurement strategies is a major open problem. In this paper we solve this problem in the setting of discrete-time input-output quantum Markov chains. We present an efficient algorithm for optimal estimation of one-dimensional dynamical parameters which consists of an iterative procedure for updating a `measurement filter' operator and determining successive measurement bases for the output units. A key ingredient of the scheme is the use of a coherent quantum absorber as a way to post-process the output after the interaction with the system. This is designed adaptively such that the joint system and absorber stationary state is pure at a reference parameter value. The scheme offers an exciting prospect for optimal continuous-time adaptive measurements, but more work is needed to find realistic practical implementations.

知識 (knowledge) · 相互獨立的 · 近似 · 數值分析 ·

2022 年 4 月 18 日

Utilizing Time-Reversibility for Shock Capturing in Nonlinear Hyperbolic Conservation Laws

Tarik Dzanic,Will Trojak,Freddie D. Witherden

from arxiv, 20 pages, 14 figures

In this work, we introduce a novel approach to formulating an artificial viscosity for shock capturing in nonlinear hyperbolic systems by utilizing the property that the solutions of hyperbolic conservation laws are not reversible in time in the vicinity of shocks. The proposed approach does not require any additional governing equations or a priori knowledge of the hyperbolic system in question, is independent of the mesh and approximation order, and requires the use of only one tunable parameter. The primary novelty is that the resulting artificial viscosity is unique for each component of the conservation law which is advantageous for systems in which some components exhibit discontinuities while others do not. The efficacy of the method is shown in numerical experiments of multi-dimensional hyperbolic conservation laws such as nonlinear transport, Euler equations, and ideal magnetohydrodynamics using a high-order discontinuous spectral element method on unstructured grids.

賭博機/老虎機 · PDE · 優化器 · 貝葉斯風險 · 規范化的 ·

2022 年 4 月 18 日

Risk and optimal policies in bandit experiments

Karun Adusumilli

We provide a decision theoretic analysis of bandit experiments. The setting corresponds to a dynamic programming problem, but solving this directly is typically infeasible. Working within the framework of diffusion asymptotics, we define suitable notions of asymptotic Bayes and minimax risk for bandit experiments. For normally distributed rewards, the minimal Bayes risk can be characterized as the solution to a nonlinear second-order partial differential equation (PDE). Using a limit of experiments approach, we show that this PDE characterization also holds asymptotically under both parametric and non-parametric distribution of the rewards. The approach further describes the state variables it is asymptotically sufficient to restrict attention to, and therefore suggests a practical strategy for dimension reduction. The upshot is that we can approximate the dynamic programming problem defining the bandit experiment with a PDE which can be efficiently solved using sparse matrix routines. We derive the optimal Bayes and minimax policies from the numerical solutions to these equations. The proposed policies substantially dominate existing methods such as Thompson sampling. The framework also allows for substantial generalizations to the bandit problem such as time discounting and pure exploration motives.

Performer · INFORMS · 學成 · 邊緣化 · 試驗 ·

2022 年 4 月 17 日

An Adaptive Task-Related Component Analysis Method for SSVEP recognition

Vangelis P. Oikonomou

from arxiv, 23 pages, 3 Figures, 6 Tables

Steady-state visual evoked potential (SSVEP) recognition methods are equipped with learning from the subject's calibration data, and they can achieve extra high performance in the SSVEP-based brain-computer interfaces (BCIs), however their performance deteriorate drastically if the calibration trials are insufficient. This study develops a new method to learn from limited calibration data and it proposes and evaluates a novel adaptive data-driven spatial filtering approach for enhancing SSVEPs detection. The spatial filter learned from each stimulus utilizes temporal information from the corresponding EEG trials. To introduce the temporal information into the overall procedure, an multitask learning approach, based on the bayesian framework, is adopted. The performance of the proposed method was evaluated into two publicly available benchmark datasets, and the results demonstrated that our method outperform competing methods by a significant margin.

控制器 · 穩健性 · 類別 · Lipschitz · 估計/估計量 ·

2022 年 4 月 17 日

Robust Stability of Neural-Network Controlled Nonlinear Systems with Parametric Variability

Soumyabrata Talukder,Ratnesh Kumar

from arxiv, 15 pages, 7 figures

Stability certification and identification of the stabilizable operating region of a dynamical system are two important concerns to ensure its operational safety/security and robustness. With the advent of machine-learning tools, these issues are especially important for systems with machine-learned components in the feedback loop. Here, in presence of unknown discrete variation (DV) of its parameters within a bounded range, a system controlled by a static feedback controller in which the closed-loop (CL) equilibria are subject to variation-induced drift is equivalently represented using a class of time-invariant systems, each with the same control policy. To develop a general theory for stability and stabilizability of such a class of neural-network (NN) controlled nonlinear systems, a Lyapunov-based convex stability certificate is proposed and is further used to devise an estimate of a local Lipschitz upper bound for the NN and a corresponding operating domain in the state space containing an initialization set, starting from where the CL local asymptotic stability of each system in the class is guaranteed, while the trajectory of the original system remains confined to the domain if the DV of the parameters satisfies a certain quasi-stationarity condition. To compute such a robustly stabilizing NN controller, a stability-guaranteed training (SGT) algorithm is also proposed. The effectiveness of the proposed framework is demonstrated using illustrative examples.

估計/估計量 · binary · 可辨認的 · Performance · 方差 ·

2022 年 4 月 16 日

Model-assisted complier average treatment effect estimates in randomized experiments with non-compliance and a binary outcome

Jiyang Ren

In randomized experiments, the actual treatments received by some experimental units may differ from their treatment assignments. This non-compliance issue often occurs in clinical trials, social experiments, and the applications of randomized experiments in many other fields. Under certain assumptions, the average treatment effect for the compliers is identifiable and equal to the ratio of the intention-to-treat effects of the potential outcomes to that of the potential treatment received. To improve the estimation efficiency, we propose three model-assisted estimators for the complier average treatment effect in randomized experiments with a binary outcome. We study their asymptotic properties, compare their efficiencies with that of the Wald estimator, and propose the Neyman-type conservative variance estimators to facilitate valid inferences. Moreover, we extend our methods and theory to estimate the multiplicative complier average treatment effect. Our analysis is randomization-based, allowing the working models to be misspecified. Finally, we conduct simulation studies to illustrate the advantages of the model-assisted methods and apply these analysis methods in a randomized experiment to evaluate the effect of academic services or incentives on academic performance.

估計/估計量 · Performer · 通道 · CC · 可約的 ·

2022 年 4 月 16 日

MMV-Based Sequential AoA and AoD Estimation for Millimeter Wave MIMO Channels

Wei Zhang,Miaomiao Dong,Taejoon Kim

from arxiv, Accepted by IEEE Transactions on Communications

The fact that the millimeter-wave (mmWave) multiple-input multiple-output (MIMO) channel has sparse support in the spatial domain has motivated recent compressed sensing (CS)-based mmWave channel estimation methods, where the angles of arrivals (AoAs) and angles of departures (AoDs) are quantized using angle dictionary matrices. However, the existing CS-based methods usually obtain the estimation result through one-stage channel sounding that have two limitations: (i) the requirement of large-dimensional dictionary and (ii) unresolvable quantization error. These two drawbacks are irreconcilable; improvement of the one implies deterioration of the other. To address these challenges, we propose, in this paper, a two-stage method to estimate the AoAs and AoDs of mmWave channels. In the proposed method, the channel estimation task is divided into two stages, Stage I and Stage II. Specifically, in Stage I, the AoAs are estimated by solving a multiple measurement vectors (MMV) problem. In Stage II, based on the estimated AoAs, the receive sounders are designed to estimate AoDs. The dimension of the angle dictionary in each stage can be reduced, which in turn reduces the computational complexity substantially. We then analyze the successful recovery probability (SRP) of the proposed method, revealing the superiority of the proposed framework over the existing one-stage CS-based methods. We further enhance the reconstruction performance by performing resource allocation between the two stages. We also overcome the unresolvable quantization error issue present in the prior techniques by applying the atomic norm minimization method to each stage of the proposed two-stage approach. The simulation results illustrate the substantially improved performance with low complexity of the proposed two-stage method.

學成 · 控制器 · Networking · 生物學合理性 · Weight ·

2022 年 4 月 14 日

Minimizing Control for Credit Assignment with Strong Feedback

Alexander Meulemans,Matilde Tristany Farinha,Maria R. Cervera,Jo?o Sacramento,Benjamin F. Grewe

from arxiv, 25 pages, 3 figures

The success of deep learning attracted interest in whether the brain learns hierarchical representations using gradient-based learning. However, current biologically plausible methods for gradient-based credit assignment in deep neural networks need infinitesimally small feedback signals, which is problematic in biologically realistic noisy environments and at odds with experimental evidence in neuroscience showing that top-down feedback can significantly influence neural activity. Building upon deep feedback control (DFC), a recently proposed credit assignment method, we combine strong feedback influences on neural activity with gradient-based learning and show that this naturally leads to a novel view on neural network optimization. Instead of gradually changing the network weights towards configurations with low output loss, weight updates gradually minimize the amount of feedback required from a controller that drives the network to the supervised output label. Moreover, we show that the use of strong feedback in DFC allows learning forward and feedback connections simultaneously, using a learning rule fully local in space and time. We complement our theoretical results with experiments on standard computer-vision benchmarks, showing competitive performance to backpropagation as well as robustness to noise. Overall, our work presents a fundamentally novel view of learning as control minimization, while sidestepping biologically unrealistic assumptions.

負例 · INFORMS · 圖 · MoDELS · 樣本 ·

2020 年 3 月 12 日

Reinforced Negative Sampling over Knowledge Graph for Recommendation

Xiang Wang,Yaokun Xu,Xiangnan He,Yixin Cao,Meng Wang,Tat-Seng Chua

from arxiv, WWW 2020 oral presentation

Properly handling missing data is a fundamental challenge in recommendation. Most present works perform negative sampling from unobserved data to supply the training of recommender models with negative signals. Nevertheless, existing negative sampling strategies, either static or adaptive ones, are insufficient to yield high-quality negative samples --- both informative to model training and reflective of user real needs. In this work, we hypothesize that item knowledge graph (KG), which provides rich relations among items and KG entities, could be useful to infer informative and factual negative samples. Towards this end, we develop a new negative sampling model, Knowledge Graph Policy Network (KGPolicy), which works as a reinforcement learning agent to explore high-quality negatives. Specifically, by conducting our designed exploration operations, it navigates from the target positive interaction, adaptively receives knowledge-aware negative signals, and ultimately yields a potential negative item to train the recommender. We tested on a matrix factorization (MF) model equipped with KGPolicy, and it achieves significant improvements over both state-of-the-art sampling methods like DNS and IRGAN, and KG-enhanced recommender models like KGAT. Further analyses from different angles provide insights of knowledge-aware sampling. We release the codes and datasets at //github.com/xiangwang1223/kgpolicy.