Online auction scenarios, such as bidding searches on advertising platforms, often require bidders to participate repeatedly in auctions for the same or similar items. We design an algorithm for adaptive automatic bidding in repeated auctions in which the seller and other bidders also update their strategies. We apply and improve the opponent modeling algorithm to allow bidders to learn optimal bidding strategies in this multiagent reinforcement learning environment. The algorithm uses almost no private information about the opponent or restrictions on the strategy space, so it can be extended to multiple scenarios. Our algorithm improves the utility compared to both static bidding strategies and dynamic learning strategies. We hope the application of opponent modeling in auctions will promote the research of automatic bidding strategies in online auctions and the design of non-incentive compatible auction mechanisms.
Sequential decoding, commonly applied to substitution channels, is a sub-optimal alternative to Viterbi decoding with significantly reduced memory costs. In this work, a sequential decoder for convolutional codes over channels that are prone to insertion, deletion, and substitution errors, is described and analyzed. Our decoder expands the code trellis by a new channel-state variable, called drift state, as proposed by Davey and MacKay. A suitable decoding metric on that trellis for sequential decoding is derived, generalizing the original Fano metric. The decoder is also extended to facilitate the simultaneous decoding of multiple received sequences that arise from a single transmitted sequence. Under low-noise environments, our decoding approach reduces the decoding complexity by a couple orders of magnitude in comparison to Viterbi's algorithm, albeit at slightly higher bit error rates. An analytical method to determine the computational cutoff rate is also suggested. This analysis is supported with numerical evaluations of bit error rates and computational complexity, which are compared with respect to optimal Viterbi decoding.
Stochastic volatility models, where the volatility is a stochastic process, can capture most of the essential stylized facts of implied volatility surfaces and give more realistic dynamics of the volatility smile/skew. However, they come with the significant issue that they take too long to calibrate. Alternative calibration methods based on Deep Learning (DL) techniques have been recently used to build fast and accurate solutions to the calibration problem. Huge and Savine developed a Differential Machine Learning (DML) approach, where Machine Learning models are trained on samples of not only features and labels but also differentials of labels to features. The present work aims to apply the DML technique to price vanilla European options (i.e. the calibration instruments), more specifically, puts when the underlying asset follows a Heston model and then calibrate the model on the trained network. DML allows for fast training and accurate pricing. The trained neural network dramatically reduces Heston calibration's computation time. In this work, we also introduce different regularisation techniques, and we apply them notably in the case of the DML. We compare their performance in reducing overfitting and improving the generalisation error. The DML performance is also compared to the classical DL (without differentiation) one in the case of Feed-Forward Neural Networks. We show that the DML outperforms the DL. The complete code for our experiments is provided in the GitHub repository: //github.com/asridi/DML-Calibration-Heston-Model
Classification systems are normally trained by minimizing the cross-entropy between system outputs and reference labels, which makes the Kullback-Leibler divergence a natural choice for measuring how closely the system can follow the data. Precision and recall provide another perspective for measuring the performance of a classification system. Non-binary references can arise from various sources, and it is often beneficial to use the soft labels for training instead of the binarized data. However, the existing definitions for precision and recall require binary reference labels, and binarizing the data can cause erroneous interpretations. We present a novel method to calculate precision, recall and F-score without quantizing the data. The proposed metrics extend the well established metrics as the definitions coincide when used with binary labels. To understand the behavior of the metrics we show simple example cases and an evaluation of different sound event detection models trained on real data with soft labels.
Causal inference is often complicated by post-treatment variables, which appear in many scientific problems, including noncompliance, truncation by death, mediation, and surrogate endpoint evaluation. Principal stratification is a strategy that adjusts for the potential values of the post-treatment variables, defined as the principal strata. It allows for characterizing treatment effect heterogeneity across principal strata and unveiling the mechanism of the treatment on the outcome related to post-treatment variables. However, the existing literature has primarily focused on binary post-treatment variables, leaving the case with continuous post-treatment variables largely unexplored, due to the complexity of infinitely many principal strata that challenge both the identification and estimation of causal effects. We fill this gap by providing nonparametric identification and semiparametric estimation theory for principal stratification with continuous post-treatment variables. We propose to use working models to approximate the underlying causal effect surfaces and derive the efficient influence functions of the corresponding model parameters. Based on the theory, we construct doubly robust estimators and implement them in an R package.
The study of complex networks has significantly advanced our understanding of community structures which serves as a crucial feature of real-world graphs. Detecting communities in graphs is a challenging problem with applications in sociology, biology, and computer science. Despite the efforts of an interdisciplinary community of scientists, a satisfactory solution to this problem has not yet been achieved. This review article delves into the topic of community detection in graphs, which serves as a crucial role in understanding the organization and functioning of complex systems. We begin by introducing the concept of community structure, which refers to the arrangement of vertices into clusters, with strong internal connections and weaker connections between clusters. Then, we provide a thorough exposition of various community detection methods, including a new method designed by us. Additionally, we explore real-world applications of community detection in diverse networks. In conclusion, this comprehensive review provides a deep understanding of community detection in graphs. It serves as a valuable resource for researchers and practitioners in multiple disciplines, offering insights into the challenges, methodologies, and applications of community detection in complex networks.
Understanding causality helps to structure interventions to achieve specific goals and enables predictions under interventions. With the growing importance of learning causal relationships, causal discovery tasks have transitioned from using traditional methods to infer potential causal structures from observational data to the field of pattern recognition involved in deep learning. The rapid accumulation of massive data promotes the emergence of causal search methods with brilliant scalability. Existing summaries of causal discovery methods mainly focus on traditional methods based on constraints, scores and FCMs, there is a lack of perfect sorting and elaboration for deep learning-based methods, also lacking some considers and exploration of causal discovery methods from the perspective of variable paradigms. Therefore, we divide the possible causal discovery tasks into three types according to the variable paradigm and give the definitions of the three tasks respectively, define and instantiate the relevant datasets for each task and the final causal model constructed at the same time, then reviews the main existing causal discovery methods for different tasks. Finally, we propose some roadmaps from different perspectives for the current research gaps in the field of causal discovery and point out future research directions.
Ensembles over neural network weights trained from different random initialization, known as deep ensembles, achieve state-of-the-art accuracy and calibration. The recently introduced batch ensembles provide a drop-in replacement that is more parameter efficient. In this paper, we design ensembles not only over weights, but over hyperparameters to improve the state of the art in both settings. For best performance independent of budget, we propose hyper-deep ensembles, a simple procedure that involves a random search over different hyperparameters, themselves stratified across multiple random initializations. Its strong performance highlights the benefit of combining models with both weight and hyperparameter diversity. We further propose a parameter efficient version, hyper-batch ensembles, which builds on the layer structure of batch ensembles and self-tuning networks. The computational and memory costs of our method are notably lower than typical ensembles. On image classification tasks, with MLP, LeNet, and Wide ResNet 28-10 architectures, our methodology improves upon both deep and batch ensembles.
Translational distance-based knowledge graph embedding has shown progressive improvements on the link prediction task, from TransE to the latest state-of-the-art RotatE. However, N-1, 1-N and N-N predictions still remain challenging. In this work, we propose a novel translational distance-based approach for knowledge graph link prediction. The proposed method includes two-folds, first we extend the RotatE from 2D complex domain to high dimension space with orthogonal transforms to model relations for better modeling capacity. Second, the graph context is explicitly modeled via two directed context representations. These context representations are used as part of the distance scoring function to measure the plausibility of the triples during training and inference. The proposed approach effectively improves prediction accuracy on the difficult N-1, 1-N and N-N cases for knowledge graph link prediction task. The experimental results show that it achieves better performance on two benchmark data sets compared to the baseline RotatE, especially on data set (FB15k-237) with many high in-degree connection nodes.
Benefit from the quick development of deep learning techniques, salient object detection has achieved remarkable progresses recently. However, there still exists following two major challenges that hinder its application in embedded devices, low resolution output and heavy model weight. To this end, this paper presents an accurate yet compact deep network for efficient salient object detection. More specifically, given a coarse saliency prediction in the deepest layer, we first employ residual learning to learn side-output residual features for saliency refinement, which can be achieved with very limited convolutional parameters while keep accuracy. Secondly, we further propose reverse attention to guide such side-output residual learning in a top-down manner. By erasing the current predicted salient regions from side-output features, the network can eventually explore the missing object parts and details which results in high resolution and accuracy. Experiments on six benchmark datasets demonstrate that the proposed approach compares favorably against state-of-the-art methods, and with advantages in terms of simplicity, efficiency (45 FPS) and model size (81 MB).
Object detection typically assumes that training and test data are drawn from an identical distribution, which, however, does not always hold in practice. Such a distribution mismatch will lead to a significant performance drop. In this work, we aim to improve the cross-domain robustness of object detection. We tackle the domain shift on two levels: 1) the image-level shift, such as image style, illumination, etc, and 2) the instance-level shift, such as object appearance, size, etc. We build our approach based on the recent state-of-the-art Faster R-CNN model, and design two domain adaptation components, on image level and instance level, to reduce the domain discrepancy. The two domain adaptation components are based on H-divergence theory, and are implemented by learning a domain classifier in adversarial training manner. The domain classifiers on different levels are further reinforced with a consistency regularization to learn a domain-invariant region proposal network (RPN) in the Faster R-CNN model. We evaluate our newly proposed approach using multiple datasets including Cityscapes, KITTI, SIM10K, etc. The results demonstrate the effectiveness of our proposed approach for robust object detection in various domain shift scenarios.