In this paper, we introduce and study the problem of \textit{binary stretch embedding} of edge-weighted graph. This problem is closely related to the well-known \textit{addressing problem} of Graham and Pollak. Addressing problem is the problem of assigning the shortest possible length strings (called ``addresses") over the alphabet $\{0,1,*\}$ to the vertices of an input graph $G$ with the following property. For every pair $u,v$ of vertices, the number of positions in which one of their addresses is $1$, and the other is $0$ is exactly equal to the distance of $u,v$ in graph $G$. When the addresses do not contain the symbol $*$, the problem is called \textit{isometric hypercube embedding}. As far as we know, the isometric hypercube embedding was introduced by Firsov in 1965. It is known that such addresses do not exist for general graphs. Inspired by the addressing problem, in this paper, we introduce the \textit{binary stretch embedding problem}, or BSEP for short, for the edge-weighted undirected graphs. We also argue how this problem is related to other graph embedding problems in the literature. Using tools and techniques such as Hadamard codes and the theory of linear programming, several upper and lower bounds as well as exact solutions for certain classes of graphs will be discovered. As an application of the results in this paper, we derive improved upper bounds or exact values for the maximum size of Lee metric codes of certain parameters.
In this paper, we focus on the problem of conformal prediction with conditional guarantees. Prior work has shown that it is impossible to construct nontrivial prediction sets with full conditional coverage guarantees. A wealth of research has considered relaxations of full conditional guarantees, relying on some predefined uncertainty structures. Departing from this line of thinking, we propose Partition Learning Conformal Prediction (PLCP), a framework to improve conditional validity of prediction sets through learning uncertainty-guided features from the calibration data. We implement PLCP efficiently with alternating gradient descent, utilizing off-the-shelf machine learning models. We further analyze PLCP theoretically and provide conditional guarantees for infinite and finite sample sizes. Finally, our experimental results over four real-world and synthetic datasets show the superior performance of PLCP compared to state-of-the-art methods in terms of coverage and length in both classification and regression scenarios.
In this paper, we set the mathematical foundations of the Dynamical Low-Rank Approximation (DLRA) method for stochastic differential equations. DLRA aims at approximating the solution as a linear combination of a small number of basis vectors with random coefficients (low rank format) with the peculiarity that both the basis vectors and the random coefficients vary in time. While the formulation and properties of DLRA are now well understood for random/parametric equations, the same cannot be said for SDEs and this work aims to fill this gap. We start by rigorously formulating a Dynamically Orthogonal (DO) approximation (an instance of DLRA successfully used in applications) for SDEs, which we then generalize to define a parametrization independent DLRA for SDEs. We show local well-posedness of the DO equations and their equivalence with the DLRA formulation. We also characterize the explosion time of the DO solution by a loss of linear independence of the random coefficients defining the solution expansion and give sufficient conditions for global existence.
In this work, we present a novel application of an uncertainty-quantification framework called Deep Evidential Learning in the domain of radiotherapy dose prediction. Using medical images of the Open Knowledge-Based Planning Challenge dataset, we found that this model can be effectively harnessed to yield uncertainty estimates that inherited correlations with prediction errors upon completion of network training. This was achieved only after reformulating the original loss function for a stable implementation. We found that (i)epistemic uncertainty was highly correlated with prediction errors, with various association indices comparable or stronger than those for Monte-Carlo Dropout and Deep Ensemble methods, (ii)the median error varied with uncertainty threshold much more linearly for epistemic uncertainty in Deep Evidential Learning relative to these other two conventional frameworks, indicative of a more uniformly calibrated sensitivity to model errors, (iii)relative to epistemic uncertainty, aleatoric uncertainty demonstrated a more significant shift in its distribution in response to Gaussian noise added to CT intensity, compatible with its interpretation as reflecting data noise. Collectively, our results suggest that Deep Evidential Learning is a promising approach that can endow deep-learning models in radiotherapy dose prediction with statistical robustness. Towards enhancing its clinical relevance, we demonstrate how we can use such a model to construct the predicted Dose-Volume-Histograms' confidence intervals.
In this paper, we propose a method for calculating the three-dimensional (3D) position and orientation of a pallet placed on a shelf on the side of a forklift truck using a 360-degree camera. By using a 360-degree camera mounted on the forklift truck, it is possible to observe both the pallet at the side of the forklift and one several meters ahead. However, the pallet on the obtained image is observed with different distortion depending on its 3D position, so that it is difficult to extract the pallet from the image. To solve this problem, a method [1] has been proposed for detecting a pallet by projecting a 360-degree image on a vertical plane that coincides with the front of the shelf to calculate an image similar to the image seen from the front of the shelf. At the same time as the detection, the approximate position and orientation of the detected pallet can be obtained, but the accuracy is not sufficient for automatic control of the forklift truck. In this paper, we propose a method for accurately detecting the yaw angle, which is the angle of the front surface of the pallet in the horizontal plane, by projecting the 360-degree image on a horizontal plane including the boundary line of the front surface of the detected pallet. The position of the pallet is also determined by moving the vertical plane having the detected yaw angle back and forth, and finding the position at which the degree of coincidence between the projection image on the vertical plane and the actual size of the front surface of the pallet is maximized. Experiments using real images taken in a laboratory and an actual warehouse have confirmed that the proposed method can calculate the position and orientation of a pallet within a reasonable calculation time and with the accuracy necessary for inserting the fork into the hole in the front of the pallet.
Data-driven predictions are often perceived as inaccurate in hindsight due to behavioral responses. In this study, we explore the role of interface design choices in shaping individuals' decision-making processes in response to predictions presented on a shared information display in a strategic setting. We introduce a novel staged experimental design to investigate the effects of design features, such as visualizations of prediction uncertainty and error, within a repeated congestion game. In this game, participants assume the role of taxi drivers and use a shared information display to decide where to search for their next ride. Our experimental design endows agents with varying level-$k$ depths of thinking, allowing some agents to possess greater sophistication in anticipating the decisions of others using the same information display. Through several extensive experiments, we identify trade-offs between displays that optimize individual decisions and those that best serve the collective social welfare of the system. We find that the influence of display characteristics varies based on an agent's strategic sophistication. We observe that design choices promoting individual-level decision-making can lead to suboptimal system outcomes, as manifested by a lower realization of potential social welfare. However, this decline in social welfare is offset by a reduction in the distribution shift, narrowing the gap between predicted and realized system outcomes, which potentially enhances the perceived reliability and trustworthiness of the information display post hoc. Our findings pave the way for new research questions concerning the design of effective prediction interfaces in strategic environments.
We introduce and investigate a powerful hyper logical framework in the linear-time setting, we call generalized HyperLTL with stuttering and contexts (GHyperLTL_SC for short). GHyperLTL_SC unifies known asynchronous extensions of HyperLTL and the well-known extension KLTL of LTL with knowledge modalities under both the synchronous and asynchronous perfect recall semantics. As a main contribution, we individuate a meaningful fragment of GHyperLTL_SC, we call simple GHyperLTL_SC, with a decidable model-checking problem, which is more expressive than HyperLTL and known fragments of asynchronous extensions of HyperLTL with a decidable model-checking problem. Simple GHyperLTL_SC subsumes KLTL under the synchronous semantics and the one-agent fragment of KLTL under the asynchronous semantics, and to the best of our knowledge, it represents the unique hyper logic with a decidable model-checking problem which can express powerful non-regular trace properties when interpreted on singleton sets of traces. We justify the relevance of simple GHyperLTL_SC by showing that it can express diagnosability properties, interesting classes of information-flow security policies, both in the synchronous and asynchronous settings, and bounded termination (more in general, global promptness in the style of Prompt LTL).
In this paper, we show how mixed-integer conic optimization can be used to combine feature subset selection with holistic generalized linear models to fully automate the model selection process. Concretely, we directly optimize for the Akaike and Bayesian information criteria while imposing constraints designed to deal with multicollinearity in the feature selection task. Specifically, we propose a novel pairwise correlation constraint that combines the sign coherence constraint with ideas from classical statistical models like Ridge regression and the OSCAR model.
BERT, a pre-trained Transformer model, has achieved ground-breaking performance on multiple NLP tasks. In this paper, we describe BERTSUM, a simple variant of BERT, for extractive summarization. Our system is the state of the art on the CNN/Dailymail dataset, outperforming the previous best-performed system by 1.65 on ROUGE-L. The codes to reproduce our results are available at //github.com/nlpyang/BertSum
In this paper, we introduce the Reinforced Mnemonic Reader for machine reading comprehension tasks, which enhances previous attentive readers in two aspects. First, a reattention mechanism is proposed to refine current attentions by directly accessing to past attentions that are temporally memorized in a multi-round alignment architecture, so as to avoid the problems of attention redundancy and attention deficiency. Second, a new optimization approach, called dynamic-critical reinforcement learning, is introduced to extend the standard supervised method. It always encourages to predict a more acceptable answer so as to address the convergence suppression problem occurred in traditional reinforcement learning algorithms. Extensive experiments on the Stanford Question Answering Dataset (SQuAD) show that our model achieves state-of-the-art results. Meanwhile, our model outperforms previous systems by over 6% in terms of both Exact Match and F1 metrics on two adversarial SQuAD datasets.
In this paper, we propose a conceptually simple and geometrically interpretable objective function, i.e. additive margin Softmax (AM-Softmax), for deep face verification. In general, the face verification task can be viewed as a metric learning problem, so learning large-margin face features whose intra-class variation is small and inter-class difference is large is of great importance in order to achieve good performance. Recently, Large-margin Softmax and Angular Softmax have been proposed to incorporate the angular margin in a multiplicative manner. In this work, we introduce a novel additive angular margin for the Softmax loss, which is intuitively appealing and more interpretable than the existing works. We also emphasize and discuss the importance of feature normalization in the paper. Most importantly, our experiments on LFW BLUFR and MegaFace show that our additive margin softmax loss consistently performs better than the current state-of-the-art methods using the same network architecture and training dataset. Our code has also been made available at //github.com/happynear/AMSoftmax