Data analysis is challenging as analysts must navigate nuanced decisions that may yield divergent conclusions. AI assistants have the potential to support analysts in planning their analyses, enabling more robust decision making. Though AI-based assistants that target code execution (e.g., Github Copilot) have received significant attention, limited research addresses assistance for both analysis execution and planning. In this work, we characterize helpful planning suggestions and their impacts on analysts' workflows. We first review the analysis planning literature and crowd-sourced analysis studies to categorize suggestion content. We then conduct a Wizard-of-Oz study (n=13) to observe analysts' preferences and reactions to planning assistance in a realistic scenario. Our findings highlight subtleties in contextual factors that impact suggestion helpfulness, emphasizing design implications for supporting different abstractions of assistance, forms of initiative, increased engagement, and alignment of goals between analysts and assistants.
Pose refinement is an interesting and practically relevant research direction. Pose refinement can be used to (1) obtain a more accurate pose estimate from an initial prior (e.g., from retrieval), (2) as pre-processing, i.e., to provide a better starting point to a more expensive pose estimator, (3) as post-processing of a more accurate localizer. Existing approaches focus on learning features / scene representations for the pose refinement task. This involves training an implicit scene representation or learning features while optimizing a camera pose-based loss. A natural question is whether training specific features / representations is truly necessary or whether similar results can be already achieved with more generic features. In this work, we present a simple approach that combines pre-trained features with a particle filter and a renderable representation of the scene. Despite its simplicity, it achieves state-of-the-art results, demonstrating that one can easily build a pose refiner without the need for specific training. The code is at //github.com/ga1i13o/mcloc_poseref
Retrieving relevant tables containing the necessary information to accurately answer a given question over tables is critical to open-domain question-answering (QA) systems. Previous methods assume the answer to such a question can be found either in a single table or multiple tables identified through question decomposition or rewriting. However, neither of these approaches is sufficient, as many questions require retrieving multiple tables and joining them through a join plan that cannot be discerned from the user query itself. If the join plan is not considered in the retrieval stage, the subsequent steps of reasoning and answering based on those retrieved tables are likely to be incorrect. To address this problem, we introduce a method that uncovers useful join relations for any query and database during table retrieval. We use a novel re-ranking method formulated as a mixed-integer program that considers not only table-query relevance but also table-table relevance that requires inferring join relationships. Our method outperforms the state-of-the-art approaches for table retrieval by up to 9.3% in F1 score and for end-to-end QA by up to 5.4% in accuracy.
Extrinsic manipulation, the use of environment contacts to achieve manipulation objectives, enables strategies that are otherwise impossible with a parallel jaw gripper. However, orchestrating a long-horizon sequence of contact interactions between the robot, object, and environment is notoriously challenging due to the scene diversity, large action space, and difficult contact dynamics. We observe that most extrinsic manipulation are combinations of short-horizon primitives, each of which depend strongly on initializing from a desirable contact configuration to succeed. Therefore, we propose to generalize one extrinsic manipulation trajectory to diverse objects and environments by retargeting contact requirements. We prepare a single library of robust short-horizon, goal-conditioned primitive policies, and design a framework to compose state constraints stemming from contacts specifications of each primitive. Given a test scene and a single demo prescribing the primitive sequence, our method enforces the state constraints on the test scene and find intermediate goal states using inverse kinematics. The goals are then tracked by the primitive policies. Using a 7+1 DoF robotic arm-gripper system, we achieved an overall success rate of 80.5% on hardware over 4 long-horizon extrinsic manipulation tasks, each with up to 4 primitives. Our experiments cover 10 objects and 6 environment configurations. We further show empirically that our method admits a wide range of demonstrations, and that contact retargeting is indeed the key to successfully combining primitives for long-horizon extrinsic manipulation. Code and additional details are available at stanford-tml.github.io/extrinsic-manipulation.
While reachability analysis is one of the most promising approaches for formal verification of dynamic systems, a major disadvantage preventing a more widespread application is the requirement to manually tune algorithm parameters such as the time step size. Manual tuning is especially problematic if one aims to verify that the system satisfies complicated specifications described by signal temporal logic formulas since the effect the tightness of the reachable set has on the satisfaction of the specification is often non-trivial to see for humans. We address this problem with a fully-automated verifier for linear systems, which automatically refines all parameters for reachability analysis until it can either prove or disprove that the system satisfies a signal temporal logic formula for all initial states and all uncertain inputs. Our verifier combines reachset temporal logic with dependency preservation to obtain a model checking approach whose over-approximation error converges to zero for adequately tuned parameters. While we in this work focus on linear systems for simplicity, the general concept we present can equivalently be applied for nonlinear and hybrid systems.
Conventional distributed approaches to coverage control may suffer from lack of convergence and poor performance, due to the fact that agents have limited information, especially in non-convex discrete environments. To address this issue, we extend the approach of [Marden 2016] which demonstrates how a limited degree of inter-agent communication can be exploited to overcome such pitfalls in one-dimensional discrete environments. The focus of this paper is on extending such results to general dimensional settings. We show that the extension is convergent and keeps the approximation ratio of 2, meaning that any stable solution is guaranteed to have a performance within 50% of the optimal one. The experimental results exhibit that our algorithm outperforms several state-of-the-art algorithms, and also that the runtime is scalable.
Mediation analysis is an important statistical tool in many research fields. Its aim is to investigate the mechanism along the causal pathway between an exposure and an outcome. The joint significance test is widely utilized as a prominent statistical approach for examining mediation effects in practical applications. Nevertheless, the limitation of this mediation testing method stems from its conservative Type I error, which reduces its statistical power and imposes certain constraints on its popularity and utility. The proposed solution to address this gap is the adjusted joint significance test for one mediator, a novel data-adaptive test for mediation effect that exhibits significant advancements compared to traditional joint significance test. The proposed method is designed to be user-friendly, eliminating the need for complicated procedures. We have derived explicit expressions for size and power, ensuring the theoretical validity of our approach. Furthermore, we extend the proposed adjusted joint significance tests for small-scale mediation hypotheses with family-wise error rate (FWER) control. Additionally, a novel adjusted Sobel-type approach is proposed for the estimation of confidence intervals for the mediation effects, demonstrating significant advancements over conventional Sobel's confidence intervals in terms of achieving desirable coverage probabilities. Our mediation testing and confidence intervals procedure is evaluated through comprehensive simulations, and compared with numerous existing approaches. Finally, we illustrate the usefulness of our method by analysing three real-world datasets with continuous, binary and time-to-event outcomes, respectively.
Existing prompt learning methods have shown certain capabilities in Out-of-Distribution (OOD) detection, but the lack of OOD images in the target dataset in their training can lead to mismatches between OOD images and In-Distribution (ID) categories, resulting in a high false positive rate. To address this issue, we introduce a novel OOD detection method, named 'NegPrompt', to learn a set of negative prompts, each representing a negative connotation of a given class label, for delineating the boundaries between ID and OOD images. It learns such negative prompts with ID data only, without any reliance on external outlier data. Further, current methods assume the availability of samples of all ID classes, rendering them ineffective in open-vocabulary learning scenarios where the inference stage can contain novel ID classes not present during training. In contrast, our learned negative prompts are transferable to novel class labels. Experiments on various ImageNet benchmarks show that NegPrompt surpasses state-of-the-art prompt-learning-based OOD detection methods and maintains a consistent lead in hard OOD detection in closed- and open-vocabulary classification scenarios. Code is available at //github.com/mala-lab/negprompt.
Even though research has repeatedly shown that non-cash incentives can be effective, cash incentives are the de facto standard in crowdsourcing contests. In this multi-study research, we quantify ideators' preferences for non-cash incentives and investigate how allowing ideators to self-select their preferred incentive -- offering ideators a choice between cash and non-cash incentives -- affects their creative performance. We further explore whether the market context of the organization hosting the contest -- social (non-profit) or monetary (for-profit) -- moderates incentive preferences and their effectiveness. We find that individuals exhibit heterogeneous incentive preferences and often prefer non-cash incentives, even in for-profit contexts. Offering ideators a choice of incentives can enhance creative performance. Market context moderates the effect of incentives, such that ideators who receive non-cash incentives in for-profit contexts tend to exert less effort. We show that heterogeneity of ideators' preferences (and the ability to satisfy diverse preferences with suitably diverse incentive options) is a critical boundary condition to realizing benefits from offering ideators a choice of incentives. We provide managers with guidance to design effective incentives by improving incentive-preference fit for ideators.
Gaussian processes (GPs) are commonly used for geospatial analysis, but they suffer from high computational complexity when dealing with massive data. For instance, the log-likelihood function required in estimating the statistical model parameters for geospatial data is a computationally intensive procedure that involves computing the inverse of a covariance matrix with size n X n, where n represents the number of geographical locations. As a result, in the literature, studies have shifted towards approximation methods to handle larger values of n effectively while maintaining high accuracy. These methods encompass a range of techniques, including low-rank and sparse approximations. Vecchia approximation is one of the most promising methods to speed up evaluating the log-likelihood function. This study presents a parallel implementation of the Vecchia approximation, utilizing batched matrix computations on contemporary GPUs. The proposed implementation relies on batched linear algebra routines to efficiently execute individual conditional distributions in the Vecchia algorithm. We rely on the KBLAS linear algebra library to perform batched linear algebra operations, reducing the time to solution compared to the state-of-the-art parallel implementation of the likelihood estimation operation in the ExaGeoStat software by up to 700X, 833X, 1380X on 32GB GV100, 80GB A100, and 80GB H100 GPUs, respectively. We also successfully manage larger problem sizes on a single NVIDIA GPU, accommodating up to 1M locations with 80GB A100 and H100 GPUs while maintaining the necessary application accuracy. We further assess the accuracy performance of the implemented algorithm, identifying the optimal settings for the Vecchia approximation algorithm to preserve accuracy on two real geospatial datasets: soil moisture data in the Mississippi Basin area and wind speed data in the Middle East.
Multi-relation Question Answering is a challenging task, due to the requirement of elaborated analysis on questions and reasoning over multiple fact triples in knowledge base. In this paper, we present a novel model called Interpretable Reasoning Network that employs an interpretable, hop-by-hop reasoning process for question answering. The model dynamically decides which part of an input question should be analyzed at each hop; predicts a relation that corresponds to the current parsed results; utilizes the predicted relation to update the question representation and the state of the reasoning process; and then drives the next-hop reasoning. Experiments show that our model yields state-of-the-art results on two datasets. More interestingly, the model can offer traceable and observable intermediate predictions for reasoning analysis and failure diagnosis.