It does not matter whether it is a job interview with Tech Giants, Wall Street firms, or a small startup; all candidates want to demonstrate their best selves or even present themselves better than they really are. Meanwhile, recruiters want to know the candidates' authentic selves and detect soft skills that prove an expert candidate would be a great fit in any company. Recruiters worldwide usually struggle to find employees with the highest level of these skills. Digital footprints can assist recruiters in this process by providing candidates' unique set of online activities, while social media delivers one of the largest digital footprints to track people. In this study, for the first time, we show that a wide range of behavioral competencies consisting of 16 in-demand soft skills can be automatically predicted from Instagram profiles based on the following lists and other quantitative features using machine learning algorithms. We also provide predictions on Big Five personality traits. Models were built based on a sample of 400 Iranian volunteer users who answered an online questionnaire and provided their Instagram usernames which allowed us to crawl the public profiles. We applied several machine learning algorithms to the uniformed data. Deep learning models mostly outperformed by demonstrating 70% and 69% average Accuracy in two-level and three-level classifications respectively. Creating a large pool of people with the highest level of soft skills, and making more accurate evaluations of job candidates is possible with the application of AI on social media user-generated data.
Prompt and accurate detection of system anomalies is essential to ensure the reliability of software systems. Unlike manual efforts that exploit all available run-time information, existing approaches usually leverage only a single type of monitoring data (often logs or metrics) or fail to make effective use of the joint information among multi-source data. Consequently, many false predictions occur. To better understand the manifestations of system anomalies, we conduct a comprehensive empirical study based on a large amount of heterogeneous data, i.e., logs and metrics. Our study demonstrates that system anomalies could manifest distinctly in different data types. Thus, integrating heterogeneous data can help recover the complete picture of a system's health status. In this context, we propose HADES, the first work to effectively identify system anomalies based on heterogeneous data. Our approach employs a hierarchical architecture to learn a global representation of the system status by fusing log semantics and metric patterns. It captures discriminative features and meaningful interactions from multi-modal data via a novel cross-modal attention module, enabling accurate system anomaly detection. We evaluate HADES extensively on large-scale simulated and industrial datasets. The experimental results present the superiority of HADES in detecting system anomalies on heterogeneous data. We release the code and the annotated dataset for reproducibility and future research.
Pressure for higher productivity and faster delivery is increasingly pervading software organizations. This can lead software engineers to act like chess players playing a gambit -- making sacrifices of their technically sound estimates, thus submitting their teams to time pressure. In turn, time pressure can have varied detrimental effects, such as poor product quality and emotional distress, decreasing productivity, which leads to more time pressure and delays: a hard-to-stop vicious cycle. This reveals a need for moving on from the more passive strategy of yielding to pressure to a more active one of defending software estimates. Therefore, we propose an approach to support software estimators in acquiring knowledge on how to carry out such defense, by introducing negotiation principles encapsulated in a set of defense lenses, presented through a digital simulation. We evaluated the proposed approach through a controlled experiment with software practitioners from different companies. We collected data on participants' attitudes, subjective norms, perceived behavioral control, and intentions to perform the defense of their estimates in light of the Theory of Planned Behavior. We employed a frequentist and a bayesian approach to data analysis. Results show improved scores among experimental group participants after engaging with the digital simulation and learning about the lenses. They were also more inclined to choose a defense action when facing pressure scenarios than a control group exposed to questions to reflect on the reasons and outcomes of pressure over estimates. Qualitative evidence reveals that practitioners perceived the set of lenses as useful in their current work environments. Collectively, these results show the effectiveness of the proposed approach and its perceived relevance for the industry, despite the low amount of time required to engage with it.
Earth imaging satellites are a crucial part of our everyday lives that enable global tracking of industrial activities. Use cases span many applications, from weather forecasting to digital maps, carbon footprint tracking, and vegetation monitoring. However, there are also limitations; satellites are difficult to manufacture, expensive to maintain, and tricky to launch into orbit. Therefore, it is critical that satellites are employed efficiently. This poses a challenge known as the satellite mission planning problem, which could be computationally prohibitive to solve on large scales. However, close-to-optimal algorithms can often provide satisfactory resolutions, such as greedy reinforcement learning, and optimization algorithms. This paper introduces a set of quantum algorithms to solve the mission planning problem and demonstrate an advantage over the classical algorithms implemented thus far. The problem is formulated as maximizing the number of high-priority tasks completed on real datasets containing thousands of tasks and multiple satellites. This work demonstrates that through solution-chaining and clustering, optimization and machine learning algorithms offer the greatest potential for optimal solutions. Most notably, this paper illustrates that a hybridized quantum-enhanced reinforcement learning agent can achieve a completion percentage of 98.5% over high-priority tasks, which is a significant improvement over the baseline greedy methods with a completion rate of 63.6%. The results presented in this work pave the way to quantum-enabled solutions in the space industry and, more generally, future mission planning problems across industries.
A recent line of works apply machine learning techniques to assist or rebuild cost based query optimizers in DBMS. While exhibiting superiority in some benchmarks, their deficiencies, e.g., unstable performance, high training cost, and slow model updating, stem from the inherent hardness of predicting the cost or latency of execution plans using machine learning models. In this paper, we introduce a learning to rank query optimizer, called Lero, which builds on top of the native query optimizer and continuously learns to improve query optimization. The key observation is that the relative order or rank of plans, rather than the exact cost or latency, is sufficient for query optimization. Lero employs a pairwise approach to train a classifier to compare any two plans and tell which one is better. Such a binary classification task is much easier than the regression task to predict the cost or latency, in terms of model efficiency and effectiveness. Rather than building a learned optimizer from scratch, Lero is designed to leverage decades of wisdom of databases and improve the native optimizer. With its non intrusive design, Lero can be implemented on top of any existing DBMS with minimum integration efforts. We implement Lero and demonstrate its outstanding performance using PostgreSQL. In our experiments, Lero achieves near optimal performance on several benchmarks. It reduces the execution time of the native PostgreSQL optimizer by up to 70% and other learned query optimizers by up to 37%. Meanwhile, Lero continuously learns and automatically adapts to query workloads and changes in data.
Expert decision-makers (DMs) in high-stakes AI-assisted decision-making (AIaDM) settings receive and reconcile recommendations from AI systems before making their final decisions. We identify distinct properties of these settings which are key to developing AIaDM models that effectively benefit team performance. First, DMs incur reconciliation costs from exerting decision-making resources (e.g., time and effort) when reconciling AI recommendations that contradict their own judgment. Second, DMs in AIaDM settings exhibit algorithm discretion behavior (ADB), i.e., an idiosyncratic tendency to imperfectly accept or reject algorithmic recommendations for any given decision task. The human's reconciliation costs and imperfect discretion behavior introduce the need to develop AI systems which (1) provide recommendations selectively, (2) leverage the human partner's ADB to maximize the team's decision accuracy while regularizing for reconciliation costs, and (3) are inherently interpretable. We refer to the task of developing AI to advise humans in AIaDM settings as learning to advise and we address this task by first introducing the AI-assisted Team (AIaT)-Learning Framework. We instantiate our framework to develop TeamRules (TR): an algorithm that produces rule-based models and recommendations for AIaDM settings. TR is optimized to selectively advise a human and to trade-off reconciliation costs and team accuracy for a given environment by leveraging the human partner's ADB. Evaluations on synthetic and real-world benchmark datasets with a variety of simulated human accuracy and discretion behaviors show that TR robustly improves the team's objective across settings over interpretable, rule-based alternatives.
In this paper, we introduce a conformal prediction method to construct prediction sets in a oneshot federated learning setting. More specifically, we define a quantile-of-quantiles estimator and prove that for any distribution, it is possible to output prediction sets with desired coverage in only one round of communication. To mitigate privacy issues, we also describe a locally differentially private version of our estimator. Finally, over a wide range of experiments, we show that our method returns prediction sets with coverage and length very similar to those obtained in a centralized setting. Overall, these results demonstrate that our method is particularly well-suited to perform conformal predictions in a one-shot federated learning setting.
Object detectors often experience a drop in performance when new environmental conditions are insufficiently represented in the training data. This paper studies how to automatically fine-tune a pre-existing object detector while exploring and acquiring images in a new environment without relying on human intervention, i.e., in an utterly self-supervised fashion. In our setting, an agent initially learns to explore the environment using a pre-trained off-the-shelf detector to locate objects and associate pseudo-labels. By assuming that pseudo-labels for the same object must be consistent across different views, we learn an exploration policy mining hard samples and we devise a novel mechanism for producing refined predictions from the consensus among observations. Our approach outperforms the current state-of-the-art, and it closes the performance gap against a fully supervised setting without relying on ground-truth annotations. We also compare various exploration policies for the agent to gather more informative observations. Code and dataset will be made available upon paper acceptance
Selecting appropriate visual encodings is critical to designing effective visualization recommendation systems, yet few findings from graphical perception are typically applied within these systems. We observe two significant limitations in translating graphical perception knowledge into actionable visualization recommendation rules/constraints: inconsistent reporting of findings and a lack of shared data across studies. How can we translate the graphical perception literature into a knowledge base for visualization recommendation? We present a review of 59 papers that study user perception and performance across ten visual analysis tasks. Through this study, we contribute a JSON dataset that collates existing theoretical and experimental knowledge and summarizes key study outcomes in graphical perception. We illustrate how this dataset can inform automated encoding decisions with three representative visualization recommendation systems. Based on our findings, we highlight open challenges and opportunities for the community in collating graphical perception knowledge for a range of visualization recommendation scenarios.
Unsupervised domain adaptation has recently emerged as an effective paradigm for generalizing deep neural networks to new target domains. However, there is still enormous potential to be tapped to reach the fully supervised performance. In this paper, we present a novel active learning strategy to assist knowledge transfer in the target domain, dubbed active domain adaptation. We start from an observation that energy-based models exhibit free energy biases when training (source) and test (target) data come from different distributions. Inspired by this inherent mechanism, we empirically reveal that a simple yet efficient energy-based sampling strategy sheds light on selecting the most valuable target samples than existing approaches requiring particular architectures or computation of the distances. Our algorithm, Energy-based Active Domain Adaptation (EADA), queries groups of targe data that incorporate both domain characteristic and instance uncertainty into every selection round. Meanwhile, by aligning the free energy of target data compact around the source domain via a regularization term, domain gap can be implicitly diminished. Through extensive experiments, we show that EADA surpasses state-of-the-art methods on well-known challenging benchmarks with substantial improvements, making it a useful option in the open world. Code is available at //github.com/BIT-DA/EADA.
Machine learning is completely changing the trends in the fashion industry. From big to small every brand is using machine learning techniques in order to improve their revenue, increase customers and stay ahead of the trend. People are into fashion and they want to know what looks best and how they can improve their style and elevate their personality. Using Deep learning technology and infusing it with Computer Vision techniques one can do so by utilizing Brain-inspired Deep Networks, and engaging into Neuroaesthetics, working with GANs and Training them, playing around with Unstructured Data,and infusing the transformer architecture are just some highlights which can be touched with the Fashion domain. Its all about designing a system that can tell us information regarding the fashion aspect that can come in handy with the ever growing demand. Personalization is a big factor that impacts the spending choices of customers.The survey also shows remarkable approaches that encroach the subject of achieving that by divulging deep into how visual data can be interpreted and leveraged into different models and approaches. Aesthetics play a vital role in clothing recommendation as users' decision depends largely on whether the clothing is in line with their aesthetics, however the conventional image features cannot portray this directly. For that the survey also highlights remarkable models like tensor factorization model, conditional random field model among others to cater the need to acknowledge aesthetics as an important factor in Apparel recommendation.These AI inspired deep models can pinpoint exactly which certain style resonates best with their customers and they can have an understanding of how the new designs will set in with the community. With AI and machine learning your businesses can stay ahead of the fashion trends.