亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

SQL query performance is critical in database applications, and query rewriting is a technique that transforms an original query into an equivalent query with a better performance. In a wide range of database-supported systems, there is a unique problem where both the application and database layer are black boxes, and the developers need to use their knowledge about the data and domain to rewrite queries sent from the application to the database for better performance. Unfortunately, existing solutions do not give the users enough freedom to express their rewriting needs. To address this problem, we propose QueryBooster, a novel middleware-based service architecture for human-centered query rewriting, where users can use its expressive and easy-to-use rule language (called VarSQL) to formulate rewriting rules based on their needs. It also allows users to express rewriting intentions by providing examples of the original query and its rewritten query. QueryBooster automatically generalizes them to rewriting rules and suggests high-quality ones. We conduct a user study to show the benefits of VarSQL to formulate rewriting rules. Our experiments on real and synthetic workloads show the effectiveness of the rule-suggesting framework and the significant advantages of using QueryBooster for human-centered query rewriting to improve the end-to-end query performance.

相關內容

The core numbers of vertices in a graph are one of the most well-studied cohesive subgraph models because of the linear running time. In practice, many data graphs are dynamic graphs that are continuously changing by inserting or removing edges. The core numbers are updated in dynamic graphs with edge insertions and deletions, which is called core maintenance. When a burst of a large number of inserted or removed edges come in, we have to handle these edges on time to keep up with the data stream. There are two main sequential algorithms for core maintenance, \textsc{Traversal} and \textsc{Order}. It is proved that the \textsc{Order} algorithm significantly outperforms the \alg{Traversal} algorithm over all tested graphs with up to 2,083 times speedups. To the best of our knowledge, all existing parallel approaches are based on the \alg{Traversal} algorithm; also, their parallelism exists only for affected vertices with different core numbers, which will reduce to sequential when all vertices have the same core numbers. In this paper, we propose a new parallel core maintenance algorithm based on the \alg{Order} algorithm. Importantly, our new approach always has parallelism, even for the graphs where all vertices have the same core numbers. Extensive experiments are conducted over real-world, temporal, and synthetic graphs on a 64-core machine. The results show that for inserting and removing 100,000 edges using 16-worker, our method achieves up to 289x and 10x times speedups compared with the most efficient existing method, respectively.

Coverage path planning is the problem of finding the shortest path that covers the entire free space of a given confined area, with applications ranging from robotic lawn mowing and vacuum cleaning, to demining and search-and-rescue tasks. While offline methods can find provably complete, and in some cases optimal, paths for known environments, their value is limited in online scenarios where the environment is not known beforehand, especially in the presence of non-static obstacles. We propose an end-to-end reinforcement learning-based approach in continuous state and action space, for the online coverage path planning problem that can handle unknown environments. We construct the observation space from both global maps and local sensory inputs, allowing the agent to plan a long-term path, and simultaneously act on short-term obstacle detections. To account for large-scale environments, we propose to use a multi-scale map input representation. Furthermore, we propose a novel total variation reward term for eliminating thin strips of uncovered space in the learned path. To validate the effectiveness of our approach, we perform extensive experiments in simulation with a distance sensor, surpassing the performance of a recent reinforcement learning-based approach.

Numerous studies have underscored the significant privacy risks associated with various leakage patterns in encrypted data stores. Most existing systems that conceal leakage either (1) incur substantial overheads, (2) focus on specific subsets of leakage patterns, or (3) apply the same security notion across various workloads, thereby impeding the attainment of fine-tuned privacy-efficiency trade-offs. In light of various detrimental leakage patterns, this paper starts with an investigation into which specific leakage patterns require our focus respectively in the contexts of key-value, range-query, and dynamic workloads. Subsequently, we introduce new security notions tailored to the specific privacy requirements of these workloads. Accordingly, we present, SWAT, an efficient construction that progressively enables these workloads, while provably mitigating system-wide leakage via a suite of algorithms with tunable privacy-efficiency trade-offs. We conducted extensive experiments and compiled a detailed result analysis, showing the efficiency of our solution. SWAT is about $10.6\times$ slower than an encryption-only data store that reveals various leakage patterns and is $31.6\times$ faster than a trivially zero-leakage solution. Meanwhile, the performance of SWAT remains highly competitive compared to other designs that mitigate specific types of leakage.

Software maintenance is an important part of a software system's life cycle. Maintenance tasks of existing software systems suffer from architecture information that is diverging over time (architectural drift). The Digital Architecture Twin (DArT) can support software maintenance by providing up-to-date architecture information. For this, the DArT gathers such information and co-evolves with a software system, enabling continuous reverse engineering. But the crucial link for stakeholders to retrieve this information is missing. To fill this gap, we contribute the Architecture Information Query Language (AIQL), which enables stakeholders to access up-to-date and tailored architecture information. We derived four application scenarios in the context of continuous reverse engineering. We showed that the AIQL provides the required functionality to formulate queries for the application scenarios and that the language scales for use with real-world software systems. In a user study, stakeholders agreed that the language is easy to understand and assessed its value to the specific stakeholder for the application scenarios.

A unified approach of Positive and Unlabelled (PU)-learning, Semi-Supervised Learning (SSL), and Open-Set Recognition (OSR) would significantly enhance the development of cost-efficient application-grade classifiers. However, previous attempts have conflated the definitions of \mbox{\textit{observed}} and \mbox{\textit{unobserved}} novel categories. Observed novel categories are defined in PU-learning as those in unlabelled training data and exist due to an incomplete set of category labels for the training set. In contrast, unobserved novel categories are defined in OSR as those that only exist in the testing data and represent new and interesting patterns that emerge over time. To maintain safe and practical classifier development, models must generalise the difference between these novel category types. In this letter, we thoroughly review the relevant machine learning research fields to propose a new unified machine learning policy called Open-set Learning with Augmented Categories by exploiting Unlabelled data or Open-LACU. Specifically, Open-LACU requires models to accurately classify $K > 1$ number of labelled categories while simultaneously detecting and separating observed novel categories into the augmented background category ($K + 1$) and further detecting and separating unobserved novel categories into the augmented unknown category ($K + 2$). Open-LACU is the first machine learning policy to generalise observed and unobserved novel categories. The significance of Open-LACU is also highlighted by discussing its application in semantic segmentation of remote sensing images, object detection within medical radiology images and disease identification through cough sound analysis.

Context modeling and recognition are crucial for adaptive mobile and ubiquitous computing. Context-awareness in mobile environments relies on prompt reactions to context changes. However, current solutions focus on limited context information processed on centralized architectures, risking privacy leakage and lacking personalization. On-device context modeling and recognition are emerging research trends, addressing these concerns. Social interactions and visited locations play significant roles in characterizing daily life scenarios. This paper proposes an unsupervised and lightweight approach to model the user's social context and locations directly on the mobile device. Leveraging the ego-network model, the system extracts high-level, semantic-rich context features from smartphone-embedded sensor data. For the social context, the approach utilizes data on physical and cyber social interactions among users and their devices. Regarding location, it prioritizes modeling the familiarity degree of specific locations over raw location data, such as GPS coordinates and proximity devices. The effectiveness of the proposed approach is demonstrated through three sets of experiments, employing five real-world datasets. These experiments evaluate the structure of social and location ego networks, provide a semantic evaluation of the proposed models, and assess mobile computing performance. Finally, the relevance of the extracted features is showcased by the improved performance of three machine learning models in recognizing daily-life situations. Compared to using only features related to physical context, the proposed approach achieves a 3% improvement in AUROC, 9% in Precision, and 5% in Recall.

In-context learning (ICL) improves language models' performance on a variety of NLP tasks by simply demonstrating a handful of examples at inference time. It is not well understood why ICL ability emerges, as the model has never been specifically trained on such demonstrations. Unlike prior work that explores implicit mechanisms behind ICL, we study ICL via investigating the pretraining data. Specifically, we first adapt an iterative, gradient-based approach to find a small subset of pretraining data that supports ICL. We observe that a continued pretraining on this small subset significantly improves the model's ICL ability, by up to 18%. We then compare the supportive subset constrastively with random subsets of pretraining data and discover: (1) The supportive pretraining data to ICL do not have a higher domain relevance to downstream tasks. (2) The supportive pretraining data have a higher mass of rarely occurring, long-tail tokens. (3) The supportive pretraining data are challenging examples where the information gain from long-range context is below average, indicating learning to incorporate difficult long-range context encourages ICL. Our work takes a first step towards understanding ICL via analyzing instance-level pretraining data. Our insights have a potential to enhance the ICL ability of language models by actively guiding the construction of pretraining data in the future.

Weakly supervised phrase grounding aims at learning region-phrase correspondences using only image-sentence pairs. A major challenge thus lies in the missing links between image regions and sentence phrases during training. To address this challenge, we leverage a generic object detector at training time, and propose a contrastive learning framework that accounts for both region-phrase and image-sentence matching. Our core innovation is the learning of a region-phrase score function, based on which an image-sentence score function is further constructed. Importantly, our region-phrase score function is learned by distilling from soft matching scores between the detected object class names and candidate phrases within an image-sentence pair, while the image-sentence score function is supervised by ground-truth image-sentence pairs. The design of such score functions removes the need of object detection at test time, thereby significantly reducing the inference cost. Without bells and whistles, our approach achieves state-of-the-art results on the task of visual phrase grounding, surpassing previous methods that require expensive object detectors at test time.

Retrieving object instances among cluttered scenes efficiently requires compact yet comprehensive regional image representations. Intuitively, object semantics can help build the index that focuses on the most relevant regions. However, due to the lack of bounding-box datasets for objects of interest among retrieval benchmarks, most recent work on regional representations has focused on either uniform or class-agnostic region selection. In this paper, we first fill the void by providing a new dataset of landmark bounding boxes, based on the Google Landmarks dataset, that includes $94k$ images with manually curated boxes from $15k$ unique landmarks. Then, we demonstrate how a trained landmark detector, using our new dataset, can be leveraged to index image regions and improve retrieval accuracy while being much more efficient than existing regional methods. In addition, we further introduce a novel regional aggregated selective match kernel (R-ASMK) to effectively combine information from detected regions into an improved holistic image representation. R-ASMK boosts image retrieval accuracy substantially at no additional memory cost, while even outperforming systems that index image regions independently. Our complete image retrieval system improves upon the previous state-of-the-art by significant margins on the Revisited Oxford and Paris datasets. Code and data will be released.

Verifiability is one of the core editing principles in Wikipedia, where editors are encouraged to provide citations for the added statements. Statements can be any arbitrary piece of text, ranging from a sentence up to a paragraph. However, in many cases, citations are either outdated, missing, or link to non-existing references (e.g. dead URL, moved content etc.). In total, 20\% of the cases such citations refer to news articles and represent the second most cited source. Even in cases where citations are provided, there are no explicit indicators for the span of a citation for a given piece of text. In addition to issues related with the verifiability principle, many Wikipedia entity pages are incomplete, with relevant information that is already available in online news sources missing. Even for the already existing citations, there is often a delay between the news publication time and the reference time. In this thesis, we address the aforementioned issues and propose automated approaches that enforce the verifiability principle in Wikipedia, and suggest relevant and missing news references for further enriching Wikipedia entity pages.

北京阿比特科技有限公司