Due to the rapid development of science and technology, the importance of imprecise, noisy, and uncertain data is increasing at an exponential rate. Thus, mining patterns in uncertain databases have drawn the attention of researchers. Moreover, frequent sequences of items from these databases need to be discovered for meaningful knowledge with great impact. In many real cases, weights of items and patterns are introduced to find interesting sequences as a measure of importance. Hence, a constraint of weight needs to be handled while mining sequential patterns. Besides, due to the dynamic nature of databases, mining important information has become more challenging. Instead of mining patterns from scratch after each increment, incremental mining algorithms utilize previously mined information to update the result immediately. Several algorithms exist to mine frequent patterns and weighted sequences from incremental databases. However, these algorithms are confined to mine the precise ones. Therefore, we have developed an algorithm to mine frequent sequences in an uncertain database in this work. Furthermore, we have proposed two new techniques for mining when the database is incremental. Extensive experiments have been conducted for performance evaluation. The analysis showed the efficiency of our proposed framework.
Sequential decision-making domains such as recommender systems, healthcare and education often have unobserved heterogeneity in the population that can be modeled using latent bandits $-$ a framework where an unobserved latent state determines the model for a trajectory. While the latent bandit framework is compelling, the extent of its generality is unclear. We first address this by establishing a de Finetti theorem for decision processes, and show that $\textit{every}$ exchangeable and coherent stateless decision process is a latent bandit. The latent bandit framework lends itself particularly well to online learning with offline datasets, a problem of growing interest in sequential decision-making. One can leverage offline latent bandit data to learn a complex model for each latent state, so that an agent can simply learn the latent state online to act optimally. We focus on a linear model for a latent bandit with $d_A$-dimensional actions, where the latent states lie in an unknown $d_K$-dimensional subspace for $d_K \ll d_A$. We present SOLD, a novel principled method to learn this subspace from short offline trajectories with guarantees. We then provide two methods to leverage this subspace online: LOCAL-UCB and ProBALL-UCB. We demonstrate that LOCAL-UCB enjoys $\tilde O(\min(d_A\sqrt{T}, d_K\sqrt{T}(1+\sqrt{d_AT/d_KN})))$ regret guarantees, where the effective dimension is lower when the size $N$ of the offline dataset is larger. ProBALL-UCB enjoys a slightly weaker guarantee, but is more practical and computationally efficient. Finally, we establish the efficacy of our methods using experiments on both synthetic data and real-life movie recommendation data from MovieLens.
Noisy correspondence that refers to mismatches in cross-modal data pairs, is prevalent on human-annotated or web-crawled datasets. Prior approaches to leverage such data mainly consider the application of uni-modal noisy label learning without amending the impact on both cross-modal and intra-modal geometrical structures in multimodal learning. Actually, we find that both structures are effective to discriminate noisy correspondence through structural differences when being well-established. Inspired by this observation, we introduce a Geometrical Structure Consistency (GSC) method to infer the true correspondence. Specifically, GSC ensures the preservation of geometrical structures within and between modalities, allowing for the accurate discrimination of noisy samples based on structural differences. Utilizing these inferred true correspondence labels, GSC refines the learning of geometrical structures by filtering out the noisy samples. Experiments across four cross-modal datasets confirm that GSC effectively identifies noisy samples and significantly outperforms the current leading methods.
The popularity of data science as a discipline and its importance in the emerging economy and industrial progress dictate that machine learning be democratized for the masses. This also means that the current practice of workforce training using machine learning tools, which requires low-level statistical and algorithmic details, is a barrier that needs to be addressed. Similar to data management languages such as SQL, machine learning needs to be practiced at a conceptual level to help make it a staple tool for general users. In particular, the technical sophistication demanded by existing machine learning frameworks is prohibitive for many scientists who are not computationally savvy or well versed in machine learning techniques. The learning curve to use the needed machine learning tools is also too high for them to take advantage of these powerful platforms to rapidly advance science. In this paper, we introduce a new declarative machine learning query language, called {\em MQL}, for naive users. We discuss its merit and possible ways of implementing it over a traditional relational database system. We discuss two materials science experiments implemented using MQL on a materials science workflow system called MatFlow.
Critical learning periods are periods early in development where temporary sensory deficits can have a permanent effect on behavior and learned representations. Despite the radical differences between biological and artificial networks, critical learning periods have been empirically observed in both systems. This suggests that critical periods may be fundamental to learning and not an accident of biology. Yet, why exactly critical periods emerge in deep networks is still an open question, and in particular it is unclear whether the critical periods observed in both systems depend on particular architectural or optimization details. To isolate the key underlying factors, we focus on deep linear network models, and show that, surprisingly, such networks also display much of the behavior seen in biology and artificial networks, while being amenable to analytical treatment. We show that critical periods depend on the depth of the model and structure of the data distribution. We also show analytically and in simulations that the learning of features is tied to competition between sources. Finally, we extend our analysis to multi-task learning to show that pre-training on certain tasks can damage the transfer performance on new tasks, and show how this depends on the relationship between tasks and the duration of the pre-training stage. To the best of our knowledge, our work provides the first analytically tractable model that sheds light into why critical learning periods emerge in biological and artificial networks.
Spatial point process models are widely applied to point pattern data from various fields in the social and environmental sciences. However, a serious hurdle in fitting point process models is the presence of duplicated points, wherein multiple observations share identical spatial coordinates. This often occurs because of decisions made in the geo-coding process, such as assigning representative locations (e.g., aggregate-level centroids) to observations when data producers lack exact location information. Because spatial point process models like the Log-Gaussian Cox Process (LGCP) assume unique locations, researchers often employ {\it ad hoc} solutions (e.g., jittering) to address duplicated data before analysis. As an alternative, this study proposes a Modified Minimum Contrast (MMC) method that adapts the inference procedure to account for the effect of duplicates without needing to alter the data. The proposed MMC method is applied to LGCP models, with simulation results demonstrating the gains of our method relative to existing approaches in terms of parameter estimation. Interestingly, simulation results also show the effect of the geo-coding process on parameter estimates, which can be utilized in the implementation of the MMC method. The MMC approach is then used to infer the spatial clustering characteristics of conflict events in Afghanistan (2008-2009).
Machine unlearning, a novel area within artificial intelligence, focuses on addressing the challenge of selectively forgetting or reducing undesirable knowledge or behaviors in machine learning models, particularly in the context of large language models (LLMs). This paper introduces a methodology to align LLMs, such as Open Pre-trained Transformer Language Models, with ethical, privacy, and safety standards by leveraging the gradient ascent algorithm for knowledge unlearning. Our approach aims to selectively erase or modify learned information in LLMs, targeting harmful responses and copyrighted content. This paper presents a dual-pronged approach to enhance the ethical and safe behavior of large language models (LLMs) by addressing the issues of harmful responses and copyrighted content. To mitigate harmful responses, we applied gradient ascent on the PKU dataset, achieving a 75\% reduction in harmful responses for Open Pre-trained Transformer Language Models (OPT1.3b and OPT2.7b) \citet{zhang2022opt} while retaining previous knowledge using the TruthfulQA dataset \citet{DBLP:journals/corr/abs-2109-07958}. For handling copyrighted content, we constructed a custom dataset based on the Lord of the Rings corpus and aligned LLMs (OPT1.3b and OPT2.7b) \citet{zhang2022opt} through LoRA: Low-Rank Adaptation of Large Language Models \citet{DBLP:journals/corr/abs-2106-09685} finetuning. Subsequently, we employed gradient ascent to unlearn the Lord of the Rings content, resulting in a remarkable reduction in the presence of copyrighted material. To maintain a diverse knowledge base, we utilized the Book Corpus dataset. Additionally, we propose a new evaluation technique for assessing the effectiveness of harmful unlearning.
The integration of technology into exercise regimens has emerged as a strategy to enhance normal human capabilities and return human motor function after injury or illness by enhancing motor learning and retention. Much research has focused on how active devices, whether confined to a lab or made into a wearable format, can apply forces at set times and conditions to optimize the process of learning. However, the focus on active force production often forces devices to either be confined to simple movements or interventions. As such, in this paper, we investigate how passive device behaviors can contribute to the process of motor learning by themselves. Our approach involves using a wearable resistance (WR) device, which is outfitted with elastic bands, to apply a force field that changes in response to a person's movements while performing exercises. We develop a method to measure the produced forces from the device without impeding the function and we characterize the device's force generation abilities. We then present a study assessing the impact of the WR device on motor learning of proper squat form compared to visual or no feedback. Biometrics such as knee and hip angles were used to monitor and assess subject performance. Our findings indicate that the force fields produced while training with the WR device can improve performance in full-body exercises similarly to a more direct visual feedback mechanism, though the improvement is not consistent across all performance metrics. Through our research, we contribute important insights into the application of passive wearable resistance technology in practical exercise settings.
Advances in artificial intelligence often stem from the development of new environments that abstract real-world situations into a form where research can be done conveniently. This paper contributes such an environment based on ideas inspired by elementary Microeconomics. Agents learn to produce resources in a spatially complex world, trade them with one another, and consume those that they prefer. We show that the emergent production, consumption, and pricing behaviors respond to environmental conditions in the directions predicted by supply and demand shifts in Microeconomics. We also demonstrate settings where the agents' emergent prices for goods vary over space, reflecting the local abundance of goods. After the price disparities emerge, some agents then discover a niche of transporting goods between regions with different prevailing prices -- a profitable strategy because they can buy goods where they are cheap and sell them where they are expensive. Finally, in a series of ablation experiments, we investigate how choices in the environmental rewards, bartering actions, agent architecture, and ability to consume tradable goods can either aid or inhibit the emergence of this economic behavior. This work is part of the environment development branch of a research program that aims to build human-like artificial general intelligence through multi-agent interactions in simulated societies. By exploring which environment features are needed for the basic phenomena of elementary microeconomics to emerge automatically from learning, we arrive at an environment that differs from those studied in prior multi-agent reinforcement learning work along several dimensions. For example, the model incorporates heterogeneous tastes and physical abilities, and agents negotiate with one another as a grounded form of communication.
Analyzing observational data from multiple sources can be useful for increasing statistical power to detect a treatment effect; however, practical constraints such as privacy considerations may restrict individual-level information sharing across data sets. This paper develops federated methods that only utilize summary-level information from heterogeneous data sets. Our federated methods provide doubly-robust point estimates of treatment effects as well as variance estimates. We derive the asymptotic distributions of our federated estimators, which are shown to be asymptotically equivalent to the corresponding estimators from the combined, individual-level data. We show that to achieve these properties, federated methods should be adjusted based on conditions such as whether models are correctly specified and stable across heterogeneous data sets.
Benefit from the quick development of deep learning techniques, salient object detection has achieved remarkable progresses recently. However, there still exists following two major challenges that hinder its application in embedded devices, low resolution output and heavy model weight. To this end, this paper presents an accurate yet compact deep network for efficient salient object detection. More specifically, given a coarse saliency prediction in the deepest layer, we first employ residual learning to learn side-output residual features for saliency refinement, which can be achieved with very limited convolutional parameters while keep accuracy. Secondly, we further propose reverse attention to guide such side-output residual learning in a top-down manner. By erasing the current predicted salient regions from side-output features, the network can eventually explore the missing object parts and details which results in high resolution and accuracy. Experiments on six benchmark datasets demonstrate that the proposed approach compares favorably against state-of-the-art methods, and with advantages in terms of simplicity, efficiency (45 FPS) and model size (81 MB).