Cambodia's agricultural landscape is rapidly transforming, particularly in the cashew sector. Despite the country's rapid emergence and ambition to become the largest cashew producer, comprehensive data on plantation areas and the environmental impacts of this expansion are lacking. This study addresses the gap in detailed land use data for cashew plantations in Cambodia and assesses the implications of agricultural advancements. We collected over 80,000 training polygons across Cambodia to train a convolutional neural network using high-resolution optical and SAR satellite data for precise cashew plantation mapping. Our findings indicate that Cambodia ranks among the top five in terms of cultivated area and the top three in global cashew production, driven by high yields. Significant cultivated areas are located in Kampong Thom, Kratie, and Ratanak Kiri provinces. Balancing rapid agricultural expansion with environmental stewardship, particularly forest conservation, is crucial. Cambodia's cashew production is poised for further growth, driven by high-yielding trees and premium nuts. However, sustainable expansion requires integrating agricultural practices with economic and environmental strategies to enhance local value and protect forested areas. Advanced mapping technologies offer comprehensive tools to support these objectives and ensure the sustainable development of Cambodia's cashew industry.
Insider threats refer to threats originating from people inside organizations. Although such threats are a classical research topic, the systematization of existing knowledge is still limited particularly with respect to non-technical research approaches. To this end, this paper presents a systematic literature review on the psychology of insider threats. According to the review results, the literature has operated with multiple distinct theories but there is still a lack of robust theorization with respect to psychology. The literature has also considered characteristics of a person, his or her personal situation, and other more or less objective facts about the person. These are seen to correlate with psychological concepts such as personality traits and psychological states of a person. In addition, the review discusses gaps and limitations in the existing research, thus opening the door for further psychology research.
Transformers are the current architecture of choice for NLP, but their attention layers do not scale well to long contexts. Recent works propose to replace attention with linear recurrent layers -- this is the case for state space models, which enjoy efficient training and inference. However, it remains unclear whether these models are competitive with transformers in machine translation (MT). In this paper, we provide a rigorous and comprehensive experimental comparison between transformers and linear recurrent models for MT. Concretely, we experiment with RetNet, Mamba, and hybrid versions of Mamba which incorporate attention mechanisms. Our findings demonstrate that Mamba is highly competitive with transformers on sentence and paragraph-level datasets, where in the latter both models benefit from shifting the training distribution towards longer sequences. Further analysis show that integrating attention into Mamba improves translation quality, robustness to sequence length extrapolation, and the ability to recall named entities.
Chain-of-thought (CoT) prompting is a simple and effective method for improving the reasoning capabilities of Large Language Models (LLMs). The basic idea of CoT is to let LLMs break down their thought processes step-by-step by putting exemplars in the input prompt. However, the densely structured prompt exemplars of CoT may cause the cognitive overload of LLMs. Inspired by human cognition, we introduce COT-SEP, a method that strategically employs separators at the end of each exemplar in CoT prompting. These separators are designed to help the LLMs understand their thought processes better while reasoning. Interestingly, it turns out that COT-SEP significantly improves the LLMs' performances on complex reasoning tasks (e.g., GSM8K, AQuA, CSQA), compared with the vanilla CoT, which does not use separators. We also study the effects of the type and the location of separators tested on multiple LLMs, including GPT-3.5-Turbo, GPT-4, and LLaMA-2 7B.
Left, Center, Right is a popular dice game. We analyze the game using Markov chain and Monte Carlo methods. We compute the expected game length for two to eight players and determine the probability of winning for each player in the game. We discuss the surprising conclusions of which players have the highest and lowest chance of winning, and we propose a small rule change that makes the game a little more fair.
This study investigates whether OpenAI's ChatGPT-3.5 and ChatGPT-4 can forecast future events. To evaluate the accuracy of the predictions, we take advantage of the fact that the training data at the time of our experiments (mid 2023) stopped at September 2021, and ask about events that happened in 2022. We employed two prompting strategies: direct prediction and what we call future narratives which ask ChatGPT to tell fictional stories set in the future with characters retelling events that happened in the past, but after ChatGPT's training data had been collected. We prompted ChatGPT to engage in storytelling, particularly within economic contexts. After analyzing 100 trials, we find that future narrative prompts significantly enhanced ChatGPT-4's forecasting accuracy. This was especially evident in its predictions of major Academy Award winners as well as economic trends, the latter inferred from scenarios where the model impersonated public figures like the Federal Reserve Chair, Jerome Powell. As a falsification exercise, we repeated our experiments in May 2024 at which time the models included more recent training data. ChatGPT-4's accuracy significantly improved when the training window included the events being prompted for, achieving 100% accuracy in many instances. The poorer accuracy for events outside of the training window suggests that in the 2023 prediction experiments, ChatGPT-4 was forming predictions based solely on its training data. Narrative prompting also consistently outperformed direct prompting. These findings indicate that narrative prompts leverage the models' capacity for hallucinatory narrative construction, facilitating more effective data synthesis and extrapolation than straightforward predictions. Our research reveals new aspects of LLMs' predictive capabilities and suggests potential future applications in analytical contexts.
In real-world scenarios, individuals often cooperate for mutual benefit. However, differences in wealth can lead to varying outcomes for similar actions. In complex social networks, individuals' choices are also influenced by their neighbors. To explore the evolution of strategies in realistic settings, we conducted repeated asymmetric prisoners dilemma experiments on a weighted BA scale-free network. Our analysis highlighted how the four components of memory-one strategies affect win rates, found two special strategies in the evolutionary process, and increased the cooperation levels among individuals. These findings offer practical insights for addressing real-world problems.
We approach productivity in science in a longitudinal fashion: We track careers over time, up to 40 years. We first allocate scientists to decile-based publishing productivity classes, from the bottom 10% to the top 10%. Then, we seek patterns of mobility between the classes in two career stages: assistant professorship and associate professorship. Our findings confirm that radically changing publishing productivity levels (upward or downward) almost never happens. Scientists with a very weak past track record in publications emerge as having marginal chances of becoming scientists with a very strong future track record across all science, technology, engineering, mathematics, and medicine (STEMM) fields. Hence, our research shows a long-term character of careers in science, with publishing productivity during the apprenticeship period of assistant professorship heavily influencing productivity during the more independent period of associate professorship. We use individual-level microdata on academic careers (from a national registry of scientists) and individual-level metadata on publications (from the Scopus raw dataset). Polish associate professors tend to be stuck in their productivity classes for years: High performers tend to remain high performers, and low performers tend to remain low performers over their careers. Logistic regression analysis powerfully supports our two-dimensional results. We examine all internationally visible Polish associate professors in five fields of science in STEMM fields (N = 4,165 with N art = 71,841 articles).
In the rapidly evolving landscape of artificial intelligence (AI), generative large language models (LLMs) stand at the forefront, revolutionizing how we interact with our data. However, the computational intensity and memory consumption of deploying these models present substantial challenges in terms of serving efficiency, particularly in scenarios demanding low latency and high throughput. This survey addresses the imperative need for efficient LLM serving methodologies from a machine learning system (MLSys) research perspective, standing at the crux of advanced AI innovations and practical system optimizations. We provide in-depth analysis, covering a spectrum of solutions, ranging from cutting-edge algorithmic modifications to groundbreaking changes in system designs. The survey aims to provide a comprehensive understanding of the current state and future directions in efficient LLM serving, offering valuable insights for researchers and practitioners in overcoming the barriers of effective LLM deployment, thereby reshaping the future of AI.
Compared with cheap addition operation, multiplication operation is of much higher computation complexity. The widely-used convolutions in deep neural networks are exactly cross-correlation to measure the similarity between input feature and convolution filters, which involves massive multiplications between float values. In this paper, we present adder networks (AdderNets) to trade these massive multiplications in deep neural networks, especially convolutional neural networks (CNNs), for much cheaper additions to reduce computation costs. In AdderNets, we take the $\ell_1$-norm distance between filters and input feature as the output response. The influence of this new similarity measure on the optimization of neural network have been thoroughly analyzed. To achieve a better performance, we develop a special back-propagation approach for AdderNets by investigating the full-precision gradient. We then propose an adaptive learning rate strategy to enhance the training procedure of AdderNets according to the magnitude of each neuron's gradient. As a result, the proposed AdderNets can achieve 74.9% Top-1 accuracy 91.7% Top-5 accuracy using ResNet-50 on the ImageNet dataset without any multiplication in convolution layer.
Language model pre-training has proven to be useful in learning universal language representations. As a state-of-the-art language model pre-training model, BERT (Bidirectional Encoder Representations from Transformers) has achieved amazing results in many language understanding tasks. In this paper, we conduct exhaustive experiments to investigate different fine-tuning methods of BERT on text classification task and provide a general solution for BERT fine-tuning. Finally, the proposed solution obtains new state-of-the-art results on eight widely-studied text classification datasets.