In this paper, we consider the fundamental problem of testing for monotone trend in a time series. While the term "trend" is commonly used and has an intuitive meaning, it is first crucial to specify its exact meaning in a hypothesis testing context. A commonly used well-known test is the Mann-Kendall test, which we show does not offer Type 1 error control even in large samples. On the other hand, by an appropriate studentization of the Mann-Kendall statistic, we construct permutation tests that offer asymptotic error control quite generally, but retain the exactness property of permutation tests for i.i.d. observations. We also introduce "local" Mann-Kendall statistics as a means of testing for local rather than global trend in a time series. Similar properties of permutation tests are obtained for these tests as well.
In this paper, we introduce two iterative methods for longest minimal length partition problem, which asks whether the disc (ball) is the set maximizing the total perimeter of the shortest partition that divides the total region into sub-regions with given volume proportions, under a volume constraint. The objective functional is approximated by a short-time heat flow using indicator functions of regions and Gaussian convolution. The problem is then represented as a constrained max-min optimization problem. Auction dynamics is used to find the shortest partition in a fixed region, and threshold dynamics is used to update the region. Numerical experiments in two-dimensional and three-dimensional cases are shown with different numbers of partitions, unequal volume proportions, and different initial shapes. The results of both methods are consistent with the conjecture that the disc in two dimensions and the ball in three dimensions are the solution of the longest minimal length partition problem.
For the first time, this paper presents a taxonomy of legal risks associated with generative AI (GenAI) by breaking down complex legal concepts to provide a common understanding of potential legal challenges for developing and deploying GenAI models. The methodology is based on (1) examining the legal claims that have been filed in existing lawsuits and (2) evaluating the reasonably foreseeable legal claims that may be filed in future lawsuits. First, we identified 29 lawsuits against prominent GenAI entities and tallied the claims of each lawsuit. From there, we identified seven claims that are cited at least four times across these lawsuits as the most likely claims for future GenAI lawsuits. For each of these seven claims, we describe the elements of the claim (what the plaintiff must prove to prevail) and provide an example of how it may apply to GenAI. Next, we identified 30 other potential claims that we consider to be more speculative, because they have been included in fewer than four lawsuits or have yet to be filed. We further separated those 30 claims into 19 that are most likely to be made in relation to pre-deployment of GenAI models and 11 that are more likely to be made in connection with post-deployment of GenAI models since the legal risks will vary between entities that create versus deploy them. For each of these claims, we describe the elements of the claim and the potential remedies that plaintiffs may seek to help entities determine their legal risks in developing or deploying GenAI. Lastly, we close the paper by noting the novelty of GenAI technology and propose some applications for the paper's taxonomy in driving further research.
Slope coefficients in rank-rank regressions are popular measures of intergenerational mobility. In this paper, we first point out two important properties of the OLS estimator in such regressions: commonly used variance estimators do not consistently estimate the asymptotic variance of the OLS estimator and, when the underlying distribution is not continuous, the OLS estimator may be highly sensitive to the way in which ties are handled. Motivated by these findings we derive the asymptotic theory for the OLS estimator in a general rank-rank regression specification without making assumptions about the continuity of the underlying distribution. We then extend the asymptotic theory to other regressions involving ranks that have been used in empirical work. Finally, we apply our new inference methods to three empirical studies. We find that the confidence intervals based on estimators of the correct variance may sometimes be substantially shorter and sometimes substantially longer than those based on commonly used variance estimators. The differences in confidence intervals concern economically meaningful values of mobility and thus may lead to different conclusions when comparing mobility across different regions or countries.
In this paper we present Large Language Model Assisted Retrieval Model Ranking (LARMOR), an effective unsupervised approach that leverages LLMs for selecting which dense retriever to use on a test corpus (target). Dense retriever selection is crucial for many IR applications that rely on using dense retrievers trained on public corpora to encode or search a new, private target corpus. This is because when confronted with domain shift, where the downstream corpora, domains, or tasks of the target corpus differ from the domain/task the dense retriever was trained on, its performance often drops. Furthermore, when the target corpus is unlabeled, e.g., in a zero-shot scenario, the direct evaluation of the model on the target corpus becomes unfeasible. Unsupervised selection of the most effective pre-trained dense retriever becomes then a crucial challenge. Current methods for dense retriever selection are insufficient in handling scenarios with domain shift. Our proposed solution leverages LLMs to generate pseudo-relevant queries, labels and reference lists based on a set of documents sampled from the target corpus. Dense retrievers are then ranked based on their effectiveness on these generated pseudo-relevant signals. Notably, our method is the first approach that relies solely on the target corpus, eliminating the need for both training corpora and test labels. To evaluate the effectiveness of our method, we construct a large pool of state-of-the-art dense retrievers. The proposed approach outperforms existing baselines with respect to both dense retriever selection and ranking. We make our code and results publicly available at //github.com/ielab/larmor/.
In this paper, we study the contextual multinomial logit (MNL) bandit problem in which a learning agent sequentially selects an assortment based on contextual information, and user feedback follows an MNL choice model. There has been a significant discrepancy between lower and upper regret bounds, particularly regarding the feature dimension $d$ and the maximum assortment size $K$. Additionally, the variation in reward structures between these bounds complicates the quest for optimality. Under uniform rewards, where all items have the same expected reward, we establish a regret lower bound of $\Omega(d\sqrt{\smash[b]{T/K}})$ and propose a constant-time algorithm, OFU-MNL+, that achieves a matching upper bound of $\tilde{O}(d\sqrt{\smash[b]{T/K}})$. Under non-uniform rewards, we prove a lower bound of $\Omega(d\sqrt{T})$ and an upper bound of $\tilde{O}(d\sqrt{T})$, also achievable by OFU-MNL+. Our empirical studies support these theoretical findings. To the best of our knowledge, this is the first work in the contextual MNL bandit literature to prove minimax optimality -- for either uniform or non-uniform reward setting -- and to propose a computationally efficient algorithm that achieves this optimality up to logarithmic factors.
This book is the result of a seminar in which we reviewed multimodal approaches and attempted to create a solid overview of the field, starting with the current state-of-the-art approaches in the two subfields of Deep Learning individually. Further, modeling frameworks are discussed where one modality is transformed into the other, as well as models in which one modality is utilized to enhance representation learning for the other. To conclude the second part, architectures with a focus on handling both modalities simultaneously are introduced. Finally, we also cover other modalities as well as general-purpose multi-modal models, which are able to handle different tasks on different modalities within one unified architecture. One interesting application (Generative Art) eventually caps off this booklet.
In this paper we provide a comprehensive introduction to knowledge graphs, which have recently garnered significant attention from both industry and academia in scenarios that require exploiting diverse, dynamic, large-scale collections of data. After a general introduction, we motivate and contrast various graph-based data models and query languages that are used for knowledge graphs. We discuss the roles of schema, identity, and context in knowledge graphs. We explain how knowledge can be represented and extracted using a combination of deductive and inductive techniques. We summarise methods for the creation, enrichment, quality assessment, refinement, and publication of knowledge graphs. We provide an overview of prominent open knowledge graphs and enterprise knowledge graphs, their applications, and how they use the aforementioned techniques. We conclude with high-level future research directions for knowledge graphs.
BERT, a pre-trained Transformer model, has achieved ground-breaking performance on multiple NLP tasks. In this paper, we describe BERTSUM, a simple variant of BERT, for extractive summarization. Our system is the state of the art on the CNN/Dailymail dataset, outperforming the previous best-performed system by 1.65 on ROUGE-L. The codes to reproduce our results are available at //github.com/nlpyang/BertSum
In this paper, we introduce the Reinforced Mnemonic Reader for machine reading comprehension tasks, which enhances previous attentive readers in two aspects. First, a reattention mechanism is proposed to refine current attentions by directly accessing to past attentions that are temporally memorized in a multi-round alignment architecture, so as to avoid the problems of attention redundancy and attention deficiency. Second, a new optimization approach, called dynamic-critical reinforcement learning, is introduced to extend the standard supervised method. It always encourages to predict a more acceptable answer so as to address the convergence suppression problem occurred in traditional reinforcement learning algorithms. Extensive experiments on the Stanford Question Answering Dataset (SQuAD) show that our model achieves state-of-the-art results. Meanwhile, our model outperforms previous systems by over 6% in terms of both Exact Match and F1 metrics on two adversarial SQuAD datasets.
In this paper, we propose a conceptually simple and geometrically interpretable objective function, i.e. additive margin Softmax (AM-Softmax), for deep face verification. In general, the face verification task can be viewed as a metric learning problem, so learning large-margin face features whose intra-class variation is small and inter-class difference is large is of great importance in order to achieve good performance. Recently, Large-margin Softmax and Angular Softmax have been proposed to incorporate the angular margin in a multiplicative manner. In this work, we introduce a novel additive angular margin for the Softmax loss, which is intuitively appealing and more interpretable than the existing works. We also emphasize and discuss the importance of feature normalization in the paper. Most importantly, our experiments on LFW BLUFR and MegaFace show that our additive margin softmax loss consistently performs better than the current state-of-the-art methods using the same network architecture and training dataset. Our code has also been made available at //github.com/happynear/AMSoftmax