亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We define `ousiometrics' to be the study of essential meaning in whatever context that meaningful signals are communicated, and `telegnomics' as the study of remotely sensed knowledge. From work emerging through the middle of the 20th century, the essence of meaning has become generally accepted as being well captured by the three orthogonal dimensions of evaluation, potency, and activation (EPA). By re-examining first types and then tokens for the English language, and through the use of automatically annotated histograms -- `ousiograms' -- we find here that: 1. The essence of meaning conveyed by words is instead best described by a compass-like power-danger (PD) framework, and 2. Analysis of a disparate collection of large-scale English language corpora -- literature, news, Wikipedia, talk radio, and social media -- shows that natural language exhibits a systematic bias toward safe, low danger words -- a reinterpretation of the Pollyanna principle's positivity bias for written expression. To help justify our choice of dimension names and to help address the problems with representing observed ousiometric dimensions by bipolar adjective pairs, we introduce and explore `synousionyms' and `antousionyms' -- ousiometric counterparts of synonyms and antonyms. We further show that the PD framework revises the circumplex model of affect as a more general model of state of mind. Finally, we use our findings to construct and test a prototype `ousiometer', a telegnomic instrument that measures ousiometric time series for temporal corpora. We contend that our power-danger ousiometric framework provides a complement for entropy-based measurements, and may be of value for the study of a wide variety of communication across biological and artificial life.

相關內容

 語料庫是語料庫語言學研究的基礎資源,也是經驗主義語言研究方法的主要資源。應用于詞典編纂,語言教學,傳統語言研究,自然語言處理中基于統計或實例的研究等方面。

Dense retrieval has shown promise in the first-stage retrieval process when trained on in-domain labeled datasets. However, previous studies have found that dense retrieval is hard to generalize to unseen domains due to its weak modeling of domain-invariant and interpretable feature (i.e., matching signal between two texts, which is the essence of information retrieval). In this paper, we propose a novel method to improve the generalization of dense retrieval via capturing matching signal called BERM. Fully fine-grained expression and query-oriented saliency are two properties of the matching signal. Thus, in BERM, a single passage is segmented into multiple units and two unit-level requirements are proposed for representation as the constraint in training to obtain the effective matching signal. One is semantic unit balance and the other is essential matching unit extractability. Unit-level view and balanced semantics make representation express the text in a fine-grained manner. Essential matching unit extractability makes passage representation sensitive to the given query to extract the pure matching information from the passage containing complex context. Experiments on BEIR show that our method can be effectively combined with different dense retrieval training methods (vanilla, hard negatives mining and knowledge distillation) to improve its generalization ability without any additional inference overhead and target domain data.

The ability of artificial intelligence (AI) systems to perceive and comprehend audio signals is crucial for many applications. Although significant progress has been made in this area since the development of AudioSet, most existing models are designed to map audio inputs to pre-defined, discrete sound label sets. In contrast, humans possess the ability to not only classify sounds into coarse-grained categories, but also to listen to the details of the sounds, explain the reason for the predictions, think what the sound infers, and understand the scene and what action needs to be taken. Such capabilities beyond perception are not yet present in existing audio models. On the other hand, modern large language models (LLMs) exhibit emerging reasoning ability but they lack audio perception capabilities. Therefore, we ask the question: can we build an AI model that has both audio perception and a reasoning ability? In this paper, we propose a novel audio foundation model, called LTU (Listen, Think, and Understand). To train LTU, we created a new OpenAQA-5M dataset consisting of 1.9 million closed-ended and 3.7 million open-ended, diverse (audio, question, answer) tuples, and used an autoregressive training framework and a perception-to-understanding curriculum. LTU demonstrates strong performance and generalization ability on conventional audio tasks such as classification and captioning. Moreover, it exhibits remarkable reasoning and comprehension abilities in the audio domain. To the best of our knowledge, LTU is the first audio-enabled large language model that bridges audio perception with advanced reasoning.

The graph traversal edit distance (GTED) is an elegant distance measure defined as the minimum edit distance between strings reconstructed from Eulerian trails in two edge-labeled graphs. GTED can be used to infer evolutionary relationships between species by comparing de Bruijn graphs directly without the computationally costly and error-prone process of genome assembly. Ebrahimpour Boroojeny et al.~(2018) suggest two ILP formulations for GTED and claim that GTED is polynomially solvable because the linear programming relaxation of one of the ILP always yields optimal integer solutions. The result that GTED is polynomially solvable is contradictory to the complexity results of existing string-to-graph matching problems. We resolve this conflict in complexity results by proving that GTED is NP-complete and showing that the ILPs proposed by Ebrahimpour Boroojeny et al. do not solve GTED but instead solve for a lower bound of GTED and are not solvable in polynomial time. In addition, we provide the first two, correct ILP formulations of GTED and evaluate their empirical efficiency. These results provide solid algorithmic foundations for comparing genome graphs and point to the direction of approximation heuristics. The source code to reproduce experimental results is available at //github.com/Kingsford-Group/gtednewilp/.

In this paper, we propose the Ordered Median Tree Location Problem (OMT). The OMT is a single-allocation facility location problem where p facilities must be placed on a network connected by a non-directed tree. The objective is to minimize the sum of the ordered weighted averaged allocation costs plus the sum of the costs of connecting the facilities in the tree. We present different MILP formulations for the OMT based on properties of the minimum spanning tree problem and the ordered median optimization. Given that ordered median hub location problems are rather difficult to solve we have improved the OMT solution performance by introducing covering variables in a valid reformulation plus developing two pre-processing phases to reduce the size of this formulations. In addition, we propose a Benders decomposition algorithm to approach the OMT. We establish an empirical comparison between these new formulations and we also provide enhancements that together with a proper formulation allow to solve medium size instances on general random graphs.

Viewing Transformers as interacting particle systems, we describe the geometry of learned representations when the weights are not time dependent. We show that particles, representing tokens, tend to cluster toward particular limiting objects as time tends to infinity. Cluster locations are determined by the initial tokens, confirming context-awareness of representations learned by Transformers. Using techniques from dynamical systems and partial differential equations, we show that the type of limiting object that emerges depends on the spectrum of the value matrix. Additionally, in the one-dimensional case we prove that the self-attention matrix converges to a low-rank Boolean matrix. The combination of these results mathematically confirms the empirical observation made by Vaswani et al. [VSP'17] that leaders appear in a sequence of tokens when processed by Transformers.

The rank invariant (RI), one of the best known invariants of persistence modules $M$ over a given poset P, is defined as the map sending each comparable pair $p\leq q$ in P to the rank of the linear map $M(p\leq q)$. The recently introduced notion of generalized rank invariant (GRI) acquires more discriminating power than the RI at the expense of enlarging the domain of RI to the set Int(P) of intervals of P or to an even larger set. Given that the size of Int(P) can be much larger than that of the domain of the RI, restricting the domain of the GRI to smaller, more manageable subcollections $\mathcal{I}$ of Int(P) would be desirable to reduce the total cost of computing the GRI. This work studies the tension which exists between computational efficiency and strength when restricting the domain of the GRI to different choices of $\mathcal{I}$. In particular, we prove that the discriminating power of the GRI over restricted collections $\mathcal{I}$ strictly increases as $\mathcal{I}$ interpolates between the domain of RI and Int(P). Along the way, some well-known results regarding the RI or GRI from the literature are contextualized within the framework of the M\"obius inversion formula and we obtain a notion of generalize persistence diagram that does not require local finiteness of the indexing poset for persistence modules. Lastly, motivated by a recent finding that zigzag persistence can be used to compute the GRI, we pay a special attention to comparing the discriminating power of the GRI for persistence modules $M$ over $\mathbb{Z}^2$ with the so-called Zigzag-path-Indexed Barcode (ZIB), a map sending each zigzag path $\Gamma$ in $\mathbb{Z}^2$ to the barcode of the restriction of $M$ to $\Gamma$. Clarifying the connection between the GRI and the ZIB is potentially important to understand to what extent zigzag persistence algorithms can be exploited for computing the GRI.

We present a method for extracting general modules for ontologies formulated in the description logic ALC. A module for an ontology is an ideally substantially smaller ontology that preserves all entailments for a user-specified set of terms. As such, it has applications such as ontology reuse and ontology analysis. Different from classical modules, general modules may use axioms not explicitly present in the input ontology, which allows for additional conciseness. So far, general modules have only been investigated for lightweight description logics. We present the first work that considers the more expressive description logic ALC. In particular, our contribution is a new method based on uniform interpolation supported by some new theoretical results. Our evaluation indicates that our general modules are often smaller than classical modules and uniform interpolants computed by the state-of-the-art, and compared with uniform interpolants, can be computed in a significantly shorter time. Moreover, our method can be used for, and in fact improves, the computation of uniform interpolants and classical modules.

Prediction algorithms that quantify the expected benefit of a given treatment conditional on patient characteristics can critically inform medical decisions. Quantifying the performance of treatment benefit prediction algorithms is an active area of research. A recently proposed metric, the concordance statistic for benefit (cfb), evaluates the discriminative ability of a treatment benefit predictor by directly extending the concept of the concordance statistic from a risk model with a binary outcome to a model for treatment benefit. In this work, we scrutinize $cfb$ on multiple fronts. Through numerical examples and theoretical developments, we show that cfb is not a proper scoring rule. We also show that it is sensitive to the unestimable correlation between counterfactual outcomes and to the definition of matched pairs. We argue that measures of statistical dispersion applied to predicted benefits do not suffer from these issues and can be an alternative metric for the discriminatory performance of treatment benefit predictors.

Reasoning with knowledge expressed in natural language and Knowledge Bases (KBs) is a major challenge for Artificial Intelligence, with applications in machine reading, dialogue, and question answering. General neural architectures that jointly learn representations and transformations of text are very data-inefficient, and it is hard to analyse their reasoning process. These issues are addressed by end-to-end differentiable reasoning systems such as Neural Theorem Provers (NTPs), although they can only be used with small-scale symbolic KBs. In this paper we first propose Greedy NTPs (GNTPs), an extension to NTPs addressing their complexity and scalability limitations, thus making them applicable to real-world datasets. This result is achieved by dynamically constructing the computation graph of NTPs and including only the most promising proof paths during inference, thus obtaining orders of magnitude more efficient models. Then, we propose a novel approach for jointly reasoning over KBs and textual mentions, by embedding logic facts and natural language sentences in a shared embedding space. We show that GNTPs perform on par with NTPs at a fraction of their cost while achieving competitive link prediction results on large datasets, providing explanations for predictions, and inducing interpretable models. Source code, datasets, and supplementary material are available online at //github.com/uclnlp/gntp.

While it is nearly effortless for humans to quickly assess the perceptual similarity between two images, the underlying processes are thought to be quite complex. Despite this, the most widely used perceptual metrics today, such as PSNR and SSIM, are simple, shallow functions, and fail to account for many nuances of human perception. Recently, the deep learning community has found that features of the VGG network trained on the ImageNet classification task has been remarkably useful as a training loss for image synthesis. But how perceptual are these so-called "perceptual losses"? What elements are critical for their success? To answer these questions, we introduce a new Full Reference Image Quality Assessment (FR-IQA) dataset of perceptual human judgments, orders of magnitude larger than previous datasets. We systematically evaluate deep features across different architectures and tasks and compare them with classic metrics. We find that deep features outperform all previous metrics by huge margins. More surprisingly, this result is not restricted to ImageNet-trained VGG features, but holds across different deep architectures and levels of supervision (supervised, self-supervised, or even unsupervised). Our results suggest that perceptual similarity is an emergent property shared across deep visual representations.

北京阿比特科技有限公司