久久久久久久精品少妇9999,中文字幕精品无码福利电影

from arxiv, change to previous version 1.fix typos and improve presentation of the paper 2.change all the letter $\Theta$ for trace to $\Psi$, to avoid confusion due to that the same letter $\Theta$ is used for the notation in asymtotics. 3.add more discussion in Section 5

Recently, Ivan Mihajlin and Alexander Smal proved a composition theorem of a universal relation and some function via so called xor composition, that is there exists some function $f:\{0,1\}^n \rightarrow \{0,1\}$ such that $\textsf{CC}(\text{U}_n \diamond \text{KW}_f) \geq 1.5n-o(n)$ where $\textsf{CC}$ denotes the communication complexity of the problem. In this paper, we significantly improve their result and present an asymptotically tight and much more general composition theorem of a universal relation and most functions, that is for most functions $f:\{0,1\}^n \rightarrow \{0,1\}$ we have $\textsf{CC}(\text{U}_m \diamond \text{KW}_f) \geq m+ n -O(\sqrt{m})$ when $m=\omega(\log^2 n),n =\omega(\sqrt{m})$. This is done by a direct proof of composition theorem of a universal relation and a multiplexor in the partially half-duplex model avoiding the xor composition. And the proof works even when the multiplexor only contains a few functions. One crucial ingredient in our proof involves a combinatorial problem of constructing a tree of many leaves and every leaf contains a non-overlapping set of functions. For each leaf, there is a set of inputs such that every function in the leaf takes the same value, that is all functions are restricted. We show how to choose a set of good inputs to effectively restrict these functions to force that the number of functions in each leaf is as small as possible while maintaining the total number of functions in all leaves. This results in a large number of leaves.

相關內容

泛函

關注 0

簇 · INFORMS · 無監督 · 可理解性 · Better ·

2024 年 1 月 3 日

Patient-Oriented Unsupervised Learning to Unlock Patterns of Multimorbidity Associated with Stroke using Primary Care Electronic Health Records

Marc Delord,Xiaohui Sun,Annastazia Learoyd,Vasa Curcin,Iain Marshall,Charles Wolfe,Mark Ashworth,Abdel Douiri

Background: Identifying and characterising the longitudinal patterns of multimorbidity associated with stroke is needed to better understand patients' needs and inform new models of care. Methods: We used an unsupervised patient-oriented clustering approach to analyse primary care electronic health records (EHR) of 30 common long-term conditions (LTC), in patients with stroke aged over 18, registered in 41 general practices in south London between 2005 and 2021. Results: Of 849,968 registered patients, 9,847 (1.16%) had a record of stroke, 46.5% were female and median age at record was 65.0 year (IQR: 51.5 to 77.0). The median number of LTCs in addition to stroke was 3 (IQR: from 2 to 5). Patients were stratified in eight clusters. These clusters revealed contrasted patterns of multimorbidity, socio-demographic characteristics (age, gender and ethnicity) and risk factors. Beside a core of 3 clusters associated with conventional stroke risk-factors, minor clusters exhibited less common but recurrent combinations of LTCs including mental health conditions, asthma, osteoarthritis and sickle cell anaemia. Importantly, complex profiles combining mental health conditions, infectious diseases and substance dependency emerged. Conclusion: This patient-oriented approach to EHRs uncovers the heterogeneity of profiles of multimorbidity and socio-demographic characteristics associated with stroke. It highlights the importance of conventional stroke risk factors as well as the association of mental health conditions in complex profiles of multimorbidity displayed in a significant proportion of patients. These results address the need for a better understanding of stroke-associated multimorbidity and complexity to inform more efficient and patient-oriented healthcare models.

得分 · 歸納偏好 · 樣例 · 秩 · MoDELS ·

2024 年 1 月 3 日

Dataset Difficulty and the Role of Inductive Bias

Devin Kwok,Nikhil Anand,Jonathan Frankle,Gintare Karolina Dziugaite,David Rolnick

from arxiv, 10 pages, 6 figures

Motivated by the goals of dataset pruning and defect identification, a growing body of methods have been developed to score individual examples within a dataset. These methods, which we call "example difficulty scores", are typically used to rank or categorize examples, but the consistency of rankings between different training runs, scoring methods, and model architectures is generally unknown. To determine how example rankings vary due to these random and controlled effects, we systematically compare different formulations of scores over a range of runs and model architectures. We find that scores largely share the following traits: they are noisy over individual runs of a model, strongly correlated with a single notion of difficulty, and reveal examples that range from being highly sensitive to insensitive to the inductive biases of certain model architectures. Drawing from statistical genetics, we develop a simple method for fingerprinting model architectures using a few sensitive examples. These findings guide practitioners in maximizing the consistency of their scores (e.g. by choosing appropriate scoring methods, number of runs, and subsets of examples), and establishes comprehensive baselines for evaluating scores in the future.

Minimax · 穩健性 · 異方差 · 相關系數 · 設計 ·

2024 年 1 月 3 日

A Note on Minimax Robustness of Designs Against Correlated or Heteroscedastic Responses

Douglas P. Wiens

We present a result according to which certain functions of covariance matrices are maximized at scalar multiples of the identity matrix. This is used to show that experimental designs that are optimal under an assumption of independent, homoscedastic responses can be minimax robust, in broad classes of alternate covariance structures. In particular it can justify the common practice of disregarding possible dependence, or heteroscedasticity, at the design stage of an experiment.

估計/估計量 · 泛函 · 樣本 · 風險函數 · Performer ·

2024 年 1 月 3 日

Penalty Parameter Selection in Deconvolution by Estimating the Risk for a Smaller Sample Size

David Kent

We address the choice of penalty parameter in the Smoothness-Penalized Deconvolution (SPeD) method of estimating a probability density under additive measurement error. Cross-validation gives an unbiased estimate of the risk (for the present sample size n) with a given penalty parameter, and this function can be minimized as a function of the penalty parameter. Least-squares cross-validation, which has been proposed for the similar Deconvoluting Kernel Density Estimator (DKDE), performs quite poorly for SPeD. We instead estimate the risk function for a smaller sample size n_1 < n with a given penalty parameter, using this to choose the penalty parameter for sample size n_1, and then use the asymptotics of the optimal penalty parameter to choose for sample size n. In a simulation study, we find that this has dramatically better performance than cross-validation, is an improvement over a SURE-type method previously proposed for this estimator, and compares favorably to the classic DKDE with its recommended plug-in method. We prove that the maximum error in estimating the risk function is of smaller order than its optimal rate of convergence.

Storage · INFORMS · 自適應學習 · MoDELS · 圖 ·

2023 年 12 月 31 日

Modeling of Memory Mechanisms in Cerebral Cortex and Simulation of Storage Performance

Hui Wei,Chenyue Feng,Jianning Zhang

At the intersection of computation and cognitive science, graph theory is utilized as a formalized description of complex relationships and structures. Traditional graph models are often static, lacking dynamic and autonomous behavioral patterns. They rely on algorithms with a global view, significantly differing from biological neural networks, in which, to simulate information storage and retrieval processes, the limitations of centralized algorithms must be overcome. This study introduces a directed graph model that equips each node with adaptive learning and decision-making capabilities, thereby facilitating decentralized dynamic information storage and modeling and simulation of the brain's memory process. We abstract different storage instances as directed graph paths, transforming the storage of information into the assignment, discrimination, and extraction of different paths. To address writing and reading challenges, each node has a personalized adaptive learning ability. A storage algorithm without a God's eye view is developed, where each node uses its limited neighborhood information to facilitate the extension, formation, solidification, and awakening of directed graph paths, achieving competitive, reciprocal, and sustainable utilization of limited resources. Storage behavior occurs in each node, with adaptive learning behaviors of nodes concretized in a microcircuit centered around a variable resistor, simulating the electrophysiological behavior of neurons. Under the constraints of neurobiology on the anatomy and electrophysiology of biological neural networks, this model offers a plausible explanation for the mechanism of memory realization, providing a comprehensive, system-level experimental validation of the memory trace theory.

深度前饋網絡 · 近似 · 感知機 · 代價 · Automator ·

2023 年 12 月 29 日

Bespoke Approximation of Multiplication-Accumulation and Activation Targeting Printed Multilayer Perceptrons

Florentia Afentaki,Gurol Saglam,Argyris Kokkinis,Kostas Siozios,Georgios Zervakis,Mehdi B Tahoori

from arxiv, Accepted for publication at the IEEE/ACM International Conference on Computer Aided Design (ICCAD) 2023, San Francisco, USA

Printed Electronics (PE) feature distinct and remarkable characteristics that make them a prominent technology for achieving true ubiquitous computing. This is particularly relevant in application domains that require conformal and ultra-low cost solutions, which have experienced limited penetration of computing until now. Unlike silicon-based technologies, PE offer unparalleled features such as non-recurring engineering costs, ultra-low manufacturing cost, and on-demand fabrication of conformal, flexible, non-toxic, and stretchable hardware. However, PE face certain limitations due to their large feature sizes, that impede the realization of complex circuits, such as machine learning classifiers. In this work, we address these limitations by leveraging the principles of Approximate Computing and Bespoke (fully-customized) design. We propose an automated framework for designing ultra-low power Multilayer Perceptron (MLP) classifiers which employs, for the first time, a holistic approach to approximate all functions of the MLP's neurons: multiplication, accumulation, and activation. Through comprehensive evaluation across various MLPs of varying size, our framework demonstrates the ability to enable battery-powered operation of even the most intricate MLP architecture examined, significantly surpassing the current state of the art.

Principle · 設計 · 編譯器 · Performer · 講稿 ·

2023 年 12 月 29 日

The Design Principles of the Elixir Type System

Giuseppe Castagna,Guillaume Duboc,José Valim

Elixir is a dynamically-typed functional language running on the Erlang Virtual Machine, designed for building scalable and maintainable applications. Its characteristics have earned it a surging adoption by hundreds of industrial actors and tens of thousands of developers. Static typing seems nowadays to be the most important request coming from the Elixir community. We present a gradual type system we plan to include in the Elixir compiler, outline its characteristics and design principles, and show by some short examples how to use it in practice. Developing a static type system suitable for Erlang's family of languages has been an open research problem for almost two decades. Our system transposes to this family of languages a polymorphic type system with set-theoretic types and semantic subtyping. To do that, we had to improve and extend both semantic subtyping and the typing techniques thereof, to account for several characteristics of these languages -- and of Elixir in particular -- such as the arity of functions, the use of guards, a uniform treatment of records and dictionaries, the need for a new sound gradual typing discipline that does not rely on the insertion at compile time of specific run-time type-tests but, rather, takes into account both the type tests performed by the virtual machine and those explicitly added by the programmer. The system presented here is "gradually" being implemented and integrated in Elixir, but a prototype implementation is already available. The aim of this work is to serve as a longstanding reference that will be used to introduce types to Elixir programmers, as well as to hint at some future directions and possible evolutions of the Elixir language.

近似 · 標量 · 泰勒 · 縮放 · state-of-the-art ·

2023 年 12 月 28 日

Mixed-Precision Paterson--Stockmeyer Method for Evaluating Polynomials of Matrices

Xiaobo Liu

The Paterson--Stockmeyer method is an evaluation scheme for matrix polynomials with scalar coefficients that arise in many state-of-the-art algorithms based on polynomial or rational approximation, for example, those for computing transcendental matrix functions. We derive a mixed-precision version of the Paterson--Stockmeyer method that is particularly useful for evaluating matrix polynomials with scalar coefficients of decaying magnitude. The key idea is to perform computations on data of small magnitude in low precision, and rounding error analysis is provided for the use of lower-than-working precisions. We focus on the evaluation of the Taylor approximants of the matrix exponential and show the applicability of our method to the existing scaling and squaring algorithms, particularly when the norm of the input matrix (which in practical algorithms is often scaled towards to origin) is sufficiently small. We also demonstrate through experiments the general applicability of our method to the computation of the polynomials from the Pad\'e approximant of the matrix exponential and the Taylor approximant of the matrix cosine. Numerical experiments show our mixed-precision Paterson--Stockmeyer algorithms can be more efficient than its fixed-precision counterpart while delivering the same level of accuracy.

知識 (knowledge) · 語言模型化 · MoDELS · NLU · Learning ·

2022 年 11 月 17 日

A Survey of Knowledge-Enhanced Pre-trained Language Models

Linmei Hu,Zeyi Liu,Ziwang Zhao,Lei Hou,Liqiang Nie,Juanzi Li

Pre-trained Language Models (PLMs) which are trained on large text corpus via self-supervised learning method, have yielded promising performance on various tasks in Natural Language Processing (NLP). However, though PLMs with huge parameters can effectively possess rich knowledge learned from massive training text and benefit downstream tasks at the fine-tuning stage, they still have some limitations such as poor reasoning ability due to the lack of external knowledge. Research has been dedicated to incorporating knowledge into PLMs to tackle these issues. In this paper, we present a comprehensive review of Knowledge-Enhanced Pre-trained Language Models (KE-PLMs) to provide a clear insight into this thriving field. We introduce appropriate taxonomies respectively for Natural Language Understanding (NLU) and Natural Language Generation (NLG) to highlight these two main tasks of NLP. For NLU, we divide the types of knowledge into four categories: linguistic knowledge, text knowledge, knowledge graph (KG), and rule knowledge. The KE-PLMs for NLG are categorized into KG-based and retrieval-based methods. Finally, we point out some promising future directions of KE-PLMs.

平滑 · 注意力機制 · 反向傳播 · 維特比算法 · 正則化項 ·

2018 年 2 月 20 日

Differentiable Dynamic Programming for Structured Prediction and Attention

Arthur Mensch,Mathieu Blondel

Dynamic programming (DP) solves a variety of structured combinatorial problems by iteratively breaking them down into smaller subproblems. In spite of their versatility, DP algorithms are usually non-differentiable, which hampers their use as a layer in neural networks trained by backpropagation. To address this issue, we propose to smooth the max operator in the dynamic programming recursion, using a strongly convex regularizer. This allows to relax both the optimal value and solution of the original combinatorial problem, and turns a broad class of DP algorithms into differentiable operators. Theoretically, we provide a new probabilistic perspective on backpropagating through these DP operators, and relate them to inference in graphical models. We derive two particular instantiations of our framework, a smoothed Viterbi algorithm for sequence prediction and a smoothed DTW algorithm for time-series alignment. We showcase these instantiations on two structured prediction tasks and on structured and sparse attention for neural machine translation.