清纯唯美另类亚洲欧美综合_国产精品久久一区二区三区蜜桃_亚洲欧美日本久久综合播放_亚洲免费人成影院在线播放_国内丝袜视频在线精品一区_日韩免费中文在线_综合久久亚洲综合综合久久

Normal numbers were introduced by Borel. Normality is certainly a weak notion of randomness; for instance, there are computable numbers which are absolutely normal. In the present paper, we introduce a relativization of normality to a fixed representation system. When we require normality with respect to large sets of such systems, we find variants of normality that imply randomness notions much stronger than absolute normality. The primary classes of numbers investigated in this paper are the supernormal numbers and the highly normal numbers, which we will define. These are relativizations of normality which are robust to all reasonable changes of representation. Among other results, we give a proof that the highly normal numbers are exactly those of computable dimension 1, which we think gives a more natural characterization than was previously known of this interesting class.

相關內容

規范化的

關注 2

相互獨立的 · Networking · 統計量 · fMRI · 平穩的 ·

2024 年 2 月 5 日

Independence Testing for Temporal Data

Cencheng Shen,Jaewon Chung,Ronak Mehta,Ting Xu,Joshua T. Vogelstein

from arxiv, 21 pages main + 10 pages appendix

Temporal data are increasingly prevalent in modern data science. A fundamental question is whether two time-series are related or not. Existing approaches often have limitations, such as relying on parametric assumptions, detecting only linear associations, and requiring multiple tests and corrections. While many non-parametric and universally consistent dependence measures have recently been proposed, directly applying them to temporal data can inflate the p-value and result in invalid test. To address these challenges, this paper introduces the temporal dependence statistic with block permutation to test independence between temporal data. Under proper assumptions, the proposed procedure is asymptotically valid and universally consistent for testing independence between stationary time-series, and capable of estimating the optimal dependence lag that maximizes the dependence. Notably, it is compatible with a rich family of distance and kernel based dependence measures, eliminates the need for multiple testing, and demonstrates superior power in multivariate, low sample size, and nonlinear settings. An analysis of neural connectivity with fMRI data reveals various temporal dependence among signals within the visual network and default mode network.

簇 · 正則化項 · 優化器 · Extensibility · 過擬合 ·

2024 年 2 月 5 日

Regularization and Optimization in Model-Based Clustering

Raphael Araujo Sampaio,Joaquim Dias Garcia,Marcus Poggi,Thibaut Vidal

Due to their conceptual simplicity, k-means algorithm variants have been extensively used for unsupervised cluster analysis. However, one main shortcoming of these algorithms is that they essentially fit a mixture of identical spherical Gaussians to data that vastly deviates from such a distribution. In comparison, general Gaussian Mixture Models (GMMs) can fit richer structures but require estimating a quadratic number of parameters per cluster to represent the covariance matrices. This poses two main issues: (i) the underlying optimization problems are challenging due to their larger number of local minima, and (ii) their solutions can overfit the data. In this work, we design search strategies that circumvent both issues. We develop more effective optimization algorithms for general GMMs, and we combine these algorithms with regularization strategies that avoid overfitting. Through extensive computational analyses, we observe that optimization or regularization in isolation does not substantially improve cluster recovery. However, combining these techniques permits a completely new level of performance previously unachieved by k-means algorithm variants, unraveling vastly different cluster structures. These results shed new light on the current status quo between GMM and k-means methods and suggest the more frequent use of general GMMs for data exploration. To facilitate such applications, we provide open-source code as well as Julia packages (UnsupervisedClustering.jl and RegularizedCovarianceMatrices.jl) implementing the proposed techniques.

各向同性 · 簇 · 線性的 · 講稿 · 論文 ·

2024 年 2 月 5 日

Isotropy, Clusters, and Classifiers

Timothee Mickus,Stig-Arne Gr?nroos,Joseph Attieh

Whether embedding spaces use all their dimensions equally, i.e., whether they are isotropic, has been a recent subject of discussion. Evidence has been accrued both for and against enforcing isotropy in embedding spaces. In the present paper, we stress that isotropy imposes requirements on the embedding space that are not compatible with the presence of clusters -- which also negatively impacts linear classification objectives. We demonstrate this fact empirically and use it to shed light on previous results from the literature.

IM · 極大 · 貪心逐層預訓練 · 貪心 · 邊緣化 ·

2024 年 2 月 5 日

Fast and Space-Efficient Parallel Algorithms for Influence Maximization

Letong Wang,Xiangyun Ding,Yan Gu,Yihan Sun

Influence Maximization (IM) is a crucial problem in data science. The goal is to find a fixed-size set of highly-influential seed vertices on a network to maximize the influence spread along the edges. While IM is NP-hard on commonly-used diffusion models, a greedy algorithm can achieve $(1-1/e)$-approximation, repeatedly selecting the vertex with the highest marginal gain in influence as the seed. Due to theoretical guarantees, rich literature focuses on improving the performance of the greedy algorithm. To estimate the marginal gain, existing work either runs Monte Carlo (MC) simulations of influence spread or pre-stores hundreds of sketches (usually per-vertex information). However, these approaches can be inefficient in time (MC simulation) or space (storing sketches), preventing the ideas from scaling to today's large-scale graphs. This paper significantly improves the scalability of IM using two key techniques. The first is a sketch-compression technique for the independent cascading model on undirected graphs. It allows combining the simulation and sketching approaches to achieve a time-space tradeoff. The second technique includes new data structures for parallel seed selection. Using our new approaches, we implemented PaC-IM: Parallel and Compressed IM. We compare PaC-IM with state-of-the-art parallel IM systems on a 96-core machine with 1.5TB memory. PaC-IM can process large-scale graphs with up to 900M vertices and 74B edges in about 2 hours. On average across all tested graphs, our uncompressed version is 5--18$\times$ faster and about 1.4$\times$ more space-efficient than existing parallel IM systems. Using compression further saves 3.8$\times$ space with only 70% overhead in time on average.

MoDELS · 模態 ·

2024 年 2 月 2 日

Bisimulations and Logics for Higher-Dimensional Automata

Safa Zouari,Krzysztof Ziemianski,Uli Fahrenberg

Higher-dimensional automata (HDAs) are models of non-in\-ter\-leav\-ing concurrency for analyzing concurrent systems. There is a rich literature that deals with bisimulations for concurrent systems, and some of them have been extended to HDAs. However, no logical characterizations of these relations are currently available for HDAs. In this work, we address this gap by introducing Ipomset modal logic, a Hennessy-Milner type logic over HDAs, and show that it characterizes Path-bisimulation, a variant of ST-bisimulation existing in the literature. We also define a notion of Cell-bisimulation, using the open-maps framework of Joyal, Nielsen, and Winskel, and establish the relationship between these bisimulations (and also their "strong" variants, which take restrictions into account). In our work, we rely on the new categorical definition of HDAs as presheaves over concurrency lists and on track objects.

優化器 · Continuity · 易處理的 · Principle · Performer ·

2024 年 2 月 2 日

Unbalanced and Light Optimal Transport

Milena Gazdieva,Arip Asadulaev,Alexander Korotin,Evgeny Burnaev

While the field of continuous Entropic Optimal Transport (EOT) has been actively developing in recent years, it became evident that the classic EOT problem is prone to different issues like the sensitivity to outliers and imbalance of classes in the source and target measures. This fact inspired the development of solvers which deal with the unbalanced EOT (UEOT) problem - the generalization of EOT allowing for mitigating the mentioned issues by relaxing the marginal constraints. Surprisingly, it turns out that the existing solvers are either based on heuristic principles or heavy-weighted with complex optimization objectives involving several neural networks. We address this challenge and propose a novel theoretically-justified and lightweight unbalanced EOT solver. Our advancement consists in developing a novel view on the optimization of the UEOT problem yielding tractable and non-minimax optimization objective. We show that combined with a light parametrization recently proposed in the field our objective leads to fast, simple and effective solver. It allows solving the continuous UEOT problem in minutes on CPU. We provide illustrative examples of the performance of our solver.

平滑 ·

2024 年 2 月 2 日

Smooth and Proper Maps

Mathieu Anel,Jonathan Weinberger

from arxiv, Dedicated to Andr\'e Joyal to his 80th birthday; 13 pages, 4 tables. v2 simplified Table 3 and corrected the characterization of acyclic/localic maps in the corresponding examples

This is an expository note explaining how the geometric notions of local connectedness and properness are related to the $\Sigma$-type and $\Pi$-type constructors of dependent type theory.

AI for Science (人工智能賦能科學研究) · 可理解性 · AI · Learning · AIM ·

2023 年 11 月 15 日

Artificial Intelligence for Science in Quantum, Atomistic, and Continuum Systems

Xuan Zhang,Limei Wang,Jacob Helwig,Youzhi Luo,Cong Fu,Yaochen Xie,Meng Liu,Yuchao Lin,Zhao Xu,Keqiang Yan,Keir Adams,Maurice Weiler,Xiner Li,Tianfan Fu,Yucheng Wang,Haiyang Yu,YuQing Xie,Xiang Fu,Alex Strasser,Shenglong Xu,Yi Liu,Yuanqi Du,Alexandra Saxton,Hongyi Ling,Hannah Lawrence,Hannes St?rk,Shurui Gui,Carl Edwards,Nicholas Gao,Adriana Ladera,Tailin Wu,Elyssa F. Hofgard,Aria Mansouri Tehrani,Rui Wang,Ameya Daigavane,Montgomery Bohde,Jerry Kurtin,Qian Huang,Tuong Phung,Minkai Xu,Chaitanya K. Joshi,Simon V. Mathis,Kamyar Azizzadenesheli,Ada Fang,Alán Aspuru-Guzik,Erik Bekkers,Michael Bronstein,Marinka Zitnik,Anima Anandkumar,Stefano Ermon,Pietro Liò,Rose Yu,Stephan Günnemann,Jure Leskovec,Heng Ji,Jimeng Sun,Regina Barzilay,Tommi Jaakkola,Connor W. Coley,Xiaoning Qian,Xiaofeng Qian,Tess Smidt,Shuiwang Ji

Advances in artificial intelligence (AI) are fueling a new paradigm of discoveries in natural sciences. Today, AI has started to advance natural sciences by improving, accelerating, and enabling our understanding of natural phenomena at a wide range of spatial and temporal scales, giving rise to a new area of research known as AI for science (AI4Science). Being an emerging research paradigm, AI4Science is unique in that it is an enormous and highly interdisciplinary area. Thus, a unified and technical treatment of this field is needed yet challenging. This work aims to provide a technically thorough account of a subarea of AI4Science; namely, AI for quantum, atomistic, and continuum systems. These areas aim at understanding the physical world from the subatomic (wavefunctions and electron density), atomic (molecules, proteins, materials, and interactions), to macro (fluids, climate, and subsurface) scales and form an important subarea of AI4Science. A unique advantage of focusing on these areas is that they largely share a common set of challenges, thereby allowing a unified and foundational treatment. A key common challenge is how to capture physics first principles, especially symmetries, in natural systems by deep learning methods. We provide an in-depth yet intuitive account of techniques to achieve equivariance to symmetry transformations. We also discuss other common technical challenges, including explainability, out-of-distribution generalization, knowledge transfer with foundation and large language models, and uncertainty quantification. To facilitate learning and education, we provide categorized lists of resources that we found to be useful. We strive to be thorough and unified and hope this initial effort may trigger more community interests and efforts to further advance AI4Science.

模態 · 潛在 · 正則化 · 損失 · Learning ·

2023 年 3 月 10 日

Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning

Qian Jiang,Changyou Chen,Han Zhao,Liqun Chen,Qing Ping,Son Dinh Tran,Yi Xu,Belinda Zeng,Trishul Chilimbi

from arxiv, 14 pages, 8 figure, CVPR 2023 accepted

Contrastive loss has been increasingly used in learning representations from multiple modalities. In the limit, the nature of the contrastive loss encourages modalities to exactly match each other in the latent space. Yet it remains an open question how the modality alignment affects the downstream task performance. In this paper, based on an information-theoretic argument, we first prove that exact modality alignment is sub-optimal in general for downstream prediction tasks. Hence we advocate that the key of better performance lies in meaningful latent modality structures instead of perfect modality alignment. To this end, we propose three general approaches to construct latent modality structures. Specifically, we design 1) a deep feature separation loss for intra-modality regularization; 2) a Brownian-bridge loss for inter-modality regularization; and 3) a geometric consistency loss for both intra- and inter-modality regularization. Extensive experiments are conducted on two popular multi-modal representation learning frameworks: the CLIP-based two-tower model and the ALBEF-based fusion model. We test our model on a variety of tasks including zero/few-shot image classification, image-text retrieval, visual question answering, visual reasoning, and visual entailment. Our method achieves consistent improvements over existing methods, demonstrating the effectiveness and generalizability of our proposed approach on latent modality structure regularization.

AI · contrastive · 評論員 · 學成 · 人工智能 ·

2021 年 7 月 20 日

AI in Finance: Challenges, Techniques and Opportunities

Longbing Cao

from arxiv, The paper is in the revision for ACM Computing Surveys, 40 pages

AI in finance broadly refers to the applications of AI techniques in financial businesses. This area has been lasting for decades with both classic and modern AI techniques applied to increasingly broader areas of finance, economy and society. In contrast to either discussing the problems, aspects and opportunities of finance that have benefited from specific AI techniques and in particular some new-generation AI and data science (AIDS) areas or reviewing the progress of applying specific techniques to resolving certain financial problems, this review offers a comprehensive and dense roadmap of the overwhelming challenges, techniques and opportunities of AI research in finance over the past decades. The landscapes and challenges of financial businesses and data are firstly outlined, followed by a comprehensive categorization and a dense overview of the decades of AI research in finance. We then structure and illustrate the data-driven analytics and learning of financial businesses and data. The comparison, criticism and discussion of classic vs. modern AI techniques for finance are followed. Lastly, open issues and opportunities address future AI-empowered finance and finance-motivated AI research.