2020久久精品亚洲热综合,日本成年黄色一区二区三区

Quasars experiencing strong lensing offer unique viewpoints on subjects related to the cosmic expansion rate, the dark matter profile within the foreground deflectors, and the quasar host galaxies. Unfortunately, identifying them in astronomical images is challenging since they are overwhelmed by the abundance of non-lenses. To address this, we have developed a novel approach by ensembling cutting-edge convolutional networks (CNNs) -- for instance, ResNet, Inception, NASNet, MobileNet, EfficientNet, and RegNet -- along with vision transformers (ViTs) trained on realistic galaxy-quasar lens simulations based on the Hyper Suprime-Cam (HSC) multiband images. While the individual model exhibits remarkable performance when evaluated against the test dataset, achieving an area under the receiver operating characteristic curve of $>$97.3% and a median false positive rate of 3.6%, it struggles to generalize in real data, indicated by numerous spurious sources picked by each classifier. A significant improvement is achieved by averaging these CNNs and ViTs, resulting in the impurities being downsized by factors up to 50. Subsequently, combining the HSC images with the UKIRT, VISTA, and unWISE data, we retrieve approximately 60 million sources as parent samples and reduce this to 892,609 after employing a photometry preselection to discover $z>1.5$ lensed quasars with Einstein radii of $\theta_\mathrm{E}<5$ arcsec. Afterward, the ensemble classifier indicates 3080 sources with a high probability of being lenses, for which we visually inspect, yielding 210 prevailing candidates awaiting spectroscopic confirmation. These outcomes suggest that automated deep learning pipelines hold great potential in effectively detecting strong lenses in vast datasets with minimal manual visual inspection involved.

相關內容

Networking

關注 22

Networking：IFIP International Conferences on Networking。 Explanation：國際網絡會議。 Publisher：IFIP。 SIT：

特化 · 變換 · 推斷 · 可約的 · MoDELS ·

2023 年 10 月 6 日

Exploiting Transformer Activation Sparsity with Dynamic Inference

Miko?aj Piórczyński,Filip Szatkowski,Klaudia Ba?azy,Bartosz Wójcik

Transformer models, despite their impressive performance, often face practical limitations due to their high computational requirements. At the same time, previous studies have revealed significant activation sparsity in these models, indicating the presence of redundant computations. In this paper, we propose Dynamic Sparsified Transformer Inference (DSTI), a method that radically reduces the inference cost of Transformer models by enforcing activation sparsity and subsequently transforming a dense model into its sparse Mixture of Experts (MoE) version. We demonstrate that it is possible to train small gating networks that successfully predict the relative contribution of each expert during inference. Furthermore, we introduce a mechanism that dynamically determines the number of executed experts individually for each token. DSTI can be applied to any Transformer-based architecture and has negligible impact on the accuracy. For the BERT-base classification model, we reduce inference cost by almost 60%.

FFT · 操作 · Processing（編程語言） · Extensibility · Performer ·

2023 年 10 月 6 日

A Structured Matrix Method for Nonequispaced Neural Operators

Levi Lingsch,Mike Michelis,Emmanuel de Bezenac,Sirani M. Perera,Robert K. Katzschmann,Siddhartha Mishra

from arxiv, 22 pages, 12 figures

The computational efficiency of many neural operators, widely used for learning solutions of PDEs, relies on the fast Fourier transform (FFT) for performing spectral computations. However, as FFT is limited to equispaced (rectangular) grids, this limits the efficiency of such neural operators when applied to problems where the input and output functions need to be processed on general non-equispaced point distributions. We address this issue by proposing a novel method that leverages batch matrix multiplications to efficiently construct Vandermonde-structured matrices and compute forward and inverse transforms, on arbitrarily distributed points. An efficient implementation of such structured matrix methods is coupled with existing neural operator models to allow the processing of data on arbitrary non-equispaced distributions of points. With extensive empirical evaluation, we demonstrate that the proposed method allows one to extend neural operators to very general point distributions with significant gains in training speed over baselines, while retaining or improving accuracy.

向量化 · SOFT · 模態 · 線性的 · Projection ·

2023 年 10 月 6 日

Vector Space Semantics for Lambek Calculus with Soft Subexponentials

Lachlan McPheat,Hadi Wazni,Mehrnoosh Sadrzadeh

from arxiv, arXiv:2111.11331v2 was intended to replace arXiv:2005.03074. now restoring arXiv:2111.11331v1

We develop a vector space semantics for Lambek Calculus with Soft Subexponentials, apply the calculus to construct compositional vector interpretations for parasitic gap noun phrases and discourse units with anaphora and ellipsis, and experiment with the constructions in a distributional sentence similarity task. As opposed to previous work, which used Lambek Calculus with a Relevant Modality the calculus used in this paper uses a bounded version of the modality and is decidable. The vector space semantics of this new modality allows us to meaningfully define contraction as projection and provide a linear theory behind what we could previously only achieve via nonlinear maps.

圖 · 成對型 · 統計量 · 潛在 · 隨機場 ·

2023 年 10 月 5 日

A Probabilistic Graph Coupling View of Dimension Reduction

Hugues Van Assel,Thibault Espinasse,Julien Chiquet,Franck Picard

Most popular dimension reduction (DR) methods like t-SNE and UMAP are based on minimizing a cost between input and latent pairwise similarities. Though widely used, these approaches lack clear probabilistic foundations to enable a full understanding of their properties and limitations. To that extent, we introduce a unifying statistical framework based on the coupling of hidden graphs using cross entropy. These graphs induce a Markov random field dependency structure among the observations in both input and latent spaces. We show that existing pairwise similarity DR methods can be retrieved from our framework with particular choices of priors for the graphs. Moreover this reveals that these methods suffer from a statistical deficiency that explains poor performances in conserving coarse-grain dependencies. Our model is leveraged and extended to address this issue while new links are drawn with Laplacian eigenmaps and PCA.

FAST · 增廣拉格朗日法 · 代價函數 · 離散化 · 泛函 ·

2023 年 10 月 4 日

Stability Improvements for Fast Matrix Multiplication

Charlotte Vermeylen,Marc Van Barel

We implement an Augmented Lagrangian method to minimize a constrained least-squares cost function designed to find polyadic decompositions of the matrix multiplication tensor. We use this method to obtain new discrete decompositions and parameter families of decompositions. Using these parametrizations, faster and more stable matrix multiplication algorithms can be discovered.

回合 · 泛化理論 · GROOVE · Learning · Performer ·

2023 年 10 月 4 日

Discovering General Reinforcement Learning Algorithms with Adversarial Environment Design

Matthew Thomas Jackson,Minqi Jiang,Jack Parker-Holder,Risto Vuorio,Chris Lu,Gregory Farquhar,Shimon Whiteson,Jakob Nicolaus Foerster

from arxiv, Published at NeurIPS 2023

The past decade has seen vast progress in deep reinforcement learning (RL) on the back of algorithms manually designed by human researchers. Recently, it has been shown that it is possible to meta-learn update rules, with the hope of discovering algorithms that can perform well on a wide range of RL tasks. Despite impressive initial results from algorithms such as Learned Policy Gradient (LPG), there remains a generalization gap when these algorithms are applied to unseen environments. In this work, we examine how characteristics of the meta-training distribution impact the generalization performance of these algorithms. Motivated by this analysis and building on ideas from Unsupervised Environment Design (UED), we propose a novel approach for automatically generating curricula to maximize the regret of a meta-learned optimizer, in addition to a novel approximation of regret, which we name algorithmic regret (AR). The result is our method, General RL Optimizers Obtained Via Environment Design (GROOVE). In a series of experiments, we show that GROOVE achieves superior generalization to LPG, and evaluate AR against baseline metrics from UED, identifying it as a critical component of environment design in this setting. We believe this approach is a step towards the discovery of truly general RL algorithms, capable of solving a wide range of real-world environments.

MoDELS · 訓練數據 · 可辨認的 · 分數匹配 · state-of-the-art ·

2023 年 10 月 4 日

On Memorization in Diffusion Models

Xiangming Gu,Chao Du,Tianyu Pang,Chongxuan Li,Min Lin,Ye Wang

Due to their capacity to generate novel and high-quality samples, diffusion models have attracted significant research interest in recent years. Notably, the typical training objective of diffusion models, i.e., denoising score matching, has a closed-form optimal solution that can only generate training data replicating samples. This indicates that a memorization behavior is theoretically expected, which contradicts the common generalization ability of state-of-the-art diffusion models, and thus calls for a deeper understanding. Looking into this, we first observe that memorization behaviors tend to occur on smaller-sized datasets, which motivates our definition of effective model memorization (EMM), a metric measuring the maximum size of training data at which a learned diffusion model approximates its theoretical optimum. Then, we quantify the impact of the influential factors on these memorization behaviors in terms of EMM, focusing primarily on data distribution, model configuration, and training procedure. Besides comprehensive empirical results identifying the influential factors, we surprisingly find that conditioning training data on uninformative random labels can significantly trigger the memorization in diffusion models. Our study holds practical significance for diffusion model users and offers clues to theoretical research in deep generative models. Code is available at //github.com/sail-sg/DiffMemorize.

MoDELS · MOS · Extensibility · 標記空間 · Performer ·

2023 年 10 月 4 日

A Foundation Model for General Moving Object Segmentation in Medical Images

Zhongnuo Yan,Tong Han,Yuhao Huang,Lian Liu,Han Zhou,Jiongquan Chen,Wenlong Shi,Yan Cao,Xin Yang,Dong Ni

from arxiv, 6 pages, 8 figures, 3 tables

Medical image segmentation aims to delineate the anatomical or pathological structures of interest, playing a crucial role in clinical diagnosis. A substantial amount of high-quality annotated data is crucial for constructing high-precision deep segmentation models. However, medical annotation is highly cumbersome and time-consuming, especially for medical videos or 3D volumes, due to the huge labeling space and poor inter-frame consistency. Recently, a fundamental task named Moving Object Segmentation (MOS) has made significant advancements in natural images. Its objective is to delineate moving objects from the background within image sequences, requiring only minimal annotations. In this paper, we propose the first foundation model, named iMOS, for MOS in medical images. Extensive experiments on a large multi-modal medical dataset validate the effectiveness of the proposed iMOS. Specifically, with the annotation of only a small number of images in the sequence, iMOS can achieve satisfactory tracking and segmentation performance of moving objects throughout the entire sequence in bi-directions. We hope that the proposed iMOS can help accelerate the annotation speed of experts, and boost the development of medical foundation models.

Machine Learning · Learning · MoDELS · 模型復雜度 · 穩健性 ·

2023 年 10 月 4 日

On Computational Entanglement and Its Interpretation in Adversarial Machine Learning

YenLung Lai,Xingbo Dong,Zhe Jin

Adversarial examples in machine learning has emerged as a focal point of research due to their remarkable ability to deceive models with seemingly inconspicuous input perturbations, potentially resulting in severe consequences. In this study, we embark on a comprehensive exploration of adversarial machine learning models, shedding light on their intrinsic complexity and interpretability. Our investigation reveals intriguing links between machine learning model complexity and Einstein's theory of special relativity, through the concept of entanglement. More specific, we define entanglement computationally and demonstrate that distant feature samples can exhibit strong correlations, akin to entanglement in quantum realm. This revelation challenges conventional perspectives in describing the phenomenon of adversarial transferability observed in contemporary machine learning models. By drawing parallels with the relativistic effects of time dilation and length contraction during computation, we gain deeper insights into adversarial machine learning, paving the way for more robust and interpretable models in this rapidly evolving field.

估計/估計量 · 有偏 · Machine Learning · 方差 · Learning ·

2023 年 10 月 1 日

Stable Estimation of Survival Causal Effects

Khiem Pham,David A. Hirshberg,Phuong-Mai Huynh-Pham,Michele Santacatterina,Ser-Nam Lim,Ramin Zabih

from arxiv, 32 pages, 5 figures

We study the problem of estimating survival causal effects, where the aim is to characterize the impact of an intervention on survival times, i.e., how long it takes for an event to occur. Applications include determining if a drug reduces the time to ICU discharge or if an advertising campaign increases customer dwell time. Historically, the most popular estimates have been based on parametric or semiparametric (e.g. proportional hazards) models; however, these methods suffer from problematic levels of bias. Recently debiased machine learning approaches are becoming increasingly popular, especially in applications to large datasets. However, despite their appealing theoretical properties, these estimators tend to be unstable because the debiasing step involves the use of the inverses of small estimated probabilities -- small errors in the estimated probabilities can result in huge changes in their inverses and therefore the resulting estimator. This problem is exacerbated in survival settings where probabilities are a product of treatment assignment and censoring probabilities. We propose a covariate balancing approach to estimating these inverses directly, sidestepping this problem. The result is an estimator that is stable in practice and enjoys many of the same theoretical properties. In particular, under overlap and asymptotic equicontinuity conditions, our estimator is asymptotically normal with negligible bias and optimal variance. Our experiments on synthetic and semi-synthetic data demonstrate that our method has competitive bias and smaller variance than debiased machine learning approaches.