苹果电影在线观看免费高清,国产亚洲一区二区三区在线,在线青草草永久视频免费播放

We provide a compact data structure for representing polyominoes that supports neighborhood and visibility queries. Neighborhood queries concern reporting adjacent cells to a given cell, and visibility queries determine whether a straight line can be drawn within the polyomino that connects two specified cells. For an arbitrary small $\epsilon >0$, our data structure can encode a polyomino with $n$ cells in $(3+\epsilon)n + o(n)$ bits while supporting all queries in constant time. The space complexity can be improved to $3n+o(n)$, while supporting neighborhood queries in $\mathcal{O}(1)$ and visibility queries in $\mathcal{O}(t(n))$ for any arbitrary $t(n) \in \omega(1)$. Previous attempts at enumerating polyominoes have indicated that at least $2.00091n - o(n)$ bits are required to differentiate between distinct polyominoes, which shows our data structure is compact. In addition, we introduce a succinct data structure tailored for bar graphs, a specific subclass of polyominoes resembling histograms. We demonstrate that a bar graph comprising $n$ cells can be encoded using only $n + o(n)$ bits, enabling constant-time query processing. Meanwhile, $n-1$ bits are necessary to represent any bar graph, proving our data structure is succinct.

相關內容

圖

關注 0

泛化理論 · 樣例 · Learning · Parse · 小樣本學習 ·

2024 年 1 月 18 日

Compositional Program Generation for Few-Shot Systematic Generalization

Tim Klinger,Luke Liu,Soham Dan,Maxwell Crouse,Parikshit Ram,Alexander Gray

from arxiv, 7 pages of text with 1 page of references

Compositional generalization is a key ability of humans that enables us to learn new concepts from only a handful examples. Neural machine learning models, including the now ubiquitous Transformers, struggle to generalize in this way, and typically require thousands of examples of a concept during training in order to generalize meaningfully. This difference in ability between humans and artificial neural architectures, motivates this study on a neuro-symbolic architecture called the Compositional Program Generator (CPG). CPG has three key features: \textit{modularity}, \textit{composition}, and \textit{abstraction}, in the form of grammar rules, that enable it to generalize both systematically to new concepts in a few-shot manner, as well as productively by length on various sequence-to-sequence language tasks. For each input, CPG uses a grammar of the input language and a parser to generate a parse in which each grammar rule is assigned its own unique semantic module, a probabilistic copy or substitution program. Instances with the same parse are always processed with the same composed modules, while those with different parses may be processed with different modules. CPG learns parameters for the modules and is able to learn the semantics for new rules and types incrementally, without forgetting or retraining on rules it's already seen. It achieves perfect generalization on both the SCAN and COGS benchmarks using just 14 examples for SCAN and 22 examples for COGS -- state-of-the-art accuracy with a 1000x improvement in sample efficiency.

Analysis · 優化器 · 相互獨立的 · motivation · Performer ·

2024 年 1 月 18 日

Multiobjective Optimization Analysis for Finding Infrastructure-as-Code Deployment Configurations

Eneko Osaba,Josu Diaz-de-Arcaya,Juncal Alonso,Jesus L. Lobo,Gorka Benguria,I?aki Etxaniz

from arxiv, 9 pages, 1 figure, 4 tables. Paper presented in the 11th International Conference on Computer and Communications Management (ICCCM 2023)

Multiobjective optimization is a hot topic in the artificial intelligence and operations research communities. The design and development of multiobjective methods is a frequent task for researchers and practitioners. As a result of this vibrant activity, a myriad of techniques have been proposed in the literature to date, demonstrating a significant effectiveness for dealing with situations coming from a wide range of real-world areas. This paper is focused on a multiobjective problem related to optimizing Infrastructure-as-Code deployment configurations. The system implemented for solving this problem has been coined as IaC Optimizer Platform (IOP). Despite the fact that a prototypical version of the IOP has been introduced in the literature before, a deeper analysis focused on the resolution of the problem is needed, in order to determine which is the most appropriate multiobjective method for embedding in the IOP. The main motivation behind the analysis conducted in this work is to enhance the IOP performance as much as possible. This is a crucial aspect of this system, deeming that it will be deployed in a real environment, as it is being developed as part of a H2020 European project. Going deeper, we resort in this paper to nine different evolutionary computation-based multiobjective algorithms. For assessing the quality of the considered solvers, 12 different problem instances have been generated based on real-world settings. Results obtained by each method after 10 independent runs have been compared using Friedman's non-parametric tests. Findings reached from the tests carried out lad to the creation of a multi-algorithm system, capable of applying different techniques according to the user's needs.

配分函數 · 泛函 · 混合 · 劃分 · PIN ·

2024 年 1 月 17 日

From Zero-Freeness to Strong Spatial Mixing via a Christoffel-Darboux Type Identity

Shuai Shao,Xiaowei Ye

We derive the strong spatial mixing property for the general 2-spin system from zero-free regions of its partition function. We view the partition function of the 2-spin system as a multivariate function over three complex parameters $(\beta, \gamma, \lambda)$, and we allow the zero-free regions of $\beta, \gamma$ or $\lambda$ to be of arbitrary shapes. As long as the zero-free region contains a positive point and it is a complex neighborhood of $\lambda=0$ when fixing $\beta, \gamma \in \mathbb{C}$, or a complex neighborhood of $\beta\gamma=1$ when fixing $\beta, \lambda\in \mathbb{C}$ or $\gamma, \lambda\in \mathbb{C}$ respectively, we are able to show that the corresponding 2-spin system exhibits strong spatial mixing on such a region. The underlying graphs of the 2-spin system are not necessarily of bounded degree, while are required to include graphs with pinned vertices. We prove this result by establishing a Christoffel-Darboux type identity for the 2-spin system on trees and using certain tools from complex analysis. To our best knowledge, our result is general enough to turn all currently known zero-free regions of the partition function of the 2-spin system where pinned vertices are allowed into the strong spatial mixing property. Moreover, we extend our result to obtain strong spatial mixing for the ferromagnetic Ising model (even with non-uniform external fields) from the celebrated Lee-Yang circle theorem.

對抗樣本 · 樣本 · 損失 · 控制器 · 替代損失 ·

2024 年 1 月 17 日

Diffusion-Based Adversarial Sample Generation for Improved Stealthiness and Controllability

Haotian Xue,Alexandre Araujo,Bin Hu,Yongxin Chen

from arxiv, Accepted as a conference paper in NeurIPS'2023. Code repo: //github.com/xavihart/Diff-PGD

Neural networks are known to be susceptible to adversarial samples: small variations of natural examples crafted to deliberately mislead the models. While they can be easily generated using gradient-based techniques in digital and physical scenarios, they often differ greatly from the actual data distribution of natural images, resulting in a trade-off between strength and stealthiness. In this paper, we propose a novel framework dubbed Diffusion-Based Projected Gradient Descent (Diff-PGD) for generating realistic adversarial samples. By exploiting a gradient guided by a diffusion model, Diff-PGD ensures that adversarial samples remain close to the original data distribution while maintaining their effectiveness. Moreover, our framework can be easily customized for specific tasks such as digital attacks, physical-world attacks, and style-based attacks. Compared with existing methods for generating natural-style adversarial samples, our framework enables the separation of optimizing adversarial loss from other surrogate losses (e.g., content/smoothness/style loss), making it more stable and controllable. Finally, we demonstrate that the samples generated using Diff-PGD have better transferability and anti-purification power than traditional gradient-based methods. Code will be released in //github.com/xavihart/Diff-PGD

Networking · 等變 · Neural Networks · Networks · 不變 ·

2024 年 1 月 17 日

A Characterization Theorem for Equivariant Networks with Point-wise Activations

Marco Pacini,Xiaowen Dong,Bruno Lepri,Gabriele Santin

from arxiv, Accepted at the 12th International Conference on Learning Representations (ICLR 2024)

Equivariant neural networks have shown improved performance, expressiveness and sample complexity on symmetrical domains. But for some specific symmetries, representations, and choice of coordinates, the most common point-wise activations, such as ReLU, are not equivariant, hence they cannot be employed in the design of equivariant neural networks. The theorem we present in this paper describes all possible combinations of finite-dimensional representations, choice of coordinates and point-wise activations to obtain an exactly equivariant layer, generalizing and strengthening existing characterizations. Notable cases of practical relevance are discussed as corollaries. Indeed, we prove that rotation-equivariant networks can only be invariant, as it happens for any network which is equivariant with respect to connected compact groups. Then, we discuss implications of our findings when applied to important instances of exactly equivariant networks. First, we completely characterize permutation equivariant networks such as Invariant Graph Networks with point-wise nonlinearities and their geometric counterparts, highlighting a plethora of models whose expressive power and performance are still unknown. Second, we show that feature spaces of disentangled steerable convolutional neural networks are trivial representations.

Med-PaLM 2 · Performer · 語言模型化 · MoDELS · 自動問答 ·

2023 年 5 月 16 日

Towards Expert-Level Medical Question Answering with Large Language Models

Karan Singhal,Tao Tu,Juraj Gottweis,Rory Sayres,Ellery Wulczyn,Le Hou,Kevin Clark,Stephen Pfohl,Heather Cole-Lewis,Darlene Neal,Mike Schaekermann,Amy Wang,Mohamed Amin,Sami Lachgar,Philip Mansfield,Sushant Prakash,Bradley Green,Ewa Dominowska,Blaise Aguera y Arcas,Nenad Tomasev,Yun Liu,Renee Wong,Christopher Semturs,S. Sara Mahdavi,Joelle Barral,Dale Webster,Greg S. Corrado,Yossi Matias,Shekoofeh Azizi,Alan Karthikesalingam,Vivek Natarajan

Recent artificial intelligence (AI) systems have reached milestones in "grand challenges" ranging from Go to protein-folding. The capability to retrieve medical knowledge, reason over it, and answer medical questions comparably to physicians has long been viewed as one such grand challenge. Large language models (LLMs) have catalyzed significant progress in medical question answering; Med-PaLM was the first model to exceed a "passing" score in US Medical Licensing Examination (USMLE) style questions with a score of 67.2% on the MedQA dataset. However, this and other prior work suggested significant room for improvement, especially when models' answers were compared to clinicians' answers. Here we present Med-PaLM 2, which bridges these gaps by leveraging a combination of base LLM improvements (PaLM 2), medical domain finetuning, and prompting strategies including a novel ensemble refinement approach. Med-PaLM 2 scored up to 86.5% on the MedQA dataset, improving upon Med-PaLM by over 19% and setting a new state-of-the-art. We also observed performance approaching or exceeding state-of-the-art across MedMCQA, PubMedQA, and MMLU clinical topics datasets. We performed detailed human evaluations on long-form questions along multiple axes relevant to clinical applications. In pairwise comparative ranking of 1066 consumer medical questions, physicians preferred Med-PaLM 2 answers to those produced by physicians on eight of nine axes pertaining to clinical utility (p < 0.001). We also observed significant improvements compared to Med-PaLM on every evaluation axis (p < 0.001) on newly introduced datasets of 240 long-form "adversarial" questions to probe LLM limitations. While further studies are necessary to validate the efficacy of these models in real-world settings, these results highlight rapid progress towards physician-level performance in medical question answering.

泛化理論 · Vision · domain shift · 對象識別 · 行人重識別 ·

2021 年 7 月 18 日

Domain Generalization in Vision: A Survey

Kaiyang Zhou,Ziwei Liu,Yu Qiao,Tao Xiang,Chen Change Loy

from arxiv, v4: includes the word "vision" in the title; improves the organization and clarity in Section 2-3; adds future directions; and more

Generalization to out-of-distribution (OOD) data is a capability natural to humans yet challenging for machines to reproduce. This is because most learning algorithms strongly rely on the i.i.d.~assumption on source/target data, which is often violated in practice due to domain shift. Domain generalization (DG) aims to achieve OOD generalization by using only source data for model learning. Since first introduced in 2011, research in DG has made great progresses. In particular, intensive research in this topic has led to a broad spectrum of methodologies, e.g., those based on domain alignment, meta-learning, data augmentation, or ensemble learning, just to name a few; and has covered various vision applications such as object recognition, segmentation, action recognition, and person re-identification. In this paper, for the first time a comprehensive literature review is provided to summarize the developments in DG for computer vision over the past decade. Specifically, we first cover the background by formally defining DG and relating it to other research fields like domain adaptation and transfer learning. Second, we conduct a thorough review into existing methods and present a categorization based on their methodologies and motivations. Finally, we conclude this survey with insights and discussions on future research directions.

蒸餾 · MoDELS · 學成 · Student-Teacher · Vision ·

2021 年 6 月 17 日

Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks

Lin Wang,Kuk-Jin Yoon

from arxiv, Accepted to IEEE Transactions on Pattern Analysis and Machine Intelligence(TPAMI),2021. Some references are updated in this version

Deep neural models in recent years have been successful in almost every field, including extremely complex problem statements. However, these models are huge in size, with millions (and even billions) of parameters, thus demanding more heavy computation power and failing to be deployed on edge devices. Besides, the performance boost is highly dependent on redundant labeled data. To achieve faster speeds and to handle the problems caused by the lack of data, knowledge distillation (KD) has been proposed to transfer information learned from one model to another. KD is often characterized by the so-called `Student-Teacher' (S-T) learning framework and has been broadly applied in model compression and knowledge transfer. This paper is about KD and S-T learning, which are being actively studied in recent years. First, we aim to provide explanations of what KD is and how/why it works. Then, we provide a comprehensive survey on the recent progress of KD methods together with S-T frameworks typically for vision tasks. In general, we consider some fundamental questions that have been driving this research area and thoroughly generalize the research progress and technical details. Additionally, we systematically analyze the research status of KD in vision applications. Finally, we discuss the potentials and open challenges of existing methods and prospect the future directions of KD and S-T learning.

泛化理論 · Extensibility · state-of-the-art · 測試數據 · 學成 ·

2021 年 4 月 16 日

Deep Stable Learning for Out-Of-Distribution Generalization

Xingxuan Zhang,Peng Cui,Renzhe Xu,Linjun Zhou,Yue He,Zheyan Shen

Approaches based on deep neural networks have achieved striking performance when testing data and training data share similar distribution, but can significantly fail otherwise. Therefore, eliminating the impact of distribution shifts between training and testing data is crucial for building performance-promising deep models. Conventional methods assume either the known heterogeneity of training data (e.g. domain labels) or the approximately equal capacities of different domains. In this paper, we consider a more challenging case where neither of the above assumptions holds. We propose to address this problem by removing the dependencies between features via learning weights for training samples, which helps deep models get rid of spurious correlations and, in turn, concentrate more on the true connection between discriminative features and labels. Extensive experiments clearly demonstrate the effectiveness of our method on multiple distribution generalization benchmarks compared with state-of-the-art counterparts. Through extensive experiments on distribution generalization benchmarks including PACS, VLCS, MNIST-M, and NICO, we show the effectiveness of our method compared with state-of-the-art counterparts.

超參數 · Performer · Weight · 集成 · 穩健性 ·

2020 年 6 月 24 日

Hyperparameter Ensembles for Robustness and Uncertainty Quantification

Florian Wenzel,Jasper Snoek,Dustin Tran,Rodolphe Jenatton

Ensembles over neural network weights trained from different random initialization, known as deep ensembles, achieve state-of-the-art accuracy and calibration. The recently introduced batch ensembles provide a drop-in replacement that is more parameter efficient. In this paper, we design ensembles not only over weights, but over hyperparameters to improve the state of the art in both settings. For best performance independent of budget, we propose hyper-deep ensembles, a simple procedure that involves a random search over different hyperparameters, themselves stratified across multiple random initializations. Its strong performance highlights the benefit of combining models with both weight and hyperparameter diversity. We further propose a parameter efficient version, hyper-batch ensembles, which builds on the layer structure of batch ensembles and self-tuning networks. The computational and memory costs of our method are notably lower than typical ensembles. On image classification tasks, with MLP, LeNet, and Wide ResNet 28-10 architectures, our methodology improves upon both deep and batch ensembles.