青柠在线观看免费高清1,日本特黄AAA大片24免费区,99ER在线免费视频,欧美日韩亚洲三级片大全在线观看,黄色三级AV片在线免费看

In this work, we propose an ensemble modeling approach for multimodal action recognition. We independently train individual modality models using a variant of focal loss tailored to handle the long-tailed distribution of the MECCANO [21] dataset. Based on the underlying principle of focal loss, which captures the relationship between tail (scarce) classes and their prediction difficulties, we propose an exponentially decaying variant of focal loss for our current task. It initially emphasizes learning from the hard misclassified examples and gradually adapts to the entire range of examples in the dataset. This annealing process encourages the model to strike a balance between focusing on the sparse set of hard samples, while still leveraging the information provided by the easier ones. Additionally, we opt for the late fusion strategy to combine the resultant probability distributions from RGB and Depth modalities for final action prediction. Experimental evaluations on the MECCANO dataset demonstrate the effectiveness of our approach.

相關內容

RetinaNet

關注 7

RetinaNet是2018年Facebook AI團隊在目標檢測領域新的貢獻。它的重要作者名單中Ross Girshick與Kaiming He赫然在列。來自Microsoft的Sun Jian團隊與現在Facebook的Ross/Kaiming團隊在當前視覺目標分類、檢測領域有著北喬峰、南慕容一般的獨特地位。這兩個實驗室的文章多是行業里前進方向的提示牌。 RetinaNet只是原來FPN網絡與FCN網絡的組合應用，因此在目標網絡檢測框架上它并無特別亮眼創新。文章中最大的創新來自于Focal loss的提出及在單階段目標檢測網絡RetinaNet（實質為Resnet + FPN + FCN）的成功應用。Focal loss是一種改進了的交叉熵(cross-entropy, CE)loss，它通過在原有的CE loss上乘了個使易檢測目標對模型訓練貢獻削弱的指數式，從而使得Focal loss成功地解決了在目標檢測時，正負樣本區域極不平衡而目標檢測loss易被大批量負樣本所左右的問題。此問題是單階段目標檢測框架（如SSD/Yolo系列）與雙階段目標檢測框架（如Faster-RCNN/R-FCN等）accuracy gap的最大原因。在Focal loss提出之前，已有的目標檢測網絡都是通過像Boot strapping/Hard example mining等方法來解決此問題的。作者通過后續實驗成功表明Focal loss可在單階段目標檢測網絡中成功使用，并最終能以更快的速率實現與雙階段目標檢測網絡近似或更優的效果。

矩 · 估計/估計量 · 均值 · 穩健性 · 噪聲 ·

2023 年 11 月 8 日

Robust Mean Estimation Without Moments for Symmetric Distributions

Gleb Novikov,David Steurer,Stefan Tiegel

from arxiv, Accepted at NeurIPS 2023

We study the problem of robustly estimating the mean or location parameter without moment assumptions. We show that for a large class of symmetric distributions, the same error as in the Gaussian setting can be achieved efficiently. The distributions we study include products of arbitrary symmetric one-dimensional distributions, such as product Cauchy distributions, as well as elliptical distributions. For product distributions and elliptical distributions with known scatter (covariance) matrix, we show that given an $\varepsilon$-corrupted sample, we can with probability at least $1-\delta$ estimate its location up to error $O(\varepsilon \sqrt{\log(1/\varepsilon)})$ using $\tfrac{d\log(d) + \log(1/\delta)}{\varepsilon^2 \log(1/\varepsilon)}$ samples. This result matches the best-known guarantees for the Gaussian distribution and known SQ lower bounds (up to the $\log(d)$ factor). For elliptical distributions with unknown scatter (covariance) matrix, we propose a sequence of efficient algorithms that approaches this optimal error. Specifically, for every $k \in \mathbb{N}$, we design an estimator using time and samples $\tilde{O}({d^k})$ achieving error $O(\varepsilon^{1-\frac{1}{2k}})$. This matches the error and running time guarantees when assuming certifiably bounded moments of order up to $k$. For unknown covariance, such error bounds of $o(\sqrt{\varepsilon})$ are not even known for (general) sub-Gaussian distributions. Our algorithms are based on a generalization of the well-known filtering technique. We show how this machinery can be combined with Huber-loss-based techniques to work with projections of the noise that behave more nicely than the initial noise. Moreover, we show how SoS proofs can be used to obtain algorithmic guarantees even for distributions without a first moment. We believe that this approach may find other applications in future works.

MoDELS · 地球 · 標注 · Extensibility · 可約的 ·

2023 年 11 月 8 日

Foundation Models for Generalist Geospatial Artificial Intelligence

Johannes Jakubik,Sujit Roy,C. E. Phillips,Paolo Fraccaro,Denys Godwin,Bianca Zadrozny,Daniela Szwarcman,Carlos Gomes,Gabby Nyirjesy,Blair Edwards,Daiki Kimura,Naomi Simumba,Linsong Chu,S. Karthik Mukkavilli,Devyani Lambhate,Kamal Das,Ranjini Bangalore,Dario Oliveira,Michal Muszynski,Kumar Ankur,Muthukumaran Ramasubramanian,Iksha Gurung,Sam Khallaghi, Hanxi, Li,Michael Cecil,Maryam Ahmadi,Fatemeh Kordi,Hamed Alemohammad,Manil Maskey,Raghu Ganti,Kommy Weldemariam,Rahul Ramachandran

Significant progress in the development of highly adaptable and reusable Artificial Intelligence (AI) models is expected to have a significant impact on Earth science and remote sensing. Foundation models are pre-trained on large unlabeled datasets through self-supervision, and then fine-tuned for various downstream tasks with small labeled datasets. This paper introduces a first-of-a-kind framework for the efficient pre-training and fine-tuning of foundational models on extensive geospatial data. We have utilized this framework to create Prithvi, a transformer-based geospatial foundational model pre-trained on more than 1TB of multispectral satellite imagery from the Harmonized Landsat-Sentinel 2 (HLS) dataset. Our study demonstrates the efficacy of our framework in successfully fine-tuning Prithvi to a range of Earth observation tasks that have not been tackled by previous work on foundation models involving multi-temporal cloud gap imputation, flood mapping, wildfire scar segmentation, and multi-temporal crop segmentation. Our experiments show that the pre-trained model accelerates the fine-tuning process compared to leveraging randomly initialized weights. In addition, pre-trained Prithvi compares well against the state-of-the-art, e.g., outperforming a conditional GAN model in multi-temporal cloud imputation by up to 5pp (or 5.7%) in the structural similarity index. Finally, due to the limited availability of labeled data in the field of Earth observation, we gradually reduce the quantity of available labeled data for refining the model to evaluate data efficiency and demonstrate that data can be decreased significantly without affecting the model's accuracy. The pre-trained 100 million parameter model and corresponding fine-tuning workflows have been released publicly as open source contributions to the global Earth sciences community through Hugging Face.

門控 · 線性的 · 循環神經網絡 · MoDELS · Networking ·

2023 年 11 月 8 日

Hierarchically Gated Recurrent Neural Network for Sequence Modeling

Zhen Qin,Songlin Yang,Yiran Zhong

from arxiv, NeurIPS 2023 Spotlight. Zhen Qin and Songlin Yang contribute equally to this paper. Yiran Zhong is the corresponding author. The source code is available at //github.com/OpenNLPLab/HGRN

Transformers have surpassed RNNs in popularity due to their superior abilities in parallel training and long-term dependency modeling. Recently, there has been a renewed interest in using linear RNNs for efficient sequence modeling. These linear RNNs often employ gating mechanisms in the output of the linear recurrence layer while ignoring the significance of using forget gates within the recurrence. In this paper, we propose a gated linear RNN model dubbed Hierarchically Gated Recurrent Neural Network (HGRN), which includes forget gates that are lower bounded by a learnable value. The lower bound increases monotonically when moving up layers. This allows the upper layers to model long-term dependencies and the lower layers to model more local, short-term dependencies. Experiments on language modeling, image classification, and long-range arena benchmarks showcase the efficiency and effectiveness of our proposed model. The source code is available at //github.com/OpenNLPLab/HGRN.

模態 · Extensibility · 收縮 · 論文 ·

2023 年 11 月 8 日

Nested Sequents for Quantified Modal Logics

Tim S. Lyon,Eugenio Orlandelli

from arxiv, accepted to TABLEAUX 2023

This paper studies nested sequents for quantified modal logics. In particular, it considers extensions of the propositional modal logics definable by the axioms D, T, B, 4, and 5 with varying, increasing, decreasing, and constant domains. Each calculus is proved to have good structural properties: weakening and contraction are height-preserving admissible and cut is (syntactically) admissible. Each calculus is shown to be equivalent to the corresponding axiomatic system and, thus, to be sound and complete. Finally, it is argued that the calculi are internal -- i.e., each sequent has a formula interpretation -- whenever the existence predicate is expressible in the language.

Learning · 線性的 · MoDELS · Performer · 統計量 ·

2023 年 11 月 8 日

Learning Linear Gaussian Polytree Models with Interventions

D. Tramontano,L. Waldmann,M. Drton,E. Duarte

from arxiv, To be published in: IEEE Journal on Selected Areas in Information Theory, Special Issue: Causality: Fundamental Limits and Applications

We present a consistent and highly scalable local approach to learn the causal structure of a linear Gaussian polytree using data from interventional experiments with known intervention targets. Our methods first learn the skeleton of the polytree and then orient its edges. The output is a CPDAG representing the interventional equivalence class of the polytree of the true underlying distribution. The skeleton and orientation recovery procedures we use rely on second order statistics and low-dimensional marginal distributions. We assess the performance of our methods under different scenarios in synthetic data sets and apply our algorithm to learn a polytree in a gene expression interventional data set. Our simulation studies demonstrate that our approach is fast, has good accuracy in terms of structural Hamming distance, and handles problems with thousands of nodes.

Learning · 操作 · 估計/估計量 · 可約的 · Minimax ·

2023 年 11 月 8 日

Sharp Spectral Rates for Koopman Operator Learning

Vladimir Kostic,Karim Lounici,Pietro Novelli,Massimiliano Pontil

from arxiv, Accepted to the thirty-seventh Conference on Neural Information Processing Systems (NeurIPS 2023)

Nonlinear dynamical systems can be handily described by the associated Koopman operator, whose action evolves every observable of the system forward in time. Learning the Koopman operator and its spectral decomposition from data is enabled by a number of algorithms. In this work we present for the first time non-asymptotic learning bounds for the Koopman eigenvalues and eigenfunctions. We focus on time-reversal-invariant stochastic dynamical systems, including the important example of Langevin dynamics. We analyze two popular estimators: Extended Dynamic Mode Decomposition (EDMD) and Reduced Rank Regression (RRR). Our results critically hinge on novel {minimax} estimation bounds for the operator norm error, that may be of independent interest. Our spectral learning bounds are driven by the simultaneous control of the operator norm error and a novel metric distortion functional of the estimated eigenfunctions. The bounds indicates that both EDMD and RRR have similar variance, but EDMD suffers from a larger bias which might be detrimental to its learning rate. Our results shed new light on the emergence of spurious eigenvalues, an issue which is well known empirically. Numerical experiments illustrate the implications of the bounds in practice.

判別器 · Learning · 生成式對抗網絡 · Networking · GANs ·

2023 年 11 月 7 日

Dynamically Masked Discriminator for Generative Adversarial Networks

Wentian Zhang,Haozhe Liu,Bing Li,Jinheng Xie,Yawen Huang,Yuexiang Li,Yefeng Zheng,Bernard Ghanem

from arxiv, Updated v2 -- NeurIPS 2023 camera ready version

Training Generative Adversarial Networks (GANs) remains a challenging problem. The discriminator trains the generator by learning the distribution of real/generated data. However, the distribution of generated data changes throughout the training process, which is difficult for the discriminator to learn. In this paper, we propose a novel method for GANs from the viewpoint of online continual learning. We observe that the discriminator model, trained on historically generated data, often slows down its adaptation to the changes in the new arrival generated data, which accordingly decreases the quality of generated results. By treating the generated data in training as a stream, we propose to detect whether the discriminator slows down the learning of new knowledge in generated data. Therefore, we can explicitly enforce the discriminator to learn new knowledge fast. Particularly, we propose a new discriminator, which automatically detects its retardation and then dynamically masks its features, such that the discriminator can adaptively learn the temporally-vary distribution of generated data. Experimental results show our method outperforms the state-of-the-art approaches.

INFORMS · Networking · MoDELS · Performer · Attention ·

2023 年 11 月 7 日

Lightweight Portrait Matting via Regional Attention and Refinement

Yatao Zhong,Ilya Zharkov

We present a lightweight model for high resolution portrait matting. The model does not use any auxiliary inputs such as trimaps or background captures and achieves real time performance for HD videos and near real time for 4K. Our model is built upon a two-stage framework with a low resolution network for coarse alpha estimation followed by a refinement network for local region improvement. However, a naive implementation of the two-stage model suffers from poor matting quality if not utilizing any auxiliary inputs. We address the performance gap by leveraging the vision transformer (ViT) as the backbone of the low resolution network, motivated by the observation that the tokenization step of ViT can reduce spatial resolution while retain as much pixel information as possible. To inform local regions of the context, we propose a novel cross region attention (CRA) module in the refinement network to propagate the contextual information across the neighboring regions. We demonstrate that our method achieves superior results and outperforms other baselines on three benchmark datasets while only uses $1/20$ of the FLOPS compared to the existing state-of-the-art model.

數據增強 · 圖 · 圖形處理器 · Performer · Neural Networks ·

2020 年 12 月 2 日

Data Augmentation for Graph Neural Networks

Tong Zhao,Yozen Liu,Leonardo Neves,Oliver Woodford,Meng Jiang,Neil Shah

from arxiv, AAAI 2021. This complete version contains the Appendix

Data augmentation has been widely used to improve generalizability of machine learning models. However, comparatively little work studies data augmentation for graphs. This is largely due to the complex, non-Euclidean structure of graphs, which limits possible manipulation operations. Augmentation operations commonly used in vision and language have no analogs for graphs. Our work studies graph data augmentation for graph neural networks (GNNs) in the context of improving semi-supervised node-classification. We discuss practical and theoretical motivations, considerations and strategies for graph data augmentation. Our work shows that neural edge predictors can effectively encode class-homophilic structure to promote intra-class edges and demote inter-class edges in given graph structure, and our main contribution introduces the GAug graph data augmentation framework, which leverages these insights to improve performance in GNN-based node classification via edge prediction. Extensive experiments on multiple benchmarks show that augmentation via GAug improves performance across GNN architectures and datasets.

塑造 · 解碼 · MoDELS · 學成 · 生成模型 ·

2018 年 12 月 6 日

Learning Implicit Fields for Generative Shape Modeling

Zhiqin Chen,Hao Zhang

We advocate the use of implicit fields for learning generative models of shapes and introduce an implicit field decoder for shape generation, aimed at improving the visual quality of the generated shapes. An implicit field assigns a value to each point in 3D space, so that a shape can be extracted as an iso-surface. Our implicit field decoder is trained to perform this assignment by means of a binary classifier. Specifically, it takes a point coordinate, along with a feature vector encoding a shape, and outputs a value which indicates whether the point is outside the shape or not. By replacing conventional decoders by our decoder for representation learning and generative modeling of shapes, we demonstrate superior results for tasks such as shape autoencoding, generation, interpolation, and single-view 3D reconstruction, particularly in terms of visual quality.