在线亚洲91SE亚洲综合在线-午夜欧美不卡AAAA精品观看

Robust estimation provides essential tools for analyzing data that contain outliers, ensuring that statistical models remain reliable even in the presence of some anomalous data. While robust methods have long been available in R, users of Python have lacked a comprehensive package that offers these methods in a cohesive framework. RobPy addresses this gap by offering a wide range of robust methods in Python, built upon established libraries including NumPy, SciPy, and scikit-learn. This package includes tools for robust preprocessing, univariate estimation, covariance matrices, regression, and principal component analysis, which are able to detect outliers and to mitigate their effect. In addition, RobPy provides specialized diagnostic plots for visualizing casewise and cellwise outliers. This paper presents the structure of the RobPy package, demonstrates its functionality through examples, and compares its features to existing implementations in other statistical software. By bringing robust methods to Python, RobPy enables more users to perform robust data analysis in a modern and versatile programming language.

相關內容

穩健性

關注 3

Learning · 表示 · Processing（編程語言） · 線性的 · 線性組合 ·

2024 年 12 月 16 日

LeARN: Learnable and Adaptive Representations for Nonlinear Dynamics in System Identification

Arunabh Singh,Joyjit Mukherjee

from arxiv, This work has been submitted to the 7th Annual Learning for Dynamics & Control Conference for review

System identification, the process of deriving mathematical models of dynamical systems from observed input-output data, has undergone a paradigm shift with the advent of learning-based methods. Addressing the intricate challenges of data-driven discovery in nonlinear dynamical systems, these methods have garnered significant attention. Among them, Sparse Identification of Nonlinear Dynamics (SINDy) has emerged as a transformative approach, distilling complex dynamical behaviors into interpretable linear combinations of basis functions. However, SINDy relies on domain-specific expertise to construct its foundational "library" of basis functions, which limits its adaptability and universality. In this work, we introduce a nonlinear system identification framework called LeARN that transcends the need for prior domain knowledge by learning the library of basis functions directly from data. To enhance adaptability to evolving system dynamics under varying noise conditions, we employ a novel meta-learning-based system identification approach that uses a lightweight deep neural network (DNN) to dynamically refine these basis functions. This not only captures intricate system behaviors but also adapts seamlessly to new dynamical regimes. We validate our framework on the Neural Fly dataset, showcasing its robust adaptation and generalization capabilities. Despite its simplicity, our LeARN achieves competitive dynamical error performance compared to SINDy. This work presents a step toward the autonomous discovery of dynamical systems, paving the way for a future where machine learning uncovers the governing principles of complex systems without requiring extensive domain-specific interventions.

最優化 · 優化器 · Processing（編程語言） · 可辨認的 · 有向 ·

2024 年 12 月 16 日

The Selection Problem in Multi-Query Optimization: a Comprehensive Survey

Sergey Zinchenko,Denis Ponomaryov

View materialization, index selection, and plan caching are well-known techniques for optimization of query processing in database systems. The essence of these tasks is to select and save a subset of the most useful candidates (views/indexes/plans) for reuse within given space/time budget constraints. In this paper, based on the View Selection Problem, we propose a unified view on these problems. We identify the root causes of the complexity of these selection problems and provide a detailed analysis of techniques to cope with them. Our survey provides a modern classification of selection algorithms known in the literature, including the latest ones based on Machine Learning. We provide a ground for the reuse of the selection techniques between different optimization scenarios and highlight challenges and promising directions in the field.

可約的 · MoDELS · 博弈論 · Performer · 縮放 ·

2024 年 12 月 15 日

GAP: Game Theory-Based Approach for Reliability and Power Management in Emerging Fog Computing

Abolfazl Younesi,Mohsen Ansari,Alireza Ejlali,Mohammad Amin Fazli,Muhammad Shafique,J?rg Henkel

from arxiv, 13 pages, 10 figures

Fog computing brings about a transformative shift in data management, presenting unprecedented opportunities for enhanced performance and reduced latency. However, one of the key aspects of fog computing revolves around ensuring efficient power and reliability management. To address this challenge, we have introduced a novel model that proposes a non-cooperative game theory-based strategy to strike a balance between power consumption and reliability in decision-making processes. Our proposed model capitalizes on the Cold Primary/Backup strategy (CPB) to guarantee reliability target by re-executing tasks to different nodes when a fault occurs, while also leveraging Dynamic Voltage and Frequency Scaling (DVFS) to reduce power consumption during task execution and maximizing overall efficiency. Non-cooperative game theory plays a pivotal role in our model, as it facilitates the development of strategies and solutions that uphold reliability while reducing power consumption. By treating the trade-off between power and reliability as a non-cooperative game, our proposed method yields significant energy savings, with up to a 35% reduction in energy consumption, 41% decrease in wait time, and 31% shorter completion time compared to state-of-the-art approaches. Our findings underscore the value of game theory in optimizing power and reliability within fog computing environments, demonstrating its potential for driving substantial improvements

MoDELS · 數據集 · CMS · Performer · domain shift ·

2024 年 12 月 13 日

Aspen Open Jets: Unlocking LHC Data for Foundation Models in Particle Physics

Oz Amram,Luca Anzalone,Joschka Birk,Darius A. Faroughy,Anna Hallin,Gregor Kasieczka,Michael Kr?mer,Ian Pang,Humberto Reyes-Gonzalez,David Shih

from arxiv, 11 pages, 4 figures, the AspenOpenJets dataset can be found at //doi.org/10.25592/uhhfdm.16505

Foundation models are deep learning models pre-trained on large amounts of data which are capable of generalizing to multiple datasets and/or downstream tasks. This work demonstrates how data collected by the CMS experiment at the Large Hadron Collider can be useful in pre-training foundation models for HEP. Specifically, we introduce the AspenOpenJets dataset, consisting of approximately 180M high $p_T$ jets derived from CMS 2016 Open Data. We show how pre-training the OmniJet-$\alpha$ foundation model on AspenOpenJets improves performance on generative tasks with significant domain shift: generating boosted top and QCD jets from the simulated JetClass dataset. In addition to demonstrating the power of pre-training of a jet-based foundation model on actual proton-proton collision data, we provide the ML-ready derived AspenOpenJets dataset for further public use.

簇 · INTERACT · 相似度 · MoDELS · Processing（編程語言） ·

2024 年 12 月 13 日

Sims: An Interactive Tool for Geospatial Matching and Clustering

Akram Zaytar,Girmaw Abebe Tadesse,Caleb Robinson,Eduardo G. Bendito,Medha Devare,Meklit Chernet,Gilles Q. Hacheme,Rahul Dodhia,Juan M. Lavista Ferres

Acquiring, processing, and visualizing geospatial data requires significant computing resources, especially for large spatio-temporal domains. This challenge hinders the rapid discovery of predictive features, which is essential for advancing geospatial modeling. To address this, we developed Similarity Search (Sims), a no-code web tool that allows users to visualize, compare, cluster, and perform similarity search over defined regions of interest using Google Earth Engine as a backend. Sims is designed to complement existing modeling tools by focusing on feature exploration rather than model creation. We demonstrate the utility of Sims through a case study analyzing simulated maize yield data in Rwanda, where we evaluate how different combinations of soil, weather, and agronomic features affect the clustering of yield response zones. Sims is open source and available at //github.com/microsoft/Sims

Prompt · 優化器 · MoDELS · 語言模型化 · Performer ·

2024 年 12 月 12 日

GReaTer: Gradients over Reasoning Makes Smaller Language Models Strong Prompt Optimizers

Sarkar Snigdha Sarathi Das,Ryo Kamoi,Bo Pang,Yusen Zhang,Caiming Xiong,Rui Zhang

from arxiv, 32 pages, 8 figures

The effectiveness of large language models (LLMs) is closely tied to the design of prompts, making prompt optimization essential for enhancing their performance across a wide range of tasks. Many existing approaches to automating prompt engineering rely exclusively on textual feedback, refining prompts based solely on inference errors identified by large, computationally expensive LLMs. Unfortunately, smaller models struggle to generate high-quality feedback, resulting in complete dependence on large LLM judgment. Moreover, these methods fail to leverage more direct and finer-grained information, such as gradients, due to operating purely in text space. To this end, we introduce GReaTer, a novel prompt optimization technique that directly incorporates gradient information over task-specific reasoning. By utilizing task loss gradients, GReaTer enables self-optimization of prompts for open-source, lightweight language models without the need for costly closed-source LLMs. This allows high-performance prompt optimization without dependence on massive LLMs, closing the gap between smaller models and the sophisticated reasoning often needed for prompt refinement. Extensive evaluations across diverse reasoning tasks including BBH, GSM8k, and FOLIO demonstrate that GReaTer consistently outperforms previous state-of-the-art prompt optimization methods, even those reliant on powerful LLMs. Additionally, GReaTer-optimized prompts frequently exhibit better transferability and, in some cases, boost task performance to levels comparable to or surpassing those achieved by larger language models, highlighting the effectiveness of prompt optimization guided by gradients over reasoning. Code of GReaTer is available at //github.com/psunlpgroup/GreaTer.

估計/估計量 · MS · INFORMS · Integration · 多峰值 ·

2024 年 12 月 12 日

At First Contact: Stiffness Estimation Using Vibrational Information for Prosthetic Grasp Modulation

Anway S. Pimpalkar,Ariel Slepyan,Nitish V. Thakor

from arxiv, 5 pages, 7 figures, for IEEE Sensors Letters

Stiffness estimation is crucial for delicate object manipulation in robotic and prosthetic hands but remains challenging due to dependence on force and displacement measurement and real-time sensory integration. This study presents a piezoelectric sensing framework for stiffness estimation at first contact during pinch grasps, addressing the limitations of traditional force-based methods. Inspired by human skin, a multimodal tactile sensor that captures vibrational and force data is developed and integrated into a prosthetic hand's fingertip. Machine learning models, including support vector machines and convolutional neural networks, demonstrate that vibrational signals within the critical 15 ms after first contact reliably encode stiffness, achieving classification accuracies up to 98.6% and regression errors as low as 2.39 Shore A on real-world objects of varying stiffness. Inference times of less than 1.5 ms are significantly faster than the average grasp closure time (16.65 ms in our dataset), enabling real-time stiffness estimation before the object is fully grasped. By leveraging the transient asymmetry in grasp dynamics, where one finger contacts the object before the others, this method enables early grasp modulation, enhancing safety and intuitiveness in prosthetic hands while offering broad applications in robotics.

Learning · 機器人 · 控制器 · contrastive · 多樣性 ·

2024 年 12 月 12 日

Learning to Adapt: Bio-Inspired Gait Strategies for Versatile Quadruped Locomotion

Joseph Humphreys,Chengxu Zhou

from arxiv, 15 pages, 8 figures, journal paper

Deep reinforcement learning (DRL) has revolutionised quadruped robot locomotion, but existing control frameworks struggle to generalise beyond their training-induced observational scope, resulting in limited adaptability. In contrast, animals achieve exceptional adaptability through gait transition strategies, diverse gait utilisation, and seamless adjustment to immediate environmental demands. Inspired by these capabilities, we present a novel DRL framework that incorporates key attributes of animal locomotion: gait transition strategies, pseudo gait procedural memory, and adaptive motion adjustments. This approach enables our framework to achieve unparalleled adaptability, demonstrated through blind zero-shot deployment on complex terrains and recovery from critically unstable states. Our findings offer valuable insights into the biomechanics of animal locomotion, paving the way for robust, adaptable robotic systems.

state-of-the-art · 可理解性 · BERT · 去噪自編碼器 · Performer ·

2019 年 6 月 19 日

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Zhilin Yang,Zihang Dai,Yiming Yang,Jaime Carbonell,Ruslan Salakhutdinov,Quoc V. Le

from arxiv, Pretrained models and code are available at //github.com/zihangdai/xlnet

With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling. However, relying on corrupting the input with masks, BERT neglects dependency between the masked positions and suffers from a pretrain-finetune discrepancy. In light of these pros and cons, we propose XLNet, a generalized autoregressive pretraining method that (1) enables learning bidirectional contexts by maximizing the expected likelihood over all permutations of the factorization order and (2) overcomes the limitations of BERT thanks to its autoregressive formulation. Furthermore, XLNet integrates ideas from Transformer-XL, the state-of-the-art autoregressive model, into pretraining. Empirically, XLNet outperforms BERT on 20 tasks, often by a large margin, and achieves state-of-the-art results on 18 tasks including question answering, natural language inference, sentiment analysis, and document ranking.

優化器 · Machine Learning · MoDELS · 學成 · 數學優化 ·

2019 年 1 月 16 日

Optimization Models for Machine Learning: A Survey

Claudio Gambella,Bissan Ghaddar,Joe Naoum-Sawaya

This paper surveys the machine learning literature and presents machine learning as optimization models. Such models can benefit from the advancement of numerical optimization techniques which have already played a distinctive role in several machine learning settings. Particularly, mathematical optimization models are presented for commonly used machine learning approaches for regression, classification, clustering, and deep neural networks as well new emerging applications in machine teaching and empirical model learning. The strengths and the shortcomings of these models are discussed and potential research directions are highlighted.