亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We demonstrate the first algorithms for the problem of regression for generalized linear models (GLMs) in the presence of additive oblivious noise. We assume we have sample access to examples $(x, y)$ where $y$ is a noisy measurement of $g(w^* \cdot x)$. In particular, \new{the noisy labels are of the form} $y = g(w^* \cdot x) + \xi + \epsilon$, where $\xi$ is the oblivious noise drawn independently of $x$ \new{and satisfies} $\Pr[\xi = 0] \geq o(1)$, and $\epsilon \sim \mathcal N(0, \sigma^2)$. Our goal is to accurately recover a \new{parameter vector $w$ such that the} function $g(w \cdot x)$ \new{has} arbitrarily small error when compared to the true values $g(w^* \cdot x)$, rather than the noisy measurements $y$. We present an algorithm that tackles \new{this} problem in its most general distribution-independent setting, where the solution may not \new{even} be identifiable. \new{Our} algorithm returns \new{an accurate estimate of} the solution if it is identifiable, and otherwise returns a small list of candidates, one of which is close to the true solution. Furthermore, we \new{provide} a necessary and sufficient condition for identifiability, which holds in broad settings. \new{Specifically,} the problem is identifiable when the quantile at which $\xi + \epsilon = 0$ is known, or when the family of hypotheses does not contain candidates that are nearly equal to a translated $g(w^* \cdot x) + A$ for some real number $A$, while also having large error when compared to $g(w^* \cdot x)$. This is the first \new{algorithmic} result for GLM regression \new{with oblivious noise} which can handle more than half the samples being arbitrarily corrupted. Prior work focused largely on the setting of linear regression, and gave algorithms under restrictive assumptions.

相關內容

We investigate the dimension-parametric complexity of the reachability problem in vector addition systems with states (VASS) and its extension with pushdown stack (pushdown VASS). Up to now, the problem is known to be $\mathcal{F}_k$-hard for VASS of dimension $3k+2$ (the complexity class $\mathcal{F}_k$ corresponds to the $k$th level of the fast-growing hierarchy), and no essentially better bound is known for pushdown VASS. We provide a new construction that improves the lower bound for VASS: $\mathcal{F}_k$-hardness in dimension $2k+3$. Furthermore, building on our new insights we show a new lower bound for pushdown VASS: $\mathcal{F}_k$-hardness in dimension $\frac k 2 + 4$. This dimension-parametric lower bound is strictly stronger than the upper bound for VASS, which suggests that the (still unknown) complexity of the reachability problem in pushdown VASS is higher than in plain VASS (where it is Ackermann-complete).

We propose a novel iterative numerical method to solve the three-dimensional inverse obstacle scattering problem of recovering the shape of the obstacle from far-field measurements. To address the inherent ill-posed nature of the inverse problem, we advocate the use of a trained latent representation of surfaces as the generative prior. This prior enjoys excellent expressivity within the given class of shapes, and meanwhile, the latent dimensionality is low, which greatly facilitates the computation. Thus, the admissible manifold of surfaces is realistic and the resulting optimization problem is less ill-posed. We employ the shape derivative to evolve the latent surface representation, by minimizing the loss, and we provide a local convergence analysis of a gradient descent type algorithm to a stationary point of the loss. We present several numerical examples, including also backscattered and phaseless data, to showcase the effectiveness of the proposed algorithm.

Large Language Models (LLMs) combined with program-based solving techniques are increasingly demonstrating proficiency in mathematical reasoning. However, such progress is mostly demonstrated in closed-source models such as OpenAI-GPT4 and Claude. In this paper, we seek to study the performance of strong open-source LLMs. Specifically, we analyze the outputs of Code Llama (7B) when applied to math word problems. We identify a category of problems that pose a challenge for the model, particularly those involving quantities that span multiple types or units. To address this issue, we propose a systematic approach by defining units for each quantity and ensuring the consistency of these units during mathematical operations. We developed Unit Consistency Programs (UCPs), an annotated dataset of math word problems, each paired with programs that contain unit specifications and unit verification routines. Finally, we finetune the Code Llama (7B) model with UCPs to produce VerityMath and present our preliminary findings.

Nonresponse after probability sampling is a universal challenge in survey sampling, often necessitating adjustments to mitigate sampling and selection bias simultaneously. This study explored the removal of bias and effective utilization of available information, not just in nonresponse but also in the scenario of data integration, where summary statistics from other data sources are accessible. We reformulate these settings within a two-step monotone missing data framework, where the first step of missingness arises from sampling and the second originates from nonresponse. Subsequently, we derive the semiparametric efficiency bound for the target parameter. We also propose adaptive estimators utilizing methods of moments and empirical likelihood approaches to attain the lower bound. The proposed estimator exhibits both efficiency and double robustness. However, attaining efficiency with an adaptive estimator requires the correct specification of certain working models. To reinforce robustness against the misspecification of working models, we extend the property of double robustness to multiple robustness by proposing a two-step empirical likelihood method that effectively leverages empirical weights. A numerical study is undertaken to investigate the finite-sample performance of the proposed methods. We further applied our methods to a dataset from the National Health and Nutrition Examination Survey data by efficiently incorporating summary statistics from the National Health Interview Survey data.

Parameterized convex minorant (PCM) method is proposed for the approximation of the objective function in amortized optimization. In the proposed method, the objective function approximator is expressed by the sum of a PCM and a nonnegative gap function, where the objective function approximator is bounded from below by the PCM convex in the optimization variable. The proposed objective function approximator is a universal approximator for continuous functions, and the global minimizer of the PCM attains the global minimum of the objective function approximator. Therefore, the global minimizer of the objective function approximator can be obtained by a single convex optimization. As a realization of the proposed method, extended parameterized log-sum-exp network is proposed by utilizing a parameterized log-sum-exp network as the PCM. Numerical simulation is performed for parameterized non-convex objective function approximation and for learning-based nonlinear model predictive control to demonstrate the performance and characteristics of the proposed method. The simulation results support that the proposed method can be used to learn objective functions and to find a global minimizer reliably and quickly by using convex optimization algorithms.

Kernel ridge regression, KRR, is a generalization of linear ridge regression that is non-linear in the data, but linear in the parameters. Here, we introduce an equivalent formulation of the objective function of KRR, opening up both for using penalties other than the ridge penalty and for studying kernel ridge regression from the perspective of gradient descent. Using a continuous-time perspective, we derive a closed-form solution for solving kernel regression with gradient descent, something we refer to as kernel gradient flow, KGF, and theoretically bound the differences between KRR and KGF, where, for the latter, regularization is obtained through early stopping. We also generalize KRR by replacing the ridge penalty with the $\ell_1$ and $\ell_\infty$ penalties, respectively, and use the fact that analogous to the similarities between KGF and KRR, $\ell_1$ regularization and forward stagewise regression (also known as coordinate descent), and $\ell_\infty$ regularization and sign gradient descent, follow similar solution paths. We can thus alleviate the need for computationally heavy algorithms based on proximal gradient descent. We show theoretically and empirically how the $\ell_1$ and $\ell_\infty$ penalties, and the corresponding gradient-based optimization algorithms, produce sparse and robust kernel regression solutions, respectively.

Graph-based kNN algorithms have garnered widespread popularity for machine learning tasks due to their simplicity and effectiveness. However, as factual data often inherit complex distributions, the conventional kNN graph's reliance on a unified k-value can hinder its performance. A crucial factor behind this challenge is the presence of ambiguous samples along decision boundaries that are inevitably more prone to incorrect classifications. To address the situation, we propose the Distribution-Informed adaptive kNN Graph (DaNNG), which combines adaptive kNN with distribution-aware graph construction. By incorporating an approximation of the distribution with customized k-adaption criteria, DaNNG can significantly improve performance on ambiguous samples, and hence enhance overall accuracy and generalization capability. Through rigorous evaluations on diverse benchmark datasets, DaNNG outperforms state-of-the-art algorithms, showcasing its adaptability and efficacy across various real-world scenarios.

The existence of representative datasets is a prerequisite of many successful artificial intelligence and machine learning models. However, the subsequent application of these models often involves scenarios that are inadequately represented in the data used for training. The reasons for this are manifold and range from time and cost constraints to ethical considerations. As a consequence, the reliable use of these models, especially in safety-critical applications, is a huge challenge. Leveraging additional, already existing sources of knowledge is key to overcome the limitations of purely data-driven approaches, and eventually to increase the generalization capability of these models. Furthermore, predictions that conform with knowledge are crucial for making trustworthy and safe decisions even in underrepresented scenarios. This work provides an overview of existing techniques and methods in the literature that combine data-based models with existing knowledge. The identified approaches are structured according to the categories integration, extraction and conformity. Special attention is given to applications in the field of autonomous driving.

The accurate and interpretable prediction of future events in time-series data often requires the capturing of representative patterns (or referred to as states) underpinning the observed data. To this end, most existing studies focus on the representation and recognition of states, but ignore the changing transitional relations among them. In this paper, we present evolutionary state graph, a dynamic graph structure designed to systematically represent the evolving relations (edges) among states (nodes) along time. We conduct analysis on the dynamic graphs constructed from the time-series data and show that changes on the graph structures (e.g., edges connecting certain state nodes) can inform the occurrences of events (i.e., time-series fluctuation). Inspired by this, we propose a novel graph neural network model, Evolutionary State Graph Network (EvoNet), to encode the evolutionary state graph for accurate and interpretable time-series event prediction. Specifically, Evolutionary State Graph Network models both the node-level (state-to-state) and graph-level (segment-to-segment) propagation, and captures the node-graph (state-to-segment) interactions over time. Experimental results based on five real-world datasets show that our approach not only achieves clear improvements compared with 11 baselines, but also provides more insights towards explaining the results of event predictions.

Multi-relation Question Answering is a challenging task, due to the requirement of elaborated analysis on questions and reasoning over multiple fact triples in knowledge base. In this paper, we present a novel model called Interpretable Reasoning Network that employs an interpretable, hop-by-hop reasoning process for question answering. The model dynamically decides which part of an input question should be analyzed at each hop; predicts a relation that corresponds to the current parsed results; utilizes the predicted relation to update the question representation and the state of the reasoning process; and then drives the next-hop reasoning. Experiments show that our model yields state-of-the-art results on two datasets. More interestingly, the model can offer traceable and observable intermediate predictions for reasoning analysis and failure diagnosis, thereby allowing manual manipulation in predicting the final answer.

北京阿比特科技有限公司