亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We give the first agnostic, efficient, proper learning algorithm for monotone Boolean functions. Given $2^{\tilde{O}(\sqrt{n}/\varepsilon)}$ uniformly random examples of an unknown function $f:\{\pm 1\}^n \rightarrow \{\pm 1\}$, our algorithm outputs a hypothesis $g:\{\pm 1\}^n \rightarrow \{\pm 1\}$ that is monotone and $(\mathrm{opt} + \varepsilon)$-close to $f$, where $\mathrm{opt}$ is the distance from $f$ to the closest monotone function. The running time of the algorithm (and consequently the size and evaluation time of the hypothesis) is also $2^{\tilde{O}(\sqrt{n}/\varepsilon)}$, nearly matching the lower bound of Blais et al (RANDOM '15). We also give an algorithm for estimating up to additive error $\varepsilon$ the distance of an unknown function $f$ to monotone using a run-time of $2^{\tilde{O}(\sqrt{n}/\varepsilon)}$. Previously, for both of these problems, sample-efficient algorithms were known, but these algorithms were not run-time efficient. Our work thus closes this gap in our knowledge between the run-time and sample complexity. This work builds upon the improper learning algorithm of Bshouty and Tamon (JACM '96) and the proper semiagnostic learning algorithm of Lange, Rubinfeld, and Vasilyan (FOCS '22), which obtains a non-monotone Boolean-valued hypothesis, then ``corrects'' it to monotone using query-efficient local computation algorithms on graphs. This black-box correction approach can achieve no error better than $2\mathrm{opt} + \varepsilon$ information-theoretically; we bypass this barrier by a) augmenting the improper learner with a convex optimization step, and b) learning and correcting a real-valued function before rounding its values to Boolean. Our real-valued correction algorithm solves the ``poset sorting'' problem of [LRV22] for functions over general posets with non-Boolean labels.

相關內容

Many real-world optimization problems contain unknown parameters that must be predicted prior to solving. To train the predictive machine learning (ML) models involved, the commonly adopted approach focuses on maximizing predictive accuracy. However, this approach does not always lead to the minimization of the downstream task loss. Decision-focused learning (DFL) is a recently proposed paradigm whose goal is to train the ML model by directly minimizing the task loss. However, state-of-the-art DFL methods are limited by the assumptions they make about the structure of the optimization problem (e.g., that the problem is linear) and by the fact that can only predict parameters that appear in the objective function. In this work, we address these limitations by instead predicting \textit{distributions} over parameters and adopting score function gradient estimation (SFGE) to compute decision-focused updates to the predictive model, thereby widening the applicability of DFL. Our experiments show that by using SFGE we can: (1) deal with predictions that occur both in the objective function and in the constraints; and (2) effectively tackle two-stage stochastic optimization problems.

The dominant theories of rational choice assume logical omniscience. That is, they assume that when facing a decision problem, an agent can perform all relevant computations and determine the truth value of all relevant logical/mathematical claims. This assumption is unrealistic when, for example, we offer bets on remote digits of pi or when an agent faces a computationally intractable planning problem. Furthermore, the assumption of logical omniscience creates contradictions in cases where the environment can contain descriptions of the agent itself. Importantly, strategic interactions as studied in game theory are decision problems in which a rational agent is predicted by its environment (the other players). In this paper, we develop a theory of rational decision making that does not assume logical omniscience. We consider agents who repeatedly face decision problems (including ones like betting on digits of pi or games against other agents). The main contribution of this paper is to provide a sensible theory of rationality for such agents. Roughly, we require that a boundedly rational inductive agent tests each efficiently computable hypothesis infinitely often and follows those hypotheses that keep their promises of high rewards. We then prove that agents that are rational in this sense have other desirable properties. For example, they learn to value random and pseudo-random lotteries at their expected reward. Finally, we consider strategic interactions between different agents and prove a folk theorem for what strategies bounded rational inductive agents can converge to.

We present a novel algorithm to detect double nuclei galaxies (DNG) called GOTHIC (Graph BOosted iterated HIll Climbing) - that detects whether a given image of a galaxy has two or more closely separated nuclei. Our aim is to detect samples of dual or multiple active galactic nuclei (AGN) in galaxies. Although galaxy mergers are common, the detection of dual AGN is rare. Their detection is very important as they help us understand the formation of supermassive black hole (SMBH) binaries, SMBH growth and AGN feedback effects in multiple nuclei systems. There is thus a need for an algorithm to do a systematic survey of existing imaging data for the discovery of DNGs and dual AGN. We have tested GOTHIC on a known sample of DNGs and subsequently applied it to a sample of a million SDSS DR16 galaxies lying in the redshift range of 0 to 0.75 approximately, and have available spectroscopic data. We have detected 159 dual AGN in this sample, of which 2 are triple AGN systems. Our results show that dual AGN are not common, and triple AGN even rarer. The color (u-r) magnitude plots of the DNGs indicate that star formation is quenched as the nuclei come closer and as the AGN fraction increases. The quenching is especially prominent for dual/triple AGN galaxies that lie in the extreme end of the red sequence.

This work, for the first time, introduces two constant factor approximation algorithms with linear query complexity for non-monotone submodular maximization over a ground set of size $n$ subject to a knapsack constraint, $\mathsf{DLA}$ and $\mathsf{RLA}$. $\mathsf{DLA}$ is a deterministic algorithm that provides an approximation factor of $6+\epsilon$ while $\mathsf{RLA}$ is a randomized algorithm with an approximation factor of $4+\epsilon$. Both run in $O(n \log(1/\epsilon)/\epsilon)$ query complexity. The key idea to obtain a constant approximation ratio with linear query lies in: (1) dividing the ground set into two appropriate subsets to find the near-optimal solution over these subsets with linear queries, and (2) combining a threshold greedy with properties of two disjoint sets or a random selection process to improve solution quality. In addition to the theoretical analysis, we have evaluated our proposed solutions with three applications: Revenue Maximization, Image Summarization, and Maximum Weighted Cut, showing that our algorithms not only return comparative results to state-of-the-art algorithms but also require significantly fewer queries.

Deep neural networks (DNNs) are increasingly being deployed to perform safety-critical tasks. The opacity of DNNs, which prevents humans from reasoning about them, presents new safety and security challenges. To address these challenges, the verification community has begun developing techniques for rigorously analyzing DNNs, with numerous verification algorithms proposed in recent years. While a significant amount of work has gone into developing these verification algorithms, little work has been devoted to rigorously studying the computability and complexity of the underlying theoretical problems. Here, we seek to contribute to the bridging of this gap. We focus on two kinds of DNNs: those that employ piecewise-linear activation functions (e.g., ReLU), and those that employ piecewise-smooth activation functions (e.g., Sigmoids). We prove the two following theorems: 1) The decidability of verifying DNNs with a particular set of piecewise-smooth activation functions is equivalent to a well-known, open problem formulated by Tarski; and 2) The DNN verification problem for any quantifier-free linear arithmetic specification can be reduced to the DNN reachability problem, whose approximation is NP-complete. These results answer two fundamental questions about the computability and complexity of DNN verification, and the ways it is affected by the network's activation functions and error tolerance; and could help guide future efforts in developing DNN verification tools.

In the field of Artificial Intelligence (AI) and Machine Learning (ML), the approximation of unknown target functions $y=f(\mathbf{x})$ using limited instances $S={(\mathbf{x^{(i)}},y^{(i)})}$, where $\mathbf{x^{(i)}} \in D$ and $D$ represents the domain of interest, is a common objective. We refer to $S$ as the training set and aim to identify a low-complexity mathematical model that can effectively approximate this target function for new instances $\mathbf{x}$. Consequently, the model's generalization ability is evaluated on a separate set $T=\{\mathbf{x^{(j)}}\} \subset D$, where $T \neq S$, frequently with $T \cap S = \emptyset$, to assess its performance beyond the training set. However, certain applications require accurate approximation not only within the original domain $D$ but also in an extended domain $D'$ that encompasses $D$. This becomes particularly relevant in scenarios involving the design of new structures, where minimizing errors in approximations is crucial. For example, when developing new materials through data-driven approaches, the AI/ML system can provide valuable insights to guide the design process by serving as a surrogate function. Consequently, the learned model can be employed to facilitate the design of new laboratory experiments. In this paper, we propose a method for multivariate regression based on iterative fitting of a continued fraction, incorporating additive spline models. We compare the performance of our method with established techniques, including AdaBoost, Kernel Ridge, Linear Regression, Lasso Lars, Linear Support Vector Regression, Multi-Layer Perceptrons, Random Forests, Stochastic Gradient Descent, and XGBoost. To evaluate these methods, we focus on an important problem in the field: predicting the critical temperature of superconductors based on physical-chemical characteristics.

Maximum weight independent set (MWIS) admits a $\frac1k$-approximation in inductively $k$-independent graphs and a $\frac{1}{2k}$-approximation in $k$-perfectly orientable graphs. These are a a parameterized class of graphs that generalize $k$-degenerate graphs, chordal graphs, and intersection graphs of various geometric shapes such as intervals, pseudo-disks, and several others. We consider a generalization of MWIS to a submodular objective. Given a graph $G=(V,E)$ and a non-negative submodular function $f: 2^V \rightarrow \mathbb{R}_+$, the goal is to approximately solve $\max_{S \in \mathcal{I}_G} f(S)$ where $\mathcal{I}_G$ is the set of independent sets of $G$. We obtain an $\Omega(\frac1k)$-approximation for this problem in the two mentioned graph classes. The first approach is via the multilinear relaxation framework and a simple contention resolution scheme, and this results in a randomized algorithm with approximation ratio at least $\frac{1}{e(k+1)}$. This approach also yields parallel (or low-adaptivity) approximations. Motivated by the goal of designing efficient and deterministic algorithms, we describe two other algorithms for inductively $k$-independent graphs that are inspired by work on streaming algorithms: a preemptive greedy algorithm and a primal-dual algorithm. In addition to being simpler and faster, these algorithms, in the monotone submodular case, yield the first deterministic constant factor approximations for various special cases that have been previously considered such as intersection graphs of intervals, disks and pseudo-disks.

The Newton, Gauss--Newton and Levenberg--Marquardt methods all use the first derivative of a vector function (the Jacobian) to minimise its sum of squares. When the Jacobian matrix is ill-conditioned, the function varies much faster in some directions than others and the space of possible improvement in sum of squares becomes a long narrow ellipsoid in the linear model. This means that even a small amount of nonlinearity in the problem parameters can cause a proposed point far down the long axis of the ellipsoid to fall outside of the actual curved valley of improved values, even though it is quite nearby. This paper presents a differential equation that `follows' these valleys, based on the technique of geodesic acceleration, which itself provides a 2$^\mathrm{nd}$ order improvement to the Levenberg--Marquardt iteration step. Higher derivatives of this equation are computed that allow $n^\mathrm{th}$ order improvements to the optimisation methods to be derived. These higher-order accelerated methods up to 4$^\mathrm{th}$ order are tested numerically and shown to provide substantial reduction of both number of steps and computation time.

Positive linear programs (LPs) model many graph and operations research problems. One can solve for a $(1+\epsilon)$-approximation for positive LPs, for any selected $\epsilon$, in polylogarithmic depth and near-linear work via variations of the multiplicative weight update (MWU) method. Despite extensive theoretical work on these algorithms through the decades, their empirical performance is not well understood. In this work, we implement and test an efficient parallel algorithm for solving positive LP relaxations, and apply it to graph problems such as densest subgraph, bipartite matching, vertex cover and dominating set. We accelerate the algorithm via a new step size search heuristic. Our implementation uses sparse linear algebra optimization techniques such as fusion of vector operations and use of sparse format. Furthermore, we devise an implicit representation for graph incidence constraints. We demonstrate the parallel scalability with the use of threading OpenMP and MPI on the Stampede2 supercomputer. We compare this implementation with exact libraries and specialized libraries for the above problems in order to evaluate MWU's practical standing for both accuracy and performance among other methods. Our results show this implementation is faster than general purpose LP solvers (IBM CPLEX, Gurobi) in all of our experiments, and in some instances, outperforms state-of-the-art specialized parallel graph algorithms.

With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.

北京阿比特科技有限公司