Independent or i.i.d. innovations is an essential assumption in the literature for analyzing a vector time series. However, this assumption is either too restrictive for a real-life time series to satisfy or is hard to verify through a hypothesis test. This paper performs statistical inference on a sparse high-dimensional vector autoregressive time series, allowing its white noise innovations to be dependent, even non-stationary. To achieve this goal, it adopts a post-selection estimator to fit the vector autoregressive model and derives the asymptotic distribution of the post-selection estimator. The innovations in the autoregressive time series are not assumed to be independent, thus making the covariance matrices of the autoregressive coefficient estimators complex and difficult to estimate. Our work develops a bootstrap algorithm to facilitate practitioners in performing statistical inference without having to engage in sophisticated calculations. Simulations and real-life data experiments reveal the validity of the proposed methods and theoretical results. Real-life data is rarely considered to exactly satisfy an autoregressive model with independent or i.i.d. innovations, so our work should better reflect the reality compared to the literature that assumes i.i.d. innovations.
Aberrant respondents are common but yet extremely detrimental to the quality of social surveys or questionnaires. Recently, factor mixture models have been employed to identify individuals providing deceptive or careless responses. We propose a comprehensive factor mixture model that combines confirmatory and exploratory factor models to represent both the non-aberrant and aberrant components of the responses. The flexibility of the proposed solution allows for the identification of two of the most common aberant response styles, namely faking and careless responding. We validated our approach by means of two simulations and two case studies. The results indicate the effectiveness of the proposed model in handling with aberrant responses in social and behavioral surveys.
In this work, a class of continuous-time autonomous dynamical systems describing many important phenomena and processes arising in real-world applications is considered. We apply the nonstandard finite difference (NSFD) methodology proposed by Mickens to design a generalized NSFD method for the dynamical system models under consideration. This method is constructed based on a novel non-local approximation for the right-side functions of the dynamical systems. It is proved by rigorous mathematical analyses that the NSFD method is dynamically consistent with respect to positivity, asymptotic stability and three classes of conservation laws, including direct conservation, generalized conservation and sub-conservation laws. Furthermore, the NSFD method is easy to be implemented and can be applied to solve a broad range of mathematical models arising in real-life. Finally, a set of numerical experiments is performed to illustrate the theoretical findings and to show advantages of the proposed NSFD method.
Currently, the cloud computing paradigm is experiencing rapid growth as there is a shift from other distributed computing methods and traditional IT infrastructure towards it. Consequently, optimised task scheduling techniques have become crucial in managing the expanding cloud computing environment. In cloud computing, numerous tasks need to be scheduled on a limited number of diverse virtual machines to minimise the imbalance between the local and global search space; and optimise system utilisation. Task scheduling is a challenging problem known as NP-complete, which means that there is no exact solution, and we can only achieve near-optimal results, particularly when using large-scale tasks in the context of cloud computing. This paper proposes an optimised strategy, Cuckoo-based Discrete Symbiotic Organisms Search (C-DSOS) that incorporated with Levy-Flight for optimal task scheduling in the cloud computing environment to minimise degree of imbalance. The strategy is based on the Standard Symbiotic Organism Search (SOS), which is a nature-inspired metaheuristic optimisation algorithm designed for numerical optimisation problems. SOS simulates the symbiotic relationships observed in ecosystems, such as mutualism, commensalism, and parasitism. To evaluate the proposed technique, the CloudSim toolkit simulator was used to conduct experiments. The results demonstrated that C-DSOS outperforms the Simulated Annealing Symbiotic Organism Search (SASOS) algorithm, which is a benchmarked algorithm commonly used in task scheduling problems. C-DSOS exhibits a favourable convergence rate, especially when using larger search spaces, making it suitable for task scheduling problems in the cloud. For the analysis, a t-test was employed, reveals that C-DSOS is statistically significant compared to the benchmarked SASOS algorithm, particularly for scenarios involving a large search space.
Controlling the departure time of the trucks from a container hub is important to both the traffic and the logistics systems. This, however, requires an intelligent decision support system that can control and manage truck arrival times at terminal gates. This paper introduces an integrated model that can be used to understand, predict, and control logistics and traffic interactions in the port-hinterland ecosystem. This approach is context-aware and makes use of big historical data to predict system states and apply control policies accordingly, on truck inflow and outflow. The control policies ensure multiple stakeholders satisfaction including those of trucking companies, terminal operators, and road traffic agencies. The proposed method consists of five integrated modules orchestrated to systematically steer truckers toward choosing those time slots that are expected to result in lower gate waiting times and more cost-effective schedules. The simulation is supported by real-world data and shows that significant gains can be obtained in the system.
Following Inoue et al., we define a word to be a repetition if it is a (fractional) power of exponent at least 2. A word has a repetition factorization if it is the product of repetitions. We study repetition factorizations in several (generalized) automatic sequences, including the infinite Fibonacci word, the Thue-Morse word, paperfolding words, and the Rudin-Shapiro sequence.
Nonlinear systems arising from time integrators like Backward Euler can sometimes be reformulated as optimization problems, known as incremental potentials. We show through a comprehensive experimental analysis that the widely used Projected Newton method, which relies on unconditional semidefinite projection of Hessian contributions, typically exhibits a reduced convergence rate compared to classical Newton's method. We demonstrate how factors like resolution, element order, projection method, material model and boundary handling impact convergence of Projected Newton and Newton. Drawing on these findings, we propose the hybrid method Project-on-Demand Newton, which projects only conditionally, and show that it enjoys both the robustness of Projected Newton and convergence rate of Newton. We additionally introduce Kinetic Newton, a regularization-based method that takes advantage of the structure of incremental potentials and avoids projection altogether. We compare the four solvers on hyperelasticity and contact problems. We also present a nuanced discussion of convergence criteria, and propose a new acceleration-based criterion that avoids problems associated with existing residual norm criteria and is easier to interpret. We finally address a fundamental limitation of the Armijo backtracking line search that occasionally blocks convergence, especially for stiff problems. We propose a novel parameter-free, robust line search technique to eliminate this issue.
For the iterative decoupling of elliptic-parabolic problems such as poroelasticity, we introduce time discretization schemes up to order $5$ based on the backward differentiation formulae. Its analysis combines techniques known from fixed-point iterations with the convergence analysis of the temporal discretization. As the main result, we show that the convergence depends on the interplay between the time step size and the parameters for the contraction of the iterative scheme. Moreover, this connection is quantified explicitly, which allows for balancing the single error components. Several numerical experiments illustrate and validate the theoretical results, including a three-dimensional example from biomechanics.
Refinement calculus provides a structured framework for the progressive and modular development of programs, ensuring their correctness throughout the refinement process. This paper introduces a refinement calculus tailored for quantum programs. To this end, we first study the partial correctness of nondeterministic programs within a quantum while language featuring prescription statements. Orthogonal projectors, which are equivalent to subspaces of the state Hilbert space, are taken as assertions for quantum states. In addition to the denotational semantics where a nondeterministic program is associated with a set of trace-nonincreasing super-operators, we also present their semantics in transforming a postcondition to the weakest liberal postconditions and, conversely, transforming a precondition to the strongest postconditions. Subsequently, refinement rules are introduced based on these dual semantics, offering a systematic approach to the incremental development of quantum programs applicable in various contexts. To illustrate the practical application of the refinement calculus, we examine examples such as the implementation of a $Z$-rotation gate, the repetition code, and the quantum-to-quantum Bernoulli factory. Furthermore, we present Quire, a Python-based interactive prototype tool that provides practical support to programmers engaged in the stepwise development of correct quantum programs.
Previous researchers conducting Just-In-Time (JIT) defect prediction tasks have primarily focused on the performance of individual pre-trained models, without exploring the relationship between different pre-trained models as backbones. In this study, we build six models: RoBERTaJIT, CodeBERTJIT, BARTJIT, PLBARTJIT, GPT2JIT, and CodeGPTJIT, each with a distinct pre-trained model as its backbone. We systematically explore the differences and connections between these models. Specifically, we investigate the performance of the models when using Commit code and Commit message as inputs, as well as the relationship between training efficiency and model distribution among these six models. Additionally, we conduct an ablation experiment to explore the sensitivity of each model to inputs. Furthermore, we investigate how the models perform in zero-shot and few-shot scenarios. Our findings indicate that each model based on different backbones shows improvements, and when the backbone's pre-training model is similar, the training resources that need to be consumed are much more closer. We also observe that Commit code plays a significant role in defect detection, and different pre-trained models demonstrate better defect detection ability with a balanced dataset under few-shot scenarios. These results provide new insights for optimizing JIT defect prediction tasks using pre-trained models and highlight the factors that require more attention when constructing such models. Additionally, CodeGPTJIT and GPT2JIT achieved better performance than DeepJIT and CC2Vec on the two datasets respectively under 2000 training samples. These findings emphasize the effectiveness of transformer-based pre-trained models in JIT defect prediction tasks, especially in scenarios with limited training data.
We propose and compare methods for the analysis of extreme events in complex systems governed by PDEs that involve random parameters, in situations where we are interested in quantifying the probability that a scalar function of the system's solution is above a threshold. If the threshold is large, this probability is small and its accurate estimation is challenging. To tackle this difficulty, we blend theoretical results from large deviation theory (LDT) with numerical tools from PDE-constrained optimization. Our methods first compute parameters that minimize the LDT-rate function over the set of parameters leading to extreme events, using adjoint methods to compute the gradient of this rate function. The minimizers give information about the mechanism of the extreme events as well as estimates of their probability. We then propose a series of methods to refine these estimates, either via importance sampling or geometric approximation of the extreme event sets. Results are formulated for general parameter distributions and detailed expressions are provided when Gaussian distributions. We give theoretical and numerical arguments showing that the performance of our methods is insensitive to the extremeness of the events we are interested in. We illustrate the application of our approach to quantify the probability of extreme tsunami events on shore. Tsunamis are typically caused by a sudden, unpredictable change of the ocean floor elevation during an earthquake. We model this change as a random process, which takes into account the underlying physics. We use the one-dimensional shallow water equation to model tsunamis numerically. In the context of this example, we present a comparison of our methods for extreme event probability estimation, and find which type of ocean floor elevation change leads to the largest tsunamis on shore.