亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Staggered treatment adoption arises in the evaluation of policy impact and implementation in a variety of settings. This occurs in both randomized stepped-wedge trials and non-randomized quasi-experimental designs using causal inference methods based on difference-in-differences analysis. In both settings, it is crucial to carefully consider the target estimand and possible treatment effect heterogeneities in order to estimate the effect without bias and in an interpretable fashion. This paper proposes a novel non-parametric approach to this estimation for either setting. By constructing an estimator using two-by-two difference-in-difference comparisons as building blocks with arbitrary weights, the investigator can select weights to target the desired estimand in an unbiased manner under assumed treatment effect homogeneity, and minimize the variance under an assumed working covariance structure. This provides desirable bias properties with a relatively small sacrifice in variance and power by using the comparisons efficiently. The method is demonstrated on toy examples to show the process, as well as in the re-analysis of a stepped wedge trial on the impact of novel tuberculosis diagnostic tools. A full algorithm with R code is provided to implement this method. The proposed method allows for high flexibility and clear targeting of desired effects, providing one solution to the bias-variance-generalizability tradeoff.

相關內容

Traditional methods for point forecasting in univariate random walks often fail to surpass naive benchmarks due to data unpredictability. This study introduces a novel forecasting method that fuses movement prediction (binary classification) with naive forecasts for accurate one-step-ahead point forecasting. The method's efficacy is demonstrated through theoretical analysis, simulations, and real-world data experiments. It reliably exceeds naive forecasts with movement prediction accuracies as low as 0.55, outperforming baseline models like ARIMA, linear regression, MLP, and LSTM networks in forecasting the S\&P 500 index and Bitcoin prices. This method is particularly advantageous when accurate point predictions are challenging but accurate movement predictions are attainable, translating movement predictions into point forecasts in random walk contexts.

This work presents a procedure to solve the Euler equations by explicitly updating, in a conservative manner, a generic thermodynamic variable such as temperature, pressure or entropy instead of the total energy. The presented procedure is valid for any equation of state and spatial discretization. When using complex equations of state such as Span-Wagner, choosing the temperature as the generic thermodynamic variable yields great reductions in the computational costs associated to thermodynamic evaluations. Results computed with a state of the art thermodynamic model are presented, and computational times are analyzed. Particular attention is dedicated to the conservation of total energy, the propagation speed of shock waves and jump conditions. The procedure is thoroughly tested using the Span-Wagner equation of state through the CoolProp thermodynamic library and the Van der Waals equation of state, both in the ideal and non-ideal compressible fluid-dynamics regimes, by comparing it to the standard total energy update and analytical solutions where available.

Remote proctoring technology, a cheating-preventive measure, often raises privacy and fairness concerns that may affect test-takers' experiences and the validity of test results. Our study explores how selectively obfuscating information in video recordings can protect test-takers' privacy while ensuring effective and fair cheating detection. Interviews with experts (N=9) identified four key video regions indicative of potential cheating behaviors: the test-taker's face, body, background and the presence of individuals in the background. Experts recommended specific obfuscation methods for each region based on privacy significance and cheating behavior frequency, ranging from conventional blurring to advanced methods like replacement with deepfake, 3D avatars and silhouetting. We then conducted a vignette experiment with potential test-takers (N=259, non-experts) to evaluate their perceptions of cheating detection, visual privacy and fairness, using descriptions and examples of still images for each expert-recommended combination of video regions and obfuscation methods. Our results indicate that the effectiveness of obfuscation methods varies by region. Tailoring remote proctoring with region-specific advanced obfuscation methods can improve the perceptions of privacy and fairness compared to the conventional methods, though it may decrease perceived information sufficiency for detecting cheating. However, non-experts preferred conventional blurring for videos they were more willing to share, highlighting a gap between the perceived effectiveness of the advanced obfuscation methods and their practical acceptance. This study contributes to the field of user-centered privacy by suggesting promising directions to address current remote proctoring challenges and guiding future research.

Evaluation of multilingual Large Language Models (LLMs) is challenging due to a variety of factors -- the lack of benchmarks with sufficient linguistic diversity, contamination of popular benchmarks into LLM pre-training data and the lack of local, cultural nuances in translated benchmarks. In this work, we study human and LLM-based evaluation in a multilingual, multi-cultural setting. We evaluate 30 models across 10 Indic languages by conducting 90K human evaluations and 30K LLM-based evaluations and find that models such as GPT-4o and Llama-3 70B consistently perform best for most Indic languages. We build leaderboards for two evaluation settings - pairwise comparison and direct assessment and analyse the agreement between humans and LLMs. We find that humans and LLMs agree fairly well in the pairwise setting but the agreement drops for direct assessment evaluation especially for languages such as Bengali and Odia. We also check for various biases in human and LLM-based evaluation and find evidence of self-bias in the GPT-based evaluator. Our work presents a significant step towards scaling up multilingual evaluation of LLMs.

We devise a control-theoretic reinforcement learning approach to support direct learning of the optimal policy. We establish theoretical properties of our approach and derive an algorithm based on a specific instance of this approach. Our empirical results demonstrate the significant benefits of our approach.

We consider the method of mappings for performing shape optimization for unsteady fluid-structure interaction (FSI) problems. In this work, we focus on the numerical implementation. We model the optimization problem such that it takes several theoretical results into account, such as regularity requirements on the transformations and a differential geometrical point of view on the manifold of shapes. Moreover, we discretize the problem such that we can compute exact discrete gradients. This allows for the use of general purpose optimization solvers. We focus on an FSI benchmark problem to validate our numerical implementation. The method is used to optimize parts of the outer boundary and the interface. The numerical simulations build on FEniCS, dolfin-adjoint and IPOPT. Moreover, as an additional theoretical result, we show that for a linear special case the adjoint attains the same structure as the forward problem but reverses the temporal flow of information.

The widely adopted Business Process Model and Notation (BPMN) is a cornerstone of industry standards for business process modeling. However, its ambiguous execution semantics often result in inconsistent interpretations, depending on the software used for implementation. In response, the Process Specification Language (PASS) provides formally defined semantics to overcome these interpretational challenges. Despite its clear advantages, PASS has not reached the same level of industry penetration as BPMN. This feasibility study proposes using PASS as an intermediary framework to translate and execute BPMN models. It describes the development of a prototype translator that converts specific BPMN elements into a format compatible with PASS. These models are then transformed into source code and executed in a bespoke workflow environment, marking a departure from traditional BPMN implementations. Our findings suggest that integrating PASS enhances compatibility across different modeling and execution tools and offers a more robust methodology for implementing business processes across organizations. This study lays the groundwork for more accurate and unified business process model executions, potentially transforming industry standards for process modeling and execution.

Segmentation models for brain lesions in MRI are commonly developed for a specific disease and trained on data with a predefined set of MRI modalities. Each such model cannot segment the disease using data with a different set of MRI modalities, nor can it segment any other type of disease. Moreover, this training paradigm does not allow a model to benefit from learning from heterogeneous databases that may contain scans and segmentation labels for different types of brain pathologies and diverse sets of MRI modalities. Is it feasible to use Federated Learning (FL) for training a single model on client databases that contain scans and labels of different brain pathologies and diverse sets of MRI modalities? We demonstrate promising results by combining appropriate, simple, and practical modifications to the model and training strategy: Designing a model with input channels that cover the whole set of modalities available across clients, training with random modality drop, and exploring the effects of feature normalization methods. Evaluation on 7 brain MRI databases with 5 different diseases shows that such FL framework can train a single model that is shown to be very promising in segmenting all disease types seen during training. Importantly, it is able to segment these diseases in new databases that contain sets of modalities different from those in training clients. These results demonstrate, for the first time, feasibility and effectiveness of using FL to train a single segmentation model on decentralised data with diverse brain diseases and MRI modalities, a necessary step towards leveraging heterogeneous real-world databases. Code will be made available at: //github.com/FelixWag/FL-MultiDisease-MRI

Despite advances in areas such as the personalization of robots, sustaining adoption of robots for long-term use in families remains a challenge. Recent studies have identified integrating robots into families' routines and rituals as a promising approach to support long-term adoption. However, few studies explored the integration of robots into family routines and there is a gap in systematic measures to capture family preferences for robot integration. Building upon existing routine inventories, we developed Family-Robot Routines Inventory (FRRI), with 24 family routines and 24 child routine items, to capture parents' attitudes toward and expectations from the integration of robotic technology into their family routines. Using this inventory, we collected data from 150 parents through an online survey. Our analysis indicates that parents had varying perceptions for the utility of integrating robots into their routines. For example, parents found robot integration to be more helpful in children's individual routines, than to the collective routines of their families. We discuss the design implications of these preliminary findings, and how they may serve as a first step toward understanding the diverse challenges and demands of designing and integrating household robots for families.

The success of AI models relies on the availability of large, diverse, and high-quality datasets, which can be challenging to obtain due to data scarcity, privacy concerns, and high costs. Synthetic data has emerged as a promising solution by generating artificial data that mimics real-world patterns. This paper provides an overview of synthetic data research, discussing its applications, challenges, and future directions. We present empirical evidence from prior art to demonstrate its effectiveness and highlight the importance of ensuring its factuality, fidelity, and unbiasedness. We emphasize the need for responsible use of synthetic data to build more powerful, inclusive, and trustworthy language models.

北京阿比特科技有限公司