
- **DOI:**
https://doi.org/10.48550/arXiv.2303.00652
**Abstract:**
Explainable artificial intelligence (XAI) methods shed light on the predictions of machine learning algorithms. Several different approaches exist and have already been applied in climate science. However, usually missing ground truth explanations complicate their evaluation and comparison, subsequently impeding the choice of the XAI method. Therefore, in this work, we introduce XAI evaluation in the climate context and discuss different desired explanation properties, namely robustness, faithfulness, randomization, complexity, and localization....
**Bullet points summary:**
XAI evaluation is introduced to climate science to compare and assess the performance of explanation methods based on desirable properties, which can help climate researchers choose suitable XAI methods.The paper identifies and discusses five desirable properties of XAI methods: robustness, faithfulness, randomization, complexity, and localization. These properties are evaluated in the context of climate science using an established classification task.Different XAI methods exhibit varying strengths and weaknesses with respect to the five evaluation properties, and their performance can be architecture-dependent. For example, salience methods tend to show improvements in faithfulness and complexity but reduced randomization skill. Sensitivity methods, on the other hand, tend to have higher randomization skill scores, but sacrifice faithfulness and complexity skills.The paper proposes a framework for selecting an appropriate XAI method for a specific research task. This framework involves identifying essential XAI properties, calculating evaluation skill scores across these properties for different XAI methods, and then ranking or comparing the skill scores to determine the best-performing method or combination of methods.XAI evaluation can support researchers in choosing an explanation method, independent of the network structure and targeted to their specific research problem. The use of evaluation metrics alongside benchmark datasets contributes to the benchmarking of explanation methods.
**Citation:**
Bommer, Philine Lou, et al. "Finding the right XAI method—A guide for the evaluation and ranking of explainable AI methods in climate science." Artificial Intelligence for the Earth Systems 3.3 (2024): e230074.