Downloads: 7
United States | Information Technology | Volume 11 Issue 12, December 2022 | Pages: 1505 - 1511
Making AI Explainable for Stronger Distributed Systems: Clear Machine Learning for Smarter Fault Tolerance
Abstract: Today's distributed systems depend heavily on machine learning (ML) to predict and recover from faults, but the "black-box" nature of many ML models makes them hard to trust and understand. To tackle this, we present a new approach that blends interpretable ML (IML) methods-like SHAP, LIME, and rule-based models-into adaptive fault tolerance systems. Unlike traditional methods that focus only on accuracy, our framework not only predicts failures effectively but also explains why they happen in a way humans can grasp. We built a hybrid system that pairs real-time ML fault detection with explainable decision-making, helping system operators trust and act on AI-driven insights. Testing on the Parallel Distributed Task Infrastructure (PDTI), our method cuts false alarms by 30% compared to deep learning models while maintaining over 95% recovery accuracy across different failure scenarios. We also explore the balance between explainability and computational cost, giving practical advice for using explainable AI (XAI) in time-sensitive systems. This research closes the gap between fully automated resilience and human oversight, making distributed systems more transparent and reliable-especially in large-scale, dynamic environments.
Keywords: Adaptive fault tolerance, distributed systems, explainable AI (XAI), interpretable machine learning, fault prediction, real-time monitoring, system resilience, parallel computing, heterogeneous networks, trustworthy AI
How to Cite?: Rajani Kumari Vaddepalli, "Making AI Explainable for Stronger Distributed Systems: Clear Machine Learning for Smarter Fault Tolerance", Volume 11 Issue 12, December 2022, International Journal of Science and Research (IJSR), Pages: 1505-1511, https://www.ijsr.net/getabstract.php?paperid=SR221223115122, DOI: https://dx.doi.org/10.21275/SR221223115122