International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
Call for Papers | Fully Refereed | Open Access | Double Blind Peer Reviewed

ISSN: 2319-7064


Downloads: 8

India | Computer Science Engineering | Volume 9 Issue 3, March 2020 | Pages: 1678 - 1689


Chaos Engineering for Building Resilient Distributed Systems

Venkata Naga Sai Kiran Challa

Abstract: Chaos Engineering is an advanced methodology used to ensure the reliability and fault tolerance of distributed systems. By deliberately introducing faults, it tests how systems behave under real-world conditions, thereby identifying vulnerabilities that traditional testing may miss. This proactive approach helps organizations like Netflix, Amazon, Google, and Microsoft to maintain high availability of their services. Integrating Machine Learning ML into Chaos Engineering further enhances its effectiveness by predicting anomalies, automating experiments, and improving observability. This combined strategy promotes a culture of continuous learning and resilience, crucial for modern, complex systems.

Keywords: Chaos Engineering, fault tolerance, distributed systems, Machine Learning, resilience



Citation copied to Clipboard!

Rate this Article

5

Characters: 0

Received Comments

No approved comments available.

Rating submitted successfully!


Top