DigitalSE Logo

Exploiting Formal Modeling in Resilient System Design: Key Concepts, Current Practice, and Innovative Approach

Abstract

While traditional fault-tolerance methods and extensive testing have contributed significantly to the realization of dependable systems, today's complex systems face increasing threats from “unknown unknowns” that result from unexpected use, misuse, and unpredictable external disruptions. Resilience, a nonfunctional property that, specifically addresses these concerns, is focused on supporting continued, dependable operations in uncertain operational environments. While the need for resilience is well recognized, the solution methods tend to be ad hoc , leading to inefficient implementations of questionable effectiveness and flexibility. This chapter describes a formal modeling approach for implementing core resilience functions and reducing the risk of disruptive events that degrade performance and dependability. The term “formal,” as used in this chapter, connotes mathematical rigor consistent with correctness proofs. The approach presented in this chapter uses a flexible contract (FC) construct that extends the conventional design-by-contract paradigm to address the flexibility needs of resilient systems. An illustrative example of the approach is presented in the context of quadcopter operations in real-world situations.


Leads

Azad M. Madni

University of Southern California

Michael Sievers

University of Southern California

Publications

  1. Abdelmalak , M. and Benidris , M. ( 2021 ). A Markov decision process to enhance power system operation resilience during hurricanes . 2021 IEEE Power & Energy Society General Meeting (PESGM) . Washington, DC, USA , 2021, pp. 1 – 5 . IEEE .

  2. Acheson , A. and Dagli , C. ( 2016 ). Modeling resilience in system of systems architecture . Procedia Computer Science 95 : 111 – 118 .

  3. Avizienis , A. , Laprie , J.-C. , Randell , B. , and Landwehr , C. ( 2004 ). Basic concepts and taxonomy of dependable and secure computing . IEEE Transactions on Dependable Computing 1 ( 1 ): 11 – 33 .

  4. Bakirtzis , G. , Sherburne , T. , Adams , S. et al. ( 2022 ). An ontological metamodel for cyber-physical system safety, security, and resilience coengineering . Software and Systems Modeling 21 : 113 – 117 .

  5. Balluchi , A. , Casagrande , A. , Collins , P. , et al. ( 2006 ). Ariadne: a framework for reachability analysis of hybrid automata . Symposium on Mathematical Theory of Networks and Systems (MTNS) (2006) Kyoto, Japan (24–28 July 2006). Semantic Scholar .

  6. Bellini , E. , Marrone , S. , and Marulli , F. ( 2021 ). Cyber resilience metamodeling: the railway communication case study . Electronics 10 ( 583 ).

  7. Bellman , R. ( 1952 ). On the theory of dynamic programming . Proceedings of the National Academy of Sciences of the United States of America 38 : 716 – 719 .

  8. Benvenuti , L. , Ferrari , A. , Mazzi , E. , and Vincentelli , A. ( 1970 ). Contract-based design for computation and verification of a closed-loop hybrid system International Workshop on Hybrid Systems: Computation and Control (22 April 2008). pp. 58 – 71 .

  9. Biggar , O. and Zamani , M. ( 2020 ). A framework for formal verification of behavior trees with linear temporal logic . IEEE Robotics and Automation Letters 5 ( 2 ): 2341 – 2348 .

  10. Büchi , J.R. ( 1960 ). Weak second-order arithmetic and finite automata . Mathematical Logic Quarterly 6 ( 1–6 ): 66 – 92 .

  11. Büchi , J.R. ( 1962 ). On a decision method in restricted second-order arithmetic . In: Logic Methodology and Philosophy of Science (ed. E. Nagel , P. Suppes , and A. Tarski ), 1 – 11 . Stanford : Stanford University Press .

  12. Catano , N. ( 2022 ). Program synthesis for cyber-resilience . IEEE Transactions on Software Engineering , 49 ( 3 ): 962 – 972 .

  13. Cichonski , P. , Millar , T. , Grance , T. , and Scarfone , K. ( 2012 ). Computer Security Incident Handling Guide . NIST Computer Security Resource Center. SP 800-61 Rev. 2

  14. Cimatti , A. and Tonetta , S. ( 2012 ). A property-based proof system for contract-based design . 38th Euromicro Conference on Software Engineering and Advanced Applications . pp. 21 – 28 . IEEE .

  15. Clarke , E. , Grumberg , O. , and Peled , D. ( 2001 ). Model Checking . MIT Press . ISBN: 978-0-262-03270-4.

  16. Clarke , E. , Emerson , E.A. , and Sifakis , J. ( 2009 ). Model checking: algorithmic verification and debugging . Communications ACM 22 ( 11 ): 74 – 84 .

  17. Delahaye , B. , Caillaud , B. , and Legay , A. ( 2010 ). Probabilistic contracts: a compositional reasoning methodology for the design of stochastic systems . In: 2010 10th International Conference on Application of Concurrency to System Design , Braga, Portugal (21–25 June 2010). pp. 223 – 232 . IEEE.

  18. Di Pietro ( 2020 ). Machine Learning with Python: Classification (complete tutorial) . Towards Data Science, 11 May 2020.

  19. Dijkstra , E.W. ( 1975 ). Guarded commands, nondeterminacy and formal derivation of programs . Communications of the ACM 18 ( 8 ): 453 – 457 .

  20. Eddy , S.R. ( 2004 ). What is dynamic programming? Nature Biotechnology 22 ( 7 ): 909 – 910 .

  21. Enseling , O. ( 2001 ). iContract: Design by Contract in Java . InfoWorld.

  22. Gerth , R. , Peled , D. , Vardi , M.Y. , and Wolper , P. ( 1995 ). Simple on-the-fly automatic verification of linear temporal logic . In: Protocol Specification, Testing and Verification XV. PSTV 1995 (ed. P. Dembinski and M. Sredniawa ). Boston, MA : IFIP Advances in Information and Communication Technology, Springer .

  23. Gholami , A. , Shekari , T. , Amirioun , M.H. et al. ( 2018 ). Toward a consensus on the definition and taxonomy of power system resilience . IEEE Access 6 : 32035 – 32053 .

  24. Goerger , S.R. , Madni , A.M. , and Eslinger , O.J. ( 2014 ). Engineered resilient systems: a DoD perspective . Procedia Computer Science 20 ( 28 ): 865 – 872 .

  25. Goldbeck , N. , Angeloudis , P. , and Ochieng , W. ( 2019 ). Resilience assessment for interdependent urban infrastructure systems using dynamic flow models . Reliability Engineering and System Safety 188 : 62 – 79 .

  26. Hastie , T. , Tibshirani , R. , and Friedman , J. ( 2009 ). Unsupervised Learning. The Elements of Statistical Learning , 485 – 585 . New York : Springer .

  27. Hoang , T.S. ( 2017 ). Appendix A: An Introduction to the Event-B Modeling Method . Berlin : Springer Verlag .

  28. Hoare , C.A.R. ( 1969 ). An axiomatic basis for computer programming . Communications of the ACM 12 ( 10 ): 576 – 585 .

  29. Holzmann , G. ( 2003 ). Trends in software verification . International Symposium on Formal Methods , Pisa, Italy, Europe (8–14 September 2003). pp. 40 – 50 . Springer .

  30. Kaelbling , L.P. , Littman , M. , and Cassandra , A. ( 1998 ). Planning and acting in partially observable stochastic domains . Artificial Intelligence 101 : 99 – 134 .

  31. Kesten , Y. , Pnueli , A. , and Raviv , L. ( 1998 ). Algorithmic Verification of Linear Temporal Logic Specifications . ICALP'98,LNCS 1443. pp. 1 – 16 .

  32. Kotsiantis , S.B. ( 2007 ). Supervised machine learning: a review of classification techniques . Informatics 31 : 249 – 268 .

  33. Kuo , C.C.J. and Madni , A.M. ( 2023 ). Green learning: introduction, examples and outlook . Journal of Visual Communication and Image Representation 90 .

  34. Lai , C.C. , Hsu , C.Y. , Jen , H.H. et al. ( 2021 ). The Bayesian susceptible-exposed-infected-recovered model for the outbreak of COVID-19 on the diamond princess cruise ship . Stoch Environ Res Risk Assess 35 : 1319 – 1333 .

  35. Lauri , M. , Hsu , D. , and Pajarinen , J. ( 2022 ). Partially observable Markov Decision processes in robotics: a survey . IEEE Trans on Robotics 1 – 20 .

  36. Li , J. , Nuzzo , P. , Sangiovanni-Vincentelli , A. et al. ( 2017 ). Stochastic contracts for cyber-physical systems under probabilistic requirements . In: MEMOCODE '17: Proceedings on 15 th ACM-IEEE International Conference of Formal Methods and Models for System Design , 5 – 14 . Vienna, Austria . ACM .

  37. Lin , Y. and Bie , Z. ( 2018 ). Tri-level optimal hardening plan for a resilient distribution system considering reconfiguration and DG islanding . Applied Energy 201 : 1266 – 1279 .

  38. Madni , A.M. and Jackson , S. ( 2009 ). Towards a conceptual framework for resilience engineering . IEEE Systems Journal 3 ( 2 ): 181 – 191 .

  39. Madni , A.M. , Sievers , M. , Ordoukhanian , E. , and Pouya , P. ( 2018a ). Extending formal modeling for resilient systems . 2018 INCOSE International Symposium .

  40. Madni , A.M. , Sievers , M. , Madni , A. et al. ( 2018b ). Extending formal modeling for resilient system design . INCOSE INSIGHT 21 ( 3 ): 34 – 41 .

  41. Madni , A.M. , Sievers , M. , and Erwin , D. ( 2019 ). Formal and Probabilistic Modeling in the Design of Resilient Systems and System-of-Systems . San Diego, California : AIAA Science and Technology Forum .

  42. Madni , A.M. , Erwin , D. , and Sievers , M. ( 2020 ). Constructing models for system resilience: challenges, concepts, and formula models . Systems 8 ( 3 ).

  43. Meyer , B. ( 1989 ). Eiffel: An Introduction . Interactive Software Engineering.

  44. Meyer , B. ( 1992 ). Applying “design by contract” . Computer 25 ( 10 ): 40 – 51 .

  45. Ouimet , M. ( 2007 ). Formal software verification: model checking and theorem proving . Embedded Systems Laboratory Technical Report ESL_TIK-00214, MIT , Cambridge, MA .

  46. Pnueli , A. ( 1977 ). The temporal logic of programs . Proceedings of the 18th Annual Symposium on Foundations of Computer Science , Providence, RI (31 October–2 November 1977). pp. 46 – 57 . IEEE .s

  47. Puterman , M. ( 1990 ). Markov Decision processes , Chapter 8. In: Handbooks in Operations Research and Management Science vol. 2 , 331 – 434 .

  48. Rabiner , L. ( 1989 ). A tutorial on hidden Markov models and selected applications in speech recognition . Proceedings of the IEEE 77 ( 2 ): 257 – 286 .

  49. Reason , J. ( 1990 ). The contribution of laten human failures to the breakdown of complex systems . Philosophical Transactions of the Royal Society B: Biological Sciences 327 ( 1241 ): 475 – 484 .

  50. Sarraute , C. , Buffet , O. , and Hoffmann , J. ( 2012 ). POMDPs make better hackers: accounting for uncertainty in penetration testing . Proceedings of the 26 AAAI Conference on Artificial Intelligence , Toronto, Ontario, Canada (22–26 July 2012). pp. 1816 – 1824 .

  51. Schimpf , A. , Merz , S. , and Smaus , J.G. ( 2009 ). Construction of Büchi automata for LTL model checking verified in Isabelle/HOL . International Conference on Theorem Proving in Higher Order Logics , Munich, Germany (17–20 August 2009). pp. 1816 – 1824 . Springer .

  52. Sievers , M. and Madni , A.M. ( 2016a ). Agent-based flexible design contracts for resilient spacecraft swarms . In: Proceedings of the AIAA Science and Technology 2016 Forum and Exposition , SanDiego, California, USA .

  53. Sievers , M. and Madni , A.M. ( 2016b ). Agent-Based Flexible Design Contracts for Resilient Space Systems . AIAA Scitech .

  54. Smallwood , R. and Sondik , E. ( 1973 ). The optimal control of partially observable Markov processes over a finite horizon . Operations Research 21 ( 5 ): 1071 – 1088 .

  55. Smith , T. and Simmons , R. ( 2004 ). Heuristic search value iteration for POMDPs . Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence . pp. 520 – 527 .

  56. Taleb , N. ( 2010 ). The Black Swan – The Impact of the Highly Improbably . Random House .

  57. Thomas , G. ( 2020 ). Markov decision processes . 6 April 2020. mdps.pdf (stanford.edu) (accessed January 2023).

  58. Tipireddy , R. , Chatterjee , S. , Paulson , P. , et al. ( 2017 ). Agent-centric approach for cybersecurity decision-support with partial observability . 2017 IEEE International Symposium on Technologies for Homeland Security (HST) , Waltham, MA (25–26 April 2017). pp. 1 – 6 . IEEE .

  59. Tran , H. , Campos-Nanez , E. , Fomin , P. , and Waek , J. ( 2016 ). Cyber resilience recovery model to combat zero-day malware attacks . Computers and Security 61 : 19 – 31 .

  60. Vardi , M.Y. ( 2021 ). Program verification: a 70+ year history . 2021 International Symposium on Theoretical Aspects of Software Engineering (TASE) , Shanghai, China (25–27 August 2021). pp. 1 – 2 . IEEE .

  61. Viguerie , A. , Lorenzo , G. , Auricchio , F. et al. ( 2021 ). Simulating the spread of COVID-19 via a spatially-resolved susceptible–exposed–infected–recovered–deceased (SEIRD) model with heterogeneous diffusion . Applied Mathematics Letters 111 .

  62. Wang , J. , Zuo , W. , Barbarigos-Rhode , L. et al. ( 2019 ). Literature review on modeling and simulation of energy infrastructure from a resilience perspective . Reliability Engineering and System Safety 183 : 360 – 373 .

  63. Wang , C. , Ju , P. , Lei , S. et al. ( 2020 ). Markov decision process-based resilience enhancement for distribution systems: an approximate dynamic programming approach . IEEE Transactions on Smart Grid 11 ( 3 ): 2498 – 2510 .

  64. Welsh , T. and Benkhelifa , E. ( 2020 ). On resilience in cloud computing: a survey of techniques across the cloud domain . ACM Computing Surveys 53 ( 3 ).

  65. Westrum , R. ( 2007 ). A Typology of Resilience Situations , Chapter 5. In: Resilience Engineering Concepts and Precepts (ed. E. Hollnagel , D. Woods , and N. Leveson ). England : Ashgate Publishing Ltd .

  66. Xu , D. , Gossler , G. , and Girautl , A. ( 2012 ). Probabilistic contracts for component-based design . Formal Methods in System Design 41 : 211 – 231 .

SERC Logo

The Systems Engineering Research Center (SERC) was established in the Fall of 2008 as a government-designated University Affiliated Research Center (UARC). The SERC has produced 15 years of research, focused on an updated systems engineering toolkit (methods, tools, and practices) for the complex cyber-physical systems of today and tomorrow.


Follow us on

LinkedIn