Design-for-Availability: Designing Safety, Mission and Infrastructure Critical Systems to Meet Availability Targets

(NSF Grant No. CMMI-1129697, July 2011 – June 2014)

[Introduction to the Problem] | [Related Publications and Presentations] | [Links]

Introduction to the Problem

Availability is the ability of a service or a system to be functional when it is requested for use or operation. Availability is a function of an item’s reliability (how often it fails) and maintainability (how efficiently it can be restored when it does fail). Availability is a significant issue for many systems. A decrease of availability of an ATM machine causes inconvenience to customers; the unavailability of a point-of-sale system to retail outlets can generate a huge financial loss; the unavailability of hospital equipment can result in loss of life; poor availability will make wind farms non-viable; and the unavailability of aircraft cause airlines to cancel or delay flights. For safety, mission, and infrastructure critical systems, customers are often interested in buying the availability of a system through “availability contracts”, instead of actually buying the system itself. However, evaluating an availability requirement is a challenge for manufacturers and supporters of systems because determining how to deliver a specific availability is not trivial.

Predicting availability based on known or predicted system design and operational parameters, reliability, logistics, etc., is straightforward and often accomplished (for real systems) using Markov models or discrete event simulators.  While there is a significant body of literature that addresses availability optimization (maximizing availability), little work has been done on designing to meet a specific availability requirement, as would be done for an availability contract.  Most simple availability optimization approaches provide solutions only at selected points in time (not all times), implicitly assume that all uptimes are the same and all downtimes are the same (i.e., non-realistic systems) and do not seek to "meet a minimum" but rather they attempt to maximize the availability.  Recent interest in availability contracts that specify a required availability has created an interest in deriving system design and support parameters directly from an availability requirement (and in particular an availability requirement that is expressed as a probability distribution).  In general, determining design parameters from an availability requirement is a stochastic reverse simulation problem. While determining the availability that results from a sequence of events is straightforward, determining the events that result in a desired availability is not, and has only been accomplished using “brute force” search or iteration based methods that are not general and become quickly impractical for real systems and when uncertainties are introduced. 

The objective of this project is to develop a new methodology that uses an availability requirement as an input to the process of determining the optimal design and management of a system (as opposed to an availability output that is a consequence of system design, management and logistics inputs). The methodology is targeted at systems whose availability is driven by their reliability and the logistics supporting their return to operation after failure (note, "design-for-availability" is a phrase also used in the server and network support community where it refers to fault recovery dictated by the time required to switch to backup systems in which time to repair and system reliability are not primary concerns).

This project includes the development of a design for availability methodology applicable to single and multiple design parameters; integration with life-cycle cost analyses; and application to: logistics and reliability parameters, and within Prognostics and Health Management (PHM) environments.

Prognostics and Health Management (PHM) Background:

PHM refers to a family of methodologies and technologies that seek to provide advanced warning of system failures.  The advanced warning can then be used to avoid failure and/or optimize the maintenance of the system, [1].  PHM uses real-time data from a system to observe the state of the system (condition monitoring) and thus determine its health.  Based on the observed health and the expected future environment stresses, PHM methods provide a prognosis for future operation of the system in the form of a Remaining Useful Life (RUL) and make decisions about how the system should be managed.  CBM (Condition-Based Maintenance) is a subset of PHM that focused on taking maintenance actions only when they are necessary [2].  The application of PHM approaches to systems directly impacts the detection of future failures, the performance of preventative maintenance, and the efficiency with which failures can be diagnosed when they do occur.  All of these affect the availability of systems and the use of PHM in systems is widely expected to improve availability [3].

[1] P. Sandborn and M. Pecht, Guest Editorial: Introduction to Special Section on Electronic Systems Prognostics and Health Management, Microelectronics Reliability, Vol. 47, No. 12, Dec. 2007, pp. 1847-1848.

[2] J. H. Williams (Editor), A. Davies and P. R. Drake (Editors) Condition-Based Maintenance and Machine Diagnostics, Chapman & Hall, 1994.

[3] M. G. Pecht, Prognostics and Health Management of Electronics, Wiley, New York , NY , 2008.


Related Publications and Presentations

T. Jazouli, P. Sandborn, and A. Kashani-Pour, "A 'Design for Availability' Approach to Systems Design and Support," submitted to International Journal of Performability Engineering.

G. Haddad, P. A. Sandborn and M. G. Pecht, "Using Maintenance Options to Maximize the Benefits of Prognostics for Wind Farms," to be published Wind Energy.

P. Sandborn, "Making Business Cases for Health Management - Return on Investment, in IVHM - The Business Case, ed. I. Jennions, SAE International, 2013.

G. Haddad, P. Sandborn and M. Pecht, "Using Real Options to Valuate Decisions for Systems with Prognostic Capabilities," in IVHM - The Business Case, ed. I. Jennions, SAE International, 2013.

G. Haddad, P.A. Sandborn, and M.G. Pecht, “An Options Approach for Decision Support of Systems with Prognostic Capabilities,” IEEE Transactions on Reliability, Vol. 61, No. 4, pp. 872-883, December 2012.

P. Sandborn, T. Jazouli, and G. Haddad, "Supporting Business Cases for PHM - Return on Investment and Availability Impacts," in Diagnostics and Prognostics of Engineering Systems: Methods and Techniques, S. Kadry editor, IGI Global, 2012.



Wikipedia page supported by this project:

This project is part of the Electronic Products and Systems Cost Modeling Laboratory at the University of Maryland.


Last Updated: March 11, 2013