Difference between revisions of "System availability"
From Zenitel Wiki
(→Other failures) |
(→Other failures) |
||
Line 97: | Line 97: | ||
*Expansion, add users etc | *Expansion, add users etc | ||
*Ability to do maintenance without service interruptions | *Ability to do maintenance without service interruptions | ||
+ | |||
+ | ==Redundancy== | ||
+ | |||
+ | Redundancy is about parallelism and removing single point of failures | ||
+ | <br> | ||
+ | <br> | ||
+ | Redundancy usually gives lower MTBF figures | ||
+ | *Require more components | ||
+ | <br> | ||
+ | Redundancy usually provides significant higher service availability | ||
+ | *A single failure shall have minimum or no impact on service availability | ||
[[Category:AlphaCom E System]] | [[Category:AlphaCom E System]] |
Revision as of 13:15, 4 August 2009
System availability
System Availability
- The percentage of time that the system can perform its intended function
System Availability = 1 – System Downtime
Downtime per year
Availability | Nines | Downtime | |
90% | 1 | 36.5 | days/year |
99% | 2 | 3.65 | DAys/year |
99.9% | 3 | 8.78 | hours/year |
99.99% | 4 | 52 | minutes/year |
99.999% | 5 | 5 | minutes/year |
System Downtime
Many events causes system downtime:
- HW fault
- Software fault
- Vandalism
- Extreme conditions (fire, flooding etc)
- Power outage
- IP network failure
- Planned system maintenance
System Downtime = ∑ P * S * MTTR
P | = Probability of event taken place |
S | = Severity of event |
= Percentage of service affected by fault | |
MTTR | = Mean Time To Repair |
= mean time to detect fault + mean time to fix fault |
HW failure
MTBF
- Probability of HW faults calculated using MTBF figures
- MTBF ≠ System Availability
MTBF calculations
- Emperical method
- MIL-HDBK-217
- Telcordia
Emperical methods
- Based on statistics from the field
MIL-HDBK-217 and Telcordia
- All component entered in database with set environmental condition
- Provides usually lower MTBF figure than emperical methods
- Does include real usage conditions
- Use worst case environmental conditions
*More components gives higher MTBF
*MTBF and single points of failure
Other failures
Software fault
- Automatic watch dog functions
- Automatic recovery functions
- Maturity of system
- Structured software design and test
Vandalism and Extreme conditions
- Robustness to vandalism and extreme conditions
- IP and IK class
- IP security functions to hinder denial of service attacks (DOS)
Power outage
- UPS and redundant power supplier
IP network failure
- Network service level
- Redundant and switchover functions
Planned system maintenance
- Expansion, add users etc
- Ability to do maintenance without service interruptions
Redundancy
Redundancy is about parallelism and removing single point of failures
Redundancy usually gives lower MTBF figures
- Require more components
Redundancy usually provides significant higher service availability
- A single failure shall have minimum or no impact on service availability