This document discusses service level agreements (SLAs) between business function owners and IT service providers. It defines key terms and outlines a standardized approach to categorizing business functions into support tiers with predefined SLAs. Standardizing SLAs results in consistent architectures, documentation, support processes, and performance metrics across business functions. The goal is to establish clear expectations for both parties around system availability and resolve any potential disputes.
1 of 23
More Related Content
Service Level Agreement
1. Business Continuity Mt Xia Inc. May 2011 Service Level Agreements http://www.mtxia.com 615.556.0456
2. Scope This presentation is limited to discussions of Service Level Agreements This is presented as an insight to the information technology management techniques used by very large data center operators This is not a procedural plan or operational tutorial This presentation is intended to invoke discussions regarding requirements for managing expanding IT operations Introduction of a business function approach to system administration
3. Perspective Business function owner is a customer of the service provider Business function owner is typically the business function department manager or an application support team Business function owners do NOT dictate architectures Business function owner does NOT own the hardware or infrastructure Service provider is typically the information technology department Service providers determine the infrastructure and architecture to fulfill the requirements of each instance of an SLA. Service providers own and manage the hardware and infrastructure The relationship between business function owner and service provider: should NOT be adversarial but it should be by-the-book
4. Perspective (continued) Service is provided according to a mutual agreement: Terms and Conditions Time period Cost Binding contract All systems should be included: production, pre-production, test, development, proof-of-concept, etc. set expectations for all parties and all systems Service Level Agreement High Level Outline Overview Service Description Roles and Responsibilities Requesting Service Hours of Coverage, Response Times & Escalation Maintenance and Service Changes Pricing Reviewing and Reporting Approvals and Signatures
5. Definition of Terms Business Continuity Policies – those things that shall be done Guidelines – those things that should be done Standards – technical specifications derived from policies and guidelines Procedures – step-by-step instructions for implementation of standards Resource planning and deployment Organizational Structure Business Impact Analysis (BIA) Security Management Document Management Change Management Audit Management Service Level Agreements
6. Definition of Terms (continued) Disaster Recovery Business function recovery between geographically separated data centers using some sort of storage replication between the data centers The output of disaster recovery planning is a disaster recovery project plan The goal is to minimize downtime for business functions, not systems Business function recovery times and maximum allowable data loss is specified during the business impact analysis High Availability Business function fail over between two or more physical frames within the same data center using a single shared storage location Elimination of single points of failure (SPOF's) are a necessary part of HA The goal is to minimize downtime for business functions, not systems This is NOT non-stop computing, downtime will be experienced during fail over
7. Definition of Terms (continued) Business Impact Analysis An examination of ALL business functions to determine those regarded as critical. Assignment of recovery time objectives for each business function Assignment of recovery point objectives for each business function Assignment of support tier associated with each business function Assignment of Service Level Agreement associated with each support tier Service Level Agreement An agreement between a business function owner and the service provider which designates the amount of time, on an annualized basis, the business function will be available. Conversely, the SLA also designates the amount of time, on an annualized basis, for which the business function will NOT be available. This should not be regarded as allowable downtime, but rather as mandatory downtime that requires management approval to reschedule The SLA is associated with a business function, not with any particular machine, system, or frame.
8. Definition of Terms (continued) Frame A physical computing device, may host one or more partitions or logical systems Partition A logical grouping of resources such as CPU, Memory, network and SAN adapters CPU May be shared between multiple systems simultaneously Allocated and Deallocated on an as needed basis SLA may address number of CPUs provided, not underlying processing units Memory Assigned to a single system at a time May be allocated and deallocated on an as needed basis May be reassigned during off-peak processing times
9. Definition of Terms (continued) Virtual I/O Virtual representations of physical adapters May be shared between multiple systems simultaneously Live Partition Mobility An IBM Power6 capability for moving a live running partition from one frame to another Requires virtualized I/O Requires equal or greater CPU and Memory on target frame Requires synchronization of slot numbers between frames Not a high availability or disaster recovery solution Used to eliminate business function downtime utilizing two healthy systems Planned maintenance outages Preemptive problem management
10. Example Architecture Support Tiers Limited offerings under standardized SLA’s Tier 1 : BC / HA / DR / HA RTO / RPO / Uptime / Downtime Architecture and infrastructure Tier 2: BC / HA / DR RTO / RPO / Uptime / Downtime Architecture and infrastructure Tier 3: BC / DR RTO / RPO / Uptime / Downtime Architecture and infrastructure Tier 4: BC / HA RTO / RPO / Uptime / Downtime Architecture and infrastructure Tier 5: BC RTO / RPO / Uptime / Downtime Architecture and infrastructure Do not attempt to be all things for all purposes
16. Standardized Service Level Agreements Business function owner is NOT required to comply with predefined SLA’s But they will be responsible for finding/hiring another service provider For each support Tier, define the following: Quantity and Quality of Support Personnel Audit requirements Data retention periods Data replication methods Architecture requirements Infrastructure requirements Systems Networking Facilities Management Level of Performance Monitoring statistics Change Control Requirements System monitoring data needed to measure SLA compliance
17. Standardized Service Level Agreements (continued) Uptime / Downtime Compliance Downtime specified by the SLA is a compliance requirement Rescheduling requires management approval Missed downtime ( due to request by the business function owner) suspends SLA compliance requirements until the downtime is performed Missed downtime ( due to request by the business function owner) is credited to the service provider for future outages, regardless of whether it is rescheduled for a later date Any outages during a suspended SLA is charged against the business function owner, not the service provider The SLA can only be re-activated by performing the missed downtime, or by performing the next scheduled downtime. Outages associated with mandatory SLA downtime is scheduled a year in advance Eliminates excuses that scheduled downtime was not known about Eliminates excuses for scheduling conflicts
18. Standardized Service Level Agreements (continued) Business function owner or customer is paying for a Service Level Agreement, not uptime Uptime is provided in specified blocks and associated with support tiers Downtime is mandatory for SLA compliance, whether or not maintenance is performed during the outage Customer does NOT specify desired uptime, they select an SLA for their business function which comes with a specific amount of uptime Additional uptime requires a support tier upgrade May require additional infrastructure May require additional system resources May require additional personnel resources May require additional vendor support availability Cost increases exponentially with uptime Uptime is valuable and costly It is not given away just because there is no pending maintenance to perform
19. Standardized Service Level Agreements (continued) SLA Hardware Provisions CPU can be guaranteed by number of CPU's (virtual or logical) Underlying processing units associated with CPU's is NOT guaranteed or referenced by SLA Processing units are dynamic and controlled by service provider Memory Unused memory may be reallocated dynamically to other LPARs Network bandwidth may be shared or dedicated depending upon business function requirements Storage Multi-tiered to accommodate tiered service structure Data Replication Sync or Async replication between SANs Tape with automated replication between data centers Tape with off-site storage Local Tape only
20. Standardized Service Level Agreements (continued) SLA Performance Metrics Business function uptime / downtime NOT measured by system or frame uptime / downtime Quality of service is measured and analysed Audit compliance Department and personnel performance reviews Compensation and bonus reviews Data Center Automation Dependent upon standardization Service Level Agreements Limited Support Tiers and Architectures Infrastructure and hardware requirements Deployment, configuration, and support methodologies Documentation Auditing Security
21. Summary Service Level Agreements Categorize all business functions into a limited number of SLA's Results and Benefits include standardized: Architectures Configurations Documentation Level 1 support Performance Monitoring Maintenance Training Audit response Data Center Automation Performance reviews (department and personnel) Bonus calculations
22. Contact Information Mt Xia Inc. http://www.mtxia.com Dana French, President 615.556.0456