ITIL Practitioner Plan and Improve (IPPI) All-in-One Exam Guide and Certification Work Book: IT Service Management with Availability Management, Capacity Management and Disaster Recovery, IT Service Continuity Management
Notice of Rights: Copyright © The Art of Service. All rights reserved. No part of this book may be reproduced or transmitted in any form by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. Notice of Liability: The information in this book is distributed on an “As Is” basis without warranty. While every precaution has been taken in the preparation of the book, neither the author nor the publisher shall have any liability to any person or entity with respect to any loss or damage caused or alleged to be caused directly or indirectly by the instructions contained in this book or by the products described in it. Trademarks: Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and the publisher was aware of a trademark claim, the designations appear as requested by the owner of the trademark. All other product names and services identified throughout this book are used in editorial fashion only and for the benefit of such companies with no intention of infringement of the trademark. No such use, or the use of any trade name, is intended to convey endorsement or other affiliation with this book. ITIL® is a Registered Community Trade Mark of OGC (Office of Government Commerce, London, UK), and is Registered in the U.S. Patent and Trademark Office.
Write a Review and Receive a Bonus Emereo eBook of Your Choice
Up to $99 RRP – Absolutely Free If you recently bought this book we would love to hear from you – submit a review of this title and you’ll receive an additional free ebook of your choice from our catalog at http://www.emereo.org.
How Does it Work? Submit your review of this title via the online store where you purchased it. For example, to post a review on Amazon, just log in to your account and click on the ‘Create Your Own Review’ button (under ‘Customer Reviews’) on the relevant product page (you’ll find plenty of example product reviews on Amazon). If you purchased from a different online store, simply follow their procedures.
What Happens When I Submit my Review? Once you have submitted your review, send us an email via
[email protected], and include a link to your review and a link to the free eBook you’d like as our thank-you (from http://www.emereo.org – choose any book you like from the catalog, up to $99 RRP). You will then receive a reply email back from us, complete with your bonus ebook download link. It's that simple!
ITIL Practitioner: Plan and Improve
Table of Contents 1
INTRODUCTION ............................................................................................................ 5
2
GENERAL TERMS AND CONCEPTS ..................................................................... 17
3
INTRODUCTION TO CAPACITY MANAGEMENT................................................ 37
4
INTRODUCTION TO AVAILABILITY MANAGEMENT......................................... 45
5
INTRODUCTION TO IT SERVICE CONTINUITY MANAGEMENT..................... 55
6
PLAN AND IMPROVE................................................................................................. 61
6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10
MANAGING – CAPACITY MANAGEMENT ................................................................... 63 ORGANIZING – CAPACITY MANAGEMENT ................................................................ 75 OPTIMIZING – CAPACITY MANAGEMENT .................................................................. 83 MANAGING – AVAILABILITY MANAGEMENT ............................................................. 87 ORGANIZING – AVAILABILITY MANAGEMENT ........................................................ 103 OPTIMIZING – AVAILABILITY MANAGEMENT .......................................................... 123 MANAGING – IT SERVICE CONTINUITY MANAGEMENT .......................................... 125 ORGANIZING – IT SERVICE CONTINUITY MANAGEMENT ...................................... 143 OPTIMIZING – IT SERVICE CONTINUITY MANAGEMENT ......................................... 153 SUMMARIES ........................................................................................................... 155
7
ASSIGNMENTS ......................................................................................................... 159
7.1 ASSIGNMENT 1 – CAPACITY MANAGEMENT .............................................................. 161 7.2 ASSIGNMENT 2 – AVAILABILITY MANAGEMENT ........................................................ 163 7.3 ASSIGNMENT 3 – IT SERVICE CONTINUITY MANAGEMENT....................................... 165 8
ASSIGNMENT RESOURCES.................................................................................. 167
8.1 CASE STUDY – RECE SHOE COMPANY .................................................................... 169 9
MOCK EXAM.............................................................................................................. 179
10
MOCK EXAM ANSWERS ........................................................................................ 191
11
FURTHER READING ................................................................................................ 193
3
Also from Emereo Publishing and The Art of Service:
IT Service Operations Management Guide Your Complete Guide to Managing an IT Service Operation A professional technical roadmap to ITIL V3 Framework IT Service Operations Management (Incident, Event, Problem and Access Management, plus Request Fulfilment) with 34 templates and design documents for organizational assessment and implementation..
ITIL Practitioner: Plan and Improve 1
INTRODUCTION
Notes:
5
ITIL Practitioner: Plan and Improve
The requirement to look at the provision of IT Services in a professional way is driven by a variety of factors. Organizations understand the benefits of having Information Technology (IT) throughout their structure. However, few realize the potential of truly aligning the IT department’s objectives with the business objectives. More and more organizations are beginning to recognize IT as being a crucial delivery mechanism of services to their customers. The starting point for IT Service Management (ITSM) and the ITIL Framework is not the technology, not the processes; it is the organizational objectives.
6
ITIL Practitioner: Plan and Improve
The IT Infrastructure Library is a set of books/CDs with best (good) practice processes on how to manage the provision of IT services. It is the “process” part of the Service Management challenge (people, process and technology). The core set of material is the following set of tightly coupled areas: •
Service Delivery
•
Service Support
•
Security Management
•
Business Perspective: The IS View on Delivering Services to the Business (Vol I)
•
Applications Management
•
ICT Infrastructure Management
•
Planning to implement Service Management
•
Software Asset Management
7
ITIL Practitioner: Plan and Improve
OGC Office of Government Commerce (the trademark owners of ITIL) APMG In 2006 APMG won the tender to own the rights for accreditation and certification of the ITIL courses. EXIN and ISEB used to be independent bodies, but now sublicense through APMG. EXIN Stichting EXameninstutuut voor INformatica (Translation: Foundation for EXamination INformation Systems) ISEB: Information Systems Examination Board . This certification is recognized worldwide! TSO: The Stationary Office Tool Vendors (HP, Infra, Remedy, HEAT, etc…) Provide technical solutions for customers trying to implement ITIL/IT service management. No such thing as ITIL certified tools. Only people can be ITIL certified. ITSMF: (IT Service Management Forum) is the only internationally recognized and independent organization dedicated to ITSM.
8
ITIL Practitioner: Plan and Improve
continued… Accredited Vendors (e.g. The Art of Service) Only accredited vendors can provide ITIL training.
9
ITIL Practitioner: Plan and Improve
Notes:
10
ITIL Practitioner: Plan and Improve
This slide demonstrates the learning pathways that can be taken in developing your knowledge and skills for ITSM. The Art of Service provides world-class accredited programs for all ITIL certification levels as described above.
11
ITIL Practitioner: Plan and Improve
Notes:
12
ITIL Practitioner: Plan and Improve
Notes:
13
ITIL Practitioner: Plan and Improve
Notes:
14
ITIL Practitioner: Plan and Improve
Notes:
15
ITIL Practitioner: Plan and Improve
16
ITIL Practitioner: Plan and Improve 2
GENERAL TERMS AND CONCEPTS
Notes:
17
ITIL Practitioner: Plan and Improve
Balanced score cards (BSC) - one of the most common methods of measurement. This method uses the objectives of the organization or process to define Critical Success Factors (CSF’s). Critical Success Factors (CSF’s) - are defined for a number of areas of interest or perspectives: customer/market, business processes, personnel/innovation and finance. Key Performance Indicators (KPI’s) – the parameters for measuring progress relative to key objectives or CSF’s in the organization. Process - A process is a logically related series of activities conducted towards a defined objective. Processes may define roles, responsibilities, tools, management controls, policies, standards, guidelines, activities and work instructions if they are needed. Process Owner - A process owner is responsible for the process results. Example: The owner for the Service Level Management Process
18
ITIL Practitioner: Plan and Improve
continued… Process Manager – is responsible for the realization and structure of the process, and the reports to the process owner. Function - A team or group of people and the tools they use to carry out one or more Processes or Activities. Functions provide units of organization responsible for specific outcomes. E.g. The Service Desk function is responsible for performing activities from the other ITIL processes including Incident Management, Event Management, and Request Fulfillment etc. Vital Business Function (VBF) – A function of a business process that is critical to the success of the business. Vital Business Functions are an important consideration of all ITIL processes.
19
ITIL Practitioner: Plan and Improve
More and more people see the importance of PROCESESS (the primary purpose of this program is studying some of these in detail). The ORGANIZATIONAL PERSPECTIVE issue ensures that we align the vision, strategy and goals of the business with the delivery of IT services. The PEOPLE issue is without doubt where we – as IT Service Management professionals face our greatest challenges. Historically, the TECHNOLOGY considerations got most attention. Now we understand that without well defined processes we can often make inappropriate selections in this area.
20
ITIL Practitioner: Plan and Improve
To impress stakeholders (e.g. customers, investors, personnel) your organization will have to communicate the ‘vision’ and why they should do business with you, e.g. because you are the cheapest, the most reliable etc. The vision can be communicated in the form of a mission statement (see slide). The mission statement should be a short, concise description of the objectives of the organization and the values it believes in. The policy of the organization is the combination of all decisions and measures taken to define and realize the objectives. Implementing policies in the form of specific activities requires planning. Realization of the planned activities requires action. Actions are allocated to personnel tasks, or outsource to external organizations. There is a danger, that after time, the mission, objectives and policies will be forgotten by the organization. This why it is so important to measure at every stage of the organizations maturity, to ensure remedial action is taken when necessary.
21
ITIL Practitioner: Plan and Improve
So if we accept that the objectives are the starting point for the provision of IT Services, how can we position the delivery of services and the utilization of the ITIL Framework. The objective tree is a very powerful tool not only in showing the benefit and alignment of IT and Service Management principals to the business, but also in being able to effectively sell the concepts as well. See if you can follow along in the following paragraphs and match the sentences to the above objective tree. To meet organizational objectives, the organization has business processes in place. Examples of business processes are sales, admin and financial departments working together in a “sales process” or logistics, customer service and freight who have a “customer returns process”. Each of the units involved in these business processes needs one or more services (e.g. CRM application, e-mail, word processing, financial tools).
22
ITIL Practitioner: Plan and Improve
continued… Each of these services runs on IT infrastructure. IT Infrastructure includes hardware, software, procedures, policies, documentation, etc. This IT Infrastructure has to be managed. ITIL provides a framework for the management of IT Infrastructure. Proper management of IT Infrastructure will ensure that the services required by the business processes are available, so that the organizational objectives can be met.
23
ITIL Practitioner: Plan and Improve
The questions raised on the left hand side of this diagram arise constantly in the process-based approach typical of modern IT Service Management. The tools to answer these questions are shown on the right hand side of the diagram. The standards for the output of each process have to be defined in such a way that the complete chain of processes that meets the corporate objective, if each process complies with its process standard. If the result of a process meets the defined standard, then the process is effective. If the activities in the process are also carried out with the minimum required effort and cost, then the process is efficient. Processes are often described using procedures and work instructions.
24
ITIL Practitioner: Plan and Improve
Elements derived from the OGC – Planning to Implement Service Management. GQM = Goals/Questions/Metrics The first and most important question that needs to be answered is whether the “initiative” or adoption of a CSIP (Continuous Service Improvement Program) will deliver sufficient revenue/savings to justify the expense. The principle challenge is that much of the benefit cannot be quantified (egg. quality, flexibility, service satisfaction). With regards to Risk Management we must remember that this initiative is not the only one competing for organizational budget. What is the effect on other initiatives if this IT initiative is successful at attracting funds? Also what risks will be faced after the program has begun? The salient point is that risks will always be present, so don’t attempt to eliminate them – simply get to understand them. The Gap analysis is generally an opportunity to use “scores” to measure current and future expected levels of maturity. For example a 5 level model of maturity is defined in the OGC “Planning to Implement Service Management” text (Initial, Repeatable, Defined, Managed, and Optimized).
25
ITIL Practitioner: Plan and Improve
Elements derived from the OGC – Planning to Implement Service Management IT organizational maturity progresses through a variety of stages (Technology focus, Product/Service focus, Customer focus, Business focus, Value Chain). To move from one stage to the next requires management and control over people, processes, technology, steering, attitude and changing relationships between the business and the IT organization. IT process maturity is best measured by using an assessment tool for a more detailed analysis. A variety of such assessments can be carried out as “selfassessments” as well as commercially available solutions. (Also discussed in the next slide)
26
ITIL Practitioner: Plan and Improve
Elements derived from the OGC – Planning to Implement Service Management (notice some of the concepts raised by Kotter creeping through – the fundamentals are all the same). The starting point should not be determined on the emotions or “feelings” of an individual. The starting point will be a combination of current maturity levels of the IT organization, the IT processes AS WELL AS – overall organization priorities, IT “pain points”, process interdependencies (e.g. You cannot have Problem, without Incident Management) and other factors (including budget, available resources and current workloads). The most important aspect of the communication plan is that it needs to be ongoing and not just given enthusiastic support at the outset of the CSIP initiative. Under the banner of managing organizational change; it simply isn’t enough to get a group of people together that have project management and ITIL skills and experience. There is a multitude of issues that have to be considered and addressed (including resistance, gaining and keeping commitment, involvement and communication).
27
ITIL Practitioner: Plan and Improve
continued… Cultural change is largely reflected in the leadership/management style that is prevalent/dominant in the organization. Remember that some organizations suit dominant leadership styles (e.g. Regimented organizations (such as military units)). Careful role definition is required (see later section on using the ARCI model), especially when considering the allocation of new activities to existing staff (overload) or the conflict that could arise when processes are implemented across traditional hierarchical models. As people are an integral part of the CSIP there is a need to provide appropriate levels of education. The critical element being timing of delivery. All too often we see education programs delivered only to witness a lack of follow up activity that gives the participant an opportunity to put the theory into practice. Finally, tools. The raging debate is which comes first. Tools or process? In reality, most organizations have a variety of IT Service Management tools already. It is not the place of this program to suggest that existing tools are a wasted investment, however, the commonly accepted principle is that the processes should be supported by the tools and not the other way around.
(Service Management tools are covered later in the program)
28
ITIL Practitioner: Plan and Improve
We need to use measures and metrics to help us determine a “point of acceptance” with regard to process improvements and progress of the CSIP. CSFs are quite simply the things that we have to be able to do. They tell us in qualitative terms the basic objective for a process (e.g. the CSFs for Change Management can be (a) a repeatable change process, (b) ability to deliver change quickly, (c) deliver organizational benefits via change management). KPIs can be considered the quantitative elements of a process that give us a guide to the process efficiency (e.g. the KPIs for Change Management can be (a) the number of rejected RFCs, (b) the size of the change backlog, (c) quantity of changes delivered according to RFC schedule). Note that there are generally a lot more KPIs than CSFs for a given process. Organizational drivers are the points that are raised by “business people” as their “wish list” for the IT department. There have been many surveys conducted over many years, but the essence of what business people want from their IT departments can be condensed down into several points. These include support business operation, facilitate delivery of electronic services, help to deliver business strategies, enable change to match pace with required business change.
29
ITIL Practitioner: Plan and Improve
Elements derived from the OGC – Planning to Implement Service Management Quick wins should be used to help drive more change that is medium to long term. Also consider what your legacy of change will be. Consider in 10 years time when all current staff have moved on or been replaced. When asked why certain activities are done the way they are – what will be the response? That is the way it’s done around here. Perhaps that could be considered a good legacy, provided that part of “the way it’s done around here” involves continual monitoring and reviews. The real essence of a CSIP is the first word of the acronym. Continuous ! The way to create continuity is to go back to the start and begin again. Remember it begins with the VISION. The vision must essentially promote business and IT alignment.
30
ITIL Practitioner: Plan and Improve
continued… Knowledge management is another way to ensure the legacy of the work you do is not lost. The rate of change, staff turnover, expectations of performance and market place competition are all heading in an upwards direction. With these trends the organization cannot afford not to retain knowledge and not learn from previous mistakes and lessons. The CSIP must include in its overall design elements consideration for the gathering, organizing and accessibility of data/information/knowledge.
31
ITIL Practitioner: Plan and Improve
Most businesses are hierarchically arranged. They have departments, which are responsible for a group of employees. There are various ways of structuring departments, for example by customer, product, region or discipline. In this type of arrangement we often see that communication is ineffective.
32
ITIL Practitioner: Plan and Improve
A process is a logically related series of activities for the benefit of a defined objective. This slide illustrates cross functional process flows. The concept of processes “linking” functional areas together can apply to business departments as equally as it can to IT functional areas. For example, a Pizza shop has various functional groups (silos), including the phone operators, the cooks and the delivery drivers. The process of fulfilling a customer’s order requires a process that links the various groups together. I.e. The operator takes the order, passes this to the cooks, who when finished inform the driver that it is ready to be delivered.
33
ITIL Practitioner: Plan and Improve
Example process: Baking a Cake What is the Goal? To bake a cake! What are the inputs? The ingredients (eggs, milk, flour, sugar, butter, chocolate) plus equipment such as pans, mixers etc. What are the activities? Mix, pre-heat oven, bake, cool, decorate etc. What are the measures? How much ingredients, how long to bake, at what temperature etc. What are the norms? The recipe What are the outputs? CAKE!!!!!! As we can see the basis of ITIL’s approach to Service Management is on the interrelated activities. •
Unlike a project a process is never ending
34
ITIL Practitioner: Plan and Improve
continued… •
In this example, baking a specific cake (e.g. a birthday cake for Jane) is a project.
•
The goals, activities, inputs, outputs, goals, measures and norms defined makes up the process for baking cakes.
Question: What is the process owner (e.g. head chef in a kitchen) responsible for? The output of the process, e.g. the cake itself
35
ITIL Practitioner: Plan and Improve
The ARCI model helps show how a process actually does work end to end across several functional groups. A – Accountability (is made accountable for ensuring that the action takes place, even if they might not do it themselves). R – Responsibility (actually does the work for that activity but is responsible to the function or position that has an “A” against it. C – Consult (advice / guidance / information can be gained from this function or position prior to the action taking place). I – Inform (the function or position that is told about the event after it has happened). General Rules: •Only
1 “A” per Row (ensures accountability, more than one “A” would confuse
this) •At
least 1 “R” per Row (shows that actions are taking place)
36
ITIL Practitioner: Plan and Improve 3
INTRODUCTION TO CAPACITY MANAGEMENT
Notes:
37
ITIL Practitioner: Plan and Improve
Capacity Management needs to: •
Understand the BUSINESS REQUIREMENTS (the required Service
Delivery) •
Understand the ORGANIZATIONS operations (the current Service Delivery)
•
Understand the IT INFRASTRUCTURE REQUIREMENTS (the means of
Service Delivery), and ensure that all the current and future Capacity and performance aspects of the business requirements are provided costeffectively. Essentially it is a balancing act of COST against CAPACITY, (ensuring cost effective purchases for business capacity needs).
38
ITIL Practitioner: Plan and Improve
Capacity Management has 3 sub-processes: The 1st Sub Process is: Business Capacity Management that includes: Managing
Capacity to meet future BUSINESS requirements for IT
services Planning
and implementing sufficient capacity with an appropriate timescale
and Business Capital Management Should
be included in Change Management and Project management
activities The 2nd Sub Process is: Service Capacity Management where the Focus
is on managing ongoing SERVICE performance as
detailed in SLA or SLR Establishes
baselines and profiles of use of Services
The 3rd Sub Process is: Resource Capacity Management which Identifies
and manages each of the RESOURCES of the IT
Infrastructure. For example:
39
ITIL Practitioner: Plan and Improve
continued… CPU,
memory, disks and network bandwidth – Resource Capacity
Management Evaluates
NEW technology and the Load balances across resources
40
ITIL Practitioner: Plan and Improve
CDB: Capacity Database – holds business, service, technical, financial, and utilization data. Used for reports, forecasts etc. Performance Monitoring: Measuring, monitoring, and tuning the performance of IT Infrastructure components. Capacity Planning: Analysing the current situation and predicting the future use of the IT infrastructure and resources needed to meet the expected demand for IT Services. Modelling: Used to forecast the behaviour of the infrastructure. Application Sizing: Determining the hardware or network capacity to support new or modified applications and the predicted workload Demand Management: Aims to influence the demand on capacity. It is about moving demand.
41
ITIL Practitioner: Plan and Improve
Capacity Database (CDB): holds business, service, technical, financial, and utilisation data. Used for reports, forecasts etc. Forecast & Predictive Reports: these will be used by all areas to analyse, predict and forecast particular business and IT scenarios and their potential IT solutions. Exception Reports: Reports that show management and technical staff when the capacity and performance of particular component or service becomes unacceptable are also a required form of analysis of capacity data. Tuning: identify areas of IT infrastructure that could be better utilized
42
ITIL Practitioner: Plan and Improve
Capacity Management has a business focus and should be the focal point for all IT Performance and Capacity issues... Capacity Management encompasses operational and development environments including ALL: •
Hardware
•
Software
•
Peripherals
•
Human resources where an IT organization believes that a lack of staff will
negatively impact on service levels.
43
ITIL Practitioner: Plan and Improve
44
ITIL Practitioner: Plan and Improve 4
INTRODUCTION TO AVAILABILITY MANAGEMENT
Notes:
45
ITIL Practitioner: Plan and Improve
Other Availability objectives are: • Reduction in the frequency and duration of Availability related incidents • Maintain a forward looking Availability plan
46
ITIL Practitioner: Plan and Improve
Why are users happy with a 60min outage and unhappy with 30min outage? For a consumer/user of an IT Service, its Availability and Reliability can directly influence both the perception and satisfaction of the overall IT Service provision. Can availability be improved without understanding how IT services support the business?
47
ITIL Practitioner: Plan and Improve
Security: Security Management determines requirements, Availability Management implements measures Availability: The ability of an IT Service or component to perform its required function at a stated instant or over a stated period of time. Reliability: Freedom from operational failure. Resilience: the ability to withstand failure. Maintainability: (internal) the ability of an IT component to be retained in or restored to, an operational state. -
based on skills, knowledge, technology, backups, availability of staff.
Serviceability: (external) The contractual obligation / arrangements made with 3rd party external suppliers. Measured by Availability, Reliability and Maintainability of IT Service and components under control of the external suppliers.
48
ITIL Practitioner: Plan and Improve
continued… Vital Business Function (VBF): -
the business critical elements of the business process supported by an IT
Service.
49
ITIL Practitioner: Plan and Improve
Methods and Techniques: TOP – Technical Observation Post – a prearranged gathering of specialist technical support staff from within the IT support organization brought together to focus on specific aspects of IT Availability. Monitor events, real-time as they occur, with the specific aim of identifying improvement opportunities or bottlenecks which exist within the current IT Infrastructure. CRAMM – Identify risks and provision of justified countermeasures to reduce or eliminate the threats posed by those risks. CRAMM describes a means of identifying justifiable countermeasures to protect Confidentiality, Integrity and Availability of the IT Infrastructure. SOA – Systems Outage Analysis – provides a structured approach to identify end to end Availability improvement opportunities that deliver benefits to the user. FTA – Fault Tree Analysis – can be used to determine the chain of events that causes the disruption of IT Services.
50
ITIL Practitioner: Plan and Improve
continued… CFIA – Component Failure Impact Analysis –provides the information necessary to predict and evaluate the impact on IT Service Availability arising from component failures within the proposed IT Infrastructure and service design.
51
ITIL Practitioner: Plan and Improve
Mean time between Failures (MTBF) or UPTIME is the average time between the recovery from one incident and the occurrence of the next incident. Mean time to Repair (MTTR) or DOWNTIME is the average time between the occurrence of a fault and service recovery. Mean time between System Incidents (MTBSI) is the average time between the occurrence of two consecutive incidents. Sum of the MTTR and MTBF. High ratio of MTBF/MTBSI indicates there are many MINOR faults Low ratio of MTBF/MTBSI indicates there are few MAJOR faults
52
ITIL Practitioner: Plan and Improve
Notes:
53
ITIL Practitioner: Plan and Improve
54
ITIL Practitioner: Plan and Improve 5
INTRODUCTION TO IT SERVICE CONTINUITY MANAGEMENT
Notes:
55
ITIL Practitioner: Plan and Improve
Notes:
56
ITIL Practitioner: Plan and Improve
Disaster which is defined as NOT part of daily OPERATIONAL activities and requires a separate system. BCM or BUSINESS Continuity Management; ITSCM is usually a sub-set of the BCM plan Risk Assessment: Evaluates Assets, THREATS and VULNERABILITIES
57
ITIL Practitioner: Plan and Improve
Counter Measures are Measures to prevent or RECOVER from disaster A Manual Workaround is using NON-IT based solution to overcome IT service disruption Gradual recovery otherwise known as a COLD standby (>72hrs) Intermediate Recovery otherwise known as a WARM standby (24-72hrs) Immediate Recovery otherwise known as a HOT standby (< 24hrs, usually implies 1-2 hrs) A Reciprocal Arrangement is an Agreement with another similar sized company to share disaster recovery obligations
58
ITIL Practitioner: Plan and Improve
Notes:
59
ITIL Practitioner: Plan and Improve
60
ITIL Practitioner: Plan and Improve 6
PLAN AND IMPROVE
Notes:
61
ITIL Practitioner: Plan and Improve
62
ITIL Practitioner: Plan and Improve
6.1
MANAGING – Capacity Management
Notes:
63
ITIL Practitioner: Plan and Improve
Capacity Management has a close, two-way relationship with the business strategy and planning processes within an organization. On a regular basis, the long term strategy of an organization is detailed in an update of the business plans. The business plans are developed from the organization’s understanding of the external factors such as the competitive market-place, economic outlook and legislation, and its internal capability in terms of manpower, delivery capability etc. Capacity Management needs to understand the long-term strategy of the business while providing information on the latest ideas, trends and technologies being developed by the suppliers of computing hardware and software.
64
ITIL Practitioner: Plan and Improve
Reviews should identify: •
Current responsibility for any Capacity Management
•
The tools already in use
•
Current and desired requirements by other SM processes, especially SLM, Availability, FMIT
•
Current budget and cost-effectiveness
•
The management commitment to the introduction of Capacity Management.
When the review is completed, it should be possible to produce a report on: •
The assessment of the current situation ref Capacity Management
•
The improvements that need to be made
•
The need for Capacity Management and the benefits that can be expected
•
How the improvements can be implemented
•
Example output from the Capacity Management process
•
A project plan, showing timescales, staffing levels, costs and specifying the objectives, main tasks, ongoing activities and outputs from each part of the process.
When there is a documented statement of what already exists, it is then possible to plan for a full, ITIL compliant Capacity Management process. 65
ITIL Practitioner: Plan and Improve
Notes:
66
ITIL Practitioner: Plan and Improve
The Capacity Management Database (CDB) is key to the success of the process and is a logical database (NOT necessarily ONE physical DB) The CDB contains much more than just technical data: Business Data: Used to forecast and validate how changes in business drivers affect performance and capacity Service Data: Gathered from resources and provides data on performance and capacity of services. Technical Data: Performance and capacity data at the IT component level includes threshold alerts. Financial Data: For doing cost-benefit analysis of upgrade proposals, RFCs and Capacity Planning. Utilization Data: Utilization of IT components by time periods (e.g. second, minute, hour, day, etc) Setting up a hierarchical structure for utilisation data allows for data consolidation, refinement and archiving.
67
ITIL Practitioner: Plan and Improve
Train Staff – The IT Services in most organizations are supported by an IT Infrastructure that consists of a large variety of hardware and software resources from many different suppliers. A wide range of technical skills are required to install, set-up and run all the necessary monitors, to analyze the information, and to make and implement tuning recommendations Establishing monitoring and the CDB – The monitoring facilities need to be established on each of the hardware platforms and for each of the services. Also the CDB needs to be established, if necessary across a range of platforms. These activities should be carried out as part of the CM process. Business Capacity Management – When the necessary data is being produced by each of the Resource, Service and Business Capacity Management processes, the Business Capacity Management can use the data to produce a series or resource and service utilization reports and graphs for IT Management and the business. Some reports will be linked to SLA targets, while others will be confirmation that SLR’s can be met. The main document Business Capacity Management produces is the Capacity Plan. These are all reported on a regular basis.
68
ITIL Practitioner: Plan and Improve
continued… Service Capacity Management – When service monitoring has taken place the data needs to be analyzed, and action taken to tune performance. E.g. the activities of monitoring and analysis may indicate that the design and implementation of a particular service needs to be tuned for improved on-line response times. Resource Capacity Management – When the necessary resource monitoring has been established, the data produced needs to be analyzed, and action taken to tune the resources used. E.g. the activities of monitoring and analysis may produce exception reports for the SLM process that indicate the tuning of a particular resource is required.
69
ITIL Practitioner: Plan and Improve
Notes:
70
ITIL Practitioner: Plan and Improve
Questions to ask: Capacity utilization Effective
- is the right data, provided by appropriate monitoring tools?
Efficient
- is the right amount of data, provided in a timely and cost effective manner?
Report provision Are the reports providing the data to make better decisions? Analyze the market place Do you review the market place for new products or services which may improve capacity management (especially application sizing and modeling)? Seek feedback Do you analyze how other ITIL processes are using Capacity Management data to support their process?
71
ITIL Practitioner: Plan and Improve
Notes:
72
ITIL Practitioner: Plan and Improve
Notes:
73
ITIL Practitioner: Plan and Improve
Notes:
74
ITIL Practitioner: Plan and Improve
6.2
ORGANIZING – Capacity Management
This diagram shows how new requirements for Capacity Management drive the Business Capacity Management sub-process. Business Capacity Mgmt works with other processes – this is described on the next slide. The three highlighted areas are largely the responsibility of Capacity Management and of these only ‘New Requirements’ is the only responsibility of Business Capacity Management. ‘Ensure operational service complies with SLA’ and ‘Resolve Capacity related Incidents and Problems’ are the responsibilities of the Service Capacity Management and/or Resource Capacity Management sub-processes.
75
ITIL Practitioner: Plan and Improve
Identify and agree SLR’s – assist SLM is understanding the customer’s requirements and with the negotiation process by providing possible solutions to a number of scenarios. Modeling and Application Sizing may be used here. Design, procure or amend configuration – involved in the design of new services and make recommendations for the procurement of hardware and software, where performance and/or capacity are factors. In some cases Capacity Mgmt will instigate the implementation of the new requirement through Change Mgmt, where it may also be involved as a member of the CAB. Business Capacity Mgmt obtains the costs of proposed solutions. Update CMDB and CDB – The details of the new or amended CI’s should be recorded in the CMDB under Change Management. The CDB should be updated to include technical specification of the procured or amended CI’s. From this info thresholds can be identified and monitored. The iterative activities of Capacity Mgmt will address threshold breaches and near misses.
76
ITIL Practitioner: Plan and Improve
continued… Verify SLA – The SLA will include anticipated service throughputs and the performance requirements. Capacity Mgmt provides SLM with targets that can be monitored and provide a base for service design. Modeling will be used to provide confidence that the service design will meet the SLR’s and provide the ability for future growth. Sign SLA – The results of the Modeling activities provide the verification of service performance capabilities. From these findings the SLM may need to renegotiate the SLA. Business Capacity Mgmt provides support to SLM with these renegotiations, by recommending potential solutions and associated cost information.
77
ITIL Practitioner: Plan and Improve
Once Business Capacity Mgmt has passed on the business requirements and the service is operational, Service Capacity Mgmt is responsible for ensuring that it meets the agreed service level targets. Monitoring the service provides trend analysis for normal levels of service. Occasionally Incidents and Problems will be referred to Capacity Mgmt. Generally this will be directed to Resource Capacity Mgmt and will relate to a CI or other resource. If the Incident or Problem relates to the design or programming of an application, the service performance needs to be managed – this will be referred to Service Capacity Mgmt. The key to successful Service Capacity Management is to pre-empt difficulties, when possible. It is more proactive than reactive. However, there are times when it has to react to specific performance Problems. By understanding the performance requirements of each of the services being run, the effects of Changes in the use of service can be estimated, and actions taken to ensure that the required service performance can be achieved.
78
ITIL Practitioner: Plan and Improve
Ensures the optimum use of the current hardware and software resources in order to achieve and maintain the agreed service levels. All hardware components and some software components in the IT Infrastructure have a finite Capacity, which, when exceeded, has potential to cause Problems. Resource Capacity Management also involves understanding new technology and how it can be used to support the business. It may be appropriate to introduce new technology to improve the provision and support of the IT Services in which the organization is dependent. This information can be gathered by studying professional literature (mags and press articles) and by attending: •
Promotional seminars by hardware and software suppliers
•
User group meetings of suppliers of potential hardware and software
•
User group meetings for other IT professionals involved in Capacity Mgmt.
Resource Capacity Management is also responsible for identifying the resilience inherent in the IT Infrastructure or any subset of it. In conjunction with the Availability Management process, Capacity Mgmt should use techniques such as highlighting how susceptible the current configuration is to the failure of individual components and make recommendations on any costeffective solutions.
79
ITIL Practitioner: Plan and Improve
Tuning Techniques: Balancing workloads – transactions may arrive at the host or server at a particular gateway, depending where the transaction was initiated; balancing the ratio of initiation points to gateways can provide tuning benefits. Balancing disk traffic – storing data on disk efficiently and strategically, e.g. striping data across many spindle may reduce data collection. Definition of an acceptable locking strategy that specifies when locks are necessary and the appropriate level, e.g. database, page, file, record and row delaying the lock until an update is necessary may provide benefits. Efficient use of memory – may include looking to utilize more or less memory depending upon the circumstances. Before implementing any of the recommendations arising from the tuning techniques, it may be appropriate to consider using one of the on-going activities to test the validity of the recommendation.
80
ITIL Practitioner: Plan and Improve
Short Term Demand Management: e.g. partial failure of a critical resource in the IT Infrastructure. Long Term Demand Management: e.g. when it’s difficult to cost justify an expensive upgrade. Demand Management needs to understand which services are utilizing the resource and to what level, and needs to know the schedule of when they must be run. Then a decision can be made on whether it will be possible to influence the use of resource, and if so, which is appropriate. Demand Management can be carried out as part of any one of the subprocesses of Capacity Mgmt. However, Demand Mgmt must be carried out sensitively, without causing damage to the business customers or to the reputation of the IT organization. It is necessary to understand fully the requirements of the business and the demands on the IT Services, and to ensure that the customers are kept informed of all the actions being taken.
81
ITIL Practitioner: Plan and Improve
The plan should indicate clearly and assumptions made, any recommendations quantified in terms of resource required, cost, benefit and impact etc. The production and update of the Capacity Plan should occur at pre-defined intervals. It is essentially, an investment plan and should be published annually and completed before the negotiations on future budgets. A quarterly re-issue of the updated plan may be necessary to take into account charges in business plans, to report on the accuracy of forecasts and to make or refine recommendations.
82
ITIL Practitioner: Plan and Improve
6.3
OPTIMIZING – Capacity Management
83
ITIL Practitioner: Plan and Improve
Over expectation: Customer expectations often exceed technical capability so it is critical that expectations (inc. cost implications) are managed from the outset. This is supported by the application sizing activity. Vendor Influence: Where budget and sales target deadlines coincide, it is not uncommon to be offered what seems to be the deal of a lifetime. On face value, cost efficiencies can be realized, however, before purchasing points to remember: the pace of change is rapid, technological advancement, the overall reducing cost of technology. Manufacturer’s quoted performance figures are often not achievable within a production environment. Care should be taken when negotiating with vendors for additional performance. Lack of Information: Traditionally it is difficult to obtain accurate business forecasts, in order to predict increases and decreases in demand for IT Capacity. This affects the consistency in providing high quality service levels, cost effectively. However, Capacity Management can work effectively, even with crude business estimates. Also it helps if Capacity Mgmt understands the business and can talk to the customer in their language.
84
ITIL Practitioner: Plan and Improve
continued… Capacity Management in a distributed environment: Capacity Mgmt is often considered only as a requirement within the host environment. The network and client environments are not included as part of the Capacity Mgmt process. Level of monitoring to be implemented: Careful consideration should be given to the level of monitoring and reporting to be undertaken, and the decision should be based upon: business impact of component failure, utilization volatility, ability to monitor components, cost of component monitoring and reporting.
85
ITIL Practitioner: Plan and Improve
86
ITIL Practitioner: Plan and Improve
6.4
MANAGING – Availability Management
Notes:
87
ITIL Practitioner: Plan and Improve
Notes:
88
ITIL Practitioner: Plan and Improve
Availability Management commences as soon as the Availability requirements for an IT Service are clear enough to be articulated. In an ongoing process, finishing only when the IT Service is decommissioned.
89
ITIL Practitioner: Plan and Improve
Develop a spider web diagram demonstrating the relationships (inputs and outputs) between Availability Management and 4 other ITIL processes.
90
ITIL Practitioner: Plan and Improve
The Availability and reliability of IT can directly influence Customer satisfaction and the reputation of the business. This is why today Availability Management is essential in ensuring IT delivers the right levels Availability required by the business to satisfy its business objectives and deliver the quality of service demanded by their customers. By having this emphasis the deployment of Availability Management can make a positive contribution to enhancing the relationship with the business: the IT organization being seen to recognize and respond to IT Availability opportunities and challenges with the business needs understood.
91
ITIL Practitioner: Plan and Improve
It is important that with any process implementation careful consideration of an appropriate scope and goal is undertaken and communicated to all involved staff and stakeholders.
92
ITIL Practitioner: Plan and Improve
Single Points of Failure (SPOF): any component within the IT Infrastructure that has no back-up capability and can cause impact to the business and user when it fails. It is important that no unrecognized single points of failure exist within the IT Infrastructure design. The use of Component Failure Impact Assessment (CFIA) as a technique to identify single points of failure is recommended. CFIA, can be used to identify business and user impact and help to determine what alternatives can or should be considered to cater for the weakness in design. Risk Analysis & Management: To assess the vulnerability of failure within the configuration and capability of the IT support organization, it is recommended that the proposed IT Infrastructure, service configurations, service design and supporting organization (internal and external suppliers) are subject to a formal Risk Analysis. CRAMM is a technique that can be used to identify justifiable countermeasures that can protect the Availability of IT Systems. Testing & Simulation: To assess if new components within the design can match the stated requirements it is important that the testing regime used ensures that the expected Availability can be delivered.
93
ITIL Practitioner: Plan and Improve
continued… Simulation tools to generate the expected user demand for the new IT Service should be seriously considered to ensure components continue to operate under volume and stress conditions.
94
ITIL Practitioner: Plan and Improve
The second stage is to re-evaluate the IT Infrastructure design if the Availability requirements cannot be met and identify cost justified design Changes. The focus here is on improving the design. Improvements in design to meet the Availability requirements can be achieved by reviewing the capability of the technology to be deployed in the proposed IT Infrastructure design. To achieve a consistent and sustained level of high Availability requires investment and deployment of effective Service Management processes, systems management tools, high Availability design and ultimately special solutions with and full redundancy. This slide illustrates that to achieve higher levels of Availability requires investment in more than just the base products and components.
95
ITIL Practitioner: Plan and Improve
Before any SLR is accepted and ultimately the SLA is agreed between the business and the IT organization it is essential that the Availability requirements of the business are analyzed to assess if/how the IT Infrastructure can deliver the required levels of Availability. Availability Management provides an important role in being able to translate the business and user requirements into quantifiable Availability terms and conditions. It is important that the business is consulted early in the development lifecycle so that the business Availability needs of new or enhanced IT Service can be costed and agreed. This should be documented and agreed. What the business understands by downtime may differ from the IT perspective. Documenting and agreeing all plans avoids misunderstandings and enable subsequent design activities to commence with a clear, unambiguous understanding of what is required.
96
ITIL Practitioner: Plan and Improve
Determining the Availability requirements is likely to be an iterative process particularly where there is a need to balance the business Availability against associated costs. The steps are: •
Determine the business impact caused by loss of service.
•
From the business requirements specify the Availability, reliability and maintainability requirements for the IT components controlled by the support organization.
•
For IT Services and components provided externally, identify the serviceability requirements.
•
Estimating the costs involved in meeting the Availability, reliability, maintainability and serviceability requirements.
•
Determine with the business if the costs identified in meeting the Availability requirements are justified.
•
Determine from the business the costs likely to be incurred from loss or degradation of service.
Where these are seen as cost justified, define the Availability, reliability, maintainability and serviceability requirements in agreements and negotiate into contracts.
97
ITIL Practitioner: Plan and Improve
Positive outcomes of managing failure situations: •
Normal business operations are resumed quickly to minimize impact to the business and user
•
The availability requirements are met within the cost parameters set as a result of timely and effective recovery reducing the amount of downtime incurred by the business
•
The IT organization is seen as responsive and business focused.
98
ITIL Practitioner: Plan and Improve
By providing focus on the ‘design for recovery’ aspects of the overall Availability design can ensure that every failure is an opportunity to maintain and even enhance business and user satisfaction. The role of Incident Management & Service Desk: avoid small incidents becoming major by ensuring the right people are involved early enough to avoid mistakes and ensure appropriate business and technical recovery procedures are invoked. Understanding the Incident ‘lifecycle’: It is important that each incident passes through each stage (activity) of the Incident Management process. Systems Management: The provision of system management tools positively influences the levels of Availability that can be delivered. Implementation and exploitation should have a strong focus on achieving high Availability and enhanced recovery objectives. Diagnostic data capture procedures: if the time taken to capture diagnostics is considered excessive, a review should be instigated to identify ways of streamlining. This should also look at the scope of the diagnostics.
99
ITIL Practitioner: Plan and Improve
continued… Determine backup and recovery requirements: should cover hardware, software and data. Document recovery plans. Develop and test a backup and recovery strategy and schedule: to anticipate and prepare for performing recovery so reinstatement of service is effective and efficient requires development and testing of appropriate recovery plans. This will highlight approximate timings and communication requirements. Recovery metrics: Gathered from the other elements. Back up and recovery performance: Availability Management must continuously seek and promote faster methods of recovery for all potential Incidents. This can be achieved via a range of methods inc. automated failure detection, automated recovery, more stringent escalation procedures, exploitation of new and faster recovery tools and techniques. Service restoration and verification: Where confirmation from a user that the incident is resolved and the service is restored is not possible e.g. internet based services, ATM services. It is recommended that IT Service verification procedures are developed to enable the IT support organization to verify that a restored service is now working as expected. E.g. visual checks of transaction throughput or user simulation scripts that validate end to end service.
100
ITIL Practitioner: Plan and Improve
Availability Management has a close link with Security Management. During the gathering of Availability requirements it is important that the requirements for IT security are defined. These requirements are then applied within the design phase. For many organizations the approach taken to IT security is covered by an IT security policy owned and maintained by Security Management. In execution of security policy, Availability Management plays an important role in its operation for new IT Services. Typical security considerations that must be addressed: •
Access to authorized personnel only
•
Recovery from failure ref CIA
•
Recovery within secure parameters i.e. not compromise security policy
•
Physical access authorized personnel only
•
OLA & UC’s must reflect the adherence to security controls.
101
ITIL Practitioner: Plan and Improve
Possible problems with implementation: •
The IT organization views Availability as a responsibility of all senior managers and are reluctant to justify costs of appointing a single individual to be accountable
•
The IT organization have difficulty understanding how Availability Management can make a difference, particularly where there are existing disciplines in other ITSM processes e.g. Change, Incident, Problem Management
•
The IT organization views current levels of availability as good and see no reason for the creation of a new role
•
The IT organization fails to delegate the appropriate authority to enable the process owner for Availability Management to influence all areas of the IT organization.
102
ITIL Practitioner: Plan and Improve 6.5
ORGANIZING – Availability Management
Notes:
103
ITIL Practitioner: Plan and Improve
Notes:
104
ITIL Practitioner: Plan and Improve
Once the requirements for managing scheduled maintenance have been defined and agreed, these should be documented as a minimum in the following: •
SLA’s
•
OLA’s
•
Underpinning Contracts
•
Change Management schedules
•
Release Management schedules.
The areas responsible for implementing and managing Change, i.e. Service Desk, Network Management and Computer Operations, need to be aware of the maintenance targets and any future revisions.
105
ITIL Practitioner: Plan and Improve
Minimizing Business Impact areas of focus: •
Assessing service Impact: e.g. CFIA
•
Scheduling downtime
•
Aggregation of maintenance activity
•
Service maintenance objectives.
106
ITIL Practitioner: Plan and Improve
Key: AST = Agreed Service Time DT = Actual downtime during agreed service time Additionally, these output calculations can also be input to any Availability modeling tools that are available.
107
ITIL Practitioner: Plan and Improve
Availability Management works closely with Incident Management and Problem Management in the analysis of unavailability incidents. A good technique to help with this technical analysis of incidents affecting the availability of components and IT Services is to take an Incident Lifecycle view. Incidents can be broken down into stages which can be timed and measured. The stages are: •
Incident start
•
Incident detection
•
Incident diagnosis
•
Incident repair
•
Incident recovery
•
Incident restoration
108
ITIL Practitioner: Plan and Improve
The traditional IT approach to measurement and reporting provides an indicators on IT Availability and component reliability which is important for the internal IT support organization. However, to the business and user these measures fail to reflect Availability from their perspective and are rarely understood. The suggested approach is to divide the focus between: •
Measuring User Availability
•
Business driven measurement and reporting (see next slide)
The methodology employed to reflect User Availability could consider 2 approaches: •
Impact user minutes lost –Base calculations on the duration of downtime multiplied by the number of users impacted.
•
Impact by business transaction, Base calculations on the number of business transactions that could not be processed during the period of downtime
109
ITIL Practitioner: Plan and Improve
This approach ensures that SLA’s and IT Availability reporting are based on measures that are understood by both the business and IT. By measuring the VBF’s that rely upon IT measurement and reporting becomes business driven with the impact of failure reflecting the consequences to the business. Establishing the VBF to measure is something that needs to be agreed with the business. Benefits to this method: •
provides a common measure everyone can understand
•
Visibly demonstrate to the business tangible improvement creating added value
•
Easily identify degrading levels of service to enable pro-activity
•
Demonstrate user and business impact with suppliers to drive productivity.
•
Problems
•
Relating business experience to Incidents, if no end to end monitoring is done
110
ITIL Practitioner: Plan and Improve
continued… •
Who owned the VBF measurements and data
•
Integration and mapping of this data with IT component Availability data
111
ITIL Practitioner: Plan and Improve
VBF: to provide measurement and reporting that demonstrates the consequences of IT Availability on the business functions key to the business operations. Application Services: to provide measurement and reporting of the application services required to run the business operation and service user input. Data: to provide measurement and reporting of data Availability that is essential to support the business operation. Key Components: to provide measurement and reporting that reflects Availability, Reliability and Maintainability of IT Infrastructure components supplied and maintained by the IT organization. Platform: to provide measurement and reporting of the IT platform that ultimately supports the processing of the business applications. Reporting Dimensions: to provide a balanced and meaningful view of the Availability of an IT Service or component, reporting should consider;
112
ITIL Practitioner: Plan and Improve
continued… Availability – to provide measurement and reporting that reflects Availability against defined and agreed targets. Reliability - to provide measurement and reporting that reflects the frequency of failures Maintainability - to provide measurement and reporting that reflects the duration of failures. Response Times - to provide measurement and reporting that reflects the performance as experienced by the user. (Capacity Management may supply reports).
113
ITIL Practitioner: Plan and Improve
IT Availability Metrics Model (ITAMM): This metrics model is a recommended aid to considering the range of measures and reporting dimensions that should be borne in mind when establishing Availability measurement and reporting. Availability measurement and reporting produced to support the Availability Management process can be used as input into other ITSM processes. E.g. Capacity Management – highlight trends that indicate capacity or response time issues. FMIT – to provide cost of failure information, incorporate availability levels into profit/cost models. SLM – to provide reporting for SLA and OLA activities. Incident and Problem Management –to highlight problem black spots impacting availability. Change – Availability impact due to poor quality change, % of planned maintenance activities that have overrun their agreed Service Maintenance Objectives.
114
ITIL Practitioner: Plan and Improve
Notes:
115
ITIL Practitioner: Plan and Improve
During the production of The Plan, it is recommended that liaison with the following functionality areas is undertaken: •
SLM - concerning changing business and user requirements for existing IT Services.
•
ITSCM – concerning business impact and resilience improvements.
•
Business Relationship Management – to understand major customer concerns and/or future needs that relate to IT Availability.
•
Capacity Management – concerning the scenarios for upgrading the software, hardware and network layers...
•
FMIT- concerning cost and budget implication
•
Application Management – availability requirements of new services
•
Areas responsible for IT supplier management and the managing of relationships and contracts with suppliers
•
Technical support groups responsible for testing and maintenance functions, concerning the reliability and maintainability of existing service.
The Plan should cover a period of 1-2 years with a more detailed view of information for the first 6 months. It should be reviewed regularly with minor revisions every quarter and major revisions every half year.
116
ITIL Practitioner: Plan and Improve
Important considerations: •
Availability Management monitoring and trend analysis
•
Determine (changed) Availability requirements
•
The costs of improving Availability
•
Methods and Techniques; Suggest students read pages 262-289 (section 8.9) Service Delivery book
•
CFIA
•
FTA
•
CRAMM
•
SOA
•
TOP
117
ITIL Practitioner: Plan and Improve
Notes:
118
ITIL Practitioner: Plan and Improve
Notes:
119
ITIL Practitioner: Plan and Improve
Notes:
120
ITIL Practitioner: Plan and Improve
Notes:
121
ITIL Practitioner: Plan and Improve
122
ITIL Practitioner: Plan and Improve
6.6
OPTIMIZING – Availability Management
Availability Management can provide the IT organization with a real business and user perspective on how deficiencies within the IT infrastructure and the underpinning process and procedures impact the business operation and ultimately their customers. The use of business driven metrics can demonstrate this impact in real terms and importantly also help quantify the benefits of improvement opportunities. The wider benefits of Availability Management having a continuous improvement focus within the IT support organization are that it: •
Provides direction to best exploit skills and competencies
•
Creates an understanding of how the business uses the technology
•
Can identify ‘quick win’ low cost improvements
•
Delivers incremental Availability improvement
•
Provides positive feedback to staff on ‘how they made a difference’
•
Demonstrates to the business the added value of the IT support organization
•
Helps to promote a ‘service culture
123
ITIL Practitioner: Plan and Improve
124
ITIL Practitioner: Plan and Improve
6.7
MANAGING – IT Service Continuity Management
Notes:
125
ITIL Practitioner: Plan and Improve
Notes:
126
ITIL Practitioner: Plan and Improve
It is not possible to have effective ITSCM without support from the business. These are the four stages of the Business Continuity lifecycle that have particular emphasis on IT aspects.
127
ITIL Practitioner: Plan and Improve
The extent to which these activities need to be considered during the initiation process depends on the contingency facilities that have been applied within the organization. Some parts of the business may have an established continuity plan based on manual workarounds, and the IT Organization may have developed their own disaster plans for supporting critical systems. However, effective ITSCM is dependent on supporting critical business functions and ensuring that the available budget is applied in the most appropriate way. Policy setting – should be established and communicated asap so all member of the organization BCM issues are aware of their responsibilities to comply and support ITSCM. As a minimum the policy should set out management intention and objectives. Specify terms of reference and scope – inc. defining scope and responsibilities of managers and staff, and work methods. Also covers risk assessment and BIA and the ‘command and control’ structure required to support business interruption. Other issues to consider are; outstanding audit issues, regulatory, insurance or client requirements and compliance with other standards such as BS7999 (Security Management) or ISO/IEC 20000.
128
ITIL Practitioner: Plan and Improve
continued… Allocate Resources – effective BCM and ITSCM requires considerable resource in terms of money and manpower. Depending on the maturity of the organization, external consultants may be required to assist with BIA etc. Define the project org control structure – It is advisable to use a standard project planning methodology such as PRINCE 2 complemented with projectplanning tool. The appointment of an experienced project manager who reports to a steering committee and guides the work groups is the key to success. Agree project and quality plans – to enable the project to be controlled and variances addressed. Quality plans ensure that deliverables are achieved and to an acceptable level of quality.
129
ITIL Practitioner: Plan and Improve
This stage provides the foundation for ITSCM and is a critical component in order to determine how well an organization will survive a business interruption or disaster and the costs that will be incurred.
130
ITIL Practitioner: Plan and Improve
The Business Impact Analysis (BIA) identifies the minimum critical requirements to support the business. BIA – The purpose of the BIA is to assess this through identifying: •
Critical business processes
•
The potential damage or loss that may be caused to the organization as a result of a disruption to critical business process.
BIA also identifies the form that the damage or loss may take e.g. lost income, reputation, competitive advantage etc. How the degree of damage or loss is likely to escalate. Staffing and skills necessary to continuity operating at acceptable levels. Time within which minimum staffing, facilities and services should be recovered. The time within which all required business processes and supporting staff etc should be fully recovered, Impacts are measured against particular scenarios for each business process such as an inability to invoice for a period of days. It is also important to understand how impacts change over time e.g. it may be possible to function without a particular process for a period of time but over a longer period of time it could become critical.
131
ITIL Practitioner: Plan and Improve
continued… ITSCM ensures that contingency options are identified so that the appropriate measure can be applied at the appropriate time to keep business impacts from service disruptions to a minimum level.
132
ITIL Practitioner: Plan and Improve
Identify Risks – i.e. risks to particular IT Service components (assets) that support the business process which cause an interruption to service. Assess threat and vulnerability levels – the threat is defined as ‘how likely it is that a disruption will occur’ and the vulnerability is defined as’ whether, and to what extent, the organization will be affected by the threat materializing’. Assess the levels of risk – the overall risk can then be measured. This may be done as a measurement if quantitative data has been collected, or qualitative using a subjective assessment of, for example, low, medium or high. Following the Risk Assessment it is possible to determine appropriate countermeasures or risk reduction measures to manage the risks, i.e. reduce the risk to an acceptable minimum level or mitigate the risk.
133
ITIL Practitioner: Plan and Improve
continued… A threat is dependent on such factors as: Likely motivation, capability and resources for deliberate service disruptions such as malicious damage to computer systems, commercial failure of a key technology provider, attack against an organizations web servers and corruption of internet sites for accidental service disruptions, the organization’s location, environment, and quality of internal systems and procedures. Business processes are vulnerable where there are single points of failure for delivery of IT Services.
134
ITIL Practitioner: Plan and Improve
Risk reduction measures include: •
A comprehensive backup and recovery strategy, including off-site storage
•
Elimination of single points of failure e.g. single power supply from a single utility organization.
•
Outsourcing services to more than one provider
•
Resilient IT systems and networks constantly change-managed to ensure maximum performance in meeting the increasing business requirements.
•
Greater security controls e.g. physical access control using swipe cards
•
Better control to detect local service disruptions e.g. fire detection systems linked with suppression systems
•
Improving procedures to reduce the likelihood of errors or failures e.g. Change control.
135
ITIL Practitioner: Plan and Improve
IT Recovery Options need to be considered for: •
People and accommodation
•
IT systems and networks
•
Critical Services e.g. power, water, telecommunications etc
•
Critical assets e.g. paper records and reference material
Costs and benefits of each option need to be analyzed. Including comparative assessment of: •
Ability to meet business recovery objectives
•
Likely reduction in the potential impact
•
Costs of establishing the option
•
Costs of maintaining, testing and implementing the option
•
Technical, organizational, cultural and administrative implications.
It is important that the organization checks the recovery options that are chosen are capable of implementation and integration at the time they are required, and that the required service recovery can be achieved.
136
ITIL Practitioner: Plan and Improve
continued… As with Recovery Options, it is important that the reduction of one risk does not increase another. The risk of Availability systems and data may be reduced by outsourcing to an off-site third party; however, this potentially increases the risk of compromise of confidential information unless rigorous security controls are applied.
137
ITIL Practitioner: Plan and Improve
Typical responsibilities for ITSCM in planning and dealing with disaster are similar to how First Aid Officers and Fire Wardens act in planning and operational roles. Skill requirements for ITSCM Manager and staff: •
Knowledge of the business (help set priorities),
•
Calm under pressure,
•
Analytical (problem solving)
•
Leadership and Team players,
•
Negotiation and Communication.
138
ITIL Practitioner: Plan and Improve
Notes:
139
ITIL Practitioner: Plan and Improve
Notes:
140
ITIL Practitioner: Plan and Improve
Notes:
141
ITIL Practitioner: Plan and Improve
Notes:
142
ITIL Practitioner: Plan and Improve
6.8
ORGANIZING – IT Service Continuity Management
143
ITIL Practitioner: Plan and Improve
The implementation Stage consists of the following processes: •
Establish the organization and develop implementation plans
•
Implement stand-by arrangements
•
Implement risk reduction measures
•
Develop IT recovery plans
•
Develop procedures
•
Undertake initial tests.
Each of the above is considered with respect to the specific responsibilities that IT must action.
144
ITIL Practitioner: Plan and Improve
Organization Planning Executive – including senior management/executive board with overall authority and control within the organization and responsible for crisis management and liaison with other departments, divisions, organizations, the media, regulators, emergency services etc. Co-ordination – typically one level below the Executive group and responsible for co-ordinating the overall recovery effort within the organization. Recovery – a series of business and service recovery teams representing the critical business functions and the services that need to be established to support these functions. Each team is responsible for executing the plans within their own areas and for liaison with staff, customers and third parties. Within IT the recovery teams should be grouped by IT Service and application.
145
ITIL Practitioner: Plan and Improve
Plan development is one of the most important parts of the implementation process and without workable plans the process will certainly fail. At the highest level there is a need for an overall co-ordination plan. Phase 1: These plans are used to identify and respond to service disruption, ensure the safety of all affected staff members and visitors and determine whether there is a need to implement the business recovery process. If there is a need, Phase 2 plans need to take place. These will include the key support functions. As part of the implementation planning process, it is vitally important to review key and critical contracts required to deliver business critical services. These contracts should be reviewed to ensure that, if appropriate, they provide a BCM service, there is a defined Service Level agreed and the contracts are still valid and in-force if operations have to switch to the recovery site (either in total or partially). If contracts do not include these details, then the service criticality should be reviewed and the risks associated with the service not being provided should be assessed.
146
ITIL Practitioner: Plan and Improve
All Risk Reduction measures need to be implemented, as discussed with the earlier example slide. This is often achieved in conjunction with Availability Management as many of these reduce the probability of failure affecting the Availability of service. RAID: Redundant Array of Inexpensive Disks
147
ITIL Practitioner: Plan and Improve
Training and new procedures may be required to operate, test and maintain the stand-by arrangements and to ensure that they can be initiated when required.
148
ITIL Practitioner: Plan and Improve
Sufficient information for a technician, not familiar with the system, to follow procedures: involve people who are not familiar with the system to perform a recovery test. The recovery plans include key detail such as the data recovery point, a list of dependent systems, the nature of dependency and the data recovery points, system hardware and software requirements, configuration details and references to other relevant or essential information about the system etc. A check-list is included that covers specific actions required during all stages of recovery for the system, for example after the system has been restored to an operational state, connectivity checks, functionality checks or data consistency and integrity checks should be carried out prior to handling the system over to the business.
149
ITIL Practitioner: Plan and Improve
Notes:
150
ITIL Practitioner: Plan and Improve
Education and Awareness – this should cover the organization and the IT organization, for service continuity specific items. This ensures that all staff are aware of the implications of Business and Service Continuity and considers them part of their normal routine and budget. Review – regular review of all of the deliverable from the ITSCM process needs to be undertaken to ensure that they remain current. With respect to IT this is required whenever there is a major Change to the IT Infrastructure, assets or dependencies such as new systems or networks or a change in service providers, as well as there is a change on business direction and strategy or IT strategy. Testing – following the initial testing it is necessary to establish a program of regular testing to ensure that the critical components of the strategy are tested at least annually or as directed by senior management or audit. It is important that any changes to the IT Infrastructure are in included in the strategy, implemented appropriately and tested to ensure they function correctly.
151
ITIL Practitioner: Plan and Improve
continued… Change Control – following tests and reviews, and day to day changes, there is a need for the ITSCM plan to be updated. ITSCM must be included as part of the change management process to ensure all changes are reflected in the contingency arrangements provided by IT or 3rd parties. Inaccurate plans and inadequate recovery capabilities may result in failure of ITSCM. Training – IT may be involved in training non-IT literate business recovery team members to ensure they have the necessary level of competence to facilitate recovery. Assurance – The final process in the ITSCM lifecycle involves obtaining assurance that the quality of the ITSCM deliverables is acceptable to senior business management and that operational management processes are working satisfactorily.
152
ITIL Practitioner: Plan and Improve
6.9
OPTIMIZING – IT Service Continuity Management
Notes:
153
ITIL Practitioner: Plan and Improve
The decision to invoke needs to take into account a number of factors: •
The extent of the damage and scope of the potential invocation
•
The likely length of the disruption and unavailability of the premises and/or services.
•
The time of day/month/year and the potential business impact. At year end the need to invoke may be more pressing to ensure that year end processing is completed on time.
•
Specific requirements of the business depending on work being undertaken at the time.
Call trees: Communicating with the organization. A mechanism for communicating effectively and efficiently with identified recovery personnel throughout the organization. The Plan should include details of key personnel to be contacted to initiate the business and ITSCM plans.
154
ITIL Practitioner: Plan and Improve 6.10 SUMMARIES
Notes:
155
ITIL Practitioner: Plan and Improve
Notes:
156
ITIL Practitioner: Plan and Improve
Notes:
157
ITIL Practitioner: Plan and Improve
158
ITIL Practitioner: Plan and Improve 7
ASSIGNMENTS
159
ITIL Practitioner: Plan and Improve
160
ITIL Practitioner: Plan and Improve 7.1
Assignment 1 – Capacity Management
You will complete the following tasks relating to Capacity over the course of the day, culminating in a presentation that will summarize your knowledge learnt. You will need to use the associated case study in responding during these tasks and for the final presentation. You will act in the role of an external consultant who is assisting the organization described in the case study in implementing the Capacity Management process. Task a) Design and develop your presentation content for Task C •
Create a definition for Capacity Management process that would easily be understood by management staff who have little or no IT experience.
•
List 7 benefits that introducing Capacity Management process would bring to the organization. Be sure to make clear reference to elements of the case study.
Trainers will be looking for organizational and writing skills, relevance and accuracy of contents, clear references to the Rece Case Study, inclusion of terminology and appropriateness for audience. Task b) Demand Management – Manage and organize Using the Case Study, create a demand model that includes all the activities that will need to be performed to ensure the demand on the POS system, that ultimately led to its failure, is managed, organized and optimized. Your findings will need to be included in Task D – Final. Students will have read the entire Case Study and highlighted relevant areas of concern with regards to this assignment e.g. the disaster situation that occurred as a result of the POS system failure. Trainers will be looking for analytical, planning, organization skills, suitable use of terminology and applied knowledge of the IPPI processes. The 161
ITIL Practitioner: Plan and Improve results of this assignment will be presented as part of the final presentation- Task C. Task c) Final Presentation Acting as an external consultant, you are required to deliver a presentation to the CIO and managing directors who are considering implementing Capacity Management. Your presentation will need to include: •
A brief description of Capacity Management
•
What are the goals of Capacity Management?
•
What are the activities that are performed?
•
Who will be involved (roles and responsibilities)?
•
What are the benefits that would be delivered to the organization?
•
What challenges may be faced and how would these be overcome?
•
Conclusion including how implementation of the Capacity Management process will optimize the current performance of Rece services.
Trainers will be looking at the presentation skills of the student, specifically written and communication skills. The content of the presentation should be accurate, obtain suitable terminology and be appropriate for the suggested audience. Students need to present their findings from Task B, and this should be included in the presentation at a suitable juncture. The presentation should address specific issues with regards to the Rece Case Study and students can also be assessed on their ability to answer questions on the presentation itself and the IPPI process itself.
162
ITIL Practitioner: Plan and Improve
7.2
Assignment 2 – Availability Management
You will complete the following tasks relating to Availability Management over the course of the day, culminating in a presentation that will summarize your knowledge learnt. You will need to use the associated case study in responding during these tasks and for the final presentation. You will act in the role of an external consultant who is assisting the organization described in the case study in implementing the Availability Management process. Task a) Develop Proposal •
Create a definition for Availability Management that would easily be understood by management staff who has little or no IT experience.
•
List 5 benefits that introducing Availability Management would bring to the organization. Be sure to make clear reference to elements of the case study.
Task b) Design a draft contents of an Availability Plan Design the contents page for the Availability Plan with a detailed supporting explanation. Your explanation will have to specifically address issues from the Case Study in describing how this plan will help to improve service performance. Task c) Memo – TOP Organize a formal memo. This memo will include an explanation/definition of what a TOP is, and the benefits –specifically related to issues within the Case Study. The memo will be addressed to all the technical groups you consider to be relevant, with regard to their contribution in identifying effective and efficient solutions.
163
ITIL Practitioner: Plan and Improve Task d) SIP Using the Case Study prepare a written proposal for the Service Level Manager. This proposal will provide information that will feed into the Service Improvement Plan, and provide valuable improvement ideas. Task e) Final Presentation Acting as an external consultant, you are required to deliver a presentation to the CIO and managing directors who are considering implementing Availability Management. Your presentation will need to include: •
A brief description of Availability Management
•
What are the goals of Availability Management?
•
What are the activities that are performed?
•
Who will be involved?
•
What are the benefits that would be delivered to the organization?
•
What challenges may be faced and how would these be overcome?
•
Conclusion.
164
ITIL Practitioner: Plan and Improve
7.3
Assignment 3 – IT Service Continuity Management
You will complete the following tasks relating to ITSCM over the course of the day, culminating in a presentation that will summarize your knowledge learnt. You will need to use the associated case study in responding during these tasks and for the final presentation. You will act in the role of an external consultant who is assisting the organization described in the case study in implementing the ITSCM Management process. Task a) Design and develop your presentation for Task D •
Create a definition for ITSCM process that would easily be understood by management staff who have little or no IT experience.
•
List 7 benefits that introducing ITSCM process would bring to the organization. Be sure to make clear reference to elements of the case study.
Task b) ITSCM Plan Using the Case Study identify the reasons why the ITSCM was not effective and when the plan should have been invoked. Explain what you would have done differently and why. Your findings will need to be included in your final presentation (Task d). Task c) BCM Using the new additions to the Business Continuity Strategy from the Case Study, organize a written proposal suggesting how the current ITSCM Strategy can be improved to incorporate and support the BCM needs.
165
ITIL Practitioner: Plan and Improve Task d) Final Presentation Acting as an external consultant, you are required to deliver a presentation to the CIO and managing directors who are considering implementing ITSCM. Your presentation will need to include: •
A brief description of ITSCM
•
What are the goals of ITSCM?
•
What are the activities that are performed?
•
Who will be involved?
•
What are the benefits that would be delivered to the organization?
•
What challenges may be faced and how would these be overcome?
•
Conclusion.
166
ITIL Practitioner: Plan and Improve 8
ASSIGNMENT RESOURCES
167
ITIL Practitioner: Plan and Improve
168
ITIL Practitioner: Plan and Improve
8.1
Case Study – RECE Shoe Company
Business Case Study RECE Shoe Company Global Conglomerate ITIL Practitioners Case Study V1.0
Rece Couture Shoe Design For the love of craftsmanship
169
ITIL Practitioner: Plan and Improve Introduction Rece Couture Shoe Company of London, England, is a privately owned shoe company, founded in 1994. Rece has grown rapidly to become one of the leading shoe retailers of the world. During recent years Rece has expanded substantially to reach in 2004 the third most favoured designer shoe label. The company has also expanded the product line with the introduction of an accessory line. Such spectacular growth has been achieved internally and not through acquisition or merger. Rece is a company that has major operations centres in New York, Paris, Milan, Hong Kong and Melbourne, as well as 150 retail outlets worldwide (Rece stores and concessions in major department store locations). In addition, Rece operates an online store with the capacity to accept in excess of 5000 online orders per day. Rece provides an unparalleled service network via dedicated own offices throughout the world and remains a truly independent and private Company able to respond quickly to market changes and implement long term plans, without unnecessary interference or delay. With a streamlined management structure in London, England, Rece has become a leading customer focused and cost effective global retailer, their first class craftsmanship is favoured by international stars and elegant women worldwide. Officers & Employees CEO: Claire Enever (co-founder and president) Creative Director: Angela Miller Group Managing Director, Finance and Administration: Paul J. Rizzo, Group Managing Director, Commercial and Consumer: Bernard Scholl Group Managing Director, Retailing Business Solutions: Cecile Yelland Group Managing Director, International Shipping: Anton Chirac Group Managing Director, Employee Relations: Rebecca Cartwright Group Managing Director, Network and Technology: Anwar Sadat Group Managing Director, Sales and Marketing: Steve Birch Group Managing Director, Press and Public relations: Carla Scott Group Managing Director, Convergent Business: Rachael Alfonsin Group Managing Director, Legal and Regulatory: Pien Ch'iao 2007 Employees: 13,840 1-Year Employee Growth: (7.7%) Headquarters: Level 42, 76 Oxford Street, London, England
170
ITIL Practitioner: Plan and Improve Vision Rece vision is to enhance its position as the leading designer shoe design and retailer in the world. To realise this vision and prepare for competition, Rece has adopted a fourpart growth strategy, entailing: ¾ Optimising returns from the ‘classic’ products and services throughout the world ¾ Developing and delivering value-added services via an online interface and extended accessories line. ¾ Transforming our corporate culture and improving productivity ¾ Extending our global scope The Rece Organization The organization is virtually identical at each of the head offices. For example, each head office will have the following departments: ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾ ¾
Marketing and Sales Design and Manufacturing Retailing and Logistics Shipping and distribution Customer Services Maintenance Legal Accounts Department Human Resources ICT Department
Each manager for the listed departments will report directly to the director of that that local office. The CEO of the company and her managing directors are located in Head Office in London, England. Logistics The logistics department’s main responsibility is to ensure that all goods being shipped by sea or land are loaded to the appropriate carrier. This is to ensure that Rece can fulfil their obligations regarding paid orders from their customers. The logistics department works very closely with both the Sales and the Manufacturing departments. The logistics departments will organise the loading and consignment of the goods, but it needs to also ensure that appropriate crews have already been notified and that there is an availability of transport. The Sales, Manufacturing and Logistics departments work closely together. As a result of the complexity of the relationship a number of business process
171
ITIL Practitioner: Plan and Improve issues can arise. In the event that it becomes difficult to ascertain the nature or the solution for the issues, the Logistics department will be given priority to manage the issue to resolution. The Logistics department will therefore have the final authority in these decisions. Maintenance The Maintenance department is responsible for maintaining and stocking the necessary parts for both road and air transport. The primary objective is being responsible for the state of repair of the delivery road vehicles. Rece operates a number of large container workshops around the world, so that maintenance can be carried out more efficiently. Sales The sales department is responsible for obtaining orders for international company chains and stores requiring Rece concession stock. They are also responsible for the online orders made via the Rece website. Rece operates a total of 350 dedicated sales offices across the entire globe, and employs approximately 8,000 sales staff. Because of such a large volume of people each regional head office will have a Sales Director which will look after the sales offices within that region. Accounts Department The Accounts Department takes care of the head office's financial accounts, including the management of the accounts payable and accounts receivable ledgers. The Accounts Department also takes care of the payment of salary to staff members and any contractors. There will also be small Accounts Departments at each key location across the globe. The managers of these departments will report directory to the manager at the head office for that region. As a general rule, there is not more than four Accounts Departments per region. The Accounts Departments work very closely with the Sales Departments. The Sales departments will interface into the computer systems controlled by the Accounts Department. Human Resources The Human Resources department is the department, involved in the recruitment, selection and discharge of personnel and in human resource management. For example, each head office employs a company doctor and a psychologist, who provide medical and mental assistance to employees. Rece sees this function as being critical to their organization. Sales, Design and some Manufacturing staff are required to travel on a regular basis,
172
ITIL Practitioner: Plan and Improve sometimes at short notice for extended periods of time. This can place a serious strain on staff and their families, and as a result Rece wishes to ensure the well being of its staff members as it is recognises that they are integral to the organization. Rece does this through medical and mental assistance programs, and by offering generous product discount and annual leave packages. ICT Department Each head office around the world employees a small team of IT personnel to help deliver and support IT Services for their specific regions. For all intents, these groups run fairly autonomously, having their own support teams, including their own Service Desks. However, in London, there is a central ICT Department that provides IT Support for all the store and online requirements. The Rece information systems General The computerisation of Rece’s information systems has not been fully completed at this stage. However, there are certain aspects that can be considered fairly mature. A large part of the financial accounts of the entire Rece organization have been computerised. Due to the geography that exists between the various head offices, there had been identified a need for a virtually identical computerisation standard at these offices. However, unfortunately a large part of the handling of orders and planning processes has not been as well computerised as possible. This is generally due to local constraints being enforced by the relevant government organizations within the various regions that Rece operates. Resolution of this is being seen as a key aspect of the success of the Rece organization. Systems FINANCE is the information system used by the Accounts Department to prepare financial reports. FINANCE contains modules for accounts receivable, accounts payable, salary records and book keeping. After an order is completed, the relevant invoices are created automatically. The payments to suppliers of shipping and payments to providers of specialist maintenance are also made by means of this system. Due to the need for various head offices to comply with the local laws regarding finances, the FINANCE system has evolved to be the most diverse system in the organization, being virtually different at each head office.
173
ITIL Practitioner: Plan and Improve PAYPOINT is the companies’ point of sale system. This is ready for an up date. The technology being used is out of date and not to the standard represented by the company, who want to be state of the art but at the right cost! STOCK is the Rece Inventory Management system. It holds list of goods and materials themselves, held available in stock by a business. The inventory is held in order to manage and hide from the customer the fact that manufacture/supply delay is longer than delivery delay, and also to ease the effect of imperfections in the manufacturing process that lower production efficiencies if production capacity stands idle for lack of materials. FOCUS is the companies CRM system that manages their relationships with customers, including the capture, storage and analysis of customer, vendor, partner, and internal process information. There are three aspects to the FOCUS system: • • •
Operational - automation or support of customer processes that includes the company sales or service reps Collaborative - direct communication with customers that does not include the company sales or service reps Analytical - analysis of customer data for a broad range of purposes
Currently the third aspect of Analysis is not used effectively. General Systems There is also a general suite of applications, mainly Microsoft, with some Lotus Notes, which includes an e-mail, word processor, spreadsheet application, appointment calendars & scheduling software and the Human Resources applications... These suites of applications are offered via local networks. The links between local networks and the links between desktop computers are completely transparent to the users. This software is stored centrally on the main servers within the ICT Department in London, which allows remote users to download them as needed. However, each head office around the world, local versions are kept as this allows for easier management of the local standard operating environments. Hardware Each of the head offices uses a series of UNIX based servers for capturing and recording their information. These servers have direct data links back to the head office in London, where the information is then stored on a central mainframe. The mainframe is equipped with disk and tape storage facilities.
174
ITIL Practitioner: Plan and Improve However, across the organization the network structure is fairly similar at each of the head offices. This allows for easier deployment of applications across the entire organization. Additional Infrastructure Information The wide area network has been outsourced. The outsourcer in this situation is managing and coordinating the leasing of the necessary network infrastructure. The outsourcer is responsible for providing monitoring information regarding the availability of the wide area network. The cost of the organization maintaining the WAN infrastructure was considered too great. However, recently it is being seen by the organization that the infrastructure may not be as stable as proposed and are looking at the IT Departments to manage this in a more structure manner. This has resulted in incomplete transactions being fed back into the central mainframe, potentially costing the organization millions of dollars. IT Organization Due to the diverse nature of Rece, there are a number of IT Departments. However, policy and objectives for IT are created and managed from the London office. London is seen as the central IT Department. However, autonomy is provided to the other head offices to manage, deliver and support IT Services as they need. As such, each head office has a Chief Information Officer (CIO), who reports into the Group Managing Director, Network and Technology. All CIO’s have a monthly video conference call, with a bi-annual face to face meeting. At the bi-annual meeting, strategy and policies for Information Services and Information Technology are discussed and agreed upon. This meeting is chaired by the Group Managing Director, Network and Technology. In addition to this, there are regular consultations between the head offices regarding technical matters. Telephone and e-mail are the means most commonly used for this with the occasional video conference. There is however a general consensus amongst the senior IT Managers that the IT functions within Rece could probably contribute more to the business objectives of the company. However, they still all agree that in general terms the deployment and organization of its IT resources is reasonably good: • • •
Communication internationally is regular with good information and results Head Offices are communicating well The technical infrastructure has been extensively documented by each head office
175
ITIL Practitioner: Plan and Improve London London uses a Fujitsu 8500 series mainframe as its central computer. The mainframe serves as a way of centralizing all the data from the other various head offices. The mainframe has approximately 3000 GB of disk storage. Due to the large nature of the organization, it was determined that there also needed to be a development mainframe, although scaled down in size. At this stage the organization does not have a testing mainframe and as such, most of the testing is carried in the development environment. The remote head offices can access the mainframe via deployed client applications and for some specific uses, via the World Wide Web. Sales Offices The following IT components have been installed at each local sales office: • Personal Computers • Operating System: Windows 2000 • No CD ROMs • No Floppy Disk Drives • Pentium III – 500Mhz • At least 1 Server – • Operating System: Windows 2000 Server This allows individual sales offices to enter shipping orders into the system, as well as create and distribute information via email for their local regions. This is seen as a key aspect, as each sales office is responsible for generating their revenue. General – Other Head Offices At certain times and for certain parts of the infrastructure, development and maintenance of this is contracted out to various suppliers with whom a maintenance agreement has been signed. At this stage Rece does not have any reciprocal arrangements with any other company in the event of a disaster. The IT Organizational structure of each of the head offices is as follows: • Regional IT Manager – Overall Manager for the specific region • Network Manager - Local Area Network infrastructure • Project Manager - who is responsible for testing and coordinating any modifications to the systems and solving small problems • Service Desk - manager specialising in the management of the service desk representatives and dealing with IT Incidents
176
ITIL Practitioner: Plan and Improve • •
Desktop Manager – responsible for managing coordinating the roving engineers. Change Control Co-ordinator
In addition to this, there is a group of roving engineers who travel their local region and are trained to solve the most commonly occurring problems independently. If he/she is unable to do this, the head office is contacted. The call will then be routed through to the local head office, if resolution is not possible then the assistance of suppliers who are able to solve the problems are called in. Issues to be Considered A. At the end of 2006 Rece experienced a disaster situation. The POS system failed. The SLA states that if this situation continues for more than 4 hours, during business hours, it is considered a breach. The system was down for just over 9 hours. The ITSCM plan was invoked after 4 hours 5 minutes. Little is known about the other activities that took place throughout the disaster situation as no ongoing records where taken. What is known is that staff felt uninformed and unsupported, they were not given sufficient progress reports as to what was being done to rectify the situation or when they could advise customers that the service would be back up and running. Little or no training had been provided on how staff could perform the POS activities manually and there where no written procedures for them to follow. In discussion with the staff, they commented that they were aware of one senior manager who they could contact in the event of a situation, but when they attempted to contact him they were informed that he was out of the country on annual leave for the next 10 days. Staff also advised that they had been experiencing issues with the POS system up to 3 hours before the POS system completely failed. Seventeen incident reports, regarding the POS system, had been logged by the Service Desk. Analysis of the system after the fact identified that ultimately the cause of the failure was down to the number of transactions per second exceeding the service capacity of the POS system. Following this disaster, Rece have made some revisions to the companies Business Continuity Plan. 1. The vital business functions have been discussed and redefined. The POS system is considered to be a VBF. Therefore, the service level requirement is that any failure that is not restored within 2.5 hours will be considered a breach. 2. A manual system with supporting procedure for staff... B. Rece plans to expand the retail network by opening 5 new stores within the next 18 months. Furthermore and within the same timeframe, Rece are looking to develop the website and increase the online orders (currently approximately 200 orders on average per day worldwide) by 25%. The 177
ITIL Practitioner: Plan and Improve development of the website has not been a priority over the last two years so the internal perception is that it looks dated and does not reflect the ‘expensive’ and polished look the company is aiming for. The website has proved to be popular with customers all over the world; their only concerns have been with regards to availability times and website security for online credit card payments. The agreed downtime, within the SLA, is two hours per week. On average the downtime has been four hours per week.
178
ITIL Practitioner: Plan and Improve 9
MOCK EXAM
QUESTION 1 As a member of the planning and improvement team, you are discussing staffing issues with the manager of the print room where salary slips and invoices for the company's customers are produced. The manager of this department tells you that he uses the Capacity Management method to make sure he is not understaffed. Who is responsible for recommendations on sufficient human resources for the print room? a) b) c) d)
The manager of the print room, because he has adapted the Capacity Management method The manager of the print room, because he runs the department The Plan and Improve team, because it is responsible for making recommendations on IT resources and meeting Business requirement The Plan and Improve team, because it is responsible for the throughput of the print process
QUESTION 2 During the implementation of IT Service Continuity Management (ITSCM) you find out that an amount of crucial data is stored on local workstations. The workstations are also connected to the LAN. If this data is to be available after a possible system crash a regular backup of all the data is necessary. What is the best way to make sure this backup is realized? a) b) c)
ask the employees who are storing data on their workstation to store the data on the network drives ask the IT engineers to make regular backups of the workstations involved plan an awareness session for all employees to make sure data is stored on the network drives.
QUESTION 3 What is a benefit of an Availability plan? a) b) c)
an Availability Plan provides the necessary information on current and planned resource utilization of individual Components the frequency and duration of IT Service failures can be reduced over time the IT organization can actively manage the Infrastructure and use systems to reduce the impact of Component failure
179
ITIL Practitioner: Plan and Improve QUESTION 4 Senior management wishes to document the benefits of an IT Service Continuity (ITSCM) plan and execute regular testing against that plan. Which of the following should not be included as a benefit, of successful testing against an IT Service Continuity plan? a) b) c)
successful testing against the ITSCM plan demonstrates that the business processes are able to continue to operate in the event of a disaster successful testing against the ITSCM plan generates credibility with customers and business partners The demonstrated ability to meet regulatory requirements
QUESTION 5 As member of the Plan & Improve team you are asked to advise IT management how to measure IT Availability. Traditionally these measurements have concentrated on Component Availability and are based on a combination of an Availability percentage (%), the amount of time lost and the frequency of failure. Why are these traditional measurements no longer acceptable? a) b) c)
manufacturers of Configuration Items (CIs) provide detailed data on resilience much better measurement techniques are available they fail to reflect Availability from a Business perspective
QUESTION 6 During a review of the IT Service Continuity Management (ITSCM) process, you take the following into account: • Major Changes to the IT Infrastructure • New systems or networks • New service providers What is a fourth element you need to take into account? a) b) c) d)
new Business requirements new Configuration Items (CIs) in the Configuration Management Database new employees in the Recovery team new Incidents since the last review of the IT Service Continuity Management process
180
ITIL Practitioner: Plan and Improve QUESTION 7 In the Plan and Improve process monthly reports are produced on the Capacity of the company's systems. While writing your monthly report you find out that free disk space on one system has rapidly decreased since last month. In which document will you sound the alarm for this system? a) b) c)
the current monthly report, because you found out in time the next monthly report, because it needs investigation first you write the monthly report and an exception report for this specific system
QUESTION 8 You are initiating the Capacity Management process. How do you find out what Monitoring activities are already being executed? a) b) c)
You check the current work instructions on System Monitoring used by the engineers supporting the IT environment. You check which monitoring tools have been installed on the various systems. You interview the system engineers who are responsible for the various systems
QUESTION 9 You are planning the sub process Resource Capacity Management in your organization. Your organization is a large mail company with distribution centers all over the country. Because there are various systems and applications, you decide that Capacity Management only identifies the monitoring requirements and that the actual implementation of the process is done locally. In this case, how should you implement the Capacity Database (CDB)? a) b) c)
centralized, where the sites send their raw data to the central CDB decentralized, where every site reports on the agreed requirements decentralized with raw data, centralized with an analyzed set
181
ITIL Practitioner: Plan and Improve QUESTION 10 What is the output of the sub-process Business Capacity Management? a) b) c)
a proactive Change the Capacity Plan the Service Level Requirements
QUESTION 11 While monitoring the systems an exception report is produced, because the required Capacity of certain resources has not been met. Which process is responsible for initiating Changes on the systems? a) b) c)
the Change Management process the Incident Management process the Service Level Management process
QUESTION 12 Both tangible and intangible costs result from non-availability of a System. Examples are: • imposed fines or penalties; • loss of customer goodwill; • loss of customers; • lost User productivity; • overtime payments. What is an example of a tangible cost? a) b) c)
damage to the business' reputation loss of business opportunities lost revenue
182
ITIL Practitioner: Plan and Improve QUESTION 13 Your company is running several systems on a 24/7 basis. You are asked to plan service windows for maintenance on these systems. What is the first step for planning the required service windows? a) b) c) d)
find out what is agreed in the Service Level Agreement (SLA) and plan accordingly find out which components can be serviced concurrently in order to minimize Impact perform a Component Failure Impact Analysis (CFIA) on the systems perform Monitoring to check out what time windows are least critical
QUESTION 14 You are the Availability Manager in your organization. You recognize that the reports currently produced by the IT support organization are mainly focused on IT component Availability. Why is it important to shift to a more User and Business orientated perspective concerning IT Availability? a) b) c)
because it is necessary to meet the Service Level Agreements (SLAs) because the other processes need information reflecting Availability because Users need the IT Availability in order to perform business tasks
QUESTION 15 You are defining a policy for IT Service Continuity Management (ITSCM) for your company. Your company has a critical business system "TravelReg" where your customers are working 5 days a week (Monday - Friday) from 08:30 am until 5:30 pm. You guarantee that the system is available during these opening hours for 98 percent. What is the best policy description for this critical business system? a) b) c)
The system "TravelReg" is available for the company's customers as agreed in the SLA. The system "TravelReg" will be available on working days during working hours for at least 98 percent during this time. The system "TravelReg" will be available on working days from 08:30 am till 5:30 pm for 98 percent.
183
ITIL Practitioner: Plan and Improve QUESTION 16 The computer systems your organization uses are being hosted by a third party. The production systems (four servers) are placed in one rack. The test systems, configured for multiple test cases (also four servers), are placed in another rack. All systems are up and running. The Recovery Strategy in case of a malfunction of the production system is that the test system will be put online with a backup of the correct software and data. What is the name of this Recovery strategy? a) b) c)
immediate Recovery intermediate Recovery graduate Recovery
QUESTION 17 Which of the following is the best description of Demand Management? a) b) c)
Demand Management delivers extra resources when the business needs them. Demand Management influences business needs for computing resources. Demand Management inquires into business needs and takes appropriate action.
QUESTION 18 As a result of Capacity Management Tuning Changes are implemented. Such Changes are more Impact and risk associated than other kinds of Changes. Why is the Impact and risk that is associated with Tuning Changes likely to be greater than that of other different types of Changes? a) b) c)
Tuning activities can have unexpected side effects. Therefore a thorough Impact analysis is necessary. Tuning Changes can have major implications on the Customers of the service involved. C. Tuning Changes usually deal with core resources, such as CPU usage, memory and disk usage or network bandwidth.
QUESTION 19 In your role as the manager of the Capacity Management process, you have been made aware of new technology that could be used to improve some of the existing proposed Capacity recommendations. What is the first action you should take?
184
ITIL Practitioner: Plan and Improve a) b) c)
determine the Impact of using this new technology on current projects to properly size applications modify and reissue the Capacity Plan recommend the use of this new technology through a Service Improvement Program
QUESTION 20 The iterative activities within Capacity Management use many sources of input data. Which of the following is not a major source of input into these iterative activities? a) b) c)
Service Level Management Thresholds The Capacity Management Database The Forward Schedule of Change
QUESTION 21 As the manager of the Capacity Management process, you have become aware of Business plans to hire a number of new sales staff that are expected to use a sales related Service. What action should you take to ensure sufficient Capacity will exist to support the new users of the service? a) b) c)
You should prepare to use Demand Management techniques to manage Service utilization. You should use analytical modeling to predict the additional Capacity required to support the new users of the Service. You should use trend analysis to predict the additional Capacity required to support the new users of the Service.
QUESTION 22 One of your applications is hosted by an external party. Internally you have agreed that Availability on the desktop should be 90 percent. You want to define and agree the Availability percentage in a contract with the external party. The complete route between third party and desktop is: Host (H), VPN tunnel (VT), Internal Network (IN), Server (S) and Desktop (DT). How can the required availability of the Host (H) be calculated? a) b) c)
H = (VT * IN * S) / 0.90 H = VT * IN * S * 0.90 H = 0.90 / (VT * IN * S)
185
ITIL Practitioner: Plan and Improve
QUESTION 23 As a member of the Plan & Improve team you are asked to advise on selecting an Availability Management technique. Which Availability Management technique is appropriate for analyzing Maintainability and its influence on Downtime? a) b) c)
CCTA Risk Analysis and Management Method (CRAMM) Expanded Incident Lifecycle Technical Observation Post
QUESTION 24 Your organization is growing towards a more mature Availability Management. Until now your department (Plan & Improve) reported Availability levels expressed in percentage (%) available on critical systems, but your Customers want the reports to reflect user and businesses experience. Incident recording is at a basic level. What would be the next step in measuring and reporting about Availability if you want to satisfy your customers? a) b) c)
measure and report on Impact of failure measure and report the duration and frequency of Unavailability. Present duration in hours and minutes. measure and report the duration of Unavailability in hours and minutes
QUESTION 25 While measuring Availability it shows that some network components have become more and more unavailable. To which Service Management process could this best be reported? a) b) c)
Capacity Management Incident and Problem Management Service Level Management
QUESTION 26 As a member of the Plan & Improve team you are asked to advise on Availability Management techniques. Which Availability Management technique is suitable for Availability reporting? a) b) c)
CCTA Risk Analysis and Management Method (CRAMM) Component Failure Impact Analysis (CFIA) C. Fault Tree Analysis (FTA)
186
ITIL Practitioner: Plan and Improve
QUESTION 27 The Availability Plan should be a long-term plan for the proactive improvement of IT Availability within the imposed Cost constraints. The impetus to improve Availability comes from, among others the inability for a new IT Service to meet its Service Level Agreement (SLA) on a consistent basis • • • •
Availability measurement trends indicating a gradual deterioration in Availability unacceptable IT Service recovery and restoration time requests from the business to increase the level of Availability provided increasing Impact on the business and its customers from IT Service failures as a result of growth and/or increased Business functionality
What could be another reason to improve Availability? a) b) c)
a request from Service Level Management (SLM) to improve Availability an update of the Availability Management plan developing Business and User measurement and reporting
QUESTION 28 While setting up IT Service Continuity Management (ITSCM) you have to plan its organization. For Recovery purposes only IT is only part of the overall command, control and communications structure. In which layers of this organization structure of ITSCM does IT take part? a) b) c)
in the Business Impact Analysis, Data Management and Recovery layer in the executive, coordination and recovery layer in the Risk Reduction, Testing and recovery layer
QUESTION 29 Which of the following measures is not an implementation of a risk reduction measure? a) b) c)
a comprehensive backup strategy and an off-site storage of tapes improving Change control outsourcing services to a third party
187
ITIL Practitioner: Plan and Improve
QUESTION 30 While implementing new Changes it is necessary that the Requests for Change (RFCs) are evaluated against the IT Service Continuity Management (ITSCM) plans. In which step of the Change Management process will this take place? a) b) c)
Change building, testing and implementation Change Impact and resource assessment Change logging and filtering
QUESTION 31 Every year your department tests the IT Service Continuity Management (ITSCM) plans. After a 3 year period one of the ITSCM plans fails. What is the first action you will take? a) b) c)
review all Changes of the last year regarding Impact on this ITSCM plan review the ITSCM plan review the ITSCM plan together with the report of the failed test
QUESTION 32 As your organization progresses to the final stages of the Business Continuity project, the emphasis shifts from awareness of the need for IT Service Continuity Management (ITSCM) mechanisms towards the responsibilities and actions necessary to implement, test and maintain those mechanisms in an operational environment. Who has the key role in ongoing awareness and commitment throughout the entire organization? a) b) c)
senior management the ITSCM manager in the Plan and Improve team the Plan and Improve team
QUESTION 33 What must be done during the review of the Availability, Capacity and IT Service Continuity plans to ensure the validity of these plans? a) b) c)
check if the Key Performance Indicators are still being met check if the procedures are still carried out check if the scope of the plan is still being met
188
ITIL Practitioner: Plan and Improve
QUESTION 34 As a member of the Plan and Improve team it is your responsibility to review the tools used for Availability Management, Capacity Management and IT Service Continuity Management. Which approach ensures that your review gives the desired results? a) b) c)
checking the tools' results with another tool having the tools certified by a specialized party performing an EDP audits
QUESTION 35 Which of the following approaches is the most appropriate in conducting process audits? a) b) c)
ad hoc process audits regularly planned process audits regularly planned and ad hoc process audits
QUESTION 36 What is an important advantage of using the continuous improvement methodology in Availability Management? a) b) c)
all threats become identified and their levels accurately assessed it delivers Availability improvements from a customer's point of view it enables IT staff to observe the operational environment
QUESTION 37 You are reviewing the capacity of the mail system of your organization. After one year the current utilization of storage capacity of the mail system is 40 percent. Each mailbox is limited to 200 MB and there are 50 users. The disk capacity threshold is set to 80 percent. The management of the company does not want a limit on the mailboxes at all and asks you to remove this restriction, since there is more than enough space on the server.
189
ITIL Practitioner: Plan and Improve
What is the best action you can take? a) b) c)
You order to remove the limit for all users and place additional hard disks. You order to remove the limit for all users and set the disk capacity threshold to 60 percent. You order to remove the limit for all users because there is enough space.
QUESTION 38 Last month you have been confronted with Capacity Problems on a specific system. You find out that the reason for those Problems is that the number of Users on this system has increased rapidly. In the future, in which process should you take corrective action? a) b) c)
in the Business Capacity Management sub-process in the Change Management process in the Service Capacity Management sub-process
QUESTION 39 How can you effectively identify new IT Service Continuity Management requirements (ITSCM) in order to plan appropriate actions in the ITSCM plans? a) by performing a Business Impact Analysis and a risk assessment b) by reviewing all new or updated Service Level Agreements (SLAs) c) by scrutinizing all new or amended laws concerning new requirements to the Business processes QUESTION 40 Some Changes have an effect on one or more IT Service Continuity Management (ITSCM) plans. It is your responsibility to identify the Changes and the effects. How can you keep track of the Changes that have an effect on ITSCM plans? a) b) c)
All new Changes must be known. The Forward Schedule of Changes (FSC) must be known. You need to be a member of the Change Advisory Board (CAB) and attend meetings on a regular basis.
190
ITIL Practitioner: Plan and Improve
10 MOCK EXAM ANSWERS The table below shows the correct answers to the questions. Question Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Answer C C B B A C C C C B C C C C B B B B B B
Question Number 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
Answer B C B B A B A B C B C A C A C B B A A A
191
ITIL Practitioner: Plan and Improve
192
ITIL Practitioner: Plan and Improve
11 FURTHER READING For more information on other products available from The Art of Service, you can visit our website: http://www.theartofservice.com If you found this workbook helpful, you can find more publications from The Art of Service at: http://www.amazon.com
193