Table Of ContentImplementing Analytics
This page intentionally left blank
Implementing Analytics
A Blueprint for Design,
Development, and Adoption
Nauman Sheikh
AMSTERDAM • BOSTON • HEIDELBERG • LONDON
NEW YORK • OXFORD • PARIS • SAN DIEGO
SAN FRANCISCO • SINGAPORE • SYDNEY • TOKYO
Morgan Kaufmann is an imprint of Elsevier
Acquiring Editor: Andrea Dierna
Editorial Project Manager: Heather Scherer
Project Manager: Punithavathy Govindaradjane
Designer: Russell Purdy
Morgan Kaufmann is an imprint of Elsevier
225 Wyman Street, Waltham, MA 02451, USA
Copyright © 2013 Elsevier Inc. All rights reserved
No part of this publication may be reproduced or transmitted in any form or by any means,
electronic or mechanical, including photocopying, recording, or any information storage and
retrieval system, without permission in writing from the publisher. Details on how to seek
permission, further information about the Publisher’s permissions policies and our arrangements
with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency,
can be found at our website: www.elsevier.com/permissions.
This book and the individual contributions contained in it are protected under copyright by the
Publisher (other than as may be noted herein).
Notices
Knowledge and best practice in this field are constantly changing. As new research and experience
broaden our understanding, changes in research methods or professional practices, may become
necessary. Practitioners and researchers must always rely on their own experience and knowledge
in evaluating and using any information or methods described herein. In using such information
or methods they should be mindful of their own safety and the safety of others, including parties
for whom they have a professional responsibility.
To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors,
assume any liability for any injury and/or damage to persons or property as a matter of products
liability, negligence or otherwise, or from any use or operation of any methods, products,
instructions, or ideas contained in the material herein.
Library of Congress Cataloging-in-Publication Data
Sheikh, Nauman Mansoor.
Implementing analytics : a blueprint for design, development, and adoption/Nauman Sheikh.
pages cm
Includes bibliographical references and index.
ISBN 978-0-12-401696-5 (alk. paper)
1. System analysis. I. Title.
T57.6.S497 2013
003—dc23 2013006254
British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British Library
For information on all MK publications,
visit our website at www.mkp.com
Printed and bound in the United States of America
13 14 15 16 17 10 9 8 7 6 5 4 3 2 1
Contents
ACKNOWLEDGMENTS .............................................................................xi
AUTHOR BIOGRAPHY ............................................................................xiii
INTRODUCTION .......................................................................................xv
Part 1 Concept
CHAPTER 1 Defining Analytics .............................................................3
The Hype ......................................................................................3
The Challenge of Definition ........................................................4
Definition 1: Business Value Perspective ...............................5
Definition 2: Technical Implementation Perspective ............6
Analytics Techniques ..................................................................7
Algorithm versus Analytics Model .........................................8
Forecasting ...............................................................................9
Descriptive Analytics .............................................................11
Predictive Analytics ...............................................................13
Decision Optimization ............................................................18
Conclusion of Definition ............................................................20
CHAPTER 2 Information Continuum ...................................................21
Building Blocks of the Information Continuum .......................22
Theoretical Foundation in Data Sciences .............................23
Tools, Techniques, and Technology......................................24
Skilled Human Resources ......................................................24
Innovation and Need ..............................................................25
Information Continuum Levels .................................................25
Search and Lookup .................................................................26
Counts and Lists .....................................................................27
Operational Reporting............................................................28
Summary Reporting ...............................................................29
Historical (Snapshot) Reporting ............................................30
Metrics, KPIs, and Thresholds ...............................................31
v
Analytical Applications ..........................................................33
vi Contents
Analytics Models ....................................................................35
Decision Strategies .................................................................36
Monitoring and Tuning—Governance ..................................38
Summary .....................................................................................40
CHAPTER 3 Using Analytics ................................................................41
Healthcare ..................................................................................42
Emergency Room Visit ...........................................................42
Patients with the Same Disease ............................................43
Customer Relationship Management .......................................44
Customer Segmentation ........................................................44
Propensity to Buy ...................................................................45
Human Resource ........................................................................46
Employee Attrition .................................................................46
Resumé Matching ...................................................................47
Consumer Risk ...........................................................................48
Borrower Default ....................................................................49
Insurance ....................................................................................49
Probability of a Claim .............................................................50
Telecommunication ....................................................................51
Call Usage Patterns ................................................................51
Higher Education .......................................................................51
Admission and Acceptance ...................................................52
Manufacturing ............................................................................52
Predicting Warranty Claims ..................................................53
Analyzing Warranty Claims ...................................................54
Energy and Utilities ...................................................................54
The New Power Management Challenge ............................55
Fraud Detection ..........................................................................57
Benefits Fraud.........................................................................57
Credit Card Fraud ...................................................................57
Patterns of Problems ..................................................................58
How Much Data ......................................................................59
Performance or Derived Variables ........................................59
Part 2 Design
CHAPTER 4 Performance Variables and Model Development ..........63
Performance Variables ...............................................................63
What are Performance Variables? .........................................64
Designing Performance Variables .........................................70
Working Example ...................................................................73
Model Development ...................................................................75
Contents vii
What is a Model? ....................................................................75
Model and Characteristics in Predictive Modeling .............75
Model and Characteristics in Descriptive Modeling ...........78
Model Validation and Tuning ................................................79
Champion–Challenger: A Culture of Constant Innovation ....82
CHAPTER 5 Automated Decisions and Business Innovation............85
Automated Decisions .................................................................85
Decision Strategy .......................................................................85
Business Rules in Business Operations ...............................87
Decision Automation and Business Rules ............................88
Joint Business and Analytics Sessions for
Decision Strategies .................................................................89
Examples of Decision Strategy ..............................................89
Decision Automation and Intelligent Systems ........................94
Learning versus Applying .....................................................94
Strategy Integration Methods ...............................................96
Strategy Evaluation....................................................................97
Retrospective Processing .......................................................97
Reprocessing...........................................................................97
Champion–Challenger Strategies .............................................98
Business Process Innovation .................................................98
CHAPTER 6 G overnance: Monitoring and Tuning of Analytics
Solutions .........................................................................101
Analytics and Automated Decisions ......................................101
The Risk of Automated Decisions .......................................102
Monitoring Layer ..................................................................102
Audit and Control Framework ................................................103
Organization and Process ....................................................103
Audit Datamart .....................................................................104
Control Definition .................................................................106
Reporting and Action ...........................................................108
Part 3 Implementation
CHAPTER 7 Analytics Adoption Roadmap .......................................113
Learning from Success of Data Warehousing ........................113
Lesson 1: Simplification .......................................................113
Lesson 2: Quick Results .......................................................114
Lesson 3: Evangelize ...........................................................114
Lesson 4: Efficient Data Acquisition ..................................115
Lesson 5: Holistic View .......................................................115
viii Contents
Lesson 6: Data Management...............................................115
The Pilot ....................................................................................117
Business Problem .................................................................117
Management Attention and Champion ..............................118
The Project ............................................................................119
Results, Roadshow, and Case for Wider Adoption ............125
CHAPTER 8 Requirements Gathering for Analytics Projects ..........129
Purpose of Requirements ........................................................129
Requirements: Historical Perspective ....................................129
Calculations ..........................................................................130
Process Automation .............................................................132
Analytical and Reporting Systems ......................................132
Analytics and Decision Strategy .........................................133
Requirements Extraction .........................................................134
Problem Statement and Goal ...............................................135
Data Requirements ...............................................................139
Model and Decision Strategy Requirements ......................142
Business Process Integration Requirements .....................144
CHAPTER 9 Analytics Implementation Methodology .....................147
Centralized versus Decentralized ...........................................148
Centralized Approach ..........................................................148
Decentralized Approach ......................................................149
A Hybrid Approach ..............................................................149
Building on the Data Warehouse ............................................149
Methodology .............................................................................151
Requirements ........................................................................152
Analysis .................................................................................153
Design....................................................................................158
Implementation ....................................................................164
Deployment ...........................................................................165
Execution and Monitoring ...................................................165
CHAPTER 10 Analytics Organization and Architecture ....................167
Organizational Structure .........................................................167
BICC Organization Chart .....................................................168
Roles and Responsibilities ...................................................170
Skills Summary .....................................................................175
Technical Components in Analytics Solutions ......................176
Analytics Datamart ..............................................................176
Contents ix
CHAPTER 11 Big Data, Hadoop, and Cloud Computing ....................185
Big Data ....................................................................................185
Velocity ..................................................................................186
Variety ...................................................................................187
Volume ...................................................................................187
Big Data Implementation Challenge ...................................188
Hadoop ......................................................................................189
Hadoop Technology Stack ...................................................189
Hadoop Solution Architecture .............................................191
Hadoop as an Analytical Engine .........................................193
Cloud Computing (For Analytics) ...........................................196
Disintegration in Cloud Computing ....................................196
Analytics in Cloud Computing ............................................197
CONCLUSION .........................................................................................199
REFERENCES ..........................................................................................203
INDEX ......................................................................................................207
Description:Implementing Analytics demystifies the concept, technology and application of analytics and breaks its implementation down to repeatable and manageable steps, making it possible for widespread adoption across all functions of an organization. Implementing Analytics simplifies and helps democratize a