Download Understanding Information Storage and Management: A Comprehensive Guide and more Study Guides, Projects, Research Histology in PDF only on Docsity!
Storing, Managing, and Protecting Digital Information
Information
Storage and
Management
EMC Education Services
Information Storage and Management
Published by Wiley Publishing, Inc. 10475 Crosspoint Boulevard Indianapolis, IN 46256 www.wiley.com
Copyright © 2009 by EMC Corporation
Published by Wiley Publishing, Inc., Indianapolis, Indiana
Published simultaneously in Canada
ISBN: 978-0-470-29421-
Manufactured in the United States of America
10 9 8 7 6 5 4 3 2 1
No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600. Requests to the Publisher for permission should be addressed to the Legal Department, Wiley Publishing, Inc., 10475 Crosspoint Blvd., Indianapolis, IN 46256, (317) 572-3447, fax (317) 572-4355, or online at www.wiley.com/go/permissions.
Limit of Liability/Disclaimer of Warranty: The publisher and the author make no representations or war- ranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation warranties of fitness for a particular purpose. No warranty may be created or extended by sales or promotional materials. The advice and strategies contained herein may not be suitable for every situation. This work is sold with the understanding that the publisher is not engaged in rendering legal, accounting, or other professional services. If professional assistance is required, the services of a competent professional person should be sought. Neither the publisher nor the author shall be liable for damages arising herefrom. The fact that an organization or Web site is referred to in this work as a citation and/or a potential source of further information does not mean that the author or the publisher endorses the information the organization or Web site may provide or recommendations it may make. Further, readers should be aware that Internet Web sites listed in this work may have changed or disappeared between when this work was written and when it is read.
For general information on our other products and services please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.
Library of Congress Cataloging-in-Publication Data is available from the publisher.
Trademarks: Wiley and the Wiley logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates, in the United States and other countries, and may not be used without written permission. All other trademarks are the property of their respective owners. Wiley Publishing, Inc. is not associated with any product or vendor mentioned in this book.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.
EMC^2 , EMC, EMC Centera, EMC ControlCenter, AdvantEdge, AlphaStor, ApplicationXtender, Avamar, Captiva, Catalog Solution, Celerra, Centera, CentraStar, ClaimPack, ClaimsEditor, ClaimsEditor Professional, CLARalert, CLARiiON, ClientPak, CodeLink, Connectrix, Co-StandbyServer, Dantz, Direct Matrix Architecture, Dis- kXtender, DiskXtender 2000, Document Sciences, Documentum, EmailXaminer, EmailXtender, EmailXtract, eRoom, Event Explorer, FLARE, FormWare, HighRoad, InputAccel, Invista, ISIS, Max Retriever, Navisphere, NetWorker, nLayers, OpenScale, PixTools, Powerlink, PowerPath, Rainfinity, RepliStor, ResourcePak, Retrospect, Smarts, SnapShotServer, SnapView/IP, SRDF, Symmetrix, TimeFinder, VisualSAN, Voyence, VSAM-Assist, WebXtender, where information lives, xPression, xPresso, Xtender, and Xtender Solutions are registered trade- marks and EMC LifeLine, EMC OnCourse, EMC Proven, EMC Snap, EMC Storage Administrator, Acartus, Access Logix, ArchiveXtender, Atmos, Authentic Problems, Automated Resource Manager, AutoStart, Auto- Swap, AVALONidm, C-Clip, Celerra Replicator, CenterStage, CLARevent, Codebook Correlation Technology, Common Information Model, CopyCross, CopyPoint, DatabaseXtender, Digital Mailroom, Direct Matrix, EDM, E-Lab, eInput, Enginuity, FarPoint, FirstPass, Fortress, Global File Virtualization, Graphic Visualization, Infini- Flex, InfoMover, Infoscape, InputAccel Express, MediaStor, MirrorView, Mozy, MozyEnterprise, MozyHome, MozyPro, OnAlert, PowerSnap, QuickScan, RepliCare, SafeLine, SAN Advisor, SAN Copy, SAN Manager, SDMS, SnapImage, SnapSure, SnapView, StorageScope, SupportMate, SymmAPI, SymmEnabler, Symmetrix DMX, UltraFlex, UltraPoint, UltraScale, Viewlets, Virtual Provisioning, and VisualSRM are trademarks of EMC Corporation. All other trademarks used herein are the property of their respective owners. © Copyright 2009 EMC Corporation. All rights reserved. Published in the USA. 01/
iii
G Somasundaram (Somu) is a graduate from the Indian Institute of Technology
in Mumbai, India, and has over 22 years of experience in the IT industry, the
last 10 with EMC Corporation. Currently he is director, EMC Global Services,
leading worldwide industry readiness initiatives. Somu is the architect of EMC’s
open storage curriculum, aimed at addressing the storage knowledge “gap”
that exists in the IT industry. Under his leadership and direction, industry
readiness initiatives, such as the EMC Learning Partner and Academic Alliance
programs, continue to experience significant growth and educate thousands of
students worldwide on information storage and management technologies. Key
areas of Somu’s responsibility include guiding a global team of professionals,
identifying and partnering with global IT education providers, and setting the
overall direction for EMC’s industry readiness initiatives. Prior to his current
role, Somu held various managerial and leadership roles with EMC as well as
other leading IT vendors.
Alok Shrivastava is senior director, EMC Global Services and has focused
on education since 2003. Alok is the architect of several of EMC’s successful
education initiatives including the industry leading EMC Proven Professional
program, industry readiness programs such as EMC’s Academic Alliance, and
most recently this unique and valuable book on information storage technology.
Alok provides vision and leadership to a team of highly talented experts and
professionals that develops world-class technical education for EMC’s employ-
ees, partners, customers, and other industry professionals. Prior to his success
in education, Alok built and led a highly successful team of EMC presales
engineers in Asia-Pacific and Japan. Earlier in his career, Alok was a systems
manager, storage manager, and a backup/restore/disaster recovery consultant
working with some of the world’s largest data centers and IT installations. He
holds dual Masters degrees from the Indian Institute of Technology in Mumbai,
India, and the University of Sagar in India. Alok has worked in information
storage technology and has held a unique passion for this field for most of his
25-plus year career in IT.
About the Editors
v
Acknowledgments
When we embarked upon the project to develop this book, the very first chal-
lenge was to identify a team of subject matter experts covering the vast range
of technologies that form the modern information storage infrastructure.
A key factor working in our favor is that at EMC, we have the technologies,
the know-how, and many of the best talents in the industry. When we reached
out to individual experts, they were as excited as we were about the prospect of
publishing a comprehensive book on information storage technology. This was an
opportunity to share their expertise with professionals and students worldwide.
This book is the result of efforts and contributions from a number of key EMC
organizations led by EMC Education Services and supported by the office of
CTO, Global Marketing, and EMC Engineering.
In addition to his own research and expertise, Ganesh Rajaratnam, from
EMC Education Services, led the efforts with other subject matter experts to
develop the first draft of the book. Dr. David Black, from the EMC CTO office,
devoted many valuable hours to combing through the content and providing
cogent advice on the key topics covered in this book.
We are very grateful to the following experts from EMC Education Services
for developing the content for various sections and chapters of this book:
Rodrigo Alves
Charlie Brooks
Debasish Chakrabarty
Diana Davis
Amit Deshmukh
Michael Dulavitz
Ashish Garg
Dr. Vanchi Gurumoorthy
Simon Hawkshaw
Anbuselvi Jeyakumar
Sagar Kotekar Patil
Andre Rossouw
Tony Santamaria
Saravanaraj Sridharan
Ganesh Sundaresan
Jim Tracy
Anand Varkar
Dr. Viswanth VS
vi Acknowledgments
The following experts thoroughly reviewed the book at various stages and
provided valuable feedback and guidance:
Ronen Artzi
Eric Baize
Greg Baltazar
Edward Bell
Christopher Chaulk
Roger Dupuis
Deborah Filer
Bala Ganeshan
Jason Gervickas
Nancy Gessler
Jody Goncalves
Jack Harwood
Arthur Johnson
Michelle Lavoie
Tom McGowan
Jeffery Moore
Toby Morral
Peter Popieniuck
Kevin Sheridan
Ed VanSickle
We also thank NIIT Limited for their help with the initial draft, Muthaiah
Thiagarajan of EMC and DreaMarT Interactive Pvt. Ltd. for their support in
creating all illustrations, and the publisher, John Wiley & Sons, for their timely
support in bringing this book to the industry.
— G. Somasundaram
Director, Education Services, EMC Corporation
— Alok Shrivastava
Senior Director, Education Services, EMC Corporation
March 2009
- Section I Storage System Introduction xix
- Chapter 1 Introduction to Information Storage and Management
- 1.1 Information Storage
- 1.1.1 Data
- 1.1.2 Types of Data
- 1.1.3 Information
- 1.1.4 Storage
- 1.2 Evolution of Storage Technology and Architecture
- 1.3 Data Center Infrastructure
- 1.3.1 Core Elements
- 1.3.2 Key Requirements for Data Center Elements
- 1.3.3 Managing Storage Infrastructure
- 1.4 Key Challenges in Managing Information
- 1.5 Information Lifecycle
- 1.5.1 Information Lifecycle Management
- 1.5.2 ILM Implementation
- 1.5.3 ILM Benefits
- Summary
- Chapter 2 Storage System Environment
- 2.1 Components of a Storage System Environment
- 2.1.1 Host
- 2.1.2 Connectivity
- 2.1.3 Storage
- 2.2 Disk Drive Components viii Contents
- 2.2.1 Platter
- 2.2.2 Spindle
- 2.2.3 Read/Write Head
- 2.2.4 Actuator Arm Assembly
- 2.2.5 Controller
- 2.2.6 Physical Disk Structure
- 2.2.7 Zoned Bit Recording
- 2.2.8 Logical Block Addressing
- 2.3 Disk Drive Performance
- 2.4 Fundamental Laws Governing Disk Performance
- 2.5 Logical Components of the Host
- 2.5.1 Operating System
- 2.5.2 Device Driver
- 2.5.3 Volume Manager
- 2.5.4 File System
- 2.5.5 Application
- 2.6 Application Requirements and Disk Performance
- Summary
- Chapter 3 Data Protection: RAID
- 3.1 Implementation of RAID
- 3.1.1 Software RAID
- 3.1.2 Hardware RAID
- 3.2 RAID Array Components
- 3.3 RAID Levels
- 3.3.1 Striping
- 3.3.2 Mirroring
- 3.3.3 Parity
- 3.3.4 RAID
- 3.3.5 RAID
- 3.3.6 Nested RAID
- 3.3.7 RAID
- 3.3.8 RAID
- 3.3.9 RAID
- 3.3.10 RAID
- 3.4 RAID Comparison
- 3.5 RAID Impact on Disk Performance
- 3.5.1 Application IOPS and RAID Configurations
- 3.6 Hot Spares
- Summary
- Chapter 4 Intelligent Storage System
- 4.1 Components of an Intelligent Storage System
- 4.1.1 Front End
- 4.1.2 Cache
- 4.1.3 Back End
- 4.1.4 Physical Disk
- 4.2 Intelligent Storage Array Contents ix
- 4.2.1 High-end Storage Systems
- 4.2.2 Midrange Storage System
- 4.3 Concepts in Practice: EMC CLARiiON and Symmetrix
- 4.3.1 CLARiiON Storage Array
- 4.3.2 CLARiiON CX4 Architecture
- 4.3.3 Managing the CLARiiON
- 4.3.4 Symmetrix Storage Array
- 4.3.5 Symmetrix Component Overview
- 4.3.6 Direct Matrix Architecture
- Summary
- Section II Storage Networking Technologies and Virtualization
- Chapter 5 Direct-Attached Storage and Introduction to SCSI
- 5.1 Types of DAS
- 5.1.1 Internal DAS
- 5.1.2 External DAS
- 5.2 DAS Benefits and Limitations
- 5.3 Disk Drive Interfaces
- 5.3.1 IDE/ATA
- 5.3.2 SATA
- 5.3.3 Parallel SCSI
- 5.4 Introduction to Parallel SCSI
- 5.4.1 Evolution of SCSI
- 5.4.2 SCSI Interfaces
- 5.4.3 SCSI-3 Architecture
- 5.4.4 Parallel SCSI Addressing
- 5.5 SCSI Command Model
- 5.5.1 CDB Structure
- 5.5.2 Operation Code
- 5.5.3 Control Field
- 5.5.4 Status
- Summary
- Chapter 6 Storage Area Networks
- 6.1 Fibre Channel: Overview
- 6.2 The SAN and Its Evolution
- 6.3 Components of SAN
- 6.3.1 Node Ports
- 6.3.2 Cabling
- 6.3.3 Interconnect Devices
- 6.3.4 Storage Arrays
- 6.3.5 SAN Management Software
- 6.4 FC Connectivity
- 6.4.1 Point-to-Point
- 6.4.2 Fibre Channel Arbitrated Loop
- 6.4.3 Fibre Channel Switched Fabric
- 6.5 Fibre Channel Ports x Contents
- 6.6 Fibre Channel Architecture
- 6.6.1 Fibre Channel Protocol Stack
- 6.6.2 Fibre Channel Addressing
- 6.6.3 FC Frame
- 6.6.4. Structure and Organization of FC Data
- 6.6.5 Flow Control
- 6.6.6 Classes of Service
- 6.7 Zoning
- 6.8 Fibre Channel Login Types
- 6.9 FC Topologies
- 6.9.1 Core-Edge Fabric
- 6.9.2 Mesh Topology
- 6.10 Concepts in Practice: EMC Connectrix
- Summary
- Chapter 7 Network-Attached Storage
- 7.1 General-Purpose Servers vs. NAS Devices
- 7.2 Benefits of NAS
- 7.3 NAS File I/O
- 7.3.1 File Systems and Remote File Sharing
- 7.3.2 Accessing a File System
- 7.3.3 File Sharing
- 7.4 Components of NAS
- 7.5 NAS Implementations
- 7.5.1 Integrated NAS
- 7.5.2 Gateway NAS
- 7.5.3 Integrated NAS Connectivity
- 7.5.4 Gateway NAS Connectivity
- 7.6 NAS File-Sharing Protocols
- 7.7 NAS I/O Operations
- 7.7.1 Hosting and Accessing Files on NAS
- 7.8 Factors Affecting NAS Performance and Availability
- 7.9 Concepts in Practice: EMC Celerra
- 7.9.1 Architecture
- 7.9.2 Celerra Product Family
- Summary
- Chapter 8 IP SAN
- 8.1 iSCSI
- 8.1.1 Components of iSCSI
- 8.1.2 iSCSI Host Connectivity
- 8.1.3 Topologies for iSCSI Connectivity
- 8.1.4 iSCSI Protocol Stack
- 8.1.5 iSCSI Discovery
- 8.1.6 iSCSI Names Contents xi
- 8.1.7 iSCSI Session
- 8.1.8 iSCSI PDU
- 8.1.9 Ordering and Numbering
- 8.1.10 iSCSI Error Handling and Security
- 8.2 FCIP
- 8.2.1 FCIP Topology
- 8.2.2 FCIP Performance and Security
- Summary
- Chapter 9 Content-Addressed Storage
- 9.1 Fixed Content and Archives
- 9.2 Types of Archives
- 9.3 Features and Benefits of CAS
- 9.4 CAS Architecture
- 9.5 Object Storage and Retrieval in CAS
- 9.6 CAS Examples
- 9.6.1 Health Care Solution: Storing Patient Studies
- 9.6.2 Finance Solution: Storing Financial Records
- 9.7 Concepts in Practice: EMC Centera
- 9.7.1 EMC Centera Models
- 9.7.2 EMC Centera Architecture
- 9.7.3 Centera Tools
- 9.7.4 EMC Centera Universal Access
- Summary
- Chapter 10 Storage Virtualization
- 10.1 Forms of Virtualization
- 10.1.1 Memory Virtualization
- 10.1.2 Network Virtualization
- 10.1.3 Server Virtualization
- 10.1.4 Storage Virtualization
- 10.2 SNIA Storage Virtualization Taxonomy
- 10.3 Storage Virtualization Configurations
- 10.4 Storage Virtualization Challenges
- 10.4.1 Scalability
- 10.4.2 Functionality
- 10.4.3 Manageability
- 10.4.4 Support
- 10.5 Types of Storage Virtualization
- 10.5.1 Block-Level Storage Virtualization
- 10.5.2 File-Level Virtualization
- 10.6 Concepts in Practice
- 10.6.1 EMC Invista
- 10.6.2 Rainfinity
- Summary
- Section III Business Continuity xii Contents
- Chapter 11 Introduction to Business Continuity
- 11.1 Information Availability
- 11.1.1 Causes of Information Unavailability
- 11.1.2 Measuring Information Availability
- 11.1.3 Consequences of Downtime
- 11.2 BC Terminology
- 11.3 BC Planning Lifecycle
- 11.4 Failure Analysis
- 11.4.1 Single Point of Failure
- 11.4.2 Fault Tolerance
- 11.4.3 Multipathing Software
- 11.5 Business Impact Analysis
- 11.6 BC Technology Solutions
- 11.7 Concept in Practice: EMC PowerPath
- 11.7.1 PowerPath Features
- 11.7.2 Dynamic Load Balancing
- 11.7.3 Automatic Path Failover
- Summary
- Chapter 12 Backup and Recovery
- 12.1 Backup Purpose
- 12.1.1 Disaster Recovery
- 12.1.2 Operational Backup
- 12.1.3 Archival
- 12.2 Backup Considerations
- 12.3 Backup Granularity
- 12.4 Recovery Considerations
- 12.5 Backup Methods
- 12.6 Backup Process
- 12.7 Backup and Restore Operations
- 12.8 Backup Topologies
- 12.9 Backup in NAS Environments
- 12.10 Backup Technologies
- 12.10.1 Backup to Tape
- 12.10.2 Physical Tape Library
- 12.10.3 Backup to Disk
- 12.10.4 Virtual Tape Library
- 12.11 Concepts in Practice: EMC NetWorker
- 12.11.1 NetWorker Backup Operation
- 12.11.2 NetWorker Recovery
- 12.11.3 EmailXtender
- 12.11.4 DiskXtender
- 12.11.5 Avamar
- 12.11.6 EMC Disk Library (EDL)
- Summary
- Chapter 13 Local Replication Contents xiii
- 13.1 Source and Target
- 13.2 Uses of Local Replicas
- 13.3 Data Consistency
- 13.3.1 Consistency of a Replicated File System
- 13.3.2 Consistency of a Replicated Database
- 13.4 Local Replication Technologies
- 13.4.1 Host-Based Local Replication
- 13.4.2 Storage Array–Based Replication
- 13.5 Restore and Restart Considerations
- 13.5.1 Tracking Changes to Source and Target
- 13.6 Creating Multiple Replicas
- 13.7 Management Interface - EMC SnapView 13.8 Concepts in Practice: EMC TimeFinder and
- 13.8.1 TimeFinder/Clone
- 13.8.2 TimeFinder/Mirror
- 13.8.3 EMC SnapView
- 13.8.4 EMC SnapSure
- Summary
- Chapter 14 Remote Replication
- 14.1 Modes of Remote Replication
- 14.2 Remote Replication Technologies
- 14.2.1. Host-Based Remote Replication
- 14.2.2 Storage Array-Based Remote Replication
- 14.2.3 SAN-Based Remote Replication
- 14.3 Network Infrastructure
- 14.3.1 DWDM
- 14.3.2 SONET
- and EMC MirrorView 14.4 Concepts in Practice: EMC SRDF, EMC SAN Copy,
- 14.4.1 SRDF Family
- 14.4.2 Disaster Recovery with SRDF
- 14.4.3 SRDF Operations for Concurrent Access
- 14.4.4 EMC SAN Copy
- 14.4.5 EMC MirrorView
- Summary
- Section IV Storage Security and Management
- Chapter 15 Securing the Storage Infrastructure
- 15.1 Storage Security Framework
- 15.2 Risk Triad
- 15.2.1 Assets
- 15.2.2 Threats
- 15.2.3 Vulnerability
- 15.3 Storage Security Domains xiv Contents
- 15.3.1 Securing the Application Access Domain
- 15.3.2 Securing the Management Access Domain
- 15.3.3 Securing Backup, Recovery, and Archive (BURA)
- 15.4 Security Implementations in Storage Networking
- 15.4.1 SAN
- 15.4.2 NAS
- 15.4.3 IP SAN
- Summary
- Chapter 16 Managing the Storage Infrastructure
- 16.1 Monitoring the Storage Infrastructure
- 16.1.1 Parameters Monitored
- 16.1.2 Components Monitored
- 16.1.3 Monitoring Examples
- 16.1.4 Alerts
- 16.2 Storage Management Activities
- 16.2.1 Availability management
- 16.2.2 Capacity management
- 16.2.3 Performance management
- 16.2.4 Security Management
- 16.2.5 Reporting
- 16.2.6 Storage Management Examples
- 16.3 Storage Infrastructure Management Challenges
- 16.4 Developing an Ideal Solution
- 16.4.1 Storage Management Initiative
- 16.4.2 Enterprise Management Platforms
- 16.5 Concepts in Practice: EMC ControlCenter
- 16.5.1 ControlCenter Features and Functionality
- 16.5.2 ControlCenter Architecture
- Summary
- Appendix
- Glossary
- Index
WAN LAN FC SAN Network^ Storage IP
Virtualization Appliance
Host w ith Host w ith 1 HBA Host w ith 2 HBA Tape Library I nternal Storage
Host
RAI D Array JBOD
Control Station
NAS Head Client
FC Director
Storage Array CAS I ntegrated NAS Generic Array w ith ports
Firew all
File System Standard disk LUN Striped disk Logical Volume
iSCSI Bridge
F C I P G a t e w a y FCI P Gatew ay I P Sw itch
I P connectivity
FC connectivity
FC Hub
FC Sw itch
FC Router
I P Router
Icons used in this book
xvii
Foreword
Ralph Waldo Emerson, the great American essayist, philosopher, and poet, once
said that the invariable mark of wisdom is seeing the miraculous in the com-
mon. Today, common miracles surround us, and it is virtually impossible not
to see them. Most of us have modern gadgetry such as digital cameras, video
camcorders, cell phones, fast computers that can access millions of websites,
instant messaging, social networking sites, search engines, music downloads …
the list goes on. All of these examples have one thing in common: they generate
huge volumes of data. Not only are we in an information age, we’re in an age
where information is exploding into a digital universe that requires enhanced
technology and a new generation of professionals who are able to manage,
leverage, and optimize storage and information management solutions.
Just to give you an idea of the challenges we face today, in one year the amount
of digital information created, captured, and replicated is millions of times the
amount of information in all the books ever written. Information is the most
important asset of a business. To realize the inherent power of information, it
must be intelligently and efficiently stored, protected, and managed—so that it
can be made accessible, searchable, shareable, and, ultimately, actionable.
We are currently in the perfect storm. Everything is increasing: the informa-
tion, the costs, and the skilled professionals needed to store and manage it—
professionals who are not available in sufficient numbers to meet the growing
need. The IT manager’s number one concern is how to manage this storage
growth. Enterprises simply cannot purchase bigger and better “boxes” to store
their data. IT managers must not only worry about budgets for storage technol-
ogy, but also be concerned with energy-efficient, footprint-reducing technology
that is easy to install, manage, and use. Although many IT managers intend to
xviii Foreword
hire more trained staff, they are facing a shortage of skilled, storage-educated
professionals who can take control of managing and optimizing the data.
I was unable to find a comprehensive book in the marketplace that provided
insight into the various technologies deployed to store and manage informa-
tion. As an industry leader, we have the subject-matter expertise and practical
experience to help fill this gap; and now this book can give you a behind-the-
scenes view of the technologies used in information storage and management.
You will learn where data goes, how it is managed, and how you can contribute
to your company’s profitability.
If you’ve chosen storage and information infrastructure management as your
career, you are a pioneer in a profession that is undergoing constant change,
but one in which the challenges lead to great rewards.
Regardless of your current role in IT, this book should be a key part of your
IT library and professional development.
Thomas P. Clancy
Vice President, Education Services, EMC Corporation
March 2009