Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Distributed Databases: Concepts, Advantages, and Strategies, Study notes of Database Management Systems (DBMS)

The concept of distributed databases, their reasons for use, and the differences between distributed and decentralized databases. It covers various options for distributed databases, such as homogeneous and heterogeneous environments, and their major objectives and significant trade-offs. The document also compares distributed databases to centralized databases and discusses the advantages and disadvantages of distributing a database through data replication and partitioning.

Typology: Study notes

2010/2011

Uploaded on 08/29/2011

aditi
aditi 🇮🇳

3.8

(18)

39 documents

1 / 19

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Data Communication
Distributed and Centralized computing
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13

Partial preview of the text

Download Distributed Databases: Concepts, Advantages, and Strategies and more Study notes Database Management Systems (DBMS) in PDF only on Docsity!

Data Communication

Distributed and Centralized computing

Definitions

Distributed Database: A single logical

database spread physically across computers in multiple locations that are connected by a data communications link  Decentralized Database: A collection of independent databases on non-networked computers They are NOT the same thing!

Figure 13-1 Distributed database environments (adapted from Bell and Grimson, 1992)

Distributed Database

Options

 (^) Homogeneous–same DBMS at each node  (^) Autonomous–independent DBMSs  (^) Non-autonomous–central, coordinating DBMS  (^) Easy to manage, difficult to enforce  (^) Heterogeneous–different DBMSs at different nodes  (^) Systems–with full or partial DBMS functionality  (^) Gateways–simple paths are created to other databases without the benefits of one logical database  (^) Difficult to manage, preferred by independent organizations

Identical DBMSs Figure 13-2 Homogeneous Distributed Database Environment Source : adapted from Bell and Grimson, 1992.

Typical Heterogeneous

Environment

 Data distributed across all the nodes  Different DBMSs may be used at each node  Local access is done using the local DBMS and schema  Remote access is done using the global schema

Major Objectives

 (^) Location Transparency  (^) User does not have to know the location of the data  (^) Data requests automatically forwarded to appropriate sites  (^) Local Autonomy  (^) Local site can operate with its database when network connections fail  (^) Each site controls its own data, security, logging, recovery

Significant Trade-Offs

 (^) Synchronous Distributed Database  (^) All copies of the same data are always identical  (^) Data updates are immediately applied to all copies throughout network  (^) Good for data integrity  (^) High overhead  slow response times  (^) Asynchronous Distributed Database  (^) Some data inconsistency is tolerated  (^) Data update propagation is delayed  (^) Lower data integrity  (^) Less overhead  faster response time NOTE: all this assumes replicated data (to be discussed lat

Disadvantages of

Distributed Database

Compared to

Centralized Databases

 Software cost and complexity  Processing overhead  Data integrity exposure  Slower response for certain queries

Options for

Distributing a Database

 Data replication  (^) Copies of data distributed to different sites  Horizontal partitioning  (^) Different rows of a table distributed to different sites  Vertical partitioning  (^) Different columns of a table distributed to different sites  Combinations of the above

Data Replication (cont.)

 Disadvantages:  (^) Additional requirements for storage space  (^) Additional time for update operations  Complexity and cost of updating  Integrity exposure of getting incorrect data if replicated data is not updated simultaneously Therefore, better when used for non-volatile Therefore, better when used for non-volatile (read-only) (read-only) datadata

Types of Data Replication

 Push Replication–  updating site sends changes to other sites  Pull Replication–  receiving sites control when update messages will be processed

Factors in Choice of

Distributed Strategy

 Funding, autonomy, security  Site data referencing patterns  Growth and expansion needs  Technological capabilities  Costs of managing complex technologies  Need for reliable service