This document provides an overview and introduction to database management. It discusses the history and evolution of databases from early file systems to modern database management systems (DBMS). It also describes the basic components of a DBMS including data models, languages, users, and transactions to ensure data integrity and consistency when accessed concurrently. The goal of a DBMS is to abstract the complexities of data storage and access to provide applications and users a uniform way to interface with persistent and reliable data.
1 of 22
More Related Content
1 overview-handout-notes
1. Overview of Data Management
Grant Weddell
David R. Cheriton School of Computer Science
University of Waterloo
CS 348
Introduction to Database Management
Spring 2014
CS 348 (Intro to DB Mgmt) Overview of Data Management Spring 2014 1 / 22
Notes
2. Course Logistics
Webpage www.cs.uwaterloo.ca/gweddell/cs348
Text Book Database Management Systems (3rd Edition).
Raghu Ramakrishnan and Johannes Gehrke.
McGraw Hill, 2000.
CS 348 (Intro to DB Mgmt) Overview of Data Management Spring 2014 2 / 22
Notes
3. Course Content
Why do we use databases?
Functionality provided by a Database Management System
Database Models: Relational, Network, OO
How do we use a DBMS?
Relational model, foundational query languages
SQL
Application programming
Transactions and concurrency
How do we design a database?
Entity-Relationship (ER) modeling
Dependencies and constraints
Redundancy and normal forms
How do we administer a DBMS?
Security and authorization
Physical design/tuning
CS 348 (Intro to DB Mgmt) Overview of Data Management Spring 2014 3 / 22
Notes
4. What is a Database?
Definition (Database)
A large and persistent collection of (more-or-less similar) pieces of
information organized in a way that facilitates efficient retrieval and
modification.
Examples:
a file cabinet
a library system
a personnel management system
Definition (Database Management System (DBMS))
A program (or set of programs) that manages details related to storage
and access for a database.
CS 348 (Intro to DB Mgmt) Overview of Data Management Spring 2014 4 / 22
Notes
5. Application of Databases
Original inventory control
payroll
banking and financial systems
reservation systems
More recent computer aided design (CAD)
software development (CASE, SDE/SSE)
telecommunication systems
e-commerce
dynamic/personalized web content
CS 348 (Intro to DB Mgmt) Overview of Data Management Spring 2014 5 / 22
Notes
6. Application of Databases (contd)
Common Circumstances:
There is lots of data (mass storage)
Data is formatted
Requirements:
persistence and reliability
efficient and concurrent access
Issues:
many files with different structure
shared files or replicated data
need to exchange data (translation programs)
Note
The data maintained by the system are much more important and
valuable than the system itself.
CS 348 (Intro to DB Mgmt) Overview of Data Management Spring 2014 6 / 22
Notes
7. Brief History of Data Management: Ancient
2000 BC: Sumerian Records
350 BC: Syllogisms (Aristotle)
296 BC: Library of Alexandria
1879: Modern Logic (Frege)
1884: U.S. Census (Hollerith)
1941: Model Theory (Tarski)
CS 348 (Intro to DB Mgmt) Overview of Data Management Spring 2014 7 / 22
Notes
8. Brief History of Data Management: 1950s
First generation 50s and 60s
batch processing
sequential files and tape
input on punched cards
Second generation (60s)
disk enabled random access files
new access methods (ISAM, hash files)
mostly batch with some interactivity
independent applications with separate files
growing applications base
CS 348 (Intro to DB Mgmt) Overview of Data Management Spring 2014 8 / 22
Notes
9. Brief History of Data Management: 1960s (contd)
As the application base grows, we end up with
many shared files
a multitude of file structures
a need to exchange data among applications
This causes a variety of problems
redundancy: multiple copies
inconsistency: independent updates
inaccuracy: concurrent updates
incompatibility: multiple formats
insecurity: proliferation
inauditability: poor chain of responsibility
inflexibility: changes are difficult to apply
CS 348 (Intro to DB Mgmt) Overview of Data Management Spring 2014 9 / 22
Notes
10. Brief History of Data Management: 1960s (contd)
Hierarchical data model
IBMs Information Management System (IMS): concurrent access
only allows 1:N parent-child relationships (i.e. a tree)
hierarchy can be exploited for efficiency
queries navigate up and down treesone record at a time
data access language embedded in business processing language
difficult to express some queries
Network data model
Charles Bachmans Integrated Data Store (IDS)
model standardized by Conference On DAta SYstems Languages
(CODASYL)
data organized as collections of sets of records
separation of physical data representation from users view of data
pointers between records represent relationships
set types encoded as lists
queries navigate between recordsstill one record at a time
CS 348 (Intro to DB Mgmt) Overview of Data Management Spring 2014 10 / 22
Notes
11. Database Management System
Idea
Abstracts common functions and creates a uniform well defined
interface for applications accessing data.
1 Data Model
all data stored in a well defined way
2 Access control
only authorized people get to see/modify it
3 Concurrency control
multiple concurrent applications access data
4 Database recovery
nothing gets accidentally lost
5 Database maintenance
CS 348 (Intro to DB Mgmt) Overview of Data Management Spring 2014 11 / 22
Notes
12. Brief History of Data Management: 1970s
Edgar Codd proposes relational data model (1970)
firm mathematical foundation ! declarative queries
Charles Bachman wins ACM Turing award (1973)
The Programmer as Navigator
Peter Chen proposes E-R model (1976)
Transaction concepts (Jim Gray and others)
IBMs System R and UC Berkeleys Ingres systems demonstrate
feasibility of relational DBMS (late 1970s)
CS 348 (Intro to DB Mgmt) Overview of Data Management Spring 2014 12 / 22
Notes
13. Three Level Schema Architecture
Definition (Schema)
A schema is a description of the data interface to the database (i.e.,
how the data is organized).
1 External schema (view): what the application programs and user
see. May differ for different users of the same database.
2 Conceptual schema: description of the logical structure of all data
in the database.
3 Physical schema: description of physical aspects (selection of files,
devices, storage algorithms, etc.)
Definition (Instance)
A database instance is a database (real data) that conforms to a
given schema.
CS 348 (Intro to DB Mgmt) Overview of Data Management Spring 2014 13 / 22
Notes
14. Three-level Schema Architecture (cont.)
External
Schema
Conceptual
Schema
Internal
Schema
Conceptual
view
Internal
view
External
view
External
view
External
view
Users/
Applications
DBMS
Database
CS 348 (Intro to DB Mgmt) Overview of Data Management Spring 2014 14 / 22
Notes
15. Data Independence
Idea
Applications do not access data directly but, rather through an
abstract data model provided by the DBMS.
Two kinds of data independence:
Physical: applications immune to changes in storage structures
Logical: applications immune to changes in data organization
Note
One of the most important reasons to use a DBMS!
CS 348 (Intro to DB Mgmt) Overview of Data Management Spring 2014 15 / 22
Notes
16. Interfacing to the DBMS
Data Definition Language (DDL): for specifying schemas
may have different DDLs for external schema, conceptual schema,
internal schema
information is stored in the data dictionary, or catalog
Data Manipulation Language (DML): for specifying queries and
updates
navigational (procedural)
non-navigational (declarative)
CS 348 (Intro to DB Mgmt) Overview of Data Management Spring 2014 16 / 22
Notes
17. Types of Database Users
End user:
Accesses the database indirectly through forms or other
query-generating applications, or
Generates ad-hoc queries using the DML.
Application developer:
Designs and implements applications that access the database.
Database administrator (DBA):
Manages conceptual schema.
Assists with application view integration.
Monitors and tunes DBMS performance.
Defines internal schema.
Loads and reformats database.
Is responsible for security and reliability.
CS 348 (Intro to DB Mgmt) Overview of Data Management Spring 2014 17 / 22
Notes
18. Transactions
When multiple applications access the same data, undesirable results
occur.
Example:
withdraw(AC,1000) withdraw(AC,500)
Bal := getbal(AC)
Bal := getbal(AC)
if (Bal1000) if (Bal500)
give-money give-money
setbal(AC,Bal-1000)
setbal(AC,Bal-500)
Idea
Every application may think it is the sole application accessing the
data. The DBMS should guarantee correct execution.
CS 348 (Intro to DB Mgmt) Overview of Data Management Spring 2014 18 / 22
Notes
19. Transactions (contd)
Definition (Transaction)
An application-specified atomic and durable unit of work.
Properties of transactions ensured by the DBMS:
Atomic: a transaction occurs entirely, or not at all
Consistency: each transaction preserves the consistency
of the database
Isolated: concurrent transactions do not interfere
with each other
Durable: once completed, a transactions changes
are permanent
CS 348 (Intro to DB Mgmt) Overview of Data Management Spring 2014 19 / 22
Notes
20. Brief History of Data Management: 1980s
Development of commercial relational technology
IBM DB2, Oracle, Informix, Sybase
Edgar Codd wins ACM Turing award (1981)
SQL standardization efforts through ANSI and ISO
Object-oriented DBMSs
persistent objects
object ids, methods, inheritence
navigational interface reminicent of hierarchical model
CS 348 (Intro to DB Mgmt) Overview of Data Management Spring 2014 20 / 22
Notes
21. Brief History of Data Management: 1990s-Present
Continued expansion of SQL and system capabilities
New application areas:
the Internet
On-Line Analytic Processing (OLAP)
data warehousing
embedded systems
multimedia
XML
data streams
Jim Gray wins ACM Turing award (1998)
Relational DBMSs incorporate objects (late 1990s)
CS 348 (Intro to DB Mgmt) Overview of Data Management Spring 2014 21 / 22
Notes
22. Summary
Using a DBMS to manage data helps:
to remove common code from applications
to provide uniform access to data
to guarantee data integrity
to manage concurrent access
to protect against system failure
to set access policies for data
CS 348 (Intro to DB Mgmt) Overview of Data Management Spring 2014 22 / 22
Notes