Databases
Lecture 2 – History and Design
Bernhard Reus
1
History (Highlights)
Timeline
1960
3
© Bernhard Reus, University of Sussex, 2004-16
1960 – 1970
• 61-64: Network DBMS by
Integrated Data Store IDS
• 62: SABRE: IBM & Am. Airlines,
1st real time multiple user
• From 1966: Information Management System IMS. Hierarchical DBMS by for
4
1970
• Relational model by
Ted Codd (IBM research fellow)
database theory
Famous paper:
“A Relational Model of Data for Large Shared Data Banks”
Turing Award 1981
1924-2003
5
© Bernhard Reus, University of Sussex, 2004-16
1970 – 80
• 1974 prototype: System R (first Relational DBMS, 1.2MB)
introduces SEQUEL
James Gray 1944-2007
developed by James Gray who disappeared January 28th 2007 on a 40-foot fiberglass cruiser called Tenacious around the Farallon Islands (Frisco Bay)
• Amazon provided imagery and 40,000 people looked at hundreds of GB of satellite images to find a trace of his boat
• James Gray won the Turing Award 1998
6
1970 – 80
• 1977 Lary Ellison was quicker than IBM to pick up on (the business potential of) Codd’s ideas and founded
• which became the market leader so Ellison the 3rd richest American and 5th richest man on the planet ($56 billion)
• he has been CEO for 36 years; in 2014 he decided to stand down to become chairman
7
© Bernhard Reus, University of Sussex, 2004-16
1975 – 80
• 1975 ACM Special interest group:
Management Of Data (SIGMOD)
• 1975 INGRES (Univ. of California)
predecessor of Sybase
• 1976 Entity-relationship (E/R) model by Peter Chen
8
1980 – 90
• DBMS for personal computers (dBASE
by Ashton-Tate)
• 1983: commercial systems emerge: IBM:DB2, Oracle, Sybase, Informix
• 1986: first SQL standard
• 1989: constraints added
• client-server architecture
9
© Bernhard Reus, University of Sussex, 2004-16
1990 – 2003
• “Third generation DBMSes”
– DBMS for spatial, multimedia data – Object-relational Databases
– Object-Oriented DBMS (ObjectStore)
– XML (Yukon, Tamino, Lore)
• Massive parallel processing
• Standards:
– 1992: SQL2
– 1999: SQL3 (OODBMS)
– 2003: SQL:2003, also called SQL/XML
10
2003 – 2010 • Data explosion (data
doubles every two years)
• 2006: MySQL (started 1996) gains momentum (OPEN SOURCE database); 70,000 downloads a day
• 2008: MySQL bought by SUN ($1 billion), brain- drain reported
2009 Oracle
acquired SUN (and
thus Java, MySQL
etc) for $7.4 billion
11
© Bernhard Reus, University of Sussex, 2004-16
Present and future
• Cloud databases eg Amazon EC2
“A cloud is water vapor.”
(Larry Ellison 2010)
Databases as Service, provided by Data centers
• NoSQL databases (e.g. CouchDB,Hbase,BigTable)
• Big Data: too large for standard DBMSs (petabytes)
• Mobility in networks requires distributed database systems
Facebook handles over 250 billion photos, uses HBase
• Embedded Databases, ie. serverless (or “light”) databases for applications (SQLite)
12
ANSI/SPARC Architecture
• Proposed 1975; suggests three levels:
– External level: tailored to a user
– Conceptual level: community level, what data is stored and what are relationships
– Internal level: physical representation, how it is actually stored
13
ANSI: American National Standards Institute
SPARC: StandardsPlanningandRequirementsCommittee
© Bernhard Reus, University of Sussex, 2004-16
external level
conceptual level
internal level
physical data organisation
DBMS
logical physical
14
ANSI / SPARC Architecture
User 1
View 1
User 2
User 3
View 2
View 3
Conceptual Schema
Internal schema
Independence
• Logical Data Independence users/applications are immune to changes of the logical structure of the database, i.e. changes to the conceptual schema.
• Physical Data Independence users/applications are immune to changes of the physical representation of the stored data in the database
15
© Bernhard Reus, University of Sussex, 2004-16
data independence
Database Design
• Process of creating a design for a database that will support the enterprise’s operations and objectives.
• What form of data organization, domains, constraints etc. are needed?
16
Why bother?
• The student next to you would like to have a database application for their records/cds/mp3s, but does not know anything about Database software or theory.
• Try to explain & discuss the “shape of the data” (data model) for this application (what is stored and how it’s organised) in plain English.
• Then reverse roles.
• Then report why this is difficult.
17
© Bernhard Reus, University of Sussex, 2004-16
Need for a Data Model
• Every DBMS comes with a data definition language
• which is low level; need for a higher-level description:
Data Model: “An integrated collection of concepts for describing and manipulating data, relationships between data, and constraints on the data in an organization.”
Connolly & Begg: Database Systems
18
Design Phases
Build a (semantic) model of the info used independent of all physical considerations.
Construct a database model based on a specific kind of data model, independent of any particular DBMS.
Implementation of the DBMS, how data is stored on disk.
19
Conceptual Design
Logical Design
Physical Design
© Bernhard Reus, University of Sussex, 2004-16
Conceptual schema
internal schema
Design Phases vs. Architecture
User 1
Logical Design internalPhysical Design
User 2
User 3
Ceuxtsetronmal ised levevl iews
View 1
View 2
View 3
conceptual
logical physical
20
Conceptual Design
Conceptual Schema
level level
Internal schema
physical data organisation
DBMS
© Bernhard Reus, University of Sussex, 2004-16
data independence