2021/8/8 Third normal form – Wikipedia
https://en.wikipedia.org/wiki/Third_normal_form 1/5
Third normal form
Third normal form (3NF) is a database schema design approach for relational databases which uses
normalizing principles to reduce the duplication of data, avoid data anomalies, ensure referential
integrity, and simplify data management. It was defined in 1971 by Edgar F. Codd, an English computer
scientist who invented the relational model for database management.
A database relation (e.g. a database table) is said to meet third normal form standards if all the attributes
(e.g. database columns) are functionally dependent on solely the primary key. Codd defined this as a
relation in second normal form where all non-prime attributes depend only on the candidate keys and do
not have a transitive dependency on another key.[1]
A hypothetical example of a failure to meet third normal form would be a hospital database having a
table of patients which included a column for the telephone number of their doctor. The phone number is
dependent on the doctor, rather than the patient, thus would be better stored in a table of doctors. The
negative outcome of such a design is that a doctor’s number will be duplicated in the database if they
have multiple patients, thus increasing both the chance of input error and the cost and risk of updating
that number should it change (compared to a third normal form-compliant data model that only stores a
doctor’s number once on a doctor table).
Codd later realized that 3NF did not eliminate all undesirable data anomalies and developed a stronger
version to address this in 1974, known as Boyce–Codd normal form.
Definition of third normal form
“Nothing but the key”
Computation
Derivation of Zaniolo conditions
Normalization beyond 3NF
Considerations for use in reporting environments
See also
References
Further reading
External links
The third normal form (3NF) is a normal form used in database normalization. 3NF was originally
defined by E. F. Codd in 1971.[2]
Contents
Definition of third normal form
https://en.wikipedia.org/wiki/Database_schema
https://en.wikipedia.org/wiki/Relational_database
https://en.wikipedia.org/wiki/Database_normalization
https://en.wikipedia.org/wiki/Software_bug
https://en.wikipedia.org/wiki/Referential_integrity
https://en.wikipedia.org/wiki/Edgar_F._Codd
https://en.wikipedia.org/wiki/Relational_model
https://en.wikipedia.org/wiki/Database
https://en.wikipedia.org/wiki/Relation_(database)
https://en.wikipedia.org/wiki/Table_(database)
https://en.wikipedia.org/wiki/Column_(database)
https://en.wikipedia.org/wiki/Functional_dependency
https://en.wikipedia.org/wiki/Primary_key
https://en.wikipedia.org/wiki/Second_normal_form
https://en.wikipedia.org/wiki/Candidate_key
https://en.wikipedia.org/wiki/Transitive_dependency
https://en.wikipedia.org/wiki/Boyce%E2%80%93Codd_normal_form
https://en.wikipedia.org/wiki/Database_normalization#Normal_forms
https://en.wikipedia.org/wiki/Database_normalization
https://en.wikipedia.org/wiki/E._F._Codd
2021/8/8 Third normal form – Wikipedia
https://en.wikipedia.org/wiki/Third_normal_form 2/5
Codd’s definition states that a table is in 3NF if and only if both of the following conditions hold:
The relation R (table) is in second normal form (2NF).
Every non-prime attribute of R is non-transitively dependent on every key of R.
A non-prime attribute of R is an attribute that does not belong to any candidate key of R.[3] A transitive
dependency is a functional dependency in which X → Z (X determines Z) indirectly, by virtue of X → Y
and Y → Z (where it is not the case that Y → X).[4]
A 3NF definition that is equivalent to Codd’s, but expressed differently, was given by Carlo Zaniolo in
1982. This definition states that a table is in 3NF if and only if for each of its functional dependencies X
→ A, at least one of the following conditions holds:[5][6]
X contains A (that is, A is a subset of X, meaning X → A is trivial functional dependency),
X is a superkey,
every element of A \ X, the set difference between A and X, is a prime attribute (i.e., each
attribute in A \ X is contained in some candidate key).
Zaniolo’s definition gives a clear sense of the difference between 3NF and the more stringent Boyce–
Codd normal form (BCNF). BCNF simply eliminates the third alternative (“Every element of A \ X, the
set difference between A and X, is a prime attribute.”).
An approximation of Codd’s definition of 3NF, paralleling the traditional pledge to give true evidence in a
court of law, was given by Bill Kent: “[every] non-key [attribute] must provide a fact about the key, the
whole key, and nothing but the key”.[7] A common variation supplements this definition with the oath “so
help me Codd”.[8]
Requiring existence of “the key” ensures that the table is in 1NF; requiring that non-key attributes be
dependent on “the whole key” ensures 2NF; further requiring that non-key attributes be dependent on
“nothing but the key” ensures 3NF. While this phrase is a useful mnemonic, the fact that it only mentions
a single key means it defines some necessary but not sufficient conditions to satisfy the 2nd and 3rd
normal forms. Both 2NF and 3NF are concerned equally with all candidate keys of a table and not just
any one key.
Chris Date refers to Kent’s summary as “an intuitively attractive characterization” of 3NF and notes that
with slight adaptation it may serve as a definition of the slightly stronger Boyce–Codd normal form:
“Each attribute must represent a fact about the key, the whole key, and nothing but the key.”[9] The 3NF
version of the definition is weaker than Date’s BCNF variation, as the former is concerned only with
ensuring that non-key attributes are dependent on keys. Prime attributes (which are keys or parts of
keys) must not be functionally dependent at all; they each represent a fact about the key in the sense of
providing part or all of the key itself. (This rule applies only to functionally dependent attributes, as
applying it to all attributes would implicitly prohibit composite candidate keys, since each part of any
such key would violate the “whole key” clause.)
An example of a 2NF table that fails to meet the requirements of 3NF is:
“Nothing but the key”
https://en.wikipedia.org/wiki/If_and_only_if
https://en.wikipedia.org/wiki/Relation_(database)
https://en.wikipedia.org/wiki/Second_normal_form
https://en.wikipedia.org/wiki/Candidate_key
https://en.wikipedia.org/wiki/Transitive_dependency
https://en.wikipedia.org/wiki/Functional_dependency
https://en.wikipedia.org/wiki/Superkey
https://en.wikipedia.org/wiki/Complement_(set_theory)#Relative_complement
https://en.wikipedia.org/wiki/Candidate_key
https://en.wikipedia.org/wiki/Boyce%E2%80%93Codd_normal_form
https://en.wikipedia.org/wiki/Sworn_testimony
https://en.wikipedia.org/wiki/Edgar_F._Codd
https://en.wikipedia.org/wiki/First_normal_form
https://en.wikipedia.org/wiki/Second_normal_form
https://en.wikipedia.org/wiki/Christopher_J._Date
https://en.wikipedia.org/wiki/Boyce%E2%80%93Codd_normal_form
2021/8/8 Third normal form – Wikipedia
https://en.wikipedia.org/wiki/Third_normal_form 3/5
Tournament winners
Tournament Year Winner Winner’s date of birth
Indiana Invitational 1998 Al Fredrickson 21 July 1975
Cleveland Open 1999 Bob Albertson 28 September 1968
Des Moines Masters 1999 Al Fredrickson 21 July 1975
Indiana Invitational 1999 Chip Masterson 14 March 1977
Because each row in the table needs to tell us who won a particular Tournament in a particular Year, the
composite key {Tournament, Year} is a minimal set of attributes guaranteed to uniquely identify a row.
That is, {Tournament, Year} is a candidate key for the table.
The breach of 3NF occurs because the non-prime attribute (Winner’s date of birth) is transitively
dependent on the candidate key {Tournament, Year} through the non-prime attribute Winner. The fact
that Winner’s date of birth is functionally dependent on Winner makes the table vulnerable to logical
inconsistencies, as there is nothing to stop the same person from being shown with different dates of
birth on different records.
In order to express the same facts without violating 3NF, it is necessary to split the table into two:
Tournament winners
Tournament Year Winner
Indiana Invitational 1998 Al Fredrickson
Cleveland Open 1999 Bob Albertson
Des Moines Masters 1999 Al Fredrickson
Indiana Invitational 1999 Chip Masterson
Winner’s dates of birth
Winner Date of birth
Chip Masterson 14 March 1977
Al Fredrickson 21 July 1975
Bob Albertson 28 September 1968
Update anomalies cannot occur in these tables, because unlike before, Winner is now a candidate key in
the second table, thus allowing only one value for Date of birth for each Winner.
A relation can always be decomposed in third normal form, that is, the relation R is rewritten to
projections R1, …, Rn whose join is equal to the original relation. Further, this decomposition does not
lose any functional dependency, in the sense that every functional dependency on R can be derived from
the functional dependencies that hold on the projections R1, …, Rn. What is more, such a decomposition
can be computed in polynomial time.[10]
The definition of 3NF offered by Carlo Zaniolo in 1982, and given above, is proven in the following way:
Let X → A be a nontrivial FD (i.e. one where X does not contain A) and let A be a non-key attribute. Also
let Y be a key of R. Then Y → X.
Computation
Derivation of Zaniolo conditions
https://en.wikipedia.org/wiki/Relational_projection
https://en.wikipedia.org/wiki/Natural_join
https://en.wikipedia.org/wiki/Functional_dependencies
https://en.wikipedia.org/wiki/Polynomial_time
https://en.wikipedia.org/wiki/Functional_dependency
2021/8/8 Third normal form – Wikipedia
https://en.wikipedia.org/wiki/Third_normal_form 4/5
Most 3NF tables are free of update, insertion, and deletion anomalies. Certain types of 3NF tables, rarely
met with in practice, are affected by such anomalies; these are tables which either fall short of Boyce–
Codd normal form (BCNF) or, if they meet BCNF, fall short of the higher normal forms 4NF or 5NF.
While 3NF was ideal for machine processing, the segmented nature of the data model can be difficult to
consume by a human user. Analytics via query, reporting, and dashboards were often facilitated by a
different type of data model that provided pre-calculated analysis such as trend lines, period-to-date
calculations (month-to-date, quarter-to-date, year-to-date), cumulative calculations, basic statistics
(average, standard deviation, moving averages) and previous period comparisons (year ago, month ago,
week ago) e.g. dimensional modeling and beyond dimensional modeling, flattening of stars via Hadoop
and data science.[11][12]
Attribute-value system
1. Codd, E. F. “Further Normalization of the Data Base Relational Model”, p. 34.
2. Codd, E. F. “Further Normalization of the Data Base Relational Model”. (Presented at Courant
Computer Science Symposia Series 6, “Data Base Systems”, New York City, May 24–25, 1971.)
IBM Research Report RJ909 (August 31, 1971). Republished in Randall J. Rustin (ed.), Data Base
Systems: Courant Computer Science Symposia Series 6. Prentice-Hall, 1972.
3. Codd, p. 43.
4. Codd, p. 45–46.
5. Zaniolo, Carlo. “A New Normal Form for the Design of Relational Database Schemata”. ACM
Transactions on Database Systems 7(3), September 1982.
6. Abraham Silberschatz, Henry F. Korth, S. Sudarshan, Database System Concepts (http://www.d
b-book.com/) (5th edition), p. 276–277.
7. Kent, William. “A Simple Guide to Five Normal Forms in Relational Database Theory” (http://ww
w.bkent.net/Doc/simple5.htm), Communications of the ACM 26 (2), Feb. 1983, pp. 120–125.
8. The author of a 1989 book on database management credits one of his students with coming
up with the “so help me Codd” addendum. Diehr, George. Database Management (Scott,
Foresman, 1989), p. 331.
9. Date, C. J. An Introduction to Database Systems (7th ed.) (Addison Wesley, 2000), p. 379.
10. Serge Abiteboul, Richard B. Hull, Victor Vianu: Foundations of Databases. Addison-Wesley,
1995. http://webdam.inria.fr/Alice/ ISBN 0201537710. Theorem 11.2.14.
11. “Comparisons between Data Warehouse modelling techniques – Roelant Vos” (http://roelantvo
s.com/blog/?p=740). roelantvos.com. Retrieved 5 March 2018.
12. “Hadoop Data Modeling Lessons | EMC” (https://infocus.dellemc.com/william_schmarzo/hadoo
p-data-modeling-lessons-vin-diesel/). InFocus Blog | Dell EMC Services. 23 September 2014.
Retrieved 5 March 2018.
Normalization beyond 3NF
Considerations for use in reporting environments
See also
References
https://en.wikipedia.org/wiki/Boyce%E2%80%93Codd_normal_form
https://en.wikipedia.org/wiki/Fourth_normal_form
https://en.wikipedia.org/wiki/Fifth_normal_form
https://en.wikipedia.org/wiki/Dimensional_modeling
https://en.wikipedia.org/wiki/Hadoop
https://en.wikipedia.org/wiki/Data_science
https://en.wikipedia.org/wiki/Attribute-value_system
http://www.db-book.com/
http://www.bkent.net/Doc/simple5.htm
https://en.wikipedia.org/wiki/Serge_Abiteboul
https://en.wikipedia.org/wiki/Victor_Vianu
http://webdam.inria.fr/Alice/
https://en.wikipedia.org/wiki/ISBN_(identifier)
https://en.wikipedia.org/wiki/Special:BookSources/0201537710
http://roelantvos.com/blog/?p=740
https://infocus.dellemc.com/william_schmarzo/hadoop-data-modeling-lessons-vin-diesel/
2021/8/8 Third normal form – Wikipedia
https://en.wikipedia.org/wiki/Third_normal_form 5/5
Date, C. J. (1999), An Introduction to Database Systems (https://web.archive.org/web/2005040
4010227/http://www.aw-bc.com/catalog/academic/product/0,1144,0321197844,00.html) (8th
ed.). Addison-Wesley Longman. ISBN 0-321-19784-4.
Kent, W. (1983) A Simple Guide to Five Normal Forms in Relational Database Theory (http://ww
w.bkent.net/Doc/simple5.htm), Communications of the ACM, vol. 26, pp. 120–126
Litt’s Tips: Normalization (http://www.troubleshooters.com/littstip/ltnorm.html)
Database Normalization Basics (http://databases.about.com/od/specificproducts/a/normalizati
on.htm) by Mike Chapple (About.com)
An Introduction to Database Normalization (http://mikehillyer.com/articles/an-introduction-to-
database-normalization/) by Mike Hillyer.
A tutorial on the first 3 normal forms (http://phlonx.com/resources/nf3/) by Fred Coulson
Description of the database normalization basics (http://support.microsoft.com/kb/283878) by
Microsoft
Third Normal Form with Simple Examples (http://exploredatabase.com/2014/02/third-normal-f
orm-3nf-with-example.html) by exploreDatabase
Retrieved from “https://en.wikipedia.org/w/index.php?title=Third_normal_form&oldid=1018107159”
This page was last edited on 16 April 2021, at 09:02 (UTC).
Text is available under the Creative Commons Attribution-ShareAlike License; additional terms may apply. By using
this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the Wikimedia
Foundation, Inc., a non-profit organization.
Further reading
External links
https://web.archive.org/web/20050404010227/http://www.aw-bc.com/catalog/academic/product/0,1144,0321197844,00.html
https://en.wikipedia.org/wiki/ISBN_(identifier)
https://en.wikipedia.org/wiki/Special:BookSources/0-321-19784-4
http://www.bkent.net/Doc/simple5.htm
http://www.troubleshooters.com/littstip/ltnorm.html
http://databases.about.com/od/specificproducts/a/normalization.htm
http://phlonx.com/resources/nf3/
http://support.microsoft.com/kb/283878
http://exploredatabase.com/2014/02/third-normal-form-3nf-with-example.html
https://en.wikipedia.org/w/index.php?title=Third_normal_form&oldid=1018107159
https://en.wikipedia.org/wiki/Wikipedia:Text_of_Creative_Commons_Attribution-ShareAlike_3.0_Unported_License
https://foundation.wikimedia.org/wiki/Terms_of_Use
https://foundation.wikimedia.org/wiki/Privacy_policy
Imagine a world in which every single human being can freely share in the sum of all knowledge.