Tuesday, October 14, 2014

Database Normalization



NORMALIZATION

Normalization is a process that helps analysts or database designers to design table structures for an application. The focus of normalization is to attempt to reduce redundant table data to the very minimum. Through the normalization process, the collection of data in a single table is replaced, by the same data being distributed over multiple tables with a specific relationship being setup between the tables. By this process RDBMS schema designers try their best to reduce table data to the very minimum.

Having said this, when the process of normalization is applied to table data and this data is spread across several associated (i.e. a specific relationship has been established) tables, it takes a query much longer to run and retrieve user data from the set of tables.

Hence, often in a commercial application after 100% normalization is carried out across the master tables often the table structures are de normalized deliberately to make SQL queries run faster. This means that in commercial applications there is often a trade off between redundant table data and the speed of query execution.

Normalization is carried out for the following reasons:
To structure the data between tables so that data maintenance is simplified
To allow data retrieval at optimal speed
To simplify data maintenance through updates, inserts and deletes
To reduce the need to restructure tables as new application requirements arise
To improve the quality of design for an application by rationalization of table data

Normalization is a technique that:
Decomposes data into two dimensional tables
Eliminates any relationships in which table data does fully depend upon the primary key of a record
Eliminates any relationship that contains transitive dependencies

A description of the three forms of Normalization is as mentioned below.

First Normal Form

When a table is decomposed into two-dimensional tables with all repeating groups of data eliminated, the table data is said to be in its firs normal form.

The repetitive portion of data belonging to the record is termed as repeating groups.

Second Normal Form

A table is said to be in its second normal form when each record in the table is in the first normal form and column in the record is fully dependent on its primary key.

Third Normal Form

Table data is said to be in third normal format when all transitive dependencies are removed from this data.

No comments: