|
Why De-Normalize Data As applications evolved from batch to online systems, data normalization was necessary for real-time updating but made it difficult to generate reports. It is ironic that early on IBM developed Structured Query Language that was to be a tool for non-programmers to create reports without worrying about the complexities of hierarchical and networked databases. Today, SQL means a relational database management system that makes sense to experts only. Flat Files: Simplest Database Structure As databases grew into warehouses, the importance of mining actionable data grew. We know that the process is like mining for gold or silver: a lot of digging before a strike. It means creating reports many of which are summarily discarded. Data miners, usually end users, should have simple tools at their disposal. Tools based on relational databases are not simple because they invariably need technical support and maintenance. The simplest databases are flat files. They are, after all, the lowest common denominator of all databases including relational, multidimensional, network and hierarchical. In addition, flat files must be de-normalized for easier reporting. Duplication is not a concern because files are read-only.
|