The Business Intelligence Guide
   BI Strategy | BI Program | BI Projects | BI Data | BI Infrastructure | BI User Tools | BI Vendors | Resource Guides | Articles | BI Blog | BIG Bookstore

Get a FREE Sample of the
The Total BI Guide
THE LOGICAL ORGANIZATION

and receive the
TLO NEWSLETTER
Just enter your details below

Email:
Name:
 
HOME
 
Business Intelligence
BI Strategy
BI Program Guide
BI Tools
- Dashboards
- Scorecards
- Operational BI
- Analytics
BI Software Solutions
Data Management
Databases
DataWarehouses
Decision Support
Marketing Tools
Industry Solutions
Case Studies
BI Surveys & Awards
RESOURCES
ARTICLES
BI EVENTS
BI NEWS
Sitemap

About the Author

View Gail La Grouw's profile on LinkedIn

Google+ Gail La Grouw

Bookmark and Share

Data Warehouses


Data Warehouse

A data warehouse is the main storage server hosting an organizations historical data, its corporate memory. It is distinct from operational or transactional systems supporting the business functions.

A data warehouse can be normalized or denormalized. It can be a relational database, multidimensional database, flat file, hierarchical database, object database, etc.

Uses of Data Warehouse

The datawarehouse is used to contain the raw data for management's decision support systems [business intelligence tools].

Not all data warehouses are used for decision support. Many data warehouses are used for post-decision monitoring of the effects of decisions [for "operational" issues].

Main Benefit

The main benefit of a data warehouse is that complex queries and analysis, such as data mining, can be made on organisational information without slowing down the operational or transactional systems.

 

DW Architecture

A typical, traditional Data Warehouse architecture includes three main components:

  1. Server
  2. Storage [SAN or Storage Area Network]
  3. Database software

 

These DW solutions were typically built from systems developed for on-line transaction processing [OLTP]. They were not designed to support large and complex Business Intelligence (BI) analysis. Inherent constraints result in limited performance and high costs of acquisition and ownership.

For this reason, more innovative DW solutions evolved to DW integrated with Analytics, such as:

Performance Constraints

General-purpose servers - same computers used in data centers [web servers,
email servers or application servers] designed for OLTP applications requiring efficient, RAM-based operations on individual data elements.

General-purpose storage - arrays require time-consuming, careful synchronization of loaders and data striping mechanisms to ensure that data is distributed to all BI users.

General-purpose database - full power of general-purpose database management systems [DBMS] such as DB2 or Oracle is lost when embedded within general-purpose hardware and used for data warehousing. The software is not designed to extract optimal performance out of even the most advanced servers and storage.

 

Defintion of Data Warehouse

A data warehouse can be defined in the following terms:

  • Subject-oriented - The data in the database is organized so that all the data elements relating to the same real-world event or object are linked together.
  • Time-variant - The changes to the data in the database are tracked and recorded so that reports can be produced showing changes over time.
  • Non-volatile - Data in the database is never over-written or deleted - once committed, the data is static, read-only, but retained for future reporting; and
  • Integrated - The database contains data from most or all of an organization's operational applications, and that this data is made consistent.

While operational systems are optimized for simplicity and speed of modification [Online Transactional Processing, or OLTP] through heavy use of database normalization and an entity-relationship model, the data warehouse is optimized for reporting and analysis [online analytical processing, or OLAP].

Data in data warehouses are often heavily denormalised, summarised or stored in a dimension-based model. This is not always required to achieve acceptable query response times.

 

Types of Data Warehouses

With technology improvements supporting lower cost for more performance and to support user requirements for faster data load cycle times and more features, data warehouses have evolved through different stages:

Off line Operational Databases - Data warehouses developed by simply copying the database of an operational system to an off-line server where the processing load of reporting does not impact on the operational system's performance.

Off line Data Warehouse - Data warehouses are updated on a regular time cycle (usually daily, weekly or monthly) from the operational systems and the data is stored in an integrated reporting-oriented data structure.

Real Time Data Warehouse - Data warehouses are updated on a transaction or event basis, every time an operational system performs a transaction [e.g. an order or a delivery or a booking etc.]

Integrated Analytics Data Warehouse - Data warehouses are used to generate activity or transactions that are passed back into the operational systems for use in the daily activity of the organization. These are now fully integrated to the various types of business intelligence tools, such as MS Excel, Scorecards, Dashboards, Analytic Services.

 

Advantages Of Data Warehouses

There are many advantages to using a data warehouse, including:

  1. Enhances end-user access to a wide variety of data.
  2. Decision support system users can obtain specified trend reports, e.g. the item with the most sales in a particular area/country within the last two years.
  3. A data warehouse can be a significant enabler of commercial business applications, most notably customer relationship management (CRM).

 

Data Warehouse Concerns

There are some general concerns regarding the use of data warehouses:

  1. Extracting, transforming and loading data consumes a lot of time and computational resources.
  2. Data warehousing project scope must be actively managed to deliver a release of defined content and value.
  3. Compatibility problems with systems already in place.
  4. Security could develop into a serious issue, especially if the data warehouse is web accessible.
  5. Data Storage design controversy warrants careful consideration and perhaps prototyping of the data warehouse solution for each project's environments.

Next: Teradata Active Data Warehouse

Back To Top

For The World's Leading Guide To BI Strategy, Program & Technology


Database Index | Relational Model | Object Model | Other Models | DBMS | Contextual Types | Microsoft SQL | Oracle OODBMS | Data Warehouse | Teradata ADW | Data Mart | MS vs Teradata | Development Platforms | ODBC | JDBC | SMP | MPP | SN | Glossary

Bookmark and Share
 


EFFECTIVE DASHBOARD DESIGN

Design Secrets to Getting More Value From Performance Dashboard

THE COMPLETE DESIGN GUIDE
Effective Dashboard Design