The Business Intelligence Guide
   BI Strategy | BI Program | BI Projects | BI Data | BI Infrastructure | BI User Tools | BI Vendors | Resource Guides | Articles | BI Blog | BIG Bookstore

Get a FREE Sample of the
The Total BI Guide
THE LOGICAL ORGANIZATION

and receive the
TLO NEWSLETTER
Just enter your details below

Email:
Name:
 
HOME
 
Business Intelligence
BI Strategy
BI Program Guide
BI Tools
- Dashboards
- Scorecards
- Operational BI
- Analytics
BI Software Solutions
Data Management
Databases
DataWarehouses
Decision Support
Marketing Tools
Industry Solutions
Case Studies
BI Surveys & Awards
RESOURCES
ARTICLES
BI EVENTS
BI NEWS
Sitemap

About the Author

View Gail La Grouw's profile on LinkedIn

Google+ Gail La Grouw

Bookmark and Share

Microsoft SQL Server 2008


Microsoft SQL Server 2008 aims to provide a comprehensive, scalable data warehouse platform with enhancements for data warehousing that make it faster to build a data warehouse.

SQL Server 2008 integrates all relevant data into a comprehensive warehouse platform, empowering businesses to manage their growing data volumes with an enterprise-ready, relational database.

With this capability, organisations are better equipped to deliver actionable insights through the integration of the Microsoft Business Intelligence (BI) platform.

The key elements of this 2008 version over previous 2005 include:

  • Manageablity
  • Reliability - Server Core, Networking, Clustering
  • Web - Internet Information Services, Sharepoint Services, Media Services
  • Security - Read Only Domain Controller
  • Virtualization - Terminal Services, Gateway Apps

 

SQL Server 2008 & Data Warehouse

MS Server 2008 provides major performance improvements in Data warehouse scenarios

Improvement in performance and scalability stem from improvements:

  1. Integration Services
  2. Analysis Services
  3. Reporting Services
  4. Administration

MS SQL Server 2008 has been tested on large-scale, customer driven configuration:

Database Engine: to 100 billion fact table rows on a 32 core machine

Analysis Services: to 25 billion fact table rows on 16 core hardware

Improvements can be considered in three pillars:

  1. Extract, Transform and Load [ETL] – Merge, Minimal Logging, Insert Over DML, Change Data Capture, Integration Services Threading Model, Lookup Enhancements and Compression
  2. Query and Analysis
  3. Administration and Manageability


ETL Improvements

Merge

Syntax SQL standard used to combine multiple data insert operations into a single statement. Useful in DW to make ETL more efficient. MS has also added support for merge statement, to comply with SQL standard. If the source matches the target, it performs an update, if there is no match, it deletes the source.

For instance take the daily trades and merge them into the stock tables. If we reach as tock of Zero, the record is deleted. If the stock is not Zero, the value is updated. If no match, then INSERT the values [Stock and Delta]

Logging

Improves INSERT statements by logging only what is absolutely necessary for rollback:

  • Normally individual rows are logged
  • Page allocations are sufficient to UNDO insertions
  • The recovery model must be simple or bulk logged

Expect 3-5 times performance boost over fully logged INSERT compared to current SQL server fully logged INSERT. This helps maintain user SLAs inside service windows.

Insert / DML

The INSERT statement has the option of being able to consume DML results. This is useful in tracking history of slow changing dimensions, and dumping DML data streams to a secondary table for post-processing.

Change Data Capture

This mechanism provides for easy tracking of changes on a table. The changes are capatured fro the log asynchronously, and information is shown as to what has changed on source table. Easily consumed from Integration services.

Capture process consumes the changes, inserts into change table, which changes them further. Stored Procedures are used to qwitch on Change Data Capture Table

New Look Up

In previous versions, the initialisation process of a look up process took longer than the ETL maintenance process. The main improvement is that you are able to persist the tables on a local disk so look up does not have to be run on the database. This provides an unrestricted cache size and persistent cache

Compression On Data base Engine

The main goal of compression is to shrink data warehouse fact tables, and thus improve query performance. Can be on a table or index basis. There is a trade off on storage or CPU usage.

  • In SQL 2005 – decimal format was compressed
  • In SQL 2008 - extends this to other formats

Prefix compression - A prefix list is stored in the page for common prefixes. Individual values are replaced by:

  • Token for the prefix
  • Suffix for the value

Dictionary Compression – a common value dictionary is stored in the page, allowing common values to be replaced by tokens.

Going over the data in the rows to ID the common values, these are represented in the green area, then point to tokens in the dictionary. Gives real savings of up to 7x, depending upon data. Different degrees of redundancy in data.

 

Analysis Services Enhancements

The design enhancements may be the best performance and scalability improvement. By integrating best practice knowledge into the designers, using:

  • Presentation of attribute relationships
  • Improved wizard output
  • Simplified creation of composite keys
  • Analysis Management Objects [AMO] warnings
  • Improved support functions

Query and Analysis


Partitioned Table Parallelism

The major change here is in how threads are allocated to table query. For example, running weekly queries:

  • On Monday only running on one week – previous week. All threads used for a single partition.
  • On Tuesday, have to run across two weeks, two partitions. Using two threads, one for each partition.

In 2008 an arbitrary number of threads [CPUs] against the first partition, then move automatically into the second partition, so you always get the full degree of performance on your data.

On a mid range machine, where most of the data is in one partition and only a small volume in the second partition.

2008 just changes the way we deal with multiple partitions, with improvements most noticeable when most of the data is in one partition.

Tests on a 4-core machine [x64] and 40-GB fact table, partitioned by week provided staggering improvements from SQL Server 2005 performance.

  Table 1 Table 2
SQL Server 2005 1276 ms 15,780 ms
SQL Server 2008 368 ms 992 ms
Improvement Factor 3.5x 15x


Star Joint Queries

Decision support query that joins fact tables. If using compression, expect to see scans speed up. Using Hop query processing – selects 10-75% of fact table.

Bitmap filters in SQL 2005, now 2008 can have multiple to do semi join productions. Objective to move all bit map filters for all dimensions and locate in fact table. Then process the first bit map filter that has the most data first, to reduce the volume of data to be queried.

Query response times improved by 17% - 23%. A query in 2005 took one hour, 2008 out of box impoved down to 35 seconds.

 

Reporting Enhancements

With a scalable reporting engine and word rendering, you no longer need to rely on IIS, you can now render reports into Word, making report layout more flexible.

 

Administration

Back up Compression

Keeping disk-based back ups online is expensive, and backups take longe, when maintenance windows are reducing.

With SQL 2008 you can add a COMPRESSION clause to BACKUP statement.

This means that less storage is required to keep backups on line, backups will run significantly faster [less IO is done] and restore automatically detects compression and adjusts accordingly

In SQL Server 2005 - back up times grow exponentially beyond 20G

In SQL Server 2008 - with the replaced back up system, times only grow linearly.





Back To Top

For The World's Leading Guide To BI Strategy, Program & Technology


DW Solutions | Teradata ADW | Netezza NPS | DATAllegro | HP NeoView | Teradata 12.0 | Green Data Centers

Bookmark and Share
 


EFFECTIVE DASHBOARD DESIGN

Design Secrets to Getting More Value From Performance Dashboard

THE COMPLETE DESIGN GUIDE
Effective Dashboard Design