Data Mining
Data mining, [also referred to as Statistics and Data Mining] refers
to the interrogation of data for the purpose of identifying
trends and patterns that indicate notable business activity.
Statistical and data mining tools can perform predictive modeling
or to d[iscover the cause-and-effect correlation between two metrics.
This includes:
- Advanced Analysis
- Hypothesis Testing
- Predictive Analysis
Where relationships between data are known,
data query is used. Data mining is employed when such relationships
are not known.
Data Mining uncovers subtle relationships, such as price elasticity
and sales trends using using set theory techniques, statistical
treatments and other advanced mathematical functions. As an advanced
BI tool, data mining is typically only performed by experienced
analysts to perform:
- Correlation analysis
- Trend analysis
- Projections
The outcomes of data mining provide critical business insight for
both strategic an tactical decision making. Data mining can be used
across the business, for instance:
- Fraud detection
- Targeted marketing
- Risk management
- Business analysis and optimization
Example - Data Mining in Marketing
A marketing analyst may build a revenue model for a particular
product group to show gross margins by quarter as a function of
shipment times, pricing and demand. This can be used to estimate
the financial impact of delayed shipments. The outcome will help
the marketing team to determine any pricing changes required on
current stock to help cover the cost of any shipment driven losses
or determine promotional spend to push a substitute product until
shipments can be normalized to a more profitable standard pattern.
Analysis or Data Mining tools are used to answer questions such
as “What is the median spending of customers in each customer
region?” or “What is our market share growth this year
by store, for those stores
where actual sales exceeded target sales by more than 15%?”
Data mining tools learn about what is 'normal' in you business,
and identify elements that do not conform with this pattern.
Data Mining Technology
The standard data mining interfaces are based on OLE DB for Data
Mining specification and Data Mining Extensions [DMX] query language.
These are widely used as standard interfaces to data mining objects
and algorithms on various data mining platforms.
Data Mining Tool Constraints
Cube-based BI architectures - have inherent limitations
that render them incapable of providing a comprehensive picture
of the inter-relationships of data across the enterprise. This is
especially so for massive data business models, such as those of
Telecoms and Financial Services providers.
User interfaces - are fairly complex, however
more recent tools are providing more usable interfaces. In spite
of this, an understanding of statistical analysis is still required
in most instances.
Data - The quality of the insights gained from
Analysis or Data Mining tools is directly related to the quality
and completeness of the underlying data. Calculations are performed
by highly sophisticated SQL Generation Engines and a specialized
Analytic Engine.
Microsoft Data Mining Tools
In the MS SQL Server 2000, Microsoft embedded data mining technology
in its business intelligence platform. The more recent SQL
Server 2005 platform uses the OLE DB and DMX standards to provide
data mining capability.
SQL Server Analysis Services 2005 includes 9 algorithms
for data mining:
- Microsoft Association Rules
- Microsoft Clustering
- Microsoft Decision Trees
- Microsoft Linear Regression
- Microsoft Logistic Regression
- Microsoft Naïve Bayes
- Microsoft Neural Network
- Microsoft Sequence Clustering
- Microsoft Time Series
These algorithms can be applied to warehouse data stored SQL Server
and other SQL platforms, or in multidimensional
(OLAP) data stored in Analysis Services 2005. Since it can be
queried directly with an open query language, the output from SQL
Server data mining can be used in a variety ways.
Data mining info can be shown directly in reports, can be used
to drive actions in other applications (such as suggesting cross-selling
items in websites and Point-of-Sale systems), and can generate advanced
data visualizations through interfaces in business analytical tools,
SQL Server’s Business Intelligence Development Studio, and
other graphical data interfaces.
NEXT: Advanced
Analytics
Back To Top
Find
Out About Our Leading Executive Guide To BI Strategy, Program & Technology
BI Tools Index | Advanced
Analytics | OLAP | Cube
Analysis | Ad Hoc Query Analysis
| Data Mining | Alerting
| Scorecards | Dashboards
| Using A Dashboard | BI
in BPM | MS Excel | Text
Mining
|