Just What is a Data Scientist?

6:18 am Analytics, Big Data

In the emerging world of Big Data, one term making an entrance is ‘Data Scientist’. But, just what is a Data Scientist? I did a bit of digging around to find out. This is what I found:

First, let’s start with what it is not – a Business Analyst or a Data Analyst.

Although this ‘new role’ has emerged with the hype around Big Data,  it is not exclusive to Big Data projects. According to IBM, “A data scientist represents an evolution from the business or data analyst role”. Not sure I agree with this. Both of these roles are highly varied in skills and responsibilities. I would tend to suggest that Data Scientists come from a more scientific, mathematical and strategic background than most BA’s and DA’s. They certainly have advanced analytical skill levels, that few BA’s and probably all DA’s don’t possess.

The role of the Data Scientist is to extract trends and other patterns from large volumes of data – using a range of mathematical and statistical models. Add to that the ability to configure up ‘What If’ predictive models, and the all important ability to communicate their conclusions and recommendations to organisational leaders.

The Data Scientist is certainly a role of the future.

From the video above [ by Data Scientist Troy Sadkowsky], we learn that Data Science is about:

  • Data Acquisition – from multiple sources
  • Data Management – storage, format, standardisation
  • Information Visualisation – presentation of data in a meaningful way
  • Analytics – statistical analysis
  • Insight and Innovation


Insight and Innovation

Insight and innovation has three perspectives:

  1. New understanding
  2. New product
  3. Alternative histories – quantum theory suggest that a particle moves between point A and point B along multiple possible paths. Humans also have multiple possible paths.

 Current to New

Moving from current to new  insight and innovation, we have different aspects around knowledge and technology:

  • Knowledge – is defined from: what, how, who and why
  • Technology – has a level of organising data [data management], packaging [visualisation and analytics], and deliver [communication]. This provides us with three levels – the physical, logical and emotional aspects of data.

In moving to a new position, we ask the same questions: what we want to move towards, how we might get there, who needs to be involved and why we should even try.

The gap between the current and new can be defined by answering these questions for both current and new, and using that gap, the roadmap to the new ideal.

If we consider the current state – we have an ever-growing sphere of Big Data, enabled by reduced cost of storage space and the technology to read it, created and used by millions of parties around the world, for the purpose of sharing information.

Answering the technology questions, we are organising  data in the Cloud, packaging it for mobile devices and delivering via an Open network [ the Internet]

Looking to the future state – the focus will be on Big Value, enabled by Data Science, around the World for the purpose of transforming the way we live, work and play. We don’t need more data – we need insight to transform our existence in a positive way.

The technology will be more advanced: consider Cloud 2.0, Mobile 2.0 and Open 2.0.

So what does this all mean? – what is the point? According to Troy, there are two reasons to do something, to react to something or just be creative.

For templates used in this presentation: www.datascientist.net/innovation

For recommended books: BIG Bookstore

Leave a Comment

Your comment

You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.