Welcome to EMC Consulting Blogs Sign in | Join | Help

SSIS Junkie

Business Intelligence, what is it and what is it not?

Introduction

What is Business Intelligence? The term "Business intelligence" was coined by Howard Dresner in the late 80s to describe an emerging discipline concerned with the discovery of information in an enterprise. Over the years the term has become diluted and ambiguous and thus has come to mean a great many things to a great many people. In this blog entry I want to explain exactly what I believe Business Intelligence (BI) is and, crucially, what it is not.


Gazumping Some Myths

Firstly, it is necessary to explode a few myths about BI. I use the term "information worker" herein to refer to a person that would make use of BI.

Myth #1 : BI is synonymous with Online Analytical Processing (OLAP)

I know one particular technology-focused commentator that believes this to be true. It isn't true. OLAP's raison d'etre is to make information available quickly. If an information worker wants an answer then an OLAP system should, assuming the information is available, get it to the him/her quickly. That is a noble purpose and indeed, getting access to data quickly is important for BI, but don't be fooled into thinking that BI and OLAP are the same thing. There are many ways to access data quickly and OLAP is just one of them.

Some argue that OLAP is not quick at all because of the inherent latency between information being produced and that information being viewed by an OLAP consumer. There are movements in the OLAP discipline to eradicate that latency but today it is by-and-large still prevalent.

I alluded to an inherent problem with OLAP when I said "assuming the information is available". If an OLAP system does not possess the information that an information worker requires then then he/she would probably have to wait weeks (maybe months) for the information to become available in the OLAP system or go somewhere else to get it. Unfortunately this is sometimes a necessary evil.

Myth #2 : BI is all about tools

There are a great many software vendors out there (I don't need to list them here) that have a built a business by selling tools and toolsets on the promise of delivering BI to a person's desktop. There are tools available for presenting information, manipulating it, moving it, changing it, storing it, aggregating it and many of these tools badge themselves as "BI tools". That's a bit of a misnomer, BI is all about discovering information that wasn't known before, tools are just a means to that end.

It is possible to build a BI system or realise a BI strategy without buying a so-called BI tool. That isn't to say that these tools do not have their place, but they are not a panacea.

Myth #3 : BI is synonymous with a data warehouse

"If you are practicing BI then you need a data warehouse." Wrong!

A data warehouse can be thought of as a place to collect all the information that a business possesses so that it becomes a one-stop-shop for answering any question that an information worker may have. Collecting information into one place is not a pre-requisite to making use of that information. Data warehouses have risen in popularity on the basic premise that in order to make information useful it all needs to reside in one place; that is simply not the case.

"You have a data warehouse ergo you are practicing BI." Also wrong!

Once you have that data then you can start to do useful things with it but it is important to know that BI does not end at the data warehouse. Nor does it start at it. A data warehouse justifies its existence if the information it contains is both useful and discoverable. Chucking lots of data into a single place does not, on its own, achieve either of those things.

I talk a little more about data warehouses below.

 

There are vendors out there, both large and small, that have a vested interest in perpetuating these myths. Sales channels and marketing campaigns are built on top of them and these vendors have no desire to see their market share eroded.


Data Warehouses

What is a data warehouse?

Data warehousing advocates may be a little put out by what I said before but let me promise them that I don't think data warehousing is a discipline that should be disregarded. I have been spoonfed data warehouse theory (and Kimball theory in particular) since I started working in the BI arena in 2000 and that has helped me build a healthy appreciation of both the good and bad points of the theory.

First, let's explore why data warehouses came about by looking at the benefits they provide:

  • A data warehouse is intended to provide the fabled "single version of the truth". Data is often inconsistent across different systems of record (SORs) and the business logic that underpins the removal of data inconsistencies is tightly woven into the jobs that populate a data warehouse.
  • Data quality is improved.
  • It reduces impact on SORs by ensuring that BI information is not sought from these operational, transactional systems.
  • Data is generally structured in such a way that is optimised for analysis.
  • All data required for BI purposes is centralised in one place thus providing ease of access to that information.
  • Measurable data can be aggregated in a data warehouse to provide quicker analysis capabilities.
  • Security. Centralising data means that access to that data can be more easily controlled.
  • History can be more easily stored in a data warehouse whereas history sometimes needs to be removed from operational systems for performance reasons.

However, there are some inherent problems (not all prevalent all the time) with data warehouses that we need cannot ignore:

  • Processing windows (i.e. the time available to move data into a data warehouse) are narrowing as traditional bricks and mortar businesses move to 24-hour sales channels. Quite simply businesses such as financial institutions and global online merchants do not have "downtime" in which their SORs are available for batch processing. Business models are evolving and BI must do the same to accommodate.
  • Business logic required to determine "truth" is buried within documentation and batch processing jobs. It is hidden from the people (i.e. the information workers) that are in charge of that business logic .
  • There is a latency period between data being created and then being made available to the information worker. Put more bluntly, real-time BI is not possible when a data warehouse is employed.
  • Building a data warehouse is expensive.
  • Data warehouses are often IT-driven solutions that do not fully embrace the real requirements of the business.

There may be other bullet points that can be listed here but I see these as the main points. I don't want to make you think that data warehouses are bad - far from it. There are a lot more positives listed here than negatives after all. Be aware of the negatives though, they are important considerations. As is my next point.

 

Papering over the cracks

I want to explore this issue of "single version of the truth" a little bit more. The use of a data warehouse in determining "truth" is popular but that ignores a more pertinent point - why is the "truth" not available in our SORs? If a data warehouse is used for this purpose then aren't we just ignoring the fact that data inconsistencies exist in our operational systems? Aren't we just "papering over the cracks"?

It does bother me that companies sometimes seem more willing to invest in systems that take data out of operational systems in order to cleanse it rather than cleanse it at source. Let me give you an example. On my current engagement my team is trying to answer the question "How much of a particular substance do we generate every day?" and so far we have identified three different SORs that can answer that question and each one provides a different answer. Furthermore, depending on which information worker you speak to each one of those answers is correct. Shouldn't we be trying to eliminate that data inconsistency by fixing the problem when data is gathered and thus creating the truth at source rather than trying to do it after the fact?


Quit talking about what BI is not and tell us what it is.

So, now I've blown some myths apart and risked retribution by debunking some established theory let's focus on what BI is rather than what it is not. I like using short snappy statements to describe what I'm talking about and one such statement that I could use here is "Business Intelligence helps people to make better decisions". The term "decision support system" doesn't do the rounds these days as much as it used to and I think that's a shame because it sums up pretty well what BI is all about.

A key tenet of BI is making information easily discoverable. Better visualisation techniques can help and the aforementioned BI vendors definitely have a part to play here. Different delivery mechanisms are also required. No longer are information workers tethered to a desk; they are mobile and they need their information to be mobile as well.

Information needs to be readily available and it is up to the information workers to define what "readily available" means. For example, if the information worker requires real-time data then straightaway a design decision has been made for you - you cannot employ a data warehouse.

Above all, BI is about giving people the information they need to do their jobs. That statement gives rise to a multitude of techniques, technologies and architectures but that's OK because BI is all of those things and more.


A worthy example

To wrap-up, I want to show you an example of what I consider to be the best BI implementation I have seen in a long long time. Why so? Because it gives information workers what they want, it doesn't conform to a stereotypical BI architecture, and data is presented in a way that just makes sense. You won't hear the term "Business Intelligence" mentioned once in this video but that's OK. I want to blow away pre-conceived ideas about what constitutes a BI system and instead concentrate on what is important - making people's working lives easier. Yes its a Microsoft-heavy presentation but try and ignore that - concentrate on why this is a compelling BI demo.

Download the video from here: http://download.microsoft.com/download/f/4/6/f46669b6-f269-4fe9-8efa-4c6ae15fdea9/ms_en_v500_600x450_00.zip

You will want to start watching from about 38:30 but the good stuff starts after 44:00. Enjoy!


Conclusion

It is time for Business Intelligence applications to step over the boundaries that traditional paradigms have enforced upon it. The world of work is changing, people have different demands of their data and delivering valuable information in a timely manner is more important now than it has ever been.

 

-Jamie

 

 

Further reading:

Published 18 July 2007 06:18 by jamie.thomson

Comments

 

Abdul Aleem said:

Hi!

interesting article..

I would like to tag this article on my blog.

Please reply at maaleemq@gmail.com

will be thankful to you for the same.

July 18, 2007 11:08
 

Phil Brammer said:

Oh boy.  Michelle IS NOT going to like this entry....  She's going to have a hard time keeping up with posts like this, Jamie!

July 18, 2007 16:24
 

SCB said:

Great video demo - on the data visualization & accessibility note - a colleague showed me this today and I thought it was relevant to the discussion....a swedish professor shows a very cool data visualization tool.....

http://www.ted.com/index.php/talks/view/id/92

July 18, 2007 19:06
 

jamie.thomson said:

July 18, 2007 21:54
 

Michelle.Flynn said:

This is an excellent blog as always

I have given up trying to keep up with your blogs and am happy to let you continue attracting people to Conchango as they can see what a great team we have.

July 18, 2007 21:54
 

Phil Brammer said:

Indeed.  Indeed she did.  

Yep, looks like there's quite a bit of fun going on over on the other side of the pond.

July 19, 2007 01:47
 

jamie.thomson said:

Phil,

We've got a US office as well y'know. And I am currently working in California. hint hint.

-Jamie

July 19, 2007 02:51
 

SQL Server, BI and .NET said:

Questa mattina ho letto un'interessante articolo scritto da Jamie Thompson a proposito di BI, da

July 19, 2007 09:11
 

Robert Ham said:

Very nice piece.  Concise and right on target.  

Some thoughts I might add if I may.

BI is about knowledge discovery through data mining - discovering things that help you manage your business through the manipulation/analysis of data that you already have.  While helpful in organizing, presenting and processing data, none of the BI tools available will automatically generate knowledge.  It is good people (with the appropriate skill sets) defining the requirements, building the solution and analyzing the results that generate new knowledge.  The tools, whether simple or complex, just facilitate the process.  

The skill sets required to do complex BI include technical expertise in hardware and software platforms and data architecture/modeling, but also business process architecture/modeling, statistical analysis, computational finance and subject matter expertise for the relevant business.  Too often, it is the later skills that are missing from a BI project.  A common view among business managers is that all you need for a successful BI project is the right BI vendor selection, a good IT infrastructure team (that can handle server set up, performance tuning and data collection) and a few business analysts who have experience in finance or product development or OLTP systems, but who have never been involved in a BI project.  The most crucial skill sets are often omitted.

Without the proper skill sets, the results and analysis that a BI project produces can be worse than useless and anything but benign.  If you get your model wrong, or pick the wrong key business drivers for balanced scorecards, you may end up with managers sitting in front of sophisticated dashboards, pulling the business levers that move the dials  in the right direction, all the while driving the business into the ground.

To expand on the overused analogy of Systems/Data architecture - architects are trained in structural engineering, which teaches them to evaluate and avoid failure points.  In system architecture, we've got a relatively good understanding of hardware/software structural engineering and failure points.  Given an amount of data and software’s performance benchmarks, we can build out the processing hardware, network and storage environment to avoid system failures (or, if we miscalculate, we can just buy more power).  However, when it comes to data architecture and the business logic that is built into the systems, we are lacking a discipline akin to structural engineering, or rather, the discipline is in its infancy.

My new axiom for BI is: "Your business intelligence solution will only be as intelligent as the people who design and use it."

July 20, 2007 17:48
 

Dan said:

I like this post.  It's always tricky when trying to identify specifics about a "cloud" entity like BI, but you do a good job.  BI is made up of many "specific" things, but no "specific" thing defines BI by itself.

For years, technology has excelled in helping the business "DO what they need to DO when they need to DO it" using technology.  I think BI solutions should be designed and built to help business "KNOW what they need to KNOW when they need to KNOW it".  The method doesn't matter as much as the resulting value to the business.

Thanks for showing me there are others out there who think the same as me. :)

July 20, 2007 19:25
 

SQL Musings said:

I actually started writing an article on this, but it's been sidetracked with travel and other stuff....

July 21, 2007 16:50
 

Mathew said:

I'm no expert, but someone else may be able to elaborate. Isn't the comment 'real-time BI is not possible when a data warehouse is employed' a bit inaccurate? There are many products/ methodologies in place to create real-time/ near real-time data warehouses?

Most of the other pitfalls of data warehousing (ie, poor business relationship, bad design), would hold true for any kind of BI solution; data warehouse or not.

Other than that, a great read.

July 23, 2007 00:43
 

jamie.thomson said:

Hi Matthew,

Firstly, thanks for taking the time to leave a comment (that goes to Dan and Robert Ham as well).

I think it depends on what your definition of real-time is. To me, it means having access to the data at the moment it is created (or something approaching that). In my head that means that if you have to move it somewhere before you access it, its no longer real-time. If you say that near-real-time is as good as real-time then I'm not going to argue with you. My definition of real-time is no more correct and no more wrong than anyone else's so I guess it depends more on what the actual requirement is and what the acceptable latency is.

Probably the more pertinent fact to take away is that it doen't matter what you or I define real-time to be, the definition of real-time (and acceptable latency) has to come from the end user. The information worker. And that could be different depending on which project you are on, perhaps even depending on which person on a project you speak to. Hence, a thorough requirements definition phase is a cornerstone of any BI project. That's a point that I tried to make implicitly above, perhaps I should have made it more explicit.

Thanks again for the comment. Its been occupying my mind since I first read it on my phone about 2 hours ago and those are the kind of comments I like.

-Jamie

July 23, 2007 03:29
 

Mohit Nayyar said:

Excellent Jamie………at last I can see something outside SSIS world (as you said)…..keep it up

To me BI is more to do with making sound decisions based on some input (years of experience and data - consolidated). Now on top of that we need some good tools to play with this data and obviously few more tools to make this data available to decision makers.

So, there are few “refined” terms that we use when asking for BI resources. And to technical world that is BI and to business world BI is just about some information they need to perform their very best.

July 23, 2007 09:44
 

Common Sense said:

None of your "inherent problems with data warehouses" seem inherent or even very problematic.

SOR downtime

Hidden business logic

Real-time BI

Cost

IT-driven

These all seem emminently surmountable with the right technology, consultancy, and creativity.

July 23, 2007 16:17
 

Vivienne said:

The word GAZUMP has been inappropriately used and I assume you mean 'disproving':

gazump:

1. to cheat (a house buyer) by raising the price, at the time a contract is to be signed, over the amount originally agreed upon.  

Liked the rest of the article though - good for my management to read.

July 28, 2007 01:16
 

jamie.thomson said:

Hi Vivienne,

Fair point. I used it coloquially where perhaps I shouldn't have done. Hopefully the point came across.

Regards

Jamie

July 28, 2007 01:23
 

RaduP said:

I think real-time is a concept that may help drive the initial, and often crucial, data and warehouse modeling.

First, a little story: I used to fly satellites. That's real-time or closest to it conceptually; you get a screen-full of critical telemetry (the non-critical you can access by some navigation menu, but it's not in your face) flowing at 1Hz or less - one frame per second. If something goes wrong, usually there is a yellow or red color popping up instead of the previous nice and green telementry: "ok, the transmitter is overheating; we have to switch to secondary hardware - we'll lose contact for about 30 seconds; type in the necessary commands NOW!" The bird may cost anywhere from $5M to $800M and if you make a mistake you may be writing your resume by sunset (ok, not really, but it happens). It's real time.

Other examples you could use are a Formula1 McLaren tech watching the axle temperatures to detect break malfunction and give the driver the chance to adjust break fluid pressue or such before the tire overheats (only by 2 deg. C) - his driver is fighting for the championship in Brasil, the last race of the season. Or even think footbal when a second tier coach misses the fact that one of his players is breathing harder than usual and is pale - isn't this real-time BI in the football business?

So if one is asked to design/implement a near real-time BI system, a story telling scenario similar to above (and there are many other you could think of) may bring the design discussion from the stratosfere to the ground: "It's market financial data, we need it in real-time!" There are core financial networks that update in near-real-time, but they are available to those very few on the market floor and adjacent systems - they make real-time decissions. The question here is: "what kind of DIFFERENT decissions whould you make if you would have the information ten minutes earlier at a cost of extra $$$?"

Where I'm going also ties into one of the comments above stating that the BI community lacks a similar defined position with that of an architect. I couldn't agree more. I find myself trying to convince BI clients and beneficiaries to stay simple, work with the clean and available data now, while we can work to make other data available. One iteration at a time. But they ask who am I (although silently at times) because I end up wearing many hats whenever I get involved in a BI project: DBA (albeit not a very good one), SSIS/ETL dev., OLAP and web dev sometimes. I don't know why, but BI involves so many aspects and I can't help trying to understand everything. I think at times that this is a mistake, but usually there aren't enough specialized people in each discipline involving a full BI implementation so I'm needed anyway. And it's really fun to understand the data flow from inception to its use in decission making.

So what am I: a project manager, a programmer, an arhitect, anything in between? The biggest issue here is that I can't define the authority that I need to hold down the customer and say "from simple to complicated, one step at a time" without actually holding down the person because they lack the awareness of how important a BI implementations' intent is. This subject also helps understand why along with new disciplines and their acronyms come labels trying to box a profession; there are too many labels out there - do we need a new one? What is the BI profession?

So where did the real-time thing go? Well, from any position within a BI team one can always ask "why do you need this data fast" or "is it more important to have the data clean rather than seemingly in real-time?" Sometimes these questions need to be repeated and some people may get annoyed (so push within reason), but I found that it helps bringing beneficiaries within the real-time perspective. The whole emerging discussion may raise much needed awareness about the respective implementations' intent.

I'll come back to this blog, glad I found it.

Cheers.

Radu

August 14, 2007 16:32
 

jamie.thomson said:

Radu,

I can't thank you enough for these wonderful insights.

I like the examples you mention. I have another. Credit Card companies have masses of data that they need to process in (near) real-time. Any lag in that process and they're in a situation where catching up is a damned hard thing to do. Every time I use my credit card I think of the technical machinations that I have put into motion, and it inspires me to build better and more reliable solutions. If they can do it? Why can't I?

-Jamie

August 14, 2007 16:54
 

Andrew Fryer said:

Good stuff and although I'm not in the partner bit of the blue monster, do look me up at the technet queartley roadshows or if you're in TVP.  

September 13, 2007 08:10
 

jamie.thomson said:

Hey Andrew. Coincidently I came across your name and pic not 24 hours ago in the latest technet magazine and wondered "Hmm..who's he. Never heard of him before." :)

Might be a while till i'm in TVP seeing as I live in Texas but as and when I do I'll come and say hello :)

-Jamie

September 13, 2007 23:11
 

SSIS Junkie said:

Kalen Delaney just posted a great blog entry " Geek City: New geeky words ". She describes a new term

November 30, 2007 21:29
 

SSIS Junkie said:

As I get more and more embroiled in the Master Data Management (MDM) discipline I'm forming nascent opinions

February 2, 2008 00:35
 

SSIS Junkie said:

In July 2007 I wrote a blog entry called Business Intelligence, what is it and what is it not? which

June 15, 2008 19:40
New Comments to this post are disabled

This Blog

Syndication

Powered by Community Server (Personal Edition), by Telligent Systems