Introduction
What is Business Intelligence? The term "Business intelligence" was coined by Howard Dresner in the late 80s to describe an emerging discipline concerned with the discovery of information in an enterprise. Over the years the term has become diluted and ambiguous and thus has come to mean a great many things to a great many people. In this blog entry I want to explain exactly what I believe Business Intelligence (BI) is and, crucially, what it is not.
Gazumping Some Myths
Firstly, it is necessary to explode a few myths about BI. I use the term "information worker" herein to refer to a person that would make use of BI.
Myth #1 : BI is synonymous with Online Analytical Processing (OLAP)
I know one particular technology-focused commentator that believes this to be true. It isn't true. OLAP's raison d'etre is to make information available quickly. If an information worker wants an answer then an OLAP system should, assuming the information is available, get it to the him/her quickly. That is a noble purpose and indeed, getting access to data quickly is important for BI, but don't be fooled into thinking that BI and OLAP are the same thing. There are many ways to access data quickly and OLAP is just one of them.
Some argue that OLAP is not quick at all because of the inherent latency between information being produced and that information being viewed by an OLAP consumer. There are movements in the OLAP discipline to eradicate that latency but today it is by-and-large still prevalent.
I alluded to an inherent problem with OLAP when I said "assuming the information is available". If an OLAP system does not possess the information that an information worker requires then then he/she would probably have to wait weeks (maybe months) for the information to become available in the OLAP system or go somewhere else to get it. Unfortunately this is sometimes a necessary evil.
Myth #2 : BI is all about tools
There are a great many software vendors out there (I don't need to list them here) that have a built a business by selling tools and toolsets on the promise of delivering BI to a person's desktop. There are tools available for presenting information, manipulating it, moving it, changing it, storing it, aggregating it and many of these tools badge themselves as "BI tools". That's a bit of a misnomer, BI is all about discovering information that wasn't known before, tools are just a means to that end.
It is possible to build a BI system or realise a BI strategy without buying a so-called BI tool. That isn't to say that these tools do not have their place, but they are not a panacea.
Myth #3 : BI is synonymous with a data warehouse
"If you are practicing BI then you need a data warehouse." Wrong!
A data warehouse can be thought of as a place to collect all the information that a business possesses so that it becomes a one-stop-shop for answering any question that an information worker may have. Collecting information into one place is not a pre-requisite to making use of that information. Data warehouses have risen in popularity on the basic premise that in order to make information useful it all needs to reside in one place; that is simply not the case.
"You have a data warehouse ergo you are practicing BI." Also wrong!
Once you have that data then you can start to do useful things with it but it is important to know that BI does not end at the data warehouse. Nor does it start at it. A data warehouse justifies its existence if the information it contains is both useful and discoverable. Chucking lots of data into a single place does not, on its own, achieve either of those things.
I talk a little more about data warehouses below.
There are vendors out there, both large and small, that have a vested interest in perpetuating these myths. Sales channels and marketing campaigns are built on top of them and these vendors have no desire to see their market share eroded.
Data Warehouses
What is a data warehouse?
Data warehousing advocates may be a little put out by what I said before but let me promise them that I don't think data warehousing is a discipline that should be disregarded. I have been spoonfed data warehouse theory (and Kimball theory in particular) since I started working in the BI arena in 2000 and that has helped me build a healthy appreciation of both the good and bad points of the theory.
First, let's explore why data warehouses came about by looking at the benefits they provide:
- A data warehouse is intended to provide the fabled "single version of the truth". Data is often inconsistent across different systems of record (SORs) and the business logic that underpins the removal of data inconsistencies is tightly woven into the jobs that populate a data warehouse.
- Data quality is improved.
- It reduces impact on SORs by ensuring that BI information is not sought from these operational, transactional systems.
- Data is generally structured in such a way that is optimised for analysis.
- All data required for BI purposes is centralised in one place thus providing ease of access to that information.
- Measurable data can be aggregated in a data warehouse to provide quicker analysis capabilities.
- Security. Centralising data means that access to that data can be more easily controlled.
- History can be more easily stored in a data warehouse whereas history sometimes needs to be removed from operational systems for performance reasons.
However, there are some inherent problems (not all prevalent all the time) with data warehouses that we need cannot ignore:
- Processing windows (i.e. the time available to move data into a data warehouse) are narrowing as traditional bricks and mortar businesses move to 24-hour sales channels. Quite simply businesses such as financial institutions and global online merchants do not have "downtime" in which their SORs are available for batch processing. Business models are evolving and BI must do the same to accommodate.
- Business logic required to determine "truth" is buried within documentation and batch processing jobs. It is hidden from the people (i.e. the information workers) that are in charge of that business logic .
- There is a latency period between data being created and then being made available to the information worker. Put more bluntly, real-time BI is not possible when a data warehouse is employed.
- Building a data warehouse is expensive.
- Data warehouses are often IT-driven solutions that do not fully embrace the real requirements of the business.
There may be other bullet points that can be listed here but I see these as the main points. I don't want to make you think that data warehouses are bad - far from it. There are a lot more positives listed here than negatives after all. Be aware of the negatives though, they are important considerations. As is my next point.
Papering over the cracks
I want to explore this issue of "single version of the truth" a little bit more. The use of a data warehouse in determining "truth" is popular but that ignores a more pertinent point - why is the "truth" not available in our SORs? If a data warehouse is used for this purpose then aren't we just ignoring the fact that data inconsistencies exist in our operational systems? Aren't we just "papering over the cracks"?
It does bother me that companies sometimes seem more willing to invest in systems that take data out of operational systems in order to cleanse it rather than cleanse it at source. Let me give you an example. On my current engagement my team is trying to answer the question "How much of a particular substance do we generate every day?" and so far we have identified three different SORs that can answer that question and each one provides a different answer. Furthermore, depending on which information worker you speak to each one of those answers is correct. Shouldn't we be trying to eliminate that data inconsistency by fixing the problem when data is gathered and thus creating the truth at source rather than trying to do it after the fact?
Quit talking about what BI is not and tell us what it is.
So, now I've blown some myths apart and risked retribution by debunking some established theory let's focus on what BI is rather than what it is not. I like using short snappy statements to describe what I'm talking about and one such statement that I could use here is "Business Intelligence helps people to make better decisions". The term "decision support system" doesn't do the rounds these days as much as it used to and I think that's a shame because it sums up pretty well what BI is all about.
A key tenet of BI is making information easily discoverable. Better visualisation techniques can help and the aforementioned BI vendors definitely have a part to play here. Different delivery mechanisms are also required. No longer are information workers tethered to a desk; they are mobile and they need their information to be mobile as well.
Information needs to be readily available and it is up to the information workers to define what "readily available" means. For example, if the information worker requires real-time data then straightaway a design decision has been made for you - you cannot employ a data warehouse.
Above all, BI is about giving people the information they need to do their jobs. That statement gives rise to a multitude of techniques, technologies and architectures but that's OK because BI is all of those things and more.
A worthy example
To wrap-up, I want to show you an example of what I consider to be the best BI implementation I have seen in a long long time. Why so? Because it gives information workers what they want, it doesn't conform to a stereotypical BI architecture, and data is presented in a way that just makes sense. You won't hear the term "Business Intelligence" mentioned once in this video but that's OK. I want to blow away pre-conceived ideas about what constitutes a BI system and instead concentrate on what is important - making people's working lives easier. Yes its a Microsoft-heavy presentation but try and ignore that - concentrate on why this is a compelling BI demo.
Download the video from here: http://download.microsoft.com/download/f/4/6/f46669b6-f269-4fe9-8efa-4c6ae15fdea9/ms_en_v500_600x450_00.zip
You will want to start watching from about 38:30 but the good stuff starts after 44:00. Enjoy!
Conclusion
It is time for Business Intelligence applications to step over the boundaries that traditional paradigms have enforced upon it. The world of work is changing, people have different demands of their data and delivering valuable information in a timely manner is more important now than it has ever been.
-Jamie
Further reading: