Welcome to EMC Consulting Blogs Sign in | Join | Help

The Galloping Data Architect

It's not a hobby, it's therapy.

Never Mind the Pollux....Here's the Last Briefing from Seattle

MeWell, it was comedy all the way at the start of day two.  I arrived in the main hall for the keynote speeches to be greeted by a 12 foot image of our glorious MD and serial blogger, Dorf and his glamorous assistant David Portas in a video presentation of the work Conchango has done with McLaren F1 using the filestream component of SQL Server 2008. 

Following that we had a speech from Ben Stein; you probably know him as the boring teacher in Ferris Bueller's Day Off..."Bueller.....Bueller...".  He's actually a respected economist and he gave a great speech about the state of the world's economies at the moment, throwing in a few jokes and discussing some of the major social and economic problems facing the US today.  Not what you expect from a keynote but compelling stuff. 

Having sat through three keynotes, I'm now concerned that I haven't heard any mention of two areas that are important to the work I've been doing for the last 3 or 4 years and in one of those cases which I thought would be fundamental to Microsoft's strategy for BI going forward.

The glaring omission is Master Data Management.  In 2007 Microsoft purchased Stratature, one of the leading MDM tools on the SQL Server platform which looked at the time incredibly sensible since it filled a big gap in Microsoft's portfolio for BI and data integration. 

I'm left wondering where Microsoft see Stratature (now "Micorosft MDM") fitting into their vision for BI; clearly it’s not near the top of the priority list. There have been several posts on these blog pages about what MDM is and where it fits into data integration architecture, and one of its key roles is in the enablement of data services as part of an SOA.  This is the second omission from any of the keynotes so far and I'll come back to it.

Microsoft seems to be playing safe with the data warehouse as the future of BI.  With Project Madison on the horizon finally bringing the ability to scale, that's not surprising.  It’s a steady strategy that is proven and popular and the components that make up an MDM solution have historically been buried in traditional data warehouse projects.  You can't do BI or build a data warehouse without using components of MDM - data mapping, cross reference, hierarchy management, data quality - but taking these concepts and building them out into a separate application promotes the ability to build more agile applications and integration technologies that can reuse the services that are captured in that separate MDM application.

I went to the MDM Customer round-table and got the first two questions in.  OK they were probably a bit out of context for a customer session but worth throwing in there.  The first was basically "where is MDM in Microsoft's strategy?" - The only response was from Stratature founder and compere for the session, John McAllister who acknowledged that "...we didn't get top billing".  My second question asked where the panel thought MDM lived in the world of Gemini where users were being encouraged to do their data integration on the desktop.  They looked at me like a taxi driver might look at someone running towards them at five-thirty on a Sunday morning in the rain wearing only a pair of underpants.  The panel went mute but John moved us along by answering that if there was a need to map between different reference data, then Microsoft MDM would be a good solution for that.  The problem there of course is that that doesn't quite fit the vision of "self service BI".  Anyway I was left with the distinct impression that we have two teams driving forward but not speaking to each other a great deal. 

Many of Microsoft's competitors are branching out into areas like EII.  Forrester paint a good picture of how EII works via the provision of an "information fabric" - a virtual layer that sits above a set of integrated data services exposing data through a SQL (for bulk loading or performance requirements) or web service interface.  Behind the fabric sits more traditional looking data integration where data is extracted from systems of record, integrated though reference to an MDM system and finally published to the interface.  There may be a database behind the scenes holding data (the cache or warehouse) but equally there may be data services that issue pass through queries to systems of record.  The point is that the data that is presented to consuming applications is consistent, clean, integrated and the single version of the truth, and the consumers don't need to know the actual pattern that is used to get their data.  The added benefit of the realization of more real time access to data is also important.

This seems so far away from the DW-centric strategy coming from Microsoft that I'm wondering if it’s even on their radar.

I've heard people mention Project Velocity and Project Astoria as potential ways to enable data services.  I'm not going to comment on them here but neither was mentioned in any of the keynotes or sessions that I attended so I think if you're looking for them on the BI Strategy priority list, I'd start at the bottom and work upward.Giant Walker

Things took an unexpected turn when I went to Steve Walker's session on Building a BI Competency Centre.  Steve is a data architect in the Database Consulting group at Chevron so our paths cross frequently.  At the end of the slides, the audience woke up and it turned into an impromptu question and answer session and eventually Steve started fending questions in my direction.  This started when someone in the audience asked about the work done in Aberdeen on Project Seer to enable BI through Service Orientation.  "This is your lucky day, we have the author of that paper in the room....." and so it started.  It ended with Steve changing the deck on the fly to put my email address up in lights. I'm claiming that as a speaking slot and will expect my Microsoft BI branded denim shirt in the post. 

I was also approached in the Chevron session by someone who'd spotted my badge.  I was expecting another question on Data Services but all I got was "...do you work with Jamie....I'm a fan of his..?"

OK.  Timeout.  For those of you out there missing Jamie's blog, I can ease the pain slightly by referring you to an interview with the great man.  In summary, he “likes nothing more than getting the laptop out and hammering away", which I think we've all done in our time, however given his recent marriage his hammering is clearly taking priority over his blogging.  Don't worry, he'll be back as soon as his two week honeymoon in a caravan outside the main gate of the Redmond campus is over. 

The conference party was at the Qwest stadium, home of the Seattle Seahawks.  It was good to get to walk around the pitch and have a few beers with some fellow-Brits from Contemporary, but events that encourage 3000 overweight men to queue up to play rock band karaoke aren’t usually on my to-do list  and all the free beer in the world sometimes isn't enough to banish the crushing sense of despair I feel working in this industry.

Back to the conference.   Amir Netz gave us more details about Gemini, in a presentation focused around "That Guy".  Expect this to be coming to a conference near you, repeatedly over the next two years.  The spin was slightly different to the first day's keynote in that it focused on the overwhelming odds against the BI practitioner when facing "Those guys" that propagate Excel Hell throughout the organisation.  The premise of Genesis is that it’s much better to bring them into the BI effort in a controlled manner than to try and stop them, which given their numbers is impossible. 

 So some more technical details for you.

 In this demo, Amir pulled 100 million rows into Excel. Data is stored in memory in a SSAS instance tied to the spreadsheet.  In memory will be a new SSAS storage mode. There will be a lightweight set of ETL-like tools available within Genesis to transform the data on the way into Excel.  ETL on the desktop folks.  Any defined transformations will be re-applied when the data is reloaded.   Relationships between data will be determined by a heuristic engine in the background that will look for column name matches or similarities and might even look at the data to see if relationships can be inferred from the values of the data elements themselves.  Where the system isn't sure, the user will be prompted.

 When published, the "sandbox” is posted back to the SharePoint server and the cube from then on is hosted in memory on the server, along with any other sandboxes that have been posted back.  The server will automatically configure multiple SSAS instances behind SharePoint and connectivity between the UI and its cube will be enabled through SSAS web services.  You'll be able to see who is using your stuff through some "socialisation features", such as the names of people using each spreadsheet, its popularity etc.

 Amir also demonstrated a nice looking admin UI - the operations dashboard - which showed a summary of all  the sandboxes by popularity, size, queries etc and enabled the ability to drill into specific areas to show how trends for each sandbox had changed over time.  There is also the capability to monitor the performance of the server in terms of both memory and CPU to diagnose potential problems before they happen.

Security will be in the file through SharePoint, not the data - so if you do pull those salary figures out of the HR database, do remember to lock it down in SharePoint.

Finally there will be an option to take popular sandboxes and hit an "upgrade to performance point" button and move them across to a regular SSAS instance.

The session was packed out and it seems to be universally popular among the attendees and I can't deny that we have seen some very slick demos.  

So that’s the end of the conference for me and the last of my briefs.  The headline act was undoubtedly Gemini which stole the show. 

There is going to be a lot of hype around Gemini and don’t forget this is two years away so maybe I’m expecting too much of a complete vision at this stage.  I have concerns around governance – without which we’re just automating bad practice;    the push for integration and transformation on the desktop is a worry and if I’m sat in the audience next year I’d want to see Gemini plugged into something much more realistic to see how the UI works with several large dimensions as filters.  The stuff we’ve seen so far looks like EIS for the 21st century.

I’d like to see some some slightly deeper BI questions thrown at the tool that makes the SSAS engine on the desktop have to think beyond SUM(Sales) reports.  We also need to see the realities of server requirements to cope with the creation of dozens of user generated sandboxes.

The lack of clarity of where MDM fits in to Microsoft’s strategy has been a disappointing omission this week so I would like to see that given higher priority next year and hopefully by then we’ll know more about how Zoomix will be integrated into the stack.

I’ll end on the most interesting piece of news from a Conchango/EMC perspective – the DATAllegro acquisition is bearing fruit and with EMC hardware part of the reference architecture, Conchango’s history in implementing large enterprise data warehouses and our as gold partner status with Microsoft in BI we’re in a pretty unique position to help people take advantage of the advances in the technology.

Finally using all my powers of investigative journalism,  I found out the correct name of the cover band on stage on Tuesday morning.  Wait for it.....The Dudley Manlove Quartet.  I couldn't make this stuff up.  It’s not a great name from a merchandising perspective is it?  I'm still looking round for a tour Shirt though - "Manlove!" on the front and "Sleepless in Seattle 2008" on the back.

Preferably in denim. 

Maybe sleeveless.

Published Thursday, October 09, 2008 7:12 PM by Anonymous

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS



Why Microsoft isn’t “Into” Master Data Management « Charlie Maitland’s Blog said:

October 9, 2008 9:41 PM

Donald Farmer said:

Your blogs from the conference have had me laughing out loud. Mick.

Re Gemini and MDM - watch this space. We still have some things to reveal. It's early days in our discussions with the MDM team, but we do have some good things in the pipeline, although we're not talking about them for now.

Now I could do with a good drink to get the whole underpants image out of my head.

October 10, 2008 3:34 AM

Anonymous said:

Thanks Donald.  Crikey...if I thought about who actually reads this stuff, I probably wouldn't write it.

I look forward to you dispelling all my grumblings over the coming months!

October 10, 2008 7:14 AM

Mosha said:

Agree with Donald - your blogs were the most entertaining ones, although I don't think I got all the jokes - must be the elusive British humor - like why did you rename Gemini to Genesis - is this some kind of subtle Monty Python reference ?

October 10, 2008 6:50 PM

Anonymous said:

Lol!  Oops. That one was a mistake!  For some reason I was doing that all the way through writing them and thought I changed them back.  Put it down to a combination of jetlag, rushing to get them finished and a deep rooted pshychological obsession with Phill Collins for which I will be seeking immediate medical attention!

October 10, 2008 6:59 PM

Leave a Comment

Powered by Community Server (Personal Edition), by Telligent Systems