SQL Server 2014 CTP 1 Business Intelligence Summary
In an unusual but warmly welcomed move, Microsoft SQL Server Technical Marketing released the SQL Server 2014 CTP 1 Product Guide before the CTP was released to the public. The first CTP for SQL Server 2014 CTP is expected to be released later this month. If you have not looked through SQL Server product guides before, they contain fantastic technical overviews, white papers, and presentations. I used to be on the Microsoft SQL Server Technical Marketing team that put this type of material together and I can share that these resources are excellent for keeping up to date on what is coming next with SQL Server and getting a feel for the key investment themes for future-proofing your technology roadmaps.
After downloading the latest product guide, I found the general decks and the BI 300 deck and took a quick peek through them hoping to see some exciting news on mobile BI, predictive, and other areas that I am dying to get an update on. Here are some notes on the overall BI investment themes that I did see in the CTP 1 product guide decks for the next release of SQL Server: In-Memory Across Workloads, Performance & Scale, Hybrid Cloud, HDInsight, and Cloud BI.
Note that "Cloud BI" a.k.a. Office365 Excel and desktop Excel for the BI presentation layer seems to be the BI message in this next release. This "SQL Server 2014 BI = Excel for presentation layer" message does align to the "Basic Intelligence" Excel BI theme from the PASS BA Conference a few months ago. There was notably NO mention of Reporting Services that I could see in the BI decks or whitepapers....hmmmm, interesting.
SQL Server 2014 CTP 1 Product Guide BI themes:
- HDInsight Big Data (most of the emphasis and focus in this release)
- PDW Polybase (Remember Dr. DeWitt's keynotes, PASS 2011, NoSQL + SQL = SQL Server 2014)
- Excel, Data Explorer, PowerPivot, Power View and GeoFlow (heavy emphasis on Excel)
- BI Semantic Model, Analysis Service Tabular, and Multidimensional flavors (lightly covered)
- Reporting Services (notably missing?, no Reporting Services or Azure Reporting Services mentioned)
Data Quality Services and Master Data Services (mentioned but no coverage)
- SharePoint and Office365 (lightly covered)
If something is not covered or is covered lightly, it does not necessarily mean that there are no investments. I know there are investments in Analysis Services Tabular and other areas that were only glossed over lightly from different Microsoft product team public blogs. However, I was truly surprised not to see any classic Reporting Services covered in any of the decks since Reporting Services is the #1 most popular Microsoft BI application by far used today across the globe (aside from Excel of course). Excel is not really an operational reporting tool, operational reporting is needed everywhere, comes up in most BI conversations, etc. In the last 2012 release of SQL Server, there was not a lot of classic Reporting Services enhancement aside from SharePoint integration. Power View was truly the main focus of the Reporting Services team at that time. Power View does not really replace operational reporting, especially as it stands today with no parameters, scheduling, or bursting... I'd love to know the future vision of classic Reporting Services as would a bazillion other people. I get asked about that each time I do a Wave 15/ What is New in 2013 BI overview. I also saw one mention of the old Excel Data Mining Add-Ins that I totally love by the way. It was very nice to see them shown in here and I do hope that means those great Excel predictive analytic add-ins will still be around in v2014.
It was quite clear to me that HDInsight and PDW are getting most of the love in 2014 from the sheer amount and depth of coverage in the product guide content. The above image of the Microsoft big data technical architecture components and how they all tie together is also key. I would like to see an updated technical architecture diagram like the one SAP uses (see picture below) so customers can clearly understand Microsoft's comparable offerings. Here is my hack of what I saw in SQL Server 2014 and how it relates to SAP's big data framework that has the exact same types of technical functionality.
PowerShell, and cross-platform CLI tools. MapReduce included conceptual overviews and code examples. I imagine this area will be the one with the most significant learning curve for most of us and does seem to look more like traditional application development versus database development skills. If you haven't started learning about big data yet, you really, really, really, should...it is still early adoption but it is gaining serious market traction, will be mainstream soon, and thus soon a need to know.
Ah, PDW Polybase! Finally Dr. DeWitt's keynotes from PASS 2011 of combining the NoSQL world with the relational database SQL world come to life in SQL Server 2014. PDW Polybase allows native querying across PDW and Hadoop using regular SQL queries, integrating structured and unstructured data.
- SQL query access to data stored in HDFS seen as ‘external tables’ in PDW
- Basic statistics support for data coming from HDFS
- Distributed join querying across PDW and Hadoop tables
- Fully parallelized, high performance import of data from HDFS files into PDW tables
- Fully parallelized, high performance export of data in PDW tables into HDFS files
- Integration with various Hadoop distributions: Hadoop on Windows Server, Hortonworks, and Cloudera.
- Supporting Hadoop1.0 and 2.0
Polybase will be a powerful and totally fantastic feature - if only I could get my hands on a PDW to play with it! I am seeing PDW demand pick up this year. For those of you that don't know PDW, it is Microsoft's Parallel Data Warehouse appliance. Although it is not covered in the BI section but it is BI related - update capable column store index is in 2014 - that will be a very nice enhancement to the existing
column store index in 2012.
In the SQL Server 2014 release, we do see a lot of Excel where we used to see Reporting Services or PerformancePoint. I noted only one mention of PerformancePoint in a recycled slide. The message is loud and clear = Excel with Add-Ins and Office365 Excel is the future for Microsoft BI. The 300 deck opened with Excel Data Explorer and reviewing all the data source types that can be searched and explored.
I have a nice overview of Excel Data Explorer, an easy self-service ETL tool, from a previous blog if you are interested in it. Also in Excel slides, there was basic coverage of Excel Web Apps, PowerPivot, Power View, and GeoFlow. No specific enhancements were really mentioned in these decks yet. I am guessing that future CTP deck iterations would add deeper dive technical content for these areas - possibly even contain a deck for each one of these areas like there used to be in past SQL Server release product guides.
Lastly, there was light coverage of the BI Semantic Model, Analysis Service Tabular and Multidimensional flavors,
Data Quality Services, and Master Data Services. Again, no specific enhancements were mentioned in any of the decks yet. My wish list for the next iteration of SQL Server 2014 Product Guide BI content includes:
- Expand upon classic Reporting Services or cover what will be the future for operational reporting
Add more Predictive Analytics, Mobile BI, and Cloud BI content
- Expand Data Explorer and include things like powerful M scripting
Add deeper PowerPivot, Analysis Services, DQS and MDS material for what is new, wild, and exciting
That wraps up what I saw in the current SQL Server 2014 material from the latest product guide release. I highly advise keeping a pulse on these docs as they evolve thoughout the next year.
The Beginning of Impact Analytix
After much reflection, thought, and consideration, I chose to pursue one of my life dreams - starting my own business. My new business, Impact Analytix, LLC, will consult and develop solutions with the BI, analytics, and technology platforms customers choose. Over the last 20 years I have seen many BI peers go out on their own, take a leap of faith, and SUCCEED! I have quietly envied their ability to play with varying database and BI platform technologies, decide when to accept/turn away projects, write books, blog, go and speak at amazing industry conferences, take long vacations, write off technology and continued learning as business expenses, and so on. Having saved enough money, lived modestly, gained relevant work experiences, and with my husband's support and continual urging to start my own business for years now - I FINALLY DID IT. It is exciting and I am completely reinvigorated. I registered my own business, got an EIN, business accounts, set up Office365 (super fantastic for a small business), started researching technology partnership programs across various information technology vendors, looked at cloud stuff, set up the computer systems, interviewed numerous small business owners, and continue to plan and research as I get my first engagements started. Thanks to Mark Tabladillo, Lynn Overall, Mayra Harley and others for being true inspirations and invaluable mentors for me in this new venture. I know there are many more of you out there that have made this leap too. I DO want to talk to you, hear your stories, and lessons learned.
So that is my scoop on the beginning of Impact Analytix, LLC. Fear not, my next blog will be a technical topic again like SAS JMP, Tableau European Conference news, Microsoft SQL Server's latest service pack with the long awaited and finally released DAXMD allowing Power View reporting on top Analysis Services multidimensional cubes (Reporting Services/SSRS SharePoint right now not for Excel 2013 yet), or something else truly geeky.
New Data Explorer for Excel 2010 and 2013
I recently blogged about Self-Service ETL Tool Options. That blog was pretty much a high level summary of current market solutions in that space with links to further explore the reviewed tools and also find more related tools. Having now spent some hands-on time with the new Excel Data Explorer Preview add-in, I'd like to supplement that post with additional feedback. To be fair to the other excellent self-service ETL vendors, Alteryx and LavaStorm, I will explore their offerings deeper in future blogs too. I do know that Alteryx has released an upgrade that works with varied local R installations so I can better evaluate their predictive features in my next review. Since Excel touches the lives of at least 2 billion people globally and the Data Explorer add-in is currently free, I wanted to first dig deeper into the new Excel Data Explorer Preview add-in since most people will naturally start there and then decide if another tool is needed.
Just to refresh you, the Data Explorer Preview is an Excel add-in for Excel 2010 and 2013. You can download and install it from http://www.microsoft.com/en-us/download/details.aspx?id=36803. Data Explorer provides easy, self-service ETL type functionality within Excel.
It has a simple, non-technical wizard interface for loading, combining, and transforming both structured and unstructured data into Excel - it also has some complex scripting capability via a new language called "M". You can
read up on M in the M technical docs. Additional resources to ramp up on Data Explorer include an introduction tutorial and a more advanced one on combining data data sources.
To get started in, navigate to the Excel Data Explorer menu, choose Online Search or a Data Source Type to Find, Query, Prepare, Cleanse, and Load Data into Excel. The current list of supported data source is fairly large: Web Page, Excel or CSV files, XML, Text, Folders and file metadata, SQL Server database, Windows Azure SQL database, Access, Oracle, IBM DB2, MySQL, PostgreSQL, Teradata, SharePoint Lists, OData, Hadoop HDFS, Windows Azure Marketplace, Active Directory, Windows Azure HDInsight, and Facebook. More data sources are sure to be added over time - I wish Twitter and LinkedIn for example were much easier. TweetSQL, DataSift, R packages, and many other third-parties can get you your social data to analyze. The Data Explorer Web Page and Online Search data sources were interesting and quite excellent. I can't begin to count the number of times I have had search and then copy, paste and restructure data from a web page somewhere where the figures were embedded in messy HTML tables for analysis.
A few of the other features that I liked and foresee being really invaluable to typical data analysts are the pivot/unpivot, scripting, joining, filtering, deduplicating, grouping, splitting, and transforms. The Data Explorer formula language is pretty straight forward though I do wish learning yet another a new programming language was not necessary. Many of the Microsoft MVPs have great blogs on Data Explorer - Chris Webb, Paul Turley, Melissa Coates, Bill Pearson, Mark Tabladillo, Teo Lachev, Rob Kerr, and Jason Thomas' are probably some of my top of mind, favorite bloggers that have good technical coverage. I know I am forgetting a few of the best of the best out there. One of the things that I found "hidden" or not best implemented was the nice menu for cleansing and data preparation shown above. The "trick" to see/access this menu is to right-click on top of the data column header name in the Query Editor window. That right-click is not at all intuitive given the little drop-down arrow the user is naturally drawn to click on. If the user does not know to right-click on top of the data column header, they miss out on the true beauty and power of all the transformation capabilities. I do hope that nit pick is fixed or I fear many analysts may overlook Data Explorer depth of functionality altogether or fear the script window being too complex. After visually choosing your transformations, you can then click on the script icon to see the generated M scripts. If you are a hard-core scripting fan, you can write your own M scripts in the script window.
Personally I don't think Excel Data Explorer got the love it deserved at PASS BAC most likely since the UI does not "sizzle" and there were some grumbles about PASS 2012 Day 2 keynotes. Regardless, it is one of the best things to come out of Microsoft BI this year and it is something to add to your bag of analytical tricks. Since Excel Data Explorer is more for personal ETL versus enterprise ETL, you may be asking when should you use it. This is where I think Excel Data Explorer seems to make the most sense:
- You need basic ETL on data sources with less than one million rows
(Refer to Data Explorer limits)
- You are struggling with a mess of VBA routines or are using Access linked servers with merge queries, and so forth (There are much better ways to do data mash-ups today though I keep seeing the VBA and Access messes all over the place in the real world.)
- You need personal data cleansing
- You want lightweight web page content imports or text mining capabilities
I feel that Excel Data Explorer currently is not a go-to solution for the following scenarios:
- You need large scale data for ETL or want true web or text mining
- You need ETL scheduling, monitoring, automation, etc.
- You want a data quality solution that references third-party data, sources, can be automated, uses workflows, data stewards, reporting, etc. (use a real DQS tool for this type of thing)
- You need needs to merge same attributes (i.e. Customer, Product) from different systems, system of record priority, workflows, data stewards, and automation (use a real MDM tool for this type of thing)
That concludes my "slightly" deeper dive into Excel Data Explorer. I know it is still pretty shallow from a "true techy" perspective but I do hope that it was enough to provide a better understanding for what Data Explorer is, what it can do, and where it fits.
Combined BI solutions with SAP for SAPPHIRE 2013
I recently went to SAP SAPPHIRE and had a great time seeing my BI friends from Microsoft, Simplement, Accenture, Deloitte, Hitachi, Ernst & Young and numerous customer accounts. SAPPHIRE seems to be a much heavier attended conference than SAP TechEd that I attended last fall. To prepare for this event, I had to ramp up on combined Tableau with SAP solutions. In the past, I have worked with combined Microsoft with SAP solutions - to read up on those check out PowerPivot
with SAP HANA and Microsoft BI with SAP. Now having seen both combined vendor solutions with SAP, I do see common patterns and I also see some areas where it sure does help to have a great SAP relationship for the combined solution connectivity!
Tableau directly connects to SAP HANA, SAP BW and SAP Sybase IQ. Tableau also can be called or embedded to/from an SAP Business Objects - the Ablaze Group has an excellent video of combining Tableau with SAP Business Objects. Historically SAP customers have been using Tableau with SAP BW for a while now. This year I am seeing a little more SAP HANA than I did last year as the database market share competition heats up. Thus I started my exploration of the SAP + Tableau experience with SAP HANA and it found that it was super easy. The first thing I did was install the SAP HANA ODBC driver
on my laptop and connected to an SAP HANA instance hosted on AWS. From there the Tableau experience was exactly the same as it is with any other relational database - simple drag and drop visual analysis with the lightning fast speeds that SAP HANA in memory delivers. See the image above of my Mortgage Analytics testing (using fake data of course).
Although it was not required for Tableau, I also installed SAP HANA Studio so I could play with SAP Hana in general, load data, create stored procedures and test the Predictive Analytics Library for a later post. For learning SAP HANA, I watched a few of the wonderful videos on SAP Academy. Working with SAP HANA feels like most other relational databases that I have worked with in the past: DBA/Developer management studio, role based permissions, tables, stored procedures, import/export, SQL, explain plans, sequences and so forth. Some of the differences were the optional Analytics Function Library, Predictive Analysis Library, R integration (it relies on RServe so that seems more like a wrapper to me than true integration), and a nice data previewer with charting capabilities. More to come on SAP HANA but for now back to Tableau + SAP.
Next on the list was SAP BW with Tableau. There are quite a few of the largest companies in the world already doing this and they are sure to stop by the booth and ping me so I had to be on my A-game! To get connected to SAP BW using the Tableau SAP BW connector, I found the Knowledgebase article was quite helpful. This connector uses the OLE DB for OLAP provider and issues live MDX queries to SAP BW. The connection experience is similar to using Tableau with other OLAP data sources, such as Microsoft SQL Server Analysis Services and Oracle Essbase. I needed to install the SAP GUI for Windows and the OLE DB for OLAP provider from the SAP Service Marketplace. Tableau enables you to connect to a BEx Query or to an InfoCube in SAP BW - other SAP objects like MultiProviders and ODS must be done through a Query that has the “Allow External Access to this Query” option selected in the Properties tab in the BEx Query Designer application. In Tableau, choose the SAP Netweaver Business Warehouse connection type, enter your credentials and the select an InfoProvider to connect to a BEx Query or InfoCube. Note that BEx queries often perform better than connecting to an InfoCube since they can be designed to get specific information from SAP BW rather than everything. BEx queries can be defined using BEx Query Designer. Once a BEx Query or Infocube is selected, Tableau detects the properties of the available objects, dimensions, hierarchies, and key figures as shown side by side in the image below.
Now you can get to the real fun and start creating amazing Tableau visualizations of this SAP data, blend it with other data and publish it to Tableau Server for sharing or even embedding the analysis into a web site, web application, SharePoint or SAP Reporting portal.
While on this topic quite a few people asked about SAP Visual Intelligence now called SAP Lumira vs. Tableau this week. I would advise you to check them both out side-by-side, test, play and experience the sea of vast differences of product maturity. Tableau has a good 10+ years of development head start on SAP in this space. Tableau is #1 in this class, the depth, breadth and design is far superior. It is notable that SAP Lumira's embedded predictive analytics with SAP HANA, R, and writeback features are much easier than doing predictive with Tableau today. SAP Lumira also has much nicer data cleansing options than Tableau right now. However, data connectivity and just about everything else visual analytics is much better in Tableau than SAP Lumira. In time SAP Lumira may have something more compelling but I have not seen much for enhancements since the debut last year at SAP TechEd in October. Bottom line = test them both and choose what is the best strategic analytics tool for you.
Hopefully this was a good primer on how to get started with Tableau and SAP. This happens to be a pretty popular combined solution at some of the largest companies in the world. If you have any questions or need further guidance about this topic, please don't hesitate to reach out to me.
Mobile BI for iOS, Android, Surface, and Smart Phones
Mobile, mobile, mobile BI! Powerful information delivery for anytime, anywhere analytics, getting answers to your questions when and where you need it most - in the field. If you folllow me, you know that I am obsessed with mobile. I have loved mobile apps since the 90s. I have been dying to write this post for a long time now. Honestly I was waiting to see if there would be any news announced at PASS BA Conference to add that info into this post since I get soooooooo many pings about mobilizing Microsoft BI. Alas crickets were chirping on the mobile BI topic - there was NO mobile news. So if you have pure Microsoft BI, the same limited, not ideal, but can work options that I showed you last year with HTML5, jQuery, SSRS 2012 SP1, Excel 2010/2013, Visio 2013, PerformancePoint, and SharePoint 2010/2013 with browser based mobile for iPad are still around. However, Power View is still Silverlight and thus not mobile-friendly. GeoFlow is not mobile-friendly. The PerformancePoint right click feature to access menus broke late last year when Apple disabled that menu navigation option. I also would not recommend building any new PerformancePoint content now that "Excel has been positioned as the future BI direction for Microsoft BI" at PASS BA Conference. There are still a wide variety of third-party mobile BI options you can use - Tableau being my favorite.
I first learned of Tableau's mobile capabilities with Microsoft BI from Jen Stirrup, a Microsoft BI MVP, and since have fallen in love with it. Why do I love Tableau + Microsoft BI for Mobile BI?
1) Tableau’s touch-optimized, business intelligence is available across a wide array of mobile device types, browsers, and operating systems including Windows, iPad and Android - native apps and browser based. In the BYOD world we live in today, having wide device support is critical. Don't limit your mobile BI consumers to one device type.
2) Mobile users can view, edit or even author new visualizations on their favorite mobile tablets to not only view data but also ask on the fly questions in meetings or while on site to get immediate answers.
3) With Tableau you can mobilize the most popular Microsoft BI and other data sources including:
4) Tableau mobile business intelligence emphasizes usability. It is touch context sensitive. You can filter, scroll, pinch, zoom and it all just works without HTML5 coding hoops - there are NO coding hoops.
5) Tableau has an author-once, view anywhere approach means that every dashboard you publish is automatically touch-enabled.
7) You can use Tableau to easily combine data from with many other enterprise platforms like SAP or IBM, not just Microsoft to fully leverage AGILE, mobile data discovery alongside your existing enterprise BI implementations. Tableau is not a rip and replace vendor - Tableau values supplementing many other vendor BI solutions for best of breed solutions.
8) You can install the fully functional Tableau Desktop app on a Windows Surface Tablet if you are a true, purist, die-hard Microsoft fan!
9) You can embed live Tableau visualizations in PowerPoint (I was mad when the Power View export to PowerPoint was removed in Excel 2013...boo!) You can have that favorite feature with Tableau. Oh, and one more bonus. You can play/animate all Tableau views over time periods to see trends and patterns over time on maps, in bar charts, line charts, scatter charts, combination charts, and even in background image charts. I have seen some amazing animated visual analytics applied to supply chain, finance, healthcare, retail, transportation, science, weather, and spatial analytics scenarios.
10) Tableau mobile BI has solid security. You can enforce existing security protocols and integrate with Active Directory via Tableau Server. If a user loses their iPad or Android tablet, simply disable their Tableau Server account and give them a new one. Sensitive data is not stored on mobile devices.
So that wraps up my Top 10 list for Why Tableau + Microsoft BI for Mobile BI. I am sure I could come up with many more reasons. To get started exploring and deploying Tableau mobile BI, check out your mobile phone or tablet app store to install and play with the latest free Tableau native mobile BI app samples. You can also download a fully functional, free trial of Tableau Desktop and Tableau Server to test it internally on your network - or - you can use the free Tableau Public option and publish your dashboards there to see what they look like from anywhere in the world that has an internet connection on your favorite mobile device.
Feel free to reach out to me if you want more information on this super hot mobile BI topic. Note that I will also try to provide a SlideShare deck and some videos like I have done in the past soon.
Self-Service ETL, Data Quality, Cleansing, Prep and other Analytic Data Goodies
As promised, I wanted to share some of the options out there that I have been evaluating for self-service ETL, data quality and cleansing. In this post I will briefly highlight Microsoft Data Explorer Excel Add-In, Data Wrangler, LavaStorm Analytics, Alteryx, and Microsoft Data Quality Services. There are actually quite a few of these tools out there! A great list of more options is avallable on KD Nuggets, my favorite data mining site. Data prep is probably the most important aspect of getting a great predictive or analytic data model.
Lately there has been a lot of good Microsoft Excel MVP buzz about the Data Explorer Preview, an Excel add-in for Excel 2010 and 2013. Data Explorer seems to be a Microsoft light version of an Oracle Endeca-like offering. Data Explorer pricing and 2013 version support for it are not yet known - the preview is free. After the Rob Collie Excel 2013 PowerPivot rant, I wanted to be sure to clarify Data Explorer pricing/version details are totally unknown to me right now. (Digression: Also Microsoft is not evil - wow Rob's blog post got a huge uproar. Microsoft is not a charity - they just happen to do a lot of charitable things and build low cost, less feature rich, for the masses-type tools. Microsoft is a great company that makes a positive difference in the world. Yes, I am biased.) Data Explorer is absolutely worth checking out if you want easy, self-service ETL type functionality within Excel.
Data Explorer has a simple, non-technical wizard interface for loading, combining, and transforming both structured and unstructured data into Excel. The learning curve is minimal. The Data Explorer team did a great job of covering common data loading, data prep, and tweaking tasks from the most popular data and web data sources. They even have a social media connector and big data connector. It is notably interesting that instead of using SQL to query and tweak data, they introduced a new language called "M" with Data Explorer . "M" seems pretty powerful. You can
read up on it in the M docs. Additional resources to ramp up on Data Explorer include an introduction tutorial and a more advanced one on combining data data sources.
Another tool in this self-service ETL space is LavaStorm Analytics. I really liked what this group offers but do confess that now we are closer to a power user or an enterprise ETL developer tool user experience. The LavaStorm Analytics tool user interface is much simpler than the enterprise ETL tools I have used before. The introductory YouTube video was quite helpful for me to get a quick understanding of capability to begin building ETL solutions right away the same day. I liked the richness of the Lavastorm Analytics Engine functionality, ETL component reusablity, pre-packaged analytic libraries to incorporate predictive analytic models and statistical data analysis, and R task/R integration out-of-the-box. You do need to script the R right now in the R task (bummer) but I am hopeful that will improve in the future. The ETL component reusablity is a big plus since analytic routines can be complex and often are shared for use on numerous analytic projects. I am not doing this great self-service ETL tool justice in a one paragraph overview. Download it, watch the online free video training, and play with it for yourself. They do have a free desktop version, and several levels of paid desktop versions and server versions starting at $4,500. They also have some other product related offerings around specific analytic use cases like fraud and spend, and optimization.
Alteryx seems to be the "king" of the self-service ETL tools I reviewed and they also had the highest pricing at ~$45,000 per seat and $15,000 for personal edition. Alteryx was a highlighted vendor in the 2013 Gartner BI Magic Quadrant report. Alteryx is super feature rich and comes with embedded, prepackaged industry data sources like MOSAIC profiles,TomTom geospatial, Dunn and Bradstreet, Experian, and other in-tool geographic, demographic, and business data that you get when you buy it. So the high price is really a bundled price for both the industry analytic data sources and the self-service ETL tool. Alteryx has bulk data loaders for SalesForce, Amazon S3, SharePoint, Teradata, Oracle, MongoDb, Hive Hadoop, and many other structured and unstructed data sources. They offer a read-write, hosted analytics application wrapper framework on their website. They also have nice R integration that did not require any R coding AND an R code task option if you want/need R code within your self-service ETL flows. I was not able to test the R functions myself since I had a version conflict with my local open source R installation. Alteryx' predictive product manager was kind enough to spend time with me covering the R features in depth and also shared that they are working on a vNext iteration that will support side-by-side various R flavor installs. What I felt were true strengths with Alteryx includes their deep location/geospatial analytics and prepackaged industry data sources for mash ups. Powerful common analytic tasks like give me the geocodes for plotting 15 minute drive times from a specific store location on a map are very hard to geocode and then visualize with other tools - in Alteryx it is easy. They also just added a Tableau TDE ETL destination to allow Alteryx ETL data to be easily sent to Tableau for the deep visual, interactive analytics - see image above for the type of analytic solution this awesome combination empowers. Alteryx seems to be used by larger enterprise analytic groups - it is not for the occasional one-off, ETL project or even pure ETL. If you just need ETL, there are other much cheaper ETL tools like LavaStorm Analytics, SSIS or Pentaho.
Last but not least if you just want data cleansing and your company already owns SQL Server 2012, don't forget about SQL Server Data Quality Services (DQS) and the Data Quality Services add-in for Excel. I am going to copy the description directly from the official overview "SQL Server Data Quality Services (DQS) is a knowledge-driven data quality product. DQS enables you to build a knowledge base and use it to perform a variety of critical data quality tasks, including correction, enrichment, standardization, and de-duplication of your data. DQS enables you to perform data cleansing by using cloud-based reference data services provided by reference data providers. You can also perform data quality processes by using the DQS Cleansing component in Integration Services and Master Data Services (MDS) data quality functionality, both of which are based on DQS." Now this option is really not self-service ETL but I felt the need to mention it here since many self-service ETL projects really are basic data cleansing projects where DQS fits very well. There are tons of free videos and resources to learn how to get started and use DQS.
Tip! Self-Service BI/Data Discovery Software Evaluations
Tip of the day, download free trials of self-service BI/data discovery tools to experience using them yourself BEFORE you buy. "Experiencing" the differences in HOW to build views with your own data sources and playing with the depth of available analytic features is really important since many of these tools look the exactly the same when you only see the end result views or dashboards. If it is really a self-service or data discovery tool, then you should be able to point to/ load your own data sources, mash up/ blend your own data sources, add your own time intelligence (YTD, QTD, MTD, Year over Year, Parallel Period), and build your own analytic views - all by yourself! Don't let BI vendor sales reps fool you by copying your data into "pre-made", self-sevice BI demos or POCs. The HOW in these use cases is a Critical Success Factor.
Tip! Analysis Services Cubes with Tableau Design Patterns
I found a “golden nugget” for using Analysis Services Cubes (SSAS) with Tableau. There is a Knowledgebase article on the topic that is a little dated BUT the attached workbook with the common design pattern examples is really excellent. There do seem to be a lot of customers using Analysis Services Cubes OR trying to figure out how to do it. Keep this handy.
KB Article: http://kb.tableausoftware.com/articles/knowledgebase/functional-differences-olap-relational
Design Patterns: http://kbcdn.tableausoftware.com/workbooks/adventureworks_cube_on_scdemo-dbs.twbx
Tableau v8 "Kraken" is Officially Released
So that wraps up another late night blog post. I was not expecting this release today and was pleasantly surprised when I saw the Tweets. My next planned blog post will be on Self-Service ETL covering the new Microsoft Data Explorer, Alteryx, and Lavastorm Analytics.
Big Data Visual Analytics
Earlier this week I attended and presented at the TDWI Big Data Analytics Solution Summit. It was interesting to hear the other presenters, talk to groups using big data, and get to play hands on with messy XML big data that was stored on a Cloudera Hadoop cluster for my big data visual analytics with Tableau demo. Prior to this event, I tooled around with big data over a year ago after Denny Lee, Saptak Sen, and a few others inspired me to do so while at Microsoft but confess I felt overwhelmed by all the strange terminology of Pig, HIve, Sqoop, and the high level of effort to be functional with it. Since then the ease of using big data has vastly improved. Today most big data vendors offer connectors like the Hive ODBC and market leading visual analytics tools like Tableau include additional user friendly, big data function libraries with those drivers to make the power of big data analytics accessible to a much wider audience of analysts and developers. As a result this technology is widespread. If you are a business intelligence professional, you need to learn what big data is and how to work with it. In this post, I will share how simple it is to do big data analytics with Tableau and cover some cool, value-added benefits of using the Tableau-specific features with unstructured, messy, big data objects that is a common pain point with many other visual analytic tools.
To get started with Tableau and big data, you need a big data source. Tableau can connect to quite a few of them including Hortonworks Hadoop Hive, MapR Hadoop Hive, Cloudera Hadoop, Cassandra, Hadapt, Karmasphere and Google BigQuery. If you don't have one of these, most of the vendors do have virtual machines you can download with a cluster already set up that you can use to learn. Cloudera Hadoop is the one I am using. Once you have a big data source, you need the related Hive ODBC driver installed on each machine running Tableau Desktop or Tableau Server and also ensure Hive is set up. Hive enables SQL like queries on Hadoop file systems. More about Hive, what it is, and how it works can be found here. For me I have a 64-bit WIndows 7 OS and found out that setting up the ODBC was a little tricky. On 64-bit OS I had to call "C:\WINDOWS\SysWOW64\odbcad32.exe" in the command line to get the right ODBC user interface to see the Hive driver to create a DSN to use in Tableau. Once I had that figured out, I could point to the DSN in the Tableau data source connection window, choose my connection type, schema, and start having some real fun visualizing the big data. Creating big data visualizations in Tableau was no different than any other data source type - it was a drag/drop, pleasant, visual data discovery experience.
Tableau shines in big data visualization, no programming required, and also no waiting for super slow, long running MapReduce queries if you choose to use and schedule extracts like most groups do in the real world use cases after initial exploration and experimentation. In the 2012 TCC key note demos, Christian Chabot showcased how Tableau could render over 800,000 data points with ease blowing away the other data visualization vendors that often fail to render more than a thousand data points and use uncontrollable data sampling techniques to hide those limitations. In a world of big data, the ability to easily see and analyze patterns, outliers, or exceptions can be the difference between struggling to survive and thriving in your industry. I talked about this strategic competitive advantage during my TDWI Solution Summit demo.
I also discussed how working with big data sources often means working with messy data, unstructured data, JSON, and XML files that can be exceptionally challenging to analyze directly on a Hadoop cluster. Most other visual analytics tools require programming or ETL before you can analyze this type of data. That key point shown was the low level of effort required with Tableau in these common big data scenarios! I showed how easy it was to do drag/drop visual analysis of XML files with a live, direct Hadoop connection and using Tableau's enhanced big data driver libraries for XML, JSON, and other string functions that uncomplicate text processing, unpacking nested data, performing data transformations, and processing URLs. Tableau’s big data driver features also support pass through Hadoop UDF functions and Java programs. This is another very important point to keep in mind because one of the benefits of using open source is getting to enjoy freely leveraging the work the world of open source developers further easing big data analytics. In Tableau, you can call Hadoop UDFs, often
built as JAR files, by appending the JAR flle location to the Initial SQL (add JAR /usr/lib/hive/lib/hive-contrib-0.7.1-cdh3u1.jar; add FILE /mnt/hive_backlink_mapper.py;) For more informaton on this point refer to Hive CLI.
Another thing you can do with Tableau Initial SQL is performance tune the Hadoop MapReduce query settings. To force more query parallelism you can lower the threshold of data set size required for a single unit of work. (set mapred.max.split.size=1000000;) You can also specify using clustered fields, "bucketed fields", to improve big data query join performance. (set hive.optimize.bucketmapjoin=true;) Another Tableau Initial SQL big data optimization technique is to consider the shape of your data. Often big data is unevenly distributed, Map/Reduce tasks can lead to system hot spots where a small number of compute nodes get slammed with the most computation work. The following Tableau Initial SQL setting informs Hive that the data may be skewed and to take a different approach formulating Map/Reduce jobs. (set hive.groupby.skewindata=true;)
For ad-hoc big data queries, on-the-fly ETL, or data cleansing per se, Tableau Custom SQL can be used. In real world implementations a LIMIT clause is often added to Custon SQL statements during development and later removed when the Tableau views are deployed into production. There are many other tips and tricks for using Tableau with big data. I hope the above provides a taste for some of the things you can easily do to start getting up to speed. More information on this topic is available in the Tableau knowledgebase and forums.
How To Visually Analyze R Predictive Models in Tableau
One of the frequently asked questions that I get these days when talking to analysts about Tableau solutions is "what does Tableau offer around predictive analytics"? There does not seem to be a lot of information available on this topic. There is a fantastic How To tutorial on how to use R with Tableau in the Tableau community including some great pointers from Bora Beran on using SQL UDF functions with R. I have been continuously updating a SlideShare presentation as I learn more about this topic.
If I can find the time, I will write up a few white papers on various predictive solutions with Tableau alone and by combining Tableau with open source, R, SAS, SPSS, Alteryx, Microsoft SQL Server and Excel Data Mining, Oracle Data Mining, Oracle SQL Extensions, SAP HANA, Sybase RAP, and other options. In the meantime, I hope my blogs on this fun topic will be helpful. In my January blog, I covered how to combine Microsoft SQL Server Data Mining with Tableau (live view here). In this blog, I want to cover the super popular, R. Using R with Tableau is actually very EASY to do, does not require R scripting, or any scripting if you prefer simple point and click user interfaces.
Before we jump into R, keep in mind that there are some native Tableau, out-of-the-box, predictive features around Trending and Forecasting (new in v8) that do not require R or other any stats programs. There is also multi-pass aggregation capabilties that can be used for some predictive processing scenarios that may not be easily overlooked. The Tableau Trending feature covers both linear and nonlinear model types including Linear, Logarithmic, Exponential and Polynomial modeling with statistical significance details, model formulas, p-value, t-value, degrees of freedom, standard error, mean squared error, confidence bands, residual analysis and related model evaluation information. The Tableau Forecasting feature is new with Tableau v8. The Forecasting features use exponential smoothing algorithms with and without trending and seasonality detection. Both Trending and Forecasting can be accessed by right-clicking on a visualization in Tableau or via the Analysis menu. Anyone can easily use these Tableau predictive features - they do not need to be a data scientist. It would be helpful if they understood the concepts of statistical significance to understand the generated model formulas, what is a good model vs. what is not so good, or further deep dive into the model evaluation details.
If you are looking for more sophisticated data mining models, a combined R and Tableau solution can be used. You can use R with or without the nice point and click Rattle user interface to generate predictive models such as classification (kmeans, decision tree, regression tree), clustering (hierarchial, ewkm, bicluster), association (market basket), neural networks, support vector machines, or other types of R models, then export the R predictive model scoring output to a .csv file or a database, and visually analyze those predictive model outputs in Tableau just as you would any other data source. For most scenarios you can use the free versions of R and Rattle. For very large predictive models with 100's of attributes and variables that surpass the free R version memory limits, something like Revolution Analytics for R may be a better option to create the predictive models and scoring.
Let's walk through it. 1) Your data should already be prepared for data mining - a single data set, in a flattened format with variables and transformed variables like those used in statistics. The Dorian Pyle Data Preparation for Data Mining book is my favorite for learning how to best structure data for predictive analytics. Regardless of your predictive analytic tool choice, preparing data is something that has to be done and is a bit of an art and a science. If none of this makes sense to you thus far, you need to learn the basics of data mining to understand what you are doing even if it is easy to do. The classic book used to teach data mining 101 is the Witten Frank Hall Data Mining, Practical Machine Learning Tools and Techniques. The data prep task typically takes the most time in any predictive project, much like ETL in a data warehouse project.
Your flattened, prepared data can reside in a variety of places. R can load data from csv files, Excel, databases via ODBC, or other formats. 2) To load your data into R Rattle choose the data source and click the Execute button. Now choose the fields/attributes/variables that you want to predict (Target) and what variables are the influencers (Inputs) used in the prediction. You can optionally assign weightings for the inputs.
Now you can 3) start creating predictive models by choosing the desired predictive model type tab and model option.
Then you can 4) evaluate the model to check if it is a good or poor predictive model and continue experimenting until you have a good model.
When you are happy with your model Lift/ROC and want to running new data through the model and scoring it for predictions and visualizing it in Tableau, click the Evaluate tab, choose Type Score, Report Class or Probablity, and Include All to generate the .csv file model scoring output .csv file. Then click the Execute button and select where to save the output.
Yay, so now we have predictive model output that we can start exploring and visualizing with all the awesome features within Tableau that you already know, create a dashboard, publish it or even further explore it on a mobile device like an iPad, Android, or Surface tablet. 6) To load the predictive model output, open Tableau, choose Connect to Data, select Text file and "ta da", now you can use Tableau with the predictive model just like you would with any other data source type. If you do want to incorporate some of the R visuals like the decision tree that are not in the box yet, you can embed that image into your Tableau workbook or dashboard like I did in the sample pictures at the start of this article.
Now there is no excuse for not adding some advanced predictive modeling into your Tableau analytics. It is FREE, quite easy to do, fun, and powerful!
I do confess that this blog post was rushed, written on a plane trip from Washington DC to Tampa, due to high demand after mentioning I did this for a recent POC. There will be much more to come on this topic when I have a few spare moments breathe! In addition to the white paper, I have a SlideShare presentation with a sample live demo of a Tableau packaged workbook from my previous post for you to play with this yourself. Enjoy!
Thoughts on the Gartner 2013 Magic Quadrant for BI Platforms
***See a visualization of the rankings changes over time posted here.***
Tableau is ranked as a Leader in the 2013 Gartner Magic Quadrant for Business Intelligence report.
Customers rate Tableau #1 for ease of use, fastest development times, and it has
unmatched big data visualization. Told ya so... I couldn't resist! I literally just said that last in my last few posts. I knew that Tableau had matured into a true beauty for data lovers, the upcoming version 8 is simply amazing, and apparently the Gartner industry analysts see it in a similar light. Data discovery is now mainstream with every vendor trying to emulate what Tableau has been doing for years. It should get interesting down the road but Tableau does appear to be way ahead of this trend and the best in class in this niche space. I told my Microsoft friends Tableau was complementary and indeed Gartner confirmed that important point as well today. I am sooooo excited that both of my loved BI platforms were seen as stars.
This year the Gartner Magic Quadrant for Business Intelligence report took on a whole new level of special meaning for me since I was on the technical product marketing team that submitted Microsoft's response. It was an excellent experience, working hard with both the engineering and marketing teams, huddling up and rallying for what may be the best answer with brilliant engineers that don't give their solutions enough of the rah-rah that they deserve. The teams did a fantastic job responding, presenting, and supporting the analysts reviews. It is a thorough process.
After reading through a few of the BI vendor detail comments, I wanted to share a few notes and take aways. The first one is that ease of use and total solution time to value are typically the most important decision factors and need to be included or "scored" in BI vendor evaluations. Time is money. period. I see the intelligent people overlook that in their RFP/RFI evaluation matrices.The happiest customers have BI platforms that they enjoy working with, are agile and supportable. I did appreciate that for the second year in a row QlikView had the worst/longest development times of any vendor reviewed. Having been in a global BI technical marketing role and listening to win/loss customer reviews around the world, I heard my fair share of customers cite QlikView pains, excessive development times and total costs of ownership becoming far more than estimated/budgeted. If you are reviewing QlikView, make them show you how it is built with the scripts, time intelligence, how to change, add new columns, how to do mash-ups, test real time data, test incremental loads, load test it and so forth. The next item I want to bring up is about cloud BI. Many of the vendors were given kudos about cloud BI but in the Spotfire commentary section I noted a point about "67% of the survey population at large will never put their enterprise BI in the cloud". That is an interesting figure that I'd like to better understand. There are a couple vendors betting big on the cloud including Birst and Microsoft. I do think it is interesting in 2013 that several of the pure in-memory players finally are waking up to the direct query advantages in the world of big data. The mobile BI commentary in the Microstrategy section was spot on with what I have seen in the market the past year. The report noted 50% of Microstrategy customers were piloting mobile BI in 2012 and Microstrategy was being invited to competitive bids in companies that see mobility as a strategic imperative succeeding to replace long-established BI vendors that lag behind in mobile BI. It is a shame that innovative Microstrategy tumbled this year. It is a great but expensive and complex solution that takes a bit of time to get the shared metadata layers ready to be able to enjoy Visual Insight and other reporting. Congrats to a peer of mine at IBM that again rocked this report. I used to develop with Cognos, got some Cognos v10 books this year to play around, and I do like some of the things that I am seeing from them. SAP...oooh. So I went to SAP TechEd last fall and left there feeling exhuberant about Microsoft BI. SAP is all over the place, obsessed with SAP HANA, alienating acquired Business Objects shops that are not SAP ERP, and much like Qlik, SAP customers cite long project development times and pains. I did like SAP's Predictive Analytics with R but I didn't see any mention of it in this report. Spotfire will be one to keep an eye on this year! I do like what they are doing with predictive and R. Since they are tied into Tibco, I wonder how much of the middleware may become a future requirement. Lastly, I do feel it is a bit intriguing that vendors with a narrow scope of offerings were included/rated side-by-side with vendors that have a plethora of functionality across all the BI lifecycle areas. This report should be reviewed along with other reports on the specific functional areas of interest to understand apples to apples comparisons across vendors.
I could ramble on and on but I do need to get some sleep. To wrap up, congratulations to all the BI Leaders. The rapid BI industry development and vendor progression in this space is thrilling. I can't wait to see how this plays out in the upcoming years.
Combining the Best of Microsoft BI and Tableau
The past month I have been fielding a lot of calls and email inquiries. Feel free to email me if you want to chat. Let me take this opportunity to reassure all of my Microsoft friends that I continue to LOVE Microsoft deeply with all my heart and soul. As mentioned in my previous post, in 2013 I will be advocating the best of combined Microsoft + Tableau solutions for many reasons. One of the biggest reasons is that Tableau has mature touch-friendly, "browser-based" mobile BI for a wide variety of mobile device types. Tableau published content is mobile aware and touch optimized with no extra coding or dual authoring - a publish once, view anywhere model. Tableau also has "native" mobile BI apps for both iOS and Android - by far the most popular BYOD mobile devices in the market today. I anxiously await the Microsoft Surface Pro and I do hope that there will be market uptake of it. If you do have a Windows mobile device, you can use Tableau's "browser-based" mobile BI option. Anyone that has followed me through the years knows that I have always tooled around with mobile gadgets since PocketPC and SQL Server CE in 1999/2000. I also need to apologize to Jen Stirrup because she was recommending this Microsoft + Tableau mobile BI combination last year and I kindly asked her as a Microsoft MVP to downplay it...now here I am only a little over a year later singing the praises of the solution she first evangelized. I am sorry Jen - you did find a great solution and I was slow to embrace it.
Why Tableau? Tableau has won many accolades from Gartner, Forrester and other analysts as the BI industry data visualization "sweetheart". It has been best in class for a few years now. It also won the top award in SQL Server Pro magazine's 2012 Best BI Reporting product. Tableau is data and BI platform vendor neutral and allows optional in-memory or direct connections. Tableau customers totally love them - much like I totally love Microsoft. I have never in my career replaced Tableau or heard bad things about them from customers. I have replaced QlikView quite often and heard cries for help only months after a QlikView purchase more than once. Spotfire seemed interesting to me especially with the R integration for predictive but I didn't hear the amazing kind of customer feedback from Spotfire customers like I heard from Tableau customers. I also figured that I'd find a way to combine R with Tableau. When I tested Tableau I found it to be user friendly, functionally deep for analytics, had gorgeous ad-hoc data visualization with a huge, dreamy array of native direct data source connectivity and visualization options. The mobile BI aspects for iPad and Android sealed the deal per se.
For over 15 years I have been a consultant recommending, building and delivering the best, mixed technology solutions. It has only been in the past few years that I became vendor specific due to my role. I am a true engineer that does listen to her customers, admits when she is wrong, calls you up when you are wrong, and also likes nothing more than to explore what all the technology vendors offer to recommend the best solutions. It is incredibly wonderful to have customers that evolve into true friends as the years pass...I thrive on hearing that a solution that I built or recommended a zillion years ago is still up, running and doing the job today. It is no secret that I am motivated by making a difference - not by making the most money. If I can help people and be intellectually challenged while doing so than it is a fantastic day. Thank you to all of the people that have been reaching out to me. This is what I have been up to lately. I still love Microsoft and I am still a huge data geek.
Happy New Year!
Forbes 100 Must Watch Trends in 2013* Ok, so it is not my article but I did feel that it was a "must include" for my readers to stay at the top of their professions. I also had an appreciation for "22.Data Scientists: The New Hotshots (Tableau is great visualization software for data science types)." 59.Mobile-Optimized Goes Mainstream, 72.Personal Data Ownership, 77.Responsive Web Design, 80.Self-Service and 88.Tablet Shopping.
Predictive Data Visualization with Tableau
One of my New Year's resolutions was to drink a little less Microsoft Kool-Aid and become more open minded to combined analytic solutions that have best in class offerings. Don't take that the wrong way, I still LOVE the Microsoft data platform, SQL Server, confess to using Microsoft Office daily and even pay for an Office365 personal account - BUT - I also feel it makes a lot of sense to supplement Microsoft products when there are voids or the basic, "for the masses", features become a limiting factor or even a liability. I believe that BI and analytics tools are strategic market differentiators. Organizations that have the best BI and analytics will have a competitive advantage, period.
Let's look at a scenario where I was unable to easily explore or visualize a large data set in Excel 2013. I needed something much more robust that could deliver interactive data visualizations, at the speed of thought, with data sets having hundreds of thousands or millions of records - a scenario that even the latest Excel 2013 and/or Power View version simply can't do or compromises with data sampling techniques. Power View starts sampling at 1000 data points to guestimate the display of large data sets. Tableau on the other hand was rapidly rendering over 800,000 data points with ease in the last TCC Keynote demos. The Microsoft Yahoo! case study and Klout case study are other excellent scenarios where it just made sense to supplement the largest Microsoft data platform projects with Tableau for the large scale visual analytics.
Having said that, let's get on to the real title content of this blog. I wanted to test taking the Tableau v8 "Kraken" beta and combine it with Microsoft SSAS Prediction Queries (DMX) to showcase the power of combining two best of breed offerings. To see this in action I posted a live demo on Tableau Public.
Looking back on 2012...
WOW! 2012 was an incredible year. I wanted to fondly look back
upon the entire year of 2012 before diving into what is new, wild
and exciting in 2013. This past year has been amazing personally
and professionally surpassing anything that I could have dreamed
up. For many data science and analytics lovers, 2012 was a HUGE year for us. Most BI vendors had significant and more frequent releases.
There are rapid BI industry and player changes all around us with
never seen before speeds to market! New and old BI vendors are literally
releasing on-premise features every couple weeks and almost all
BI vendors are continuously rolling out native mobile app updates.
Traditional BI is evolving. Self-service BI has matured and is mainstream.
Consumerization of IT has altered the application, operating system
and device landscape with iOS and Android now dominating market
share. Microsoft finally gets serious about the mobile device market
with Windows 8 and the Surface
tablet at the end of the 2012. Mobile BI has become a true "need
to have" in the continually fragile, highly sensitive, global
economy. Cloud, SaaS, niche and open source BI app vendors are popping
up everywhere and coming to the table with compelling solution options
that can't be ignored. Big data, social, mobile and in-memory have
been taking front and center stage in daily customer conversations
all year long. Predictive analytics and data mining gained a lot
of momentum with the big data explosion. That R
book I bought years ago got dusted off and I am ecstatic to
be applying knowledge from my old UCSD Data Mining cert program
- if only I could finish that data prep white paper I started in
2011. There is growing "cloud noise" these days across
all vendors, not just BI vendors, with Amazon, Salesforce, Google,
Oracle, IBM, SAP and Microsoft Azure all battling for early market share.
Current cloud BI vendors see some early adoption in the small business
market segments but most enterprise customers are not quite ready
to make the cloud BI leap of faith into the unknown world of cloud
data management, latency, reliability, security, compliance, costs
In January 2012, I updated my blog and web site to be mobile device
friendly using HTML5, CSS and media queries. Then I shared how to
use those same techniques with the SharePoint 2010 December 2011
CU to enable iPad mobile device display of Microsoft BI content.
In February 2012, the Gartner
BI Magic Quadrant report was released with a few niche BI vendors
shifting up and traditional BI vendors moving down in rankings due
to speeds to market for high-demand mobile, social, in-memory and
self-service innovations. Overall BI demand was EXTREME - it felt
a lot like the late 90's dot.com bubble to me but this time for
BI related apps. SAP Hana, IBM Cognos 10 Suite release, QlikTech's
relentless telemarketing, Microstrategy's mobile BI road shows and
Tableau's big data visualization excellence are everywhere. I won't
forget the 18-20 hour days trying to keep up with all the industry
changes while also helping my customers effectively utilize and
strategically plan future BI solutions. In March 2012, SQL
Server 2012 was released. Then several spring SQL Saturday's,
numerous user group meetings, TechEd America 2012, TDWI, my SlideShare
data visualization presentation goes "viral", and
then a call regarding an awesome BI global technical product management
role. In July, I had a Donald Trump "Apprentice-like" experience
with a last minute, major event, key note presentation build/demo.
There is nothing like the adrenaline rush of a bleeding edge technology
key note demo being telecast globally. Also in July, Wave 15 Office365
and Office 2013 were publicly previewed further changing the dynamics
of Microsoft BI-land since PowerPivot and now SSRS has continued
the merge of BI into Excel. Event planning-fest dominated August
and September... In October, HDInsight was revealed at Strata and Azure
IaaS was covered at BUILD. In November at PASS
and SharePoint Conference, additional Excel-focused BI futures
were unveiled with Data Explorer (self-service SSIS), GeoFlow (stand-alone
Excel add-in) and native mobile app sneak peeks for iOS and Windows. Hekaton,
in-memory OLTP for SQL Server, and update-capable columnstore indexes
future were also shown...and last but not least, another bleeding
edge key note demo of DAXMD with Excel 2013 just in case I didn't
get my fill of adrenaline from the summer events. In December, event
planning-fest for 2013 continues while many other changes begin
to take shape.
This past year was exceptional. I absolutely loved helping people
take BI to the next level. I loved waking up and making a difference.
I loved working with MVPs, partners and the talented engineers that
build Analysis Services, Excel, PowerPivot, Power View, Reporting
Services, PerformancePoint, and other Microsoft BI products. Without
question, 2012 will be a year that I will always remember and cherish
deeply. I am humbled by all the innovations I see happening all
around me. The BI world is changing quickly...and with all these
changes, I too must continue to learn, grow and evolve.
Unstructured Data Mining/Predictive Analytics with Mahout
I came across some tutorials on Hadoop on Azure, now called HDInsight, with Mahout. If you don't know what Hadoop, HDInsight or Big Data is and you are a data pro then you'd better start learning about it. You can start digging into the HDInsight hands-on tutorials or spin up your own Windows Azure trial account.
Although I knew a little bit about Hadoop, HDInsight and Big Data, Mahout was completely new to me. I have heard and played around with R in the past. R is getting quite popular in the data analytics world. For those of you like me that do not know about Mahout, it is an Apache TLP project to create scalable, machine learning algorithms. Upon first glance, Mahout seems to have a pretty nice library of algorithms including classification, clustering, neural nets, bayesian, support vactor machines, item-based collaborative filtering (recommender), dimensional reduction and principal component analysis. Since I have been historically fascinated with predictive analytics going on ten years now, I could not resist sharing these cool new analytic tutorials the moment I saw them. After you get HDInsight figured out, then you can get started tooling with the Mahout on Azure tutorials. Typically Classification is easy to understand, validate and learn for non-data mining experts - that one may be the best Azure with Mahout tutorial to try first.
What is particularly exciting about this specific combination, big data + predictive machine learning, is not the cloud aspect at all but rather the easy button machine and infrastructure spin ups to learn very sophisticated, powerful, unstructured data analytics...the very same type of unstructured data analytics I'd bet governments, intelligence agencies, the IRS, banks and other data savvy institutions are using to protect citizens and target the good/bad guys in the world of exploding data. Data mining is coming to the forefront of analytics now that historical reporting and self-service reporting have matured. Big data further increases the need for more sosphisticated data analytics and the people that know how to do it.
PASS and SharePoint Conference 2012 BI Highlights
GeoFlow Excel Add-In Heatmapping
It literally has been an "eventful" month or two - pun intended! After being on the road for multiple Microsoft product launches, SQL Saturday and attending SAP TechEd, I am ready to get back to the BI blog. I also made a deal to get some extra BI coverage at events by stepping back and letting in-memory database get all the spotlight coverage for a few weeks. My beloved BI seems to always be asked to step aside... that is until the new PASS Microsoft Business Analytics Conference, April 10-12, 2013, where Microsoft BI can finally be front and center stage with an Excel audience. Note that although there will be some BI professional topics at this new conference, it will not be structured like the former Microsoft BI Summit. The new Microsoft Business Analytics Conference is heavily targeting Excel user audiences and Excel topics versus SQL Server BI professionals and SQL Server BI related topics - be sure to check out the event web site to fully understand what will be covered in various tracks.
For those of you that just want a quick summary and some visuals of everything new across all recent BI events/highlights, check out my latest What is New for BI SQL Saturday deck, For further commentary, please read on.
For BI professionals there were some BI highlights at PASS Summit 2012. Most of the big news this year was around SQL Server relational in-memory database advancements. Teo Lachev has a good overview. Unlike Teo that thought SQL Server 2012 SP1 availability was his top BI announcement, my favorite BI announcements were 1) DAXMD preview will be coming soon that allows Power View to be used with SSAS multidimensional data sources 2) Update-capable Columnstore Indexes (xVelocity in-memory) that will be helpful for real-time reporting solutions 3)
Polybase enables distributed SQL queries between unstructured Hadoop and structured PDW data warehouse without moving the unstructured data. Last year the ease of using the Excel Hive ODBC connector was shown and this year the future of combined unstructured with structured data solutions DeWitt described is really starting to take shape. 4) BI Power Hour was over the top as usual and the mobile BI Azure app by Patrick LeBlanc - that I won't dare describe here - stole the show! I am still laughing about that one weeks later.
Then on to SharePoint Conference (SPC) 2012. I loved SPC 2012 - there was a lot of positive energy at this event and a lot more BI people than I expected. At SPC, SharePoint 2013, Office 2013 and Office365 launched. In the Business Intelligence track opening session, Microsoft native mobile business intelligence applications for Excel and Power View were demonstrated for iOS iPad and Windows 8 devices. There was a lot of good crowd buzz around the "sneak peek" native app demos. More information on those will be made in H1 CY2013. Another very cool BI project shown at SPC is GeoFlow. GeoFlow looks a little bit like Power View but it is an add-in to Excel for 3D geospatial data overlay visualizations. GeoFlow allows visual data exploration of geospatial data and records the steps for optional video replays. The heat mapping feature was fantastic. Note GeoFlow should be available for public preview soon. ExcelMashUp.com was showcased in BI booth demos, along with Excel Inquire, Excel Power View, Excel Flash Fill, Excel Quick Analysis, and many other Excel 2013 enhancements. Clearly Excel is getting all the BI love these days. Other great BI track sessions included Kay Unkroth's Self-Service BI Governance, Kevin Donovan's PerformancePoint deep dive and Excel session and all of Diego Oppenheimer's Excel sessions.
Extend Excel Data Visualizations with Office Apps
Excel 2013 Content Off Apps (video)
To try out some of the public apps for Excel or build one yourself. Note that it is absolutely addicting when you see it all working together and a lot of fun. Finally I can break out the box and truly go nuts with any data visualization I want without excessive barriers and pain!
If you want to build one for yourself, this is where I started: http://msdn.microsoft.com/en-us/library/jj220082(v=office.15).aspx and http://msdn.microsoft.com/en-us/library/jj220065(v=office.15).aspx. You will need a free O365 Developer account that you can get here: http://msdn.microsoft.com/en-us/library/fp161507(v=office.15).aspx and Office 2013 Public Preview.
What-If Analytic Scenario Simulation Options
A common advanced analytic requirement is "what if" scenario modeling. How can we enable our analysts to alter certain parameters
within known contraints to test potential impact? Well, let me walk through a few ways this can be done.
Quite honestly, many analysts use the default, out-of-the-box Excel features
for "what if" scenario modeling and never realize all the options available to them.
Excel actually has five kinds of what-if analysis tools within
Excel or Excel Add-Ins: 1) scenarios, 2) data tables, and 3) Goal
Seek. There is also 4) Solver, an add-in to Excel that allows for
more variables. Lastly the most advanced what-if analysis option
is within the 5) SQL Server Data Mining Add-Ins for Excel.
The SQL Server Data Mining Add-Ins for Excel forecasting/time
series what-if option can
cross predict patterns with many variables, scales to extreme size
data sets and uses true predictive algorithms.
level what-if analysis, Excel Scenarios and data tables take sets
of input values and determines possible results. These options are
limited to a few variables and can only have up to 32 values. Goal
Seek is different from scenarios since it works backwards...it
will take the end result and determines potental values to get
that result. Note Goal Seek is limited to one variable.
Solver is similar to Goal Seek except it allows for more
variables and enables building forecasts by using the fill handle
or series commands that are built into Excel. It is limited to
linear trend or growth trend calculations. For more advanced
forecast complex and nonlinear data models, you can use the Analysis Pack add-in or the Data Mining Add-Ins for Excel. Solver
works with a range of cells that are related to a formula in the
target cell. Solver adjusts values in variable cells a.k.a.
"adjustable" cells. Solver does allow constraints to restrict the
Lastly the most advanced what-if analysis
option, Cross Prediction, is available via the Data Mining Add-Ins for Excel. The
Data Mining Add-Ins for Excel has
forecasting/time series algorithms that can
be used for complex, non-linear and patterns with seasonality or
periodicity. The time series anlysis cross-predicts how patterns in one time series impact
other forecasts or series in the same model. This option allows
you to specify future values for a series and then continue the
forecasts from that point. Time series algorithms use Fast
Fourier transformation to detect seasonality before training. This most advanced option is also the
most scalable for large data sets that exceed Excel limits since
this Add-In can reference data stored in a data warehouse or other
database server. Also the Data Mining Add-Ins for Excel do not suffer from the variable limitations of other
options and offers true predictive modeling capabilities thanks to
the power of the Analysis Services Data Mining engine.
Cross Prediction is a query-time feature added in 2008 that’s
independent of OLAP writeback. This feature allows you
replace or extend the training-time time slice data as part of the
prediction (forecast) query. Thus you can apply the patterns
detected at training time to a new series (or to an extension of
the training series). For example, if you suspect diesel
prices follow the same trends as gasoline prices and gasoline data
is more readily available, you can build a forecast model with gas
prices but then use that model to predict diesel prices.
Here is a video on time series forecasting with this Excel add-in. Here is a tutorial
that covers Cross Prediction with more details: http://msdn.microsoft.com/en-us/library/cc879290.aspx.
There is also cool sample application that can be used to demo
this concept visually. The demo allows you to drag a
forecast curve and see related series changes: http://www.sqlserverdatamining.com/ssdm/Default.aspx?tabid=97&Id=382.
Beautiful BI Sites with SharePoint 2013
SharePoint site design and difficulty creating SharePoint master pages has
plagued most of the BI developers I know including myself. In the last 16 years,
I have seen my fair share of ugly SharePoint BI sites, dashboards, reports and applications with
incredibly powerful information or decision making capabilities disregarded by
the business simply because the user interface didn't "look good".
Shallow...absolutely, but indeed true. It is human nature to perceive and
associate the quality of an application or information based upon the look and
feel of the packaging around what is being shown. User interface design
and dashboard design are both a bit of an art and science.
So here is the good news for anyone that has struggled with
SharePoint user interface design for BI sites and
applications. In SharePoint 2013 there are new Design Manager features can automatically convert HTML designs you create or get from standard HTML web template resources to SharePoint master pages.
There is some tweaking that needs to be done to put the main
content div and web parts in the desired locations but the developer experience
to get a decent looking SharePoint site is light years easier than
it is in SharePoint 2010. You can also use any HTML editor
now - you are no longer locked into SharePoint Designer.
That is another rah rah from me since I am old Dreamweaver fan
myself from the 90's. You can then combine the better
branding with the new mobile device channel features for applying responsive design
targeted to specific mobile devices to better "mobilize" your
SharePoint BI...the possibilities are quite amazing for the mere
human to now easily accomplish. To learn more about these
nice new features, Benjamin's blog has good step by step coverage
and here is the link to the related MSDN topics.
One of my wish list items is to have a
few BI friendly master pages added to CodePlex…working that one
already but always love to get creative, inspirational ideas from
others in the data loving profession.
Free BI Technical Training from Microsoft Product Teams
WOW - how did I miss
the memo? I did not know this
great technical resource even existed so thought I would share it here in case others are
also not aware of it. The MVA web site contains FREE technical training content for
customers and external parties developed by the Microsoft product engineering
teams - so not the marketing "fluff". There are videos, PDFs documents and also a
scoring/point system for those of you that have the burning need to compete with
other techies globally. The BI Track content posted thus far is below. You
can bet that additional material for the newer wave of BI will also get
published on it.
Breakthrough Insights using
Microsoft SQL Server 2012 - Reporting Services
This course is
designed for IT professionals understand how to implement the new
features of Microsoft SQL Server 2012 that will enable them to
provide insightful and visual reports. The 9 modules include:
•What's new Microsoft SQL Server 2012 Reporting Services
to Self-Service Alerting
•What’s new in Microsoft SQL Server 2012 Analysis
Services and PowerPivot
•How to use Power View for insightful visual reports
•How to build great Business Intelligence Semantic Models for Power View
•Deploying and managing PowerPivot in Microsoft SQL Server 2012 for SharePoint
•A Deeper understanding of PowerPivot in Microsoft SQL Server 2012
to install and configure Reporting Services in SharePoint Integration mode
•An overview of Reporting Services in Microsoft SQL Azure
SQL Server 2012 - Analysis Services
This course is designed to show you how you can take
advantage of the newest features in Microsoft SQL Server 2012. The 7 modules of
this course will introduce you to concepts and practices such as:
•Developing and Managing a Business Intelligent Semantic Model
your BI Semantic Models with Data Analysis Expressions (DAX)
Master Data Services (MDS) in your Data Warehouse
Office 2013 Business Intelligence Highlights
The Office 2013 Consumer Preview was released and there are some pretty awesome business intelligence highlights
within it. Here is a quick summary of the features I have seen thus far that I
am excited about. I will be digging deeper into these new features soon. In the
meantime, if you want to play with Power View in Excel today simply install
Office 2013 from the link above and then download the demo
Excel 2013 file. That is all you need to do to get started - no SharePoint
- - Power View and PowerPivot is now directly embedded into
Excel 2013: powerful "in-memory" analytics, highly scalable, manageable, self-service BI will be accessible to every
single Excel 2013 user
- - Power View new data visualizations: fully interactive,
drillable geospatial mapping, pie charts, cross filtered pie charts and KPIs (finally easy map reporting in Excel or
the web browser)
- -Power View report design and branding enhancements,
images, background images and many more
design themes to style and brand reports
- - Power View hierarchy support, multiple-level drill up
and down, ad-hoc user defined drill paths and hyperlink
features for web-based drill paths to other applications
- - Excel Quick Explore visual, interactive slicing/dicing
interaction with data and charts on desktop and in the web
browser: a "Proclarity-like" data exploration experience in
Excel that is mobile device friendly across IE, Safari, iOS and other popular mobile device browsers - I couldn't
resist testing it tonight and I can confirm that Quick Explore
features do indeed work on iPad!
- - SharePoint published Excel reports can also be edited in now the web browser by exposing the pivot field lists and choosing the data to
show in the report
- - Excel now has timeline filters, cycle charts, trend charts and a neat flash fill feature
- - Discovery and assessment of user-created spreadsheets
with SharePoint 2013, comparing spreadsheets
compliance, tracking lineage, conducting interactive diagnostics and creating spreadsheet analysis reports with "Inquire" in Excel
- - Excel Office Apps ("Napa" formerly
apps to be published to Office App Store and then be embedded into Excel sheets for dynamically
displaying images, inline frames/web site URLs, read/write
data, get selection or data change events on desktop and in
web browsers (I saw one very cool demo of a decomposition tree
within Excel 2013 using this technique)
- - Easier to work with external and cloud data sources: new
query tables from unstructured sources, easier web services consumption, URL, and XML
data imports (stock, twitter feeds), OData, REST and JSON
- - SharePoint, SharePoint online and O365 BI and mobile enhancements
- - New data set recommender features that can find and
suggest related external data sources posted to Azure Data
Market based on the data in an Excel model
- - SharePoint 2013 also improved site design and mobile BI
web browser design features such as dynamic image renditions
and “channel” enhancements that allows for easily surfacing
and adapting content for presentation through multiple
channels and mobile devices
- - Visio
Services has improved data driven diagrams in mobile
web browser like iPads for unlimited data visualization with improved support for touch across devices, no more Silverlight
(diagrams and strategy maps are now mobile friendly),more
supported data sources, like BCS and SQL Azure and also commenting is supported in mobile web browsers as well as the
For more information on what is new for BI and SQL Server, keep
an eye on http://blogs.msdn.com/b/microsoft_business_intelligence1.
Great blog from the Power View engineering product team
showcases data visualization and interactive mapping in Excel
Power View tips from Sr. PM Sean Boon http://blogs.msdn.com/b/seanboon.
More comprehensive list of what is new in Power View and Excel 2013 for developers.
Mobile Business Intelligence
TechEd 2012 Mobile BI Session Recording:
Good News for Microsoft BI on iPad http://technet.microsoft.com/en-us/library/hh697482.aspx
SSRS now officially supported http://msdn.microsoft.com/en-us/library/jj659023.aspx
Mobile BI on iPad tutorials http://www.msbiacademy.com/Lesson.aspx?id=91
Excellent Mobile BI video! http://youtu.be/uDZ-CsOk8C8
I just finished my first Mobile
Business Intelligence presentation. Although the BI demos worked
fine, the VGA adapters were not so fun! SharePoint’s
out-of the-box, browser-based mobile access supports dashboards,
web part pages, line of business data, documents, lists, calendars,
searches, workflow, wikis, blogs and SMS or email alerting on SharePoint
content. Microsoft publicly announced a commitment to delivering
highly interactive, touch-based and fully immersive business intelligence
experiences across different mobile devices to all users wherever
they are in 2011. Our Flexible Work Styles mission (http://microsoft.com/workstyle)
is to provide a mobile ready platform allowing the business instant
secure access to files, sites, applications and analytics with an
internet connection. In December 2011, the first set of mobile device
enhancements for iPad was released as detailed in http://technet.microsoft.com/en-us/library/hh697482.aspx . In
2012, additional touch-based data exploration and enhanced visualization
capabilities for iOS, Android and Microsoft device platforms will
be released. Looking forward to the Windows 8 wave, Microsoft will
deliver best in class mobile productivity experiences with Windows
Today SharePoint out-of the-box supports a wide range of browsers
- IE Mobile on Windows Mobile 5/6/6.1/6.5
- iPad/iPhone Safari can be used with or without mobile specific
- Safari4 on iPhone 3G/S
- BlackBerry 4.x and newer versions
- Nokia S60
- NetFront 3.4, 3.5 and newer versions
- Opera Mobile 8.65 and newer versions
- Openwave 6.2, 7.0 and newer versions
Users can easily use SharePoint dashboards, scorecards and reports
to review performance metrics, add and view comments, visually explore
information further with interactive slice and dice, drill across
and drill down to detail using mobile device UI navigation techniques.
Office Web Apps for SharePoint can also be utilized to open Word,
Excel and PowerPoint reports in a Web-enabled device.
PowerPivot in SharePoint is also mobile device friendly out-of-the-box
today and is very popular for self-service business intelligence
and data analytics. PowerPivot empowers users of all levels to access
and mash up data from virtually any source. Users can create their
own compelling reports and analytical applications, easily share
insights, and collaborate with colleagues through Microsoft Excel
2010 and SharePoint 2010. SQL Server 2012 introduces Power View,
providing SharePoint users with highly interactive, browser-based
data exploration, visualization, and presentation capabilities.
FAQ: SSAS Tabular Model Security White Paper Released
Continuing on with my FAQ theme, Dave Wickert just directed me
to a recently released security white paper for SSAS Tabular Models
written by Cathy Dumas that I could not resist sharing with everyone.
The white paper describes the security model, roles and security
considerations for handling credentials when connecting to a SSAS
tabular model and connecting to a relational data source for processing.
There are examples of both static and dynamic security scenarios.
Dynamic security approaches shown include step-by-step instructions
for creating models that implement security based on the currently
logged on Windows user name or based on custom data passed to Analysis
Services. Two different methods are used to apply row filters –
one using many-to-many relationships in DAX, and one using a more
computational approach are also covered.
The SSAS Tabular Model Security white paper is located at http://msdn.microsoft.com/en-us/library/jj127437.aspx.
How to deploy SQL Server 2012 BI in a SharePoint 2010 farm
UPDATE: August 2012 Power View Infrastructure Installation:
Step-by-Step and Scripts
General guidance for implementing SQL Server 2012 BI in a SharePoint
2010 farm (with one dedicated BI Server in the middle tier): http://msdn.microsoft.com/en-us/library/hh231687(v=sql.110).aspx
After the SharePoint 2010 farm is successfully deployed and tested
1. Understand Kerberos and how to implement it for your environment:
2. For connection to data outside of the farm, review:
3. Follow the detailed guidance to install the first SSRS Web [Service]
Application at: http://msdn.microsoft.com/en-us/library/gg492276(v=SQL.110).aspx on the dedicated app server that will be the “BI Server.”
4. Follow the detailed guidance for each WFE in the farm to add
the Reporting Services Add-in at: http://msdn.microsoft.com/en-us/library/hh479775.aspx.
Additional information is at: http://msdn.microsoft.com/en-us/library/aa905871.aspx
5. After validating #1 and #2 are done validate via SSRS Verification
6. Follow the overall guidance to install the PowerPivot functionality
at: http://msdn.microsoft.com/en-us/library/ee210654.aspx on the
dedicated app server that will be the “BI Server.”
a. Then the detailed installation guidance
b. Then the initial configuration guidance at:
c. PowerPivot Verification is at:
FAQ: How is PowerPivot Different than Excel?
I have wanted to write this post for a LOOOOONG time now.
I cannot delay this again, period. Deep technical resources
often ask me what is the difference between PowerPivot and the native
Excel they have known forever, how is it different from various
BI vendor product add-ins data exports to Excel or the Excel queries
with raw Excel data connections capabilities, VLOOKUPs, etc.
Well, let me share some of my top of mind significant differences between these two technologies. They do look alike from the
user interface but behind the scenes, the actual engines
are totally different!
- - PowerPivot has the capability to load massive data volumes in tens of millions, this capability does not exist
in native Excel
- - PowerPivot in-memory, highly-scalable VertiPaq technology
is up to 100x faster than classic VLOOKUPs in in native
- - PowerPivot data models offer 10x and even 15x data compression, there is ZERO data compression in native Excel
- - PowerPivot is truly a n-dimensional, in-memory analytic
model that has 80+ contextual, dimensional analysis
expressions (DAX) for measures and computations, advanced
time intelligence, distinct count, and many other complex
logical functions, this capability does not exist in
- - PowerPivot allows true user-defined, reusable hierarchies, KPIs and easy self-service data modeling for reporting, this capability does not exist
in native Excel
- - PowerPivot automagically detects relationships between
unrelated data sources, this capability does not exist
in native Excel
- - SharePoint PowerPivot Gallery and automated, scheduled
data refresh features, this capability does not exist in native
- - SharePoint PowerPivot is surfaced as an OLAP reporting
data source that can be used in Power View and many
front-end OLAP reporting tools (not just Microsoft's
reporting tools) that supports Analysis Services, this
capability does not exist in native Excel
- - SharePoint PowerPivot has data source, usage and query source tracking for robust auditing/monitoring type
reporting, this capability does not exist in native Excel
- - PowerPivot has the ability to be directly upgraded to Analysis
Services server and retain all calculations logic,
user defined hierarchies, KPIs and data models in the upgrade for large scale deployment to billions/trillions of records, add partitioning and advanced dynamic security, this
capability does not exist in native Excel
I am sure there are many more differences I could list here.
My top of mind points should be enough to give you a feeling for
the true differences in the technologies and how
much more powerful PowerPivot is than Excel without it.
Server 2012 Improvements for SAP Environments
Microsoft is showcasing mobile tablets, SQL Server 2012, Windows
8, Office, SharePoint and Duet solutions with SAP at the SAPPHIRE conference in Orlando May 14 -16. There is a very
slick video being shown on the Innovation page to give you a
feel for the vision of using these two platforms "better together".
In conjunction with this event, a new SQL Server 2012 for SAP white paper was released that discusses
new features of SQL Server 2012 and other key SQL Server capabilities
that are relevant to SAP NetWeaver 7.0 (SR3) and 7.1 products. The
white paper deep dives into AlwaysOn: multiple secondaries and backup
from secondary, auditing for non-SAP database access, column store
for SAP NetWeaver BW, extended online table maintenance and 15,000
PowerPivot "How To" for Large Scale Projects
It seems many BI pros do not realize the wonderful large scale
potential for PowerPivot projects using the easy SSAS upgrade path
in SQL Server 2012. Most of the enterprise customers I help do enjoy
PowerPivot for self-service BI but thought that PowerPivot was limited
to the desktop Excel file size limits. No, it is not limited - PowerPivot
is highly scalable if you upgrade it to the server SSAS Tabular
Mode version! Even better unlike other platforms that "lock"
you into their limited tools, PowerPivot models on the server can
be used in various Microsoft and non-Microsoft reporting tools as
slice/dice OLAP cubes. How cool is that?
If you have a project with say with billions of records, PowerPivot
can be directly upgraded to a server version called SSAS Tabular
Mode. The upgrade retains all the business user logic and calculations
to enable that model to be used on large scale data sets far beyond
the 80-100 million record data sets of PowerPivot. You can also
partition that PowerPivot model for highly performing incremental
uploads after the upgrade, optionally use the extremely fast "in-memory"
partitions or Direct Query partitions features to leverage the database
engine, apply security and so forth.
How To: You can upgrade PowerPivot to the server version by either
directly importing the Excel file into Visual Studio OR directly
restoring the Excel file on the SSAS server. The entire model, calculations,
KPIs and hierarchies are retained. Pictures of these two upgrade
options are below. From here you can now schedule processing and
refresh just like other enterprise server BI processes.
The PowerPivot to server upgrade approach is a great way to maximize
BI projects ROI by reducing the back and forth of the BI team and
the Excel Power User in requirements gathering, prototypes and build
out. Some vendors claim this same exact process saves 100-300x ROI.
Simply let the Excel Power User build their own model on a smaller
data set and then have the BI team upgrade it to be a true enterprise
capable model with hierarchies, KPIs and all the crazy calculations
business users dream up.
SAP Hana and PowerPivot
Last year I shared a fantastic white paper on various ways to consumer
SAP and SAP BW data in PowerPivot for more agile self-service business
intelligence with SAP sources - using-sql-server-powerpivot-for-excel-to-analyze-sap-data.aspx.
At the time this paper was written SAP HANA was not readily available.
Just today I came across an excellent blog and video on the SAP
SDN site that showcases how to bring SAP HANA data into Excel using
the ODBO connector that would also hold true for PowerPivot scenarios
It is actually very easy for a business user to do this themselves.
Take a peek for yourself.
In the field I am often asked why PowerPivot when I have SAP Business
Objects, Excelsius or other SAP tools. Excel is the business
user analytics tool of choice - bar none. Exporting to Excel
is the #1 requirement of most BI tool evaluations. PowerPivot
is easy to learn. PowerPivot is a powerful, in-memory analytics
tool with exceptionally superior performance and data compression
that can work with up to 100 million records on a desktop and can
be upgraded in SharePoint with the SSAS Tabular upgrade wizard to
take that business user's model and now apply it to billions and
trillions of records for a highly scalable, securable enterprise
BI solution. Lastly top of mind, PowerPivot is mobile
device friendly and can display on iPads as well as tablets
since it renders as HTML.
Many of my SAP customers are combining SAP with PowerPivot today
as a "best of both" answer for real win-win with self-service
BI needs. PowerPivot gives them the ability to deliver actionable
information much more quickly in an agile, practical, scalable,
upgradable and secure manner.
Windows 8 Consumer Preview
Windows 8 tablet consumer preview was released on February 29, 2012.
You can download and install it to get an idea of what the future
brings. Finally users can have a “no compromise” tablet
and mobile device experience. It feels like I have been waiting forever,
seriously...but I am a bit impatient when I get truly excited about
something. If you have been following me, I am obsessed with mobile
BI, mobile devices and mobile applications.
With Windows 8 tablet, users can have a “no compromise”
experience. Windows 8 delivers a touch first experience along with
full support for mouse and keyboard where users can have the convenience
and mobility of a tablet, power and familiar experience of the full
PC and ease to effortlessly move between work and personal activities.
Also, with Windows 8 tablets users have connected experience with
Metro style apps while they also can use all their desktop LOB and
productivity apps they use today on Windows 7. Finally, since it is
Windows, companies can take advantage of their existing management
and security infrastructure, as they manage Windows 8 tablets. That
last point is a big deal when it comes to mobile BI and securing your
data. Security is the #1 concern when it comes to mobile apps
since cool gadgets do get lost and stolen. System Center 2012
actually has some very nice features around deployment and security
for iPads, Android and other devices. Back to Windows 8, here
are a few other points about it:
- Designed to be Cloud Connected
- New Windows Store
- Internet Explorer 10 was designed for Windows 8
This is a pretty big deal since in addition to the new, cool stuff
Microsoft also has to support millions of existing applications.
Thus Windows 8 will provide two UI modes Metro and Classic.
Metro Style - This is the new “immersive experience” mode,
optimized for touch input and designed to compete with all modern
tablet devices. Metro style applications can be built in HTML5 and
Classic Desktop - Classic mode will continue to be appropriate for
“work at your desk” scenarios and supports applications
that run on Windows. Silverlight and other browser plugins will continue
to be supported the way they work today.
More posts to come on this topic! In the meantime, download
the iso to see it for yourself.
Big Data, Hadoop and StreamInsight
Big Data seems to be everywhere and I notice quite a bit of confusion
around it. Last week I saw Bill Inmon, the father of data warehousing,
speak about his new book "Building
the Unstructured Data Warehouse". It was a good overview
of how to combine traditional data warehouses with Big Data and
unstructured data sources. It also covered why Big Data is not replacing
the data warehouse - it is supplementing it. Big Data technology
does a great job of loading, storing large volumes of data but is
not really practical for reporting queries, joins and other workloads
that data warehouses have been optimized to deliver.
One related aspect that Bill did not cover was event based processing
with Big Data. Most of the groups I talk to that are asking about
Big Data are also wondering about capturing events, identifying
patterns and anomalies from event streams. Torsten Grabs, a Microsoft
Principal Program Manager, wrote an easy to understand article on
these technologies and also on how to combine them. The full article
can be read here.
- Batch-oriented processing: Microsoft’s Hadoop-based services
provide the MapReduce technology for scale-out across many machines
in order to quickly process large volumes of data.
- Event-driven processing: Microsoft StreamInsight™ provides
the capabilities to perform rich and expressive analysis over
large amounts of continuously arriving data in a highly efficient
- Complex event processing at scale: Microsoft StreamInsight™
can run in the reducer step of Hadoop MapReduce jobs on Windows
Server clusters to detect patterns or calculate trends over time.
Social Analytics PowerPivot Solutions
UPDATED 8/14/2013 Microsoft BI Social Analytics Solution and Technical Support Site
Social analytics enables companies to keep a constant pulse on
the marketplace by reviewing customer sentiment, likes, dislikes
and hot topics at any given moment. Extended Results has built
several incredibly useful, free, social analytics solutions for
Microsoft Business Intelligence using PowerPivot and Visual Studio
Tools for Office. There is a solution for Twitter, LinkedIn
and Facebook. Extended Results also has related analytics
tools for monitoring retail TAG campaigns. You can review and download the
social analytics PowerPivot's at http://www.extendedresults.com/products in the Social Insights section.
To build you own sentiment analysis, check out Twitter's API at http://help.sentiment140.com/api.
Another free Excel Add-In for social analytics is NodeXL Social Network
PowerPivot Maps, Tabs, Image Overlays and Other Creative Tips
Let's face it, dashboard presentation design matters and you can
get fancy with PowerPivot. PowerPivot can be seriously fun, functional
and powerful - especially with upcoming Hadoop ODBC and Hive Query
Add-In. Here are a few creative ideas to spice up your dashboards.
To add maps you can use free Microsoft Labs "Layerscape" Geospatial Mapping Add-In for Excel, Excel shapes with conditional color formatting or a third party
add-in like MapCite.
To embed maps, check out Clearly and Simply blog. Chandoo.og has pretty cool social
networking Excel dashboard examples and an advanced dashboard
design course. To simulate tabs on the top of a page, simply color
the top cell range, highlight current page cell and use the Excel
hyperlink feature to provide navigation at the top. Note that you
can also connect slicers to objects on multiple sheets in a workbook
for a fully contextual analytic experience as the user browses the
dashboards. To add a little extra depth overlay a transparent png
image. To call out actionable items use conditional formatting and
to provide trending information at a glance use the sparklines feature
with thicker lines for improved visibility. For other maps, guages
and custom objects check out BeGraphic.
Good News for Microsoft BI on iPad
The December 2011 Cumulative Update for SharePoint 2010 further
improves the iPad SharePoint BI experience. The SharePoint
Product Team Blog announcement was made on December 19, 2011. With
this cumulative update, users can better view Microsoft BI artifacts
on iPad devices running the iOS 5 Safari browser. What works with
iPad: Business Intelligence Center SharePoint sites, interactive
Dashboards, PowerPivot Reports, Excel Services Reports, Reporting
Services Reports, Scorecards, Analytic Charts and Analytic Grids.
Here is a good blog post with SharePoint Project Server BI on iPad: http://epmsource.com/tag/ipad.
Here is a TechNet article http://technet.microsoft.com/en-us/library/hh697482.aspx that provides the iPad viewing, best practices for navigation and
configuration details. I will post a presentation on
this topic in March 2012.
***Note if you have stand-alone SSRS Reporting Services reports
you want to use in iPad, check out the MobiWeave SketchMD! app: http://sketchmd.com/Products/bi.aspx
Playing with the Big Data Hadoop ODBC Driver
At PASS, Microsoft announced the upcoming TAP for a Hadoop ODBC
driver. This driver will enable integration of Apache Hadoop,
Windows and Azure Hadoop flavors with various Microsoft BI products.
The SQOOP connectors and/or the HiveODBC drivers can be used with
the Microsoft data platform to extract data from Hadoop and put
it into SQL and vice versa.
In the field, I am seeing major enterprise organizations leverage
Big Data Hadoop solutions for inventory, customer intelligence and
various other predictive analytics use cases. By using the
Microsoft Hadoop connector, the massive data storage and some data
crunching can be done on Hadoop's MapReduce framework with results
delivered back to SQL Server for easy business consumption.
Another use case I am seeing is Hadoop Hive querying directly within
To easily test drive this feature, download the installation package
and double click on the HiveODBCSetupx64.msi to install HiveODBC
x64, x86, and the Excel Hive Add-in. Now install the Excel
Hive Add-in, click OK and give it a try! To verify the
install worked, click on the Data tab in Excel and you should see
the Hive Pane as below. Using the Excel Hive Add-in, cick on the
Hive Pane and on the right side of the main Excel window, the Hive
pane will become visible for you to begin querying your Hadoop Big
Data sources. So easy and powerful...
Get Started with Hadoop
Query Apache or Windows Hadoop in Excel
Microsoft BI Metadata, Impact Analysis, Dependency Analysis and Lineage
There are multiple ways to manage Microsoft BI metadata, impact
analysis, dependency analysis and lineage. In SQL Server 2012,
these features will be available in the "Project Barcelona"
SQL Server Feature Pack installs. A live preview can be tooled
with at http://projectbarcelona.cloudapp.net.
A current option fan favorite is the low cost BI Documenter by
Pragmatic Works: http://pragmaticworks.com/Products/Business-Intelligence/BIDocumenter/Overview.aspx.
BI Documenter works with the full BI stack - database, SSIS, SSAS,
SSRS - with versions 2005 and up. It has many nice features
such as a metadata editor, object change compare viewer, dependency
graphing, workflow and totally automated solution documentation.
Another current option is the Microsoft Metadata 2008 Toolkit at http://sqlmetadata.codeplex.com.
The metadata toolkit tracks data lineage, technical metadata, and
performs impact analysis for SSIS, SSAS, SSRS, and SQL. Features
- -Database Repository to store the metadata
- -Stand-alone Dependency Analyzer program to view dependencies
- -Report model for ad hoc reporting on technical metadata.
- -Reports on impact analysis, dependency analysis, lineage
Additional Metadata Management Resources
SQL Server Metadata Toolkit
Managing Metadata in SQL Server
Microsoft Master Data Management
How Microsoft Rolled Out Self-Service Business Intelligence
to 40,000+ Global Users
Are you thinking about deploying PowerPivot, Excel Services, Reporting
Services and/or PerformancePoint dashboard tools in your environment?
If yes, then the following Microsoft IT Showcase articles and downloadable
white papers are a must read to jump start your
project. They cover topics like having the right onboarding
processes in place, governance controls and much more.
Data, Hadoop + PDW Better Together: must see for data lovers!
Microsoft Big Data Resources
I just watched the recording of Dr. DeWitt from PASS since I did not
get to go. I thought the points on how/when to use Hadoop and PDW
put into context what the practical future roadmap looks like and
how a future multiple universe data management system will work to
bridge both structured and unstructured data worlds. For current state,
the realities of Hadoop simple join and analysis query performance
pains, Hadoop query syntax impracticality for businesses and how/when
PDW analytic queries perform much better than Hadoop queries are very
helpful to understand for data architects dealing with data explosion.
The bottom line seems to be a better together message: Hadoop + PDW.
Visio Services Data Driven Visualizations in SharePoint
I just gave a quick demo of this powerful and amazing feature that
is often overlooked in the SharePoint Business Intelligence visualizations.
Note that Visio Services in SharePoint 2010 does not require the
viewers to have Visio - the diagrams are rendered as images or Silverlight.
These images can be connected to external data sources to provide
live, rich visual context. You can also add interactivity into diagrams
to enable drill-through and also pass parameters to other web parts.
Here are a few samples of a simple heatmap, retail floor plan connected
to POS data, hotel room availability, a strategy map, a business
process diagram and a map. There are many other examples that
overlay images and objects to get even more creative!
To learn more about Visio Services Data Driven Diagram features
Extending Project Server 2010 Business Intelligence
Great white paper for using PerformancePoint Services with Project
Server 2010 written by Emmanuel Fadullon, Delivery Architect Microsoft
Consulting Services. The document provides a detailed guide for Microsoft
Project Server 2010 users who want to extend Project Server Business
Intelligence (BI) features. It provides a few nice visualization samples
- risk matrix, custom KPIs, decomposition tree and strategy maps to
name a few. Note it does not cover the out-of-box Project Server reports.
This is about taking the Project Server data to a whole other level.
Automate Unstructured Document Data Extraction
Just saw Brainware's brilliant usage of data mining neural network
technology to extract and store unstructured data that in turn can
be incorporated into your data warehouse, ERP, SharePoint or enterprise
search. Imagine the possibilities...unstructured data represents
80% of the world’s document content! Check it out http://www.brainware.com.
SharePoint BI/Excel Services Tip!!!
If you have multiple pivot tables/charts in a SharePoint EWA web
part you can use a slicer as a controller to those multiple objects
and pass parameters to the slicer from an EWA filter. To pass multiple
values in the EWA filter parameter use a comma separated string.
I am going to play with this over the weekend for show and tell.
SQL Server Denali Data Quality Services - "What and How"
Check out all the good free videos on the upcoming SQL Server
brand new Data Quality Services features to learn both the "What
and How" of this addition to SQL Server. http://msdn.microsoft.com/en-us/sqlserver/hh323832.aspx
Data Impact and Lineage Analysis as it Evolves
Preview Microsoft impact and data flow analysis at http://projectbarcelona.cloudapp.net as it evolves. The preview contains a fictional sample data
topology and will be refreshed every few weeks.
Using PowerPivot as PerformancePoint Data Sources
I just finished a really fun BI POC build. In this POC,
I have been using an Excel PowerPivot .xlsx file as a PerformancePoint
data source! I was skeptical but hey it works even better
that I thought it might. I have PerformancePoint Filters, PerformancePoint
Analytic Charts, "slice/dice", drill downs etc. all sourced
from an Excel PowerPivot file that using Tabular data sources versus
SSAS OLAP Cubes. It was fast, easy and actually pretty cool.
If you want to give this a try for yourself, here is a TechNet link
that describes the PowerPivot in PerformancePoint connection string. http://technet.microsoft.com/en-us/library/ff191197.aspx
The key was to create a data source that points to the URL of
the PowerPivot file in the connection string:
(Note that I did not need the TechNet mentioned hotfix to do this
stuff; try it first without that hotfix install hoopla.)
A few things that did not seem to work are Parent-Child Hierarchies
in Tree Filter Controls and the PerformancePoint specific Time Intelligence
fomulas. However, overall it works pertty well.
Undocumented SSRS 2008R2 with SSAS Stricter Parameter Naming Change
There is an undocumented SSRS 2008R2 with SSAS behavior change that
you may encounter with parameter names. Starting in 2008R2,
you can not use reserved words as parameter names. For example,
I have used STRTOSET(@Set) as an SSRS parameter with SSRS 2005 and
SSRS 2008 for querying date ranges with Calendar control parameters
with SSAS cube reports. After upgrading to SSRS 2008R2 I could not
figure out what was going on when none of my SSAS cube date range
reports were working. A simple parameter name change from
@Set to @DateSet
fixed them all!!!! I hope
this tidbit helps someone else.
SQL Server 2008R2 Data Quality and Master Data Management
Just came across a Data Quality and Master Data Management free ebook from Solid Quality Mentors at
Cool! Interested in querying and monitoring Tweets?
Check out Microsoft Analytics for Twitter http://blogs.msdn.com/b/oneclickbi/archive/2011/06/10/microsoft-analytics-for-twitter.aspx
Analysis Services Operations Guide
This paper is a very deep dive into monitoring, capacity plannning, and diagnosing SQL Server 2008 Analysis Services operations issues.
More MS BI with SAP
SQL CAT White Paper http://sqlcat.com/whitepapers/archive/2011/01/26/
on how to use PowerPivot to empower business users to integrate
and analyze SAP and non-SAP data to ease and accelerate analytics.
The paper reviews six scenarios where PowerPivot is used to analyze
SAP ERP or SAP BW data:
- Importing data into PowerPivot via an Excel PivotTable connected
to SAP BW
- Importing data into PowerPivot via SQL Server Reporting Services
connected to SAP BW
- Importing data into PowerPivot from a text file generated using
SAP BW Open Hub Service
- Importing data into PowerPivot from a SQL Server Analysis Services
cube connected to SAP BW created using ERP-Link iNet.BI
- Importing data into PowerPivot using Xtract PPV connected to SAP
- Importing data into PowerPivot using Simplement Data Liberator
connected to SAP ERP
TechEd 2011 BI Tips
Just finished watching "Enriching Your BI Semantic Models
Using Data Analysis Expressions (DAX)"
Demos of PowerPivot features and DAX language enhancements:
- Multiple table relationships
- Role playing dimensions
- Relationship diagrams designer
- Date type table and better Time Intelligence
- Hierarchies / Parent Child
- Restore Excel PowerPivot to BISM Server Model
- DAX server queries and language improvements
- Statistics STDEV, VAR, and more
- Rank, Top N, Switch and more
- Many to many optimizations via DAX
For a list of all PowerPivot 2012 enhancements check out: http://technet.microsoft.com/en-us/library/hh272053(v=sql.110).aspx
The Data Loading Performance Guide
This guide covers bulk loading design patterns and best
practices. It also talks about how the TPCH ETL World Record was
achieved. A MUST READ for every Microsoft data warehouse ETL developer. http://msdn.microsoft.com/en-us/library/dd425070(SQL.100).aspx
Analysis Services Performance Guide
While on the topic performance guides, SSAS also has a similar MUST READ for SSAS developers.
Microsoft BI with SAP
Recently I was asked by a client about Microsoft SSRS on top of
SAP/BW. It turns out there are a lot of great demos and links
available on this topic. If you are thinking about more than
just SSRS - say SSIS, SSAS, PerformancePoint and PowerPivot then
the SAP BI interoperability papers are a good overview for the end-to-end
Short older Demo Video of using SQL Server 2005 Reporting Services
with SAP NetWeaver Business Intelligence
Whitepaper Using SQL Server 2008 Reporting Services with SAP NetWeaver
Whitepaper How to use parameterized MDX in Microsoft SQL Server
Reporting Services for SAP BW InfoProvider access
Several more SAP BI interoperability papers
Data Warehousing Best Practices and Guides
Data Warehouse Fundamentals
Kimball Data Warehouse Design Methodology
Best Practices for Data Warehousing with SQL Server 2008
Data Warehouse Maturity Level