Friday, April 25, 2008
An interesting side note is that Teradata's marketing slogan on their website is "The Power of Choice". It is very Keanu Reeves from the Matrix Trilogy. Wondering if there is some sublimal message there with HP's DW solution, Neoview.
(i) Transformation of data that does not meet expected rules (contents of data elements and the validation of referential integrity relationships for example)In this second part of the article he takes a little deeper into several of these components.
(ii) Mapping of data elements to some standard or common value
(iii) Cleansing of data to improve the data content (for example to cleanse and standardize name and address data) that extends the data transformation process a step further
(iv) Determining what action to take when those integration rules fail
(v) Ensuring proper ownership of the data quality process
Data transformations may be as simple as replacing one attribute value with another or validating that a piece of reference data exists. The extent of this data validation effort is dependent on the extent of the data quality issues and may require a detailed data quality initiative to understand exactly what data quality issues exist. At a minimum the data model that supports the data integration effort should be designed to enforce data integrity across the data model components and to enforce data quality on any component of that model that contains important business content. The solution must have a process in place to determine what actions to take when a data integration issue is encountered and should provide a method for the communication and ultimate resolution of those issues (typically enforced by implementing a solid technical solution that meets each of these requirements).
As Organizations grow via mergers and/or acquisitions, so too does the number of data sources and eventually lack of insight into overall corporate performance. Integration of these systems upstream may not be feasible and so the BI application may be tasked with this integration dilemma. A typical example is the integration of financial data from what used to be multiple Organizations or the integration of data from different geographical systems.
This integration is a challenge. It must consider (i) the number of sources to be integrated, (ii) commonality and differences across the different sources, (iii) requirements to conform attributes [such as accounts] to a common value but retain visibility to the original data values and (iv) how to model this information to support future integration efforts as well as downstream applications. This task is indeed a challenging one. All attributes of all sources must be analyzed to determine what is needed and what can be thrown away. Common attribute domains must be understood and translated to common values. Transformation rules and templates must be developed and maintained. The data usage must be clearly understood especially if the transformation of data is expected to lose visibility into any data that is transformed (for example if translating financial data to common charts of accounts).
Making Information Accessible to Downstream Applications
With this data integration effort in place, it is important to understand the eventual usage for this information (downstream applications and data marts) and to ensure that downstream applications can extract data efficiently. The data integration process should be designed to support the requirements for integrating data, that is to support the data acquisition and data validation/data quality processes (validation, reporting, recycling, etc), to be flexible to support future data integration requirements and to support historical data changes (regardless of any reporting expectations that may require a subset of this functionality requirement). The data integration process should also be designed to support the push or pull of data in addition. With that in mind the data integration model should provided metadata that can assist downstream processes (timestamps for example that indicate when data elements are added or modified), partition large data sets (to enable efficient extraction of data), reliable effective dating of model entities (to allow simple point in time identification) and be designed consistently.
The data integration process may at first seem a daunting process. But by breaking the BI architecture into it’s core components (data acquisition, data integration, information access), developing a consistent data model to support the data integration effort, establishing a robust exception handling and data quality initiative and finally implementing processes to manage the data transformation and integration rules, the goal of creating a solid foundation for data integration can be met.
Thursday, April 24, 2008
I first saw this video in an article on “car vs. bicycle” traffic accidents, which noted that motorists almost always say “I never saw him” or “She came out of nowhere” after snapping the bike and/or rider like a twig. The video, produced as a public-service message by Transport for London, is a brilliant illustration of how people often fail to see a change in their surroundings because their attention is elsewhere.
I’ll save my post on bicycle safety laws for another day, and instead ask whether this same phenomenon applies in BI or Performance Management applications – do your reports, scorecards, and dashboards show you “what you want to see” or are they designed so that you can spot the “moonwalking bear” in your company’s performance?
Here’s just one example, from an article in USA Today, where data was potentially mis-interpreted and mis-used with disastrous results. Documents from Vioxx lawsuits indicate that Merck & Co. apparently downplayed evidence showing the pain-killer tripled the risk of death in Alzheimer’s-prone patients. Was Merck so anxious for the clinical trials to be successful that they “saw what they wanted to see” in the results? The company claims they did nothing wrong; we’ll see what the lawsuits ultimately determine.
Sometimes the data is good, but the visualization of that data is bad. Dashboards that look like this
are useful if you’re interested in variance analysis of high-level metrics. But the visualization (essentially a hardcopy report with traffic-lights) doesn’t help with the really interesting stuff, which are the drivers underneath those high-level metrics.
Advanced visualization methods are becoming more prevalent in dashboard designs. Over the next few weeks, we’ll look at some examples of visualization methods that can improve awareness of underlying data and help spot the moonwalking bears.
In the meantime, do you have examples of good techniques you’ve used or situations where better visualization of data would’ve helped improve performance?
Dennis Newman will present how the Boston Globe has established an Enterprise Information Management (EIM) initiative to address data integrity challenges to support an enhanced customer reporting platform. The Globe established a set of common definitions for customer-centric metrics to deliver sales and marketing analytics.
David Roberts will discuss the Central Intelligence Agencies approach to maximize value from enterprise data assets. The CIA is highly dependent on quality data and information to drive decisions. David will present the CIA’s enterprise data architecture and the value that the intelligence community has gained by having a robust data platform.
Dan Power from Hub Solutions design will present on the importance of establishing a master data management initiative and platform. Dan has over 20 years in enterprise technology with a specialization in master data management (MDM), customer data integration (CDI) and enterprise data architecture.
Wednesday, April 23, 2008
For those either already registered or considering attending the DIG event in May, here are some serious and not-so-serious reasons to attend:
- Architect your enterprise data warehouse and create your personal “one version of the truth” to rationalize your missing expense receipts from your DIG conference trip.
- Have Jeffrey Ma sign your copy of “Bringing Down the House” after he talks about harnessing the power of rational, quantitative analysis to make smarter business decisions.
- See case studies from Reliant Energy, Kelley Blue Book and Infosys on how their respective organizations are embedding advanced analytic techniques into their management processes to make better decisions.
- Post to your blog, co-create your wiki, join the DIG social network and Twitter your impressions of DIG in real-time to become part of the Enterprise 2.0 phenomenon.
- Meet Andrew McAfee, who coined the term “Enterprise 2.0”, as he discusses the value creation that organizations are realizing through Web 2.0 concepts and technologies.
- Hear the Boston Globe and Central Intelligence Agency speak about leading practices to capture, organize and establish a common set of information assets to create one version of the truth.
- Test your driver tree analysis techniques and advanced analytic dashboards at the card tables to pay for your conference registration.
- Listen to Charles Fishman, award-winning journalist at Fast Company and author of “The Wal-mart Effect”, speak about leading organizations that are using information to uncover insights about their customers and what it means to be a “fast company”.
- Understand how Google, AT&T and the BBC are all leveraging Web 2.0 technologies such as blogs, wikis, social networks, tagging and prediction markets to drive mass collaboration inside and outside the organization.
- Attend the only conference that combines the theories, concepts and real world practical examples of data architecture, analytics and Enterprise 2.0 in a single agenda.
Several articles/posts from the past few weeks push on this very topic. The McKinsey Quarterly (registration required) published an article on Innovation lessons from Pixar which highlights how Brad Bird, Pixar’s Oscar-winning director, motivates his people by including them in the dialog. Fast Company spoke with Gartner researcher, Tom Austin, about how IT’s Not about the Technology but rather information technology is about leveraging the people. And Susan Scrupski posted her comments in SocailMediaToday on Corporate Antisocial Behavior: the Enemy is Us.
Each of these articles pushes us, in one way or another, to focus on the key driver behind business success – motivated teams of people. People are paramount to making things happen. E2.o tools are technologies that magnify and broadcast the culture that empowers people.
I see three opportunities in E2.o
1) Get people talking about the business - E2.o can highlight and build conversation around the “social objects” of a business. In my last post, I spoke about social objects. These are the things that allow people to connect and be in dialog. In business, these objects are things like business goals, customer wants, or new innovations.
2) Get the facts to the people - E2.o can reduce what I call the perception gap between what you think is happening in the business and what is actually happening in the business. Once facts are clear, true dialog and problem solving begins to occur. E2.o tools can integrate business intelligence into the mainstream business conversations. These same tools can then be used to solve problems collaboratively by tapping both experts' thoughts and front line operators' experiences into creative solutions.
3) Equip people with a contextual understanding of the business - E2.o can provide a more holistic understanding of a business. Through these tools, people are exposed to and vicariously taught about tangential yet pertinent topics beyond their specialized skills. This broader knowledge gives these folks the insight to act or respond with a systems-thinking mindset that is coherent with the overall business. In this way, people are more naturally prepared to act in manner that supports and adapts the business the their changing marketplace.
All three of these benefits of E2.o foster a more pronounced business culture - good or bad.
Do you agree?
Tuesday, April 22, 2008
The research paper highlights 4 key needs for an accurate prediction market
- Ability to aggregate information and knowledge from individuals
- Incentives to encourage active participation
- Feedback to participants based on market prices
- Anonymous trading
The results from the case study were quite positive. Acxiom Corporation was the test case and used the Inkling Markets software to host the market to predict 26 milestone events of an internal IT project. Two results jumped out at me. The market was 92% accurate on the milestone events (24 for 26) and had an 87% participation rate (33 participants). There was also a higher perceived level of collaboration as a project team, which had positive impact on the outcome.
The authors of the paper are Herbert Remidez, Jr. Ph.D. and Curtis Joslin from the University of Arkansas. Looking forward to seeing further output from the research.
This data is critical for marketers when deciding where to spend their ad dollars. You should read the full article to gain a full appreciation of the entire story, but here are a few snippets that are relevant to the importance of having “one version of the truth”.
Sarah Fay, chief executive of both Carat and Isobar US, ad companies owned by Aegis group said “We have not expected the numbers to be 100%”. It’s good to see that no expectations were being set out of the gates. Not sure this would fly when discussing something like revenue for an organization.
The article goes on to point out that comScore and Nielson data doesn’t always match up. “To complicate matters, disparities between comScore and Nielson data are common, as the two companies use different methodologies to measure their audience panels.” This isn’t something we don’t here inside the four walls of a corporation for something like a measures calculation rule.
Brad Bortner, an analyst with Forrester Research points out “There is no truth on the Internet, but you have two companies vying to say they are the truth of the Internet, and they disagree.”
And finally, my favorite quote in the article came from Sean Muzzy, senior partner and media director at digital ad agency http://www.ogilvy.com/neo/. “We are not going to look at comScore to determine the effectiveness of Google. We are going to look at our own campaign-performance measures”. This would be the equivalent of “if you don’t like the results, try a different measure.”
I have always wavered on the need for accurate data for certain types of measurement, especially something like clickstream analysis. I guess that wavering has now fallen to the side of the camp with the other types of data that require precision and accuracy.
Monday, April 21, 2008
Outsourcing your data warehouse
In this article on TWeb by Jannie Strydom, the idea of outsourcing an organization’s data warehouse is proposed. The primary drivers are around lack of skills to properly maintain and keep the warehouse relevant to the business. As much as I agree with outsourcing the components of a warehouse that are repetitive and process oriented (loading data, maintaining production processes, fixing errors), it is a slippery slope to outsource aspects that are critical to meet the business needs. A strong understanding of an organization’s business model and needs should be weighed heavily against the value gained (typically cost savings) by outsourcing certain aspects of a data warehouse, especially those that can help facilitate better management of performance.
More on Enterprise Mashups
I made a post a few weeks back on BI and enterprise mashups. This news story came out of the O’Reilly Web 2.0 Expo that caught my eye because of the mention of integration with Excel. In particular, the article discusses going beyond the geographic mashups being done with Google Maps and starts “mashing” multiple external data sources for enhanced analytics inside of Excel. For example, pulling competitor data directly into your own organization’s performance (I am working at a client where modeling this in their data mart has become a bit of a challenge). Two software companies that are mentioned in the article that are providing these type of mashup services and software are Kapow Technologies and JackBe Corporation. I have not kicked the tires on these two products but they sound extremely valuable to a business user trying to consume multiple and different types of data sources into a single “view”. I would like to understand how these products might fit into an overall information architecture from a consistency and “one version of the truth” perspective.
Business Intelligence and My Carbon Footprint
This one builds on the “everything must be environmental” and “green movement”. At the WTTC Global Travel & Tourism Summit in Dubai, Travelport announced their new Carbon Tracker reporting tool. It is designed for travel agencies and corporations to track their carbon footprint when it comes to corporate travel. It provides different analytic views using standard environmental calculations. The reporting tool includes travel budget and environmental impact analysis and comparison to other modes of travel (car, bus, train, flying). There is a slick product overview with screenshots on the Travelport website. I am considering using this tool to calculate the carbon footprint of DIG in Las Vegas versus another location in the US. I may need to recommend that the speakers ride bicycles to the event to reduce our environmental impact. I know Mark Lorence would be up for it (Mark is an avid bicyclist enthusiast that continues to educate me on the nuances of professional cycling…we have a prediction market already established on his first post that links Lance Armstrong to Business Intelligence). I may need to start referring to the first theme of DIG as "An Incovenient One Version of the Truth".