Showing posts with label analytics. Show all posts
Showing posts with label analytics. Show all posts

Thursday, May 8, 2008

Composite Metrics

Here are some interesting examples of composite metrics - metrics whose values are determined by a mathematical formula involving other metrics. Composite metrics can be very effective in dashboards and scorecards, as they can quickly represent high-level information with a single number based on multiple underlying values (think Dow Jones Industrial Average).



The first is from traffic.com and is called the Jam Factor, which sounds like the name of a bad 80’s rock band but is an extremely useful metric.

The Traffic.com Jam Factor is like a Richter Scale for traffic. It’s an overall measure of the traffic intensity on a roadway, or on a section of a roadway. Because the Jam Factor calculation uses real-time and historical speed data from our digital sensors and those of our partners, as well as our detailed accident, construction and congestion information, it’s a comprehensive measuring tool that is unique to Traffic.com.

The Jam Factor is measured on a scale of 0-10, with 10 representing the worst traffic conditions. This numerical scale also provides color coding to give you a quick, at-a-glance picture of conditions on the roadways.


The second is a software analysis tool called WKO+ from TrainingPeaks. WKO+ provides a variety of tools that cyclists can use to monitor data from heart-rate monitors, power meters, and GPS devices to analyze their training. Working with exercise physiologists, TrainingPeaks developed two metrics that are used in their product: Training Stress Score (TSS) and Intensity Factor (IF). TSS tells you how much stress you put on your body during a workout, and IF tells you how intense the workload was compared to last months’ similar workout.

According to Gear Fisher, Chief Technology Officer at TrainingPeaks:

“The beauty of TSS and IF is that, combined, they can tell the amount of physiological stress put on a person’s body. They are all based on an individual rider’s threshold. So unlike heart-rate or even power zones, where 400 watts is 400 watts but if I weight 300 pounds and the guy next to me weighs 150 pounds the end result is something dramatically different in terms of velocity. If I go out and do 200 TSS points, or Lance Armstrong goes out and does 200 TSS points, the relative effect on each of our bodies is the same. So he put his body through the same amount of stress as I did, even though it only took me 2 hours to get 200 TSS points and it might take Lance Armstrong 3 hours – or even an hour, depending on how hard he’s going.”

The third is from a co-worker, who’s Slapdown Index is calculated from the number of hours of sleep she had the night before, the length of her commute that morning, and the frequency of annoying email requests she gets before 10:00am. A high Slapdown Index is a leading indicator of her propensity to inflict bodily harm on those who dare approach her cube.

I’ve started using Jam Factor, TSS, and Slapdown Index to optimize my daily performance. What composite metrics have you found to be useful?




Tuesday, April 29, 2008

Taking the Heat Out of a Hot Kitchen

(Long-time fans of the Pittsburgh hockey team will understand the title of this post. Go Pens!)

We’ve all seen ‘heat maps’ used as visualization tools. A heat map is a graphical representation of data where the values taken by the variables are represented as colors. Often, heat maps are used in conjunction with an actual map – like the weather map on the back page of USAToday, or the real-time traffic display at traffic.com. And while the information from these maps is useful - “It’s cold and rainy in Boston in April, and the traffic on the Mass Pike is really bad at 5:00pm” - it’s not particularly insightful.

Here’s an interesting application of heat map visualization. It’s from Purdue University’s Project Vulcan, which is quantifying North American fossil fuel carbon dioxide (CO2) emissions at space and time scales much finer than have been achieved in the past. This 5-minute video provides an overview and shows several fascinating examples of the heat map visualizations used in representing the underlying data:



Again, some of the results are expected – "carbon dioxide emissions are high where there are lots of people spending lots of time in their cars" – but not overly insightful. More interesting, however, are the discoveries that researchers have made from analyzing the data in graphical form. There’s an excellent summary in the April 27, 2008 issue of the Boston Globe and two results stand out:

“When you rank America’s counties by their carbon emissions, San Juan County, NM – a mostly empty stretch of desert with just 100,000 people – comes in sixth, above heavily populated places like Boston and even New York City. It turns out that San Juan County hosts two generating plants fired by coal, the dirtiest form of electrical production in use today.”

And the heat maps shows a small, bright-red area (high carbon emissions) in the northwest corner of New Mexico surrounded by wide expanses colored green.

“Purdue researchers discovered higher-than-expected emissions levels in the Southeast, likely due to the increasing population of the Sun Belt, long commutes, and the region’s heavy use of air conditioning. According to Kevin Gurney, assistant professor of atmospheric science at Purdue and the project leader, this part of the map also overturns the prevailing assumption that industry follows population centers: In the Southeast, smaller factories and plants are distributed more evenly across the landscape. Cities, meanwhile, prove less damaging than their large populations might suggest, partly thanks to shorter commutes and efficient mass transit.”
Work is underway to add Canadian and Mexican data to the Project Vulcan inventories. It will be interesting to see what other non-intuitive conclusions will be reached with these analytical and visualization techniques.

Monday, April 28, 2008

What Gets in the Way of Good Analytics?

Today at Bank Systems and Technology, there’s an article on the increasing importance of analytics to the banking industry. The story is fairly typical in the genre – “we used to manage by gut, but better information about our customers can help us in so many ways!”

What caught my attention was that quite a few of the contributed quotes came from places on the org chart that just don't exist at most organizations – the “Director of Statistics and Modeling” and the “Department of Insight and Innovation” to name two. These references were threaded alongside a frequent comparison of “mature” analytics areas, such as credit card predictive modeling and “growing” areas, such as customer attrition modeling. This might suggest that organizations who create a dedicated function related to analytics and related disciplines are more successful at spreading the competency internally than those organizations that leave it to chance. This is certainly the position put forth by Thomas Davenport in Competing on Analytics, and is certainly intuitive in some respects.

It’s easy to envision a success story for such a group – evangelizing the power of analytics, introducing new skills to functions without a historical strength in analysis, etc. But what are the likely barriers and points of failure? How can an organization considering such an investment get ahead of the curve and mitigate the risk?

I’d speculate there are a handful of key reasons for struggle or failure:

  1. Lack of a starting point / quick win “pilot” - Perhaps it is difficult for a Center of Excellence-type structure to get off the ground without one demonstrated benefit within the first year or so
  2. Insufficient data trail - For businesses or domains without a solid trail of transactional information, it might be tougher to get started (there goes my idea for a chain of cash-only restaurants with no POS system)
  3. Lack of data architecture / infrastructure investment - If a new analytics team’s first report includes a request for $5 million just to organize the data, rough roads may be ahead
  4. Active resistance to the scientific approach - If a CEO is commonly heard to say “you guys think too much,” is that an organization likely to be hospitable to analytics?

What do you think is the biggest barrier? One I didn’t identify? What are the keys to success in building an organization's overall competency in analytics?

Thursday, April 24, 2008

Seeing What You Want To See

Before reading further, please watch this 1-minute video:



I first saw this video in an article on “car vs. bicycle” traffic accidents, which noted that motorists almost always say “I never saw him” or “She came out of nowhere” after snapping the bike and/or rider like a twig. The video, produced as a public-service message by Transport for London, is a brilliant illustration of how people often fail to see a change in their surroundings because their attention is elsewhere.

I’ll save my post on bicycle safety laws for another day, and instead ask whether this same phenomenon applies in BI or Performance Management applications – do your reports, scorecards, and dashboards show you “what you want to see” or are they designed so that you can spot the “moonwalking bear” in your company’s performance?

Here’s just one example, from an article in USA Today, where data was potentially mis-interpreted and mis-used with disastrous results. Documents from Vioxx lawsuits indicate that Merck & Co. apparently downplayed evidence showing the pain-killer tripled the risk of death in Alzheimer’s-prone patients. Was Merck so anxious for the clinical trials to be successful that they “saw what they wanted to see” in the results? The company claims they did nothing wrong; we’ll see what the lawsuits ultimately determine.

Sometimes the data is good, but the visualization of that data is bad. Dashboards that look like this



are useful if you’re interested in variance analysis of high-level metrics. But the visualization (essentially a hardcopy report with traffic-lights) doesn’t help with the really interesting stuff, which are the drivers underneath those high-level metrics.

Advanced visualization methods are becoming more prevalent in dashboard designs. Over the next few weeks, we’ll look at some examples of visualization methods that can improve awareness of underlying data and help spot the moonwalking bears.

In the meantime, do you have examples of good techniques you’ve used or situations where better visualization of data would’ve helped improve performance?

Thursday, April 17, 2008

Politics: There's No "I" in "DIG"

What do sports, politics and DIG have in common? Well, of course, it’s prediction markets. There’s Protrade, and Tradesports and the Iowa Electronic Markets and, well, Las Vegas itself, kind of. But thinking across to the other themes of the conference, the similarities disappear. Much has been said and written about the use of data and analytics in sports (Moneyball, footballoutsiders.com, 82games.com), but the closest most politicos get to analysis is focus groups, commissioned polls and a cornucopia (or is it hodgepodge?) of cognitive biases (“we need to focus on ‘soccer moms!’”).

In the last few years, some individuals and organizations have begun to make a dent in this space; notably among them Get Out the Vote: How to Increase Voter Turnout by a couple of Yale professors who base their recommendations on actual research. More recently, Brendan Nyhan at Duke reports on his blog the founding of “The Analyst Institute,” which states as its mission “for all voter contact to be informed by evidence-based best practices. To ensure that the progressive community becomes more effective with every election, we facilitate and support organizations in building evaluation into their election plans.”

It’s not as if there isn’t incentive to win, and it’s not as if there’s a lack of interested funding. So why is politics behind the curve on data and analytics? Is there a rational (or irrational) belief that politics need to be managed by gut? Or are there structural reasons? Or am I mistaken in thinking politics is late to the game, and that McCain is hiding the next Billy Beane somewhere on the Straight Talk Express?

Saturday, April 12, 2008

All-You-Can-Eat Seats

I learned last week that the Pittsburgh Pirates are joining a growing trend across Major League Baseball (as well as other sports) by offering an All-You-Can-Eat seating section during specific games for the 2008 season. Fans purchasing a $35 advance ticket ($40 on game-day) will receive a wristband providing access to a dedicated concession stand and all the hot dogs, hamburgers, nachos, salads, popcorn, peanuts, ice cream and soda they can eat.

I’m interested in the analytics behind this decision and wonder if the following conversation took place:

Marketing Executive: The fans want a winning team.

Baseball Executive: Are you kidding? Have you seen our lineup? What if we gave them unlimited hot dogs?

Marketing Executive: I’ll start working on the spreadsheet…

The seats are normally $17. At the $35 price, you’d need to stuff yourself with $18 worth of concessions in order to “break even” – not a particularly hard thing to do given current stadium prices.

Nutritionists and public-health officials oppose the plan, calling it a “recipe for obesity” as fans try to get their money’s worth by over-indulging. Team officials say they’re getting rid of tickets and making fans happy.

There are 164 seats in the All-You-Can-Eat section at the Pirates’ PNC Park. An advance sell-out would generate about $3000 more in revenue at $35 than at the regular $17 price, but expose a liability of 164 hungry fans trying things like “Let’s have a hot dog every time a Pirates reliever gives up a hit,” which – given the Bucs’ early-season performance - could result in numerous emergency shipments from Oscar Meyer to the Golden Triangle.

This doesn’t strike me (strike, get it?) as a good deal for the team, and potentially has some problems for the fans as well – this Braves fan did some analysis on Atlanta’s plans to offer a similar promotion.

Do promotions like these ever make business sense? Often they are designed to be loss-leaders, enticing customers with a lower entry price with the hope they’ll spend more later. Perhaps these Pirate fans, tired of shelling peanuts while watching their pitchers get shelled, will buy an over-priced souvenir.

Other promotions are designed to attract first-time customers and turn them into repeat customers. So those who can’t get tickets to the Penguins playoff games might say “What the heck, let’s go across the river, watch some baseball, and see if we can eat 5 trays of nachos before the 7th-inning stretch.”

What analytical techniques have you used to evaluate promotional activities – before, during, and after the promotion?

Early results in the 'Burgh are inconclusive. At last Wednesday’s game against the Cubs, 67 All-You-Can-Eat seats were sold. The total attendance was 9,735 so gluttons comprised less than 1% of the crowd. But the game lasted 15 innings, so they had a really, really long time to eat. And, in an amazing coincidence, the Cubs player with the winning RBI was center fielder Felix Pie.

At Red Sox games they play the Dropkick Murphy’s “I’m Shipping Up To Boston” when the closer enters late in the game. The Pirates may need their own version – “I’m Throwing Up in Pittsburgh” if this All-You-Can-Eat craze takes off…

Picture source: Keith Srakocic, Associated Press

Wednesday, April 9, 2008

In the Mood

How are you getting your engines revved for Vegas the DIG conference?

Consider this an open thread to share book and article recommendations related to data, analytics or enterprise 2.0. The poster with the most compelling suggestion will...be treated to their choice of a soft drink or adult beverage at the Green Valley Ranch by legendary DIG conference chair Pete “Memphis Ruined My Week” Graham.

Monday, April 7, 2008

Shifting Mindsets on BI

Pete Graham recently wrote a post on Using Business Intelligence in E2.0 that challenged each of us to bring business intelligence (BI) into the business conversation (verses creating a business conversation around BI). It was a prickly role reversal for those of us who like to look at the information value chain in a linear fashion beginning with data: data -> information -> knowledge (picture below of basic analytical information systems strategy). However, he provided a gentle but persuasive reminder that our mental mindsets and diagrams need to shift.

Let me explain. The idea of information and its use within business is an old idea, but its mastery reigns rather elusive. There are three core competencies that need to be achieved: Data IN, Information OUT, & Knowledge AROUND.

Data IN
Every time something happens within a business, there exists the opportunity for us to capture a piece of “data” that records its occurrence. For instance, when someone walks into a retail outlet, their visit can be recorded with a date stamp and time stamp. When the visitor buys a greeting card, the transaction is stored, inventory is marked down, and cash can be credited. If the person happens to pay by credit card, the purchase is tagged with the person’s card number. If the customer scanned their loyalty card, the transaction is immediately tagged with their profile information - and on and on. We could go on to name thousands of activities that are tracked within our organizations. These transactions let us know that something has happened!

This is not surprising. We live in a digital world where many of our actions are recorded. The challenge for businesses is to store this point-in-time data in a timely fashion and in such a way that it can be accessed quickly and easily in the future. I call this exercise, the “Data IN” process. This is the opportunity for our organizations to capture all of the happenings within our business ecosystem. Unfortunately, this raw data is unwieldly to the average business person.

Information OUT
Therefore, an organization is tasked with putting this data into context so that users can see an evolving narrative about their business. This narrative helps us to understand the what, when, and how of our businesses and their performance within the marketplace. We get to see the single occurrence (or piece of data) with the context of the business story. This process of transforming data into “information” is invaluable and gives us the digestible analytics to manage, measure, and improve our businesses.

Getting “Information OUT” is achieved by answering both traditional and current business questions with information about the past or with forecasts about the future.

Knowledge AROUND
The last piece of the information value chain is to seize the Aha! moments and business insights and push them out to the organization. For instance, a store manager who sees a declining trend in her customer base may realize that a profound shift is taking place in her market. With the combination of some analytical reporting and some field observation, she may notice that a local competitor has cut deeply into her customer base. This “knowledge” needs to be shared with her organization so that other store managers can prevent a similar decline and so functional groups within the organization can support or assist with planning a response (or change to the business). Our companies have a need to easily and quickly share insights throughout the organization, or broadcast “Knowledge AROUND”.

Today’s E2.0 tools have brought renewed energy to the business conversation represented by the Knowledge AROUND piece of the value chain. Tools like blogging, microblogging, wikis, prediction markets, etc… are democratizing the voice of the market facing parts of our organizations! This is exciting because it allows the conversation that is happening out in the field – between the people in the field and the market (customers, vendors, etc… ) to more effectively influence the information value chain. To Pete’s point, at the beginning of this post, our organizations need to bring BI into the business conversation. If we do, we have the opportunity to consistently adapt to fulfill the needs of our changing markets.

Let’s keep thinking about the paradigm shifts required to bring BI to E2.o. What do you think? What topics should we be discussing?

Friday, April 4, 2008

Predicting Lost Luggage

I read an interesting article on prediction markets by Gary Stix in the March, 2008 issue of Scientific American. The bulk of article discusses the success rate of the Iowa Electronic Markets in predicting election results based on buying and selling “securities” – portfolios of contracts for both candidates. In presidential elections from 1988 to 2004, the Iowa Electronic Markets have predicted final results better than the polls three times out of four.

The article provides a great description of how the market works. It also highlights other prediction markets that allow speculators to predict almost any conceivable event, from a Chinese moon landing by 2020 (Foresight Exchange) to Katie Couric departing from CBS News (Intrade) to the first human-to-human transmission of avian flu (Avian Influenza Prediction Market).

While these events are important, and might be fun to risk a few dollars on prediction, I was most interested in the internal markets that are being established to gauge the success of business efforts:

“Attracted by the markets’ apparent soothsaying powers, companies such as Hewlett-Packard, Google and Microsoft have established internal markets that allow employees to trade on the prospect of meeting a quarterly sales goal or a deadline for release of a new software product. As in other types of prediction markets, traders frequently seem to do better than the internal forecasts do.”

I wonder whether an internal prediction market may have help with the disastrous opening of Heathrow Airport’s new Terminal 5. Despite headlines like this:



they clearly weren’t ready for their opening week - hundreds of cancelled flights, thousands of lost bags, and a financial and PR nightmare for British Airways and BAA.



There has been a lot of Monday-morning quarterbacking (or the equivalent soccer term) about the decision to open the new terminal in “big bang” fashion. Critics have suggested a phased approach might have reduced the problems, and citied other major infrastructure projects (like the new St. Pancras rail station) as examples. I’m guessing that the executive team considered both options and researched other airline terminal openings before making their decision. (I remember when the new Pittsburgh airport opened in 1992; the last flight landed at the old airport about 10 pm, and army of people and moving vans transferred all the operations equipment to the new terminal about a mile away, and the first flight landed at the new airport at 6:00 am. Despite some initial problems with the automated baggage-handling systems, this big-bang approach went much more smoothly that Heathrow’s.)

Would an internal market, reflecting the collective knowledge of the Heathrow employees, have predicted such a chaotic opening? Experts still don’t know exactly how prediction markets work. I’m wondering whether the accuracy might have something to do with the “degree of influence” the market participants have over the outcome.

For many events – like predicting the amount of snowfall in Central Park, or the outcome of the NCAA tournament games – a trader has no influence over the outcome and is, effectively, guessing.

For other events – like predicting an election outcome or the success of a new movie – a trader has limited influence. An individual vote influences election results (unless you’re a Republican living in Massachusetts). A person can attend the opening of a movie and tell all their friends how great it was.

Most intriguing are those events where traders have significant or considerable influence over the outcome – the sales manager responsible for meeting the quarterly target, the project manager trying to launch on time, or the baggage handlers at Heathrow who not only have to use the new systems but have to show up at a new location before they even see their first bag of the day.

Is there a correlation between “amount of influence” and “accuracy of prediction?” Can markets provide field-level insight that executives can’t (or won’t) see? If a “Terminal 5” market had existed and “successful opening” contracts were trading at low prices, would BA chief executive Willie Walsh have used this information to delay the opening, conduct more testing, and phase-in new operations over time?

Does your company use prediction markets? Have they been successful?

NCAA Update: Well, the Selection Committee looks pretty good as – for the first time in NCAA men’s basketball history – all four No. 1 seeds are in the Final Four. Would a prediction market have helped? According to this news story,

“…of the 500,000 fans playing on CBSSports.com, more than 51,000 correctly predicted the final four teams…”

Assuming that some of those 10% were basketball junkies while others picked their brackets based the team’s jersey colors, can we can draw any conclusions about a “wisdom of the crowd” factor in the NCAA tournament?

Thursday, April 3, 2008

Reliant Energy, Kelly Blue Book and Infosys to Speak on Analytics at DIG 2008

I am pleased to announce that Reliant Energy, Kelly Blue Book and Infosys will be speaking on the topic of Insights from Advanced Analytics at DIG 2008. Each organization will be presenting case studies on how they leverage analytics to drive better decisions.

Christi Megow and Jason Stults from Reliant Energy will discuss how they are using performance models to drive the operations planning process. Reliant Energy provides electricity and energy services to retail and wholesale customers in the United States. Reliant Energy uses driver trees, dashboards and scorecards to define and communicate corporate strategy to stakeholders and help drive the planning, forecasting, reporting and analysis process for decision-making.

Bruce Hoffman will discuss how Kelly Blue Book uses data gathered from their website to better predict market trends and vehicle pricing. KBB.com has over 12 million site visits per month that are captured, cleansed and consolidated for analysis. Using the clickstream data, KBB is able to provide better analytics both internally and externally to their OEM clients.

Our final presenter in the analytics theme is Romil Bahl from Infosys. In this session, Mr. Bahl will describe how Infosys aligns strategic priorities and objectives with performance measures. Using analytic techniques, Infosys is able to model in real-time expected financial performance for strategic initiatives. Infosys integrates these analytics across industry and horizontal business units to enhance ‘One Infy’ experience and deepen Infosys transformational capabilities.

Tuesday, April 1, 2008

Opening Day (Part 2)

To help me get ready for the upcoming baseball season, I recently purchased the 2008 Baseball Prospectus, an annual almanac of analysis and predictions from the folks who brought us 21st century metrics like VORP (Value Over Replacement-level Player) and BABIP (Batting Average on Balls In Play).

I got to thinking about the evolving perception of analytics in a baseball context. While mentioning those metrics at the water cooler of 2008 might get you some confused looks, you're probably not in danger of being stuffed in a locker, because analytics have begun to earn respect from a wider baseball audience. How, exactly, did this happen? Off the top of my head, I came up with four possible drivers of this change (not mutually exclusive):

  1. Boil the frog slowly – If On-Base Percentage really correlates better with runs scored than batting average, well, who am I to argue? And if that's true, then maybe I ought to listen to some of your other ideas...

  2. Myth-busting – Some assertions have been controversial (e.g. “There’s no such thing as a clutch player”), but maybe they’re plausible and interesting enough to get attention

  3. Case studies – Michael Lewis’ best-seller Moneyball and the 2004 and 2007 Boston Red Sox (World Series winners) have shined a public light on organizations that succeeded with analytics-friendly leadership

  4. Audience evolution – People in general have better quantitative reasoning skills than they did, say, 20 years ago, and so are more open to evidence-based insights

Here’s my question, and it’s not about baseball: In your organization and mine, a major barrier to extracting value from analytics is a rejection of the methods and implications from (for lack of a better term) the “old school” crowd. What’s the best way to make the case for analytics in your organization? Take on a single cherished nugget of conventional wisdom and prove it wrong? Or is that too risky? Is it better to plug along cautiously, incrementally adding some objectivity and trickling new metrics into the soup until the organization is ready? Or is it the Moneyball approach – find one manager willing to try the Kool-Aid and make something happen?

Opening Day (Part 1): Roger Clemens is Innocent

I’m excited to join such a distinguished group of bloggers, and proud to reveal a new analytic insight in my first post.

It seems to be the consensus among baseball fans that Roger Clemens used performance enhancing drugs in the latter stages of his Hall of Fame career. Accusers support their claims by referring to his apparent improvement when he left the Boston Red Sox for the Toronto Blue Jays before the 1997 season, right before his age-34 season. But advanced analytics tell a different story altogether.

Using some new functionality in the latest version of SAS, I created two new metrics. One is something I call Adjusted Prevented Runs (In League). Basically, it controls for a handful of factors not addressed in the pitching metrics most favored by sabermetricians today and in one number tells you how effective a starting pitcher is, relative to the rest of the league, in keeping the opposing team from scoring. I call the second metric Factored Outs Over League, because in terms of pitching performance, while keeping opposition runs down is the ultimate goal, the way to achieve it is by getting outs. It’s like a pitcher’s version of On Base Percentage.

Because of the relationship between preventing runs and getting outs, the best way to get an overall view is to multiply the two metrics. And when you look at the data, Roger Clemens didn’t actually register a significant change in his performance from his Boston years to the Toronto years.

Thus, with the help of APR(IL)*FOOL, the power of analytics is demonstrated once again: this time, to clear the name of an innocent American.

Wednesday, March 26, 2008

Correlation, Causation, and Flat Tires


I was out of town this week and received a call from my wife. The rear tire on her car was flat, she couldn’t figure out how to change it, and ultimately called AAA. The culprit turned out to be a nail. “The tow truck guy said he’s seeing lots of these in our town. He thinks it has to do with all the home construction that’s going on.”

Always on the lookout for good causation / correlation examples, I apologized for not being home to change the tire myself and quickly Googled “flat tire correlation.” The first hit I got was from a discussion forum for BMW owners. The thread was discussing whether high-performance tires were more prone to flats. “I suspect there is a correlation between flats in general and construction activity in your area,” reported chuck92103.

Interesting, but was it chance, coincidence, or a pattern?

I took the car to NTB this morning to have the tire repaired, and the serviceman – un-prompted - provided more evidence for my fledgling theory. “Yep, we’ve been getting between 15 and 20 nails per day,” he said. “We’ve seen a lot more since all the home construction started back up.”

Well, that was all the proof I needed. Now, in addition to the sub-prime crisis, the high cost of gasoline, and whether my kid’s Thomas the Tank Engine is covered in lead paint, I had a new problem to worry about. Thousands of rogue nails, escaping from construction sites, hiding along the roads, and leaping up to impale themselves in the tires of unsuspecting minivan drivers throughout Metrowest Boston.

“I think I’ll take the train into town on Friday,” I thought. That is, until I saw this news item from yesterday’s paper. A freight-car loaded with building materials broke loose from a siding at a lumber yard, rolled three miles down the main track, and collided with a commuter rail train. 150 people were injured (fortunately, none seriously).

Could it be any more obvious? Increased home construction requires more lumber. More lumber means more freight cars. More freight cars increases the probability that one will break loose and (somehow) thwart the devices intended to prevent runaways. And more runaways, of course, means your ride home may be interrupted with potentially disastrous consequences. Not to mention all those flat tires.

My conclusion? Stop the McMansion-ization of the suburbs, increase transportation safety throughout the region!

Correlation analysis can be a powerful tool in determining the root causes of business performance. But like any tool, it has its limitations. Have you ever tried to correlate outputs and inputs and arrived at an unusual result?


NCAA update: The ScoreCard algorithm missed both first-round upsets. I correctly predicted Villanova over Clemson. But the Pitt Panthers guaranteed I wouldn’t have my office pool winnings available to pay for the flat tire by dropping their second-round game to Michigan State. Human intuition – 50%. Computer – 0%. Man does not win by analytics alone…


Wednesday, March 19, 2008

Big Dance Update

The NCAA Tournamant prediction algorithm - Dance Card - referenced last week correctly predicted 30 of the 34 at-large selections; an accuracy rate of 88%. According to Dance Card's rankings Illinois State, Dayton, Ohio State, and UMass should've been invited. The Selection Committee felt differently, and invited Villanova, Oregon, St. Joseph's, and Kansas State instead.

The creators of Dance Card have a second formula called Score Card which is designed to predict the results of NCAA tournament games. This might be a handy tool to use prior to submitting your brackets on Thursday morning!

Score Card predicts two first-round upsets: #9 Kent State over #8 Nevada Las Vegas, and #11 Baylor over #6 Purdue.

I'll add my predictions, completely devoid of analytics: a 12-seed always seems to beat a 5-seed, so I'll pick Villanova over Clemson (the Tigers come out flat after their ACC title-game loss to UNC; Villanova validates the Committee's theory that the Big East is the tougher conference) and the Pitt Panthers make it to the Final Four.

We'll see who's right...let the Madness begin!

Friday, March 14, 2008

Jeffrey Ma to Speak at DIG 2008

I am excited to announce that Jeffrey Ma will be a key note speaker at DIG 2008. If you are not familiar with Jeffrey, he is the main character in the best selling book Bringing Down the House and co-founder of the fantasy sports site Protrade.com. The book has been produced into a major motion picture, 21, which will be released on March 28th. Below is the movie trailer.



Jeffrey's background is in number crunching, which is a required skill at the blackjack tables. Jeffrey will discuss how to use data driven analysis to make smarter business decisions. The Protrade fantasy sports site leverages Jeffrey's proprietary analysis tools and metrics to predict performance. His approach is to help the audience to move away from using their "gut" to make decisions and drive decisions based on quantifiable data. We are looking foward to Jeffrey's key note and expect the venue in Las Vegas to fit the bill.

Providing visibility into municipal performance

I was recently forwarded a link to an interesting New York City online reporting tool. The purpose of the tool is to provide visibility into the city's service performance. As a diehard Boston Red Sox fan, it is a tough pill for me to swallow with the next statement I am about to make. This is a pretty cool system for New York city and a government agency to develop.

The City Performance Reporting (CPR) tool provides performance results and outcomes organized by 8 citywide themes, such as Education and Public Safety. The purpose of CPR provides transparency and accountability for city services to the "users" of these services...New York city residents.

Some of the features that CPR provides includes aggregation of metrics, year over year performance, traffic lighting for variances, metric definitions and graphical representation of data. One additional feature that I thought was interesting was the general trends up or down for the aggregate of all performance metrics. For example, the number of measures that are improving versus the number of measures that are declining (screenshot to the left).

In comparison to some of the solutions that I have seen deployed, this one is pretty basic. That being said, the fact that the city has identified the performance measures that are critical for city services and has changed processes to capture these metrics, it is an important first step. To then "expose" this information to the general public in a live reporting system shows that they have committed to hold themselves accountable.

Based on reading the cryptic URLs while clicking through the different reports, it looks like CPR is using the Oracle Siebel Analytics tool. Just a guess.

Thursday, March 13, 2008

Who’s going to the Big Dance?

The NCAA Men’s Basketball Tournament tips off this weekend with the announcement of the 65-team field. The tournament, affectionately referred to as “The Big Dance,” is a yearly highlight for college basketball fans and culminates in the crowning of the national champion after 64 games over three weeks of March Madness.

Two of these fans, who also happen to be business professors, have developed an analytical model using SAS® software to predict “at-large” teams – those schools who do not receive an automatic bid to the tournament.

Jay Coleman, an operations management professor at the University of North Florida in Jacksonville, and Allen Lynch, an economics professor at Mercer University in Macon, Georgia, built a model that has achieved an impressive 94% accuracy rate in predicting tournament teams.

The actual selections are made by the NCAA Tournament Selection Committee, and will be announced this weekend. Coleman and Lynch used historical results from this Committee, along with 42 pieces of information to build their model. Interestingly, they found that only 6 items are significant in determining whether a team gets an at-large bid:

1) RPI (Ratings Percentage Index) Rank
2) Conference RPI Rank
3) Number of wins against teams ranked from 1-25 in RPI
4) Difference in number of wins and losses in the conference
5) Difference in number of wins and losses against teams ranked 26-50 in RPI
6) Difference in number of wins and losses against teams ranked 51-100 in RPI

Here a link to their website; and a 2-minute video about their model.

As they mention in the video, predictive models have many applications in the business world. These models can be difficult to build (the DanceCard model has been refined over 14 years) and validate (we don’t have the equivalent of a 10-member committee announcing their results live on CBS). But simplifications may exist (of 42 drivers in the DanceCard model, only 6 are significant) so don’t be afraid of the complexity.

Advances in analytical software, coupled with the increased availability of data, make predictive models a powerful tool to use in optimizing your business. And we have the benefit of a real-life market to test our ability to predict the future.

Today’s burning hoops question: Will Ohio State, who lost to Florida in last year’s championship game, even make it into this year’s field with a 19-12 record and an RPI of 48? DanceCard will have their final prediction later this week.

What’s your burning business question? Have you tried to build a predictive model to answer this question? How well did you do?

Thursday, March 6, 2008

Does your business have a canary?

Ronald J. Baker, in his book “Measure What Matters to Customers,” draws a parallel between leading indicators and the canaries used by coal miners to alert them to the presence of noxious gases. If the amount of carbon monoxide reached a heightened level in an underground cavern, the canaries would stop chirping, have trouble breathing, and in some instances even die. This early-warning system gave the miners the time they needed to evacuate the mine.

Baker suggests that, in addition to lagging measures (which follow changes in a business cycle) and coincident measures (which run in sync with a business cycle), firms should identify leading measures which anticipate changes in a business cycle.

Leading measures are harder to identify – whether financial or operational. Traditional reporting paradigms (P&L, Balance Sheet, and Statement of Cash Flows) focus primarily on lagging, financial measures of business performance. Identification of leading indicators requires development of a “theory of the business” in order to find those measures that are correlated with desired performance but can be measured prior to the results rather than afterwards.

Earlier this week, in an article in the Wall Street Journal about the slowdown in the construction industry, it was reported that “the American Institute of Architect’s monthly index of billings at architectural firms was down 14% in January from its peak in July. That means fewer construction projects will start this year, said AIA Chief Economist Kermit Baker.” In this example, the index of billings plays the role of the canary – an early-warning indicator about future negative performance. The challenge is: what do you do with this information? Can the AIA use this data to make operational adjustments which will increase billings and therefore increase future construction projects?

Does your business have a canary? What leading indicators do you use to predict future performance? How did you develop those indicators?

***
Mark Lorence bio – I’m a Director in the Strategy Practice at Palladium with 18 years of consulting experience in large-scale systems implementation, planning & budgeting solutions, and Balanced Scorecard implementations. My current area of focus is incorporating analytics into traditional Balanced Scorecard projects, integrating strategic planning and business planning processes, and augmenting these solutions with next-generation dashboards. I am a lifelong Pittsburgh Steelers fan now living in Boston and admit to a small amount of childish satisfaction from the results of this year’s Super Bowl.

Sunday, March 2, 2008

Welcome to the Analytics Theme

How do you make important decisions? Do you trust your gut - or crunch the numbers? Flip a coin - or build a spreadsheet? Ask your spouse - or ask your SPSS?

Does your company make decisions the same way? Decision-making is becoming a key management competency driven by globalization, complexity, and risk. Should we be making these bigger, harder, riskier decisions the same way we've decided things in the past?

In the Analytics Theme at DIG we're going to discuss how decision-making can be improved by developing performance models and applying different analytical techniques to those models.

These techniques - decision trees, probability and statistics, simulation, regression, and optimization - may be ideas you vaguely remember from your Management Science 101 class, or they may be things you and your company are doing on a daily basis. Either way, we want to talk about them.

Once limited to the "quant jocks" with their cumbersome analytical software packages, these techniques are now widely-available thanks to advances in software tools and increased availability of data. And they're being used in some fascinating ways.

We'll hear some of these stories from our speakers and clinicians, but we're hoping to hear the best ones from you. How are you using the tools? How are you applying the techniques? And how are you improving decision-making through analytics?

To get started on this section of the blog, check out CNN's coverage of the Texas and Ohio primaries this Tuesday. I've been watching the campaigns with interest and have been fascinated with CNN's "Delegate Counting Map." They have the typical color-coded states (or counties, depending on the view) showing the results, but are able to run numerous simulations of future scenarios just by tapping a few icons..."If Obama wins the remaining states 55-45, here's what the delegate-count will look like heading into the Pennsylvania primary..."

An intuitive user interface, lots of good data, and the ability to quickly run simulations - that's a powerful analytical environment. Wouldn't it be great to apply the same ideas to your monthly reporting environment?