Jill [or Jack] of all trades, Master of none…In the Business Intelligence context, which is better? You decide…

So, all of the holiday festivities are done, and I start my new job in 5 days. After a much needed sojourn and period of deep self-reflection, I am back, and with a fresh perspective. Me, being me, my "ah-ha" of awareness centered deeply in the power of scorecards. Whether it is an activity you are doing to monitor the health of your business, or something you do for fun (albeit, without recognizing that you are doing it) like golf,  the act of scoring something, anything really, is an activity as old as time. Scoring is the physical representation of putting pen to paper, or more aptly stated, represents the recording of live scores/metrics/key performance indicators and delivering said metrics in a way that makes sense to the scorer. Often. the scorer can clearly articulate the method to their "scoring madness", but upon displaying the scorecard to onlookers, will most noteably be asked to translate the meaning of the very scorer perfected personalized means of recording new entries. "Oh, that is my symbol for the bogey I landed at hole 8; or Red means it is generally under forecast for revenue but for satisfaction, it means we were within a range that was under realized…" Huh? And typically, the scorer, in noticing the look of confusion spread across your face, or in responding to your question for further contextual information, will explain [at length] their scoring methodology, often times losing the interest of the listener. Come on folks, even I, who was quoted as "getting personally excited by [the program] Excel," can get bored; after all, for all the excitement that I find in talking turkey with BI industry folks, it truly isn’t the most exciting space in the world. It isn’t like announcing "I’m a rocket scientist" or I’m a CIA operative"; "I’m a business intelligence consultant" just isn’t as shiny on the surface.
Oh but it is…The typical devil’s advocate, I see BI practitioners as some of the most cunning problem solvers several best practices and of course, experience, to build an analytical web of cause and effect to drill down into architectural infrastructures at the core of any BI solution. The 5 "whys" of Lean Six Sigma stipulates asking 5 "Whys" to any question when you are trying to solve a problem. Here is an illustration of a typical personal goods retail store — The sales manager asks his analyst for a report showing the winter sales actuals since he knows from his staff meeting that his team missed forecast by a substantial amount.  Director 1 says to manager X "I believe we didn’t have the inventory on the shelves to support the winter marketing campaign," which he surmises from his pass-through of the store during the Christmas rush. "I overheard two customers complaining about our practices of offering promotions without stocking up accordingly." These practices create a sense of panic induced scarcity as customers start buying the products fueled by a sense of not being able to get it again anywhere else. To the customer, it merely creates feelings of annoyance and causes a buyer nightmare, as shoppers try to get one arm up on their fellow shopper [think 2007 Christmas gift nightmare trying to acquire a Nintendo Wii].
As the manager, he asks you to put together some slides to show the leadership team addressing the potential causes, and added he was interested in your opinions for mitigating this risk in the future. He adds as he is leaving your office that his strong preference is for the manager to start with his hypothesis, mind you baked in nothing but a "gut feel" and an attempt to spy on customers, and let the data found there drive the other causal paths. 
Ask yourself as the manager, where do I begin? What do I glean from the problem: "We didn’t hit forecast for winter sale." Secondarily, you were told "we didn’t have the inventory on the shelves to support the winter marketing campaign."
Well on the surface, manager X knows [for a fact]…
  • –It was Q4 2007 (winter campaign)
  • –Forecasted numbers are generated by your team
  • –Marketing team also affected to some extent (marketing campaign part of director’s hypothesis as a cause, but for now you know during that period in question, there was a campaign also happening. Remember, ‘correlation isn’t causation’.
  • –You definitely know you didn’t ask enough of the right questions, since pondering the factual points leads you to wonder why the director believes it is related to inventory shortages; questions like "did he see some report outlining sales for the period…by category of retail item…did we hit targets for any product, which was buried in the shortages by other products…"

Assumptions to ponder in your research:


  • –Was the marketing campaign related and how it is related?
  • –What items weren’t on the shelve by a specific set of date-related stratification factors-during the after Thanksgiving sales, the week before Christmas, the day after Christmas, New Year’s Day, etc.
  • Extent of shortages was great enough that customers were complaining out loud, implying this happens all of the time [a point you are convinced is wrong considering you are responsible for forecasting sales and reporting to leadership how close/far you came to hitting store targets, and would have noticed this problem in the data].

As the manager, you immediately log into your instant messaging service, and see your best analyst is online. You call her into your office along with your assistant to have a debrief meeting. Analyst I is told the situation, and ponders the problem for a moment. As the manager rattles off the director’s hypothesis, the analyst thinks about the "5 Whys" of Lean zoning in on the 2nd part of the problem statement "we didn’t have the inventory on the shelves to support the winter marketing campaign."

Why 1: Why does the director believe we didn’t have the inventory as marketed in our winter market campaign..? Spawns you to pull the list of items from the campaign and the related sales for the time period into a dataset. The results show you that 2 items of the 5 had less sales by a substantial margin that the other 3 items. In fact, the other 3 items hit target in terms of transactions sold and revenue generated, but why…

Why 2: Why did the other 2 items, which had revenue shortfalls in the double digits, look to have sold enough units to not raise flags from what was forecasted?

So, you pull up the ordering list and compare the items ordered to what was forecasted and see everything in check. You notice that you issued an unusually high number of rain checks for the 2 items, which you safely note in the back of your head to research later.

You query the database once again and generated an inventory report from your central warehouse and nothing is out of the ordinary – they supplied what was ordered. You also noticed, however, that what was ordered was the same volume as what was ordered the previous month. This doesn’t pass your sniff test…How is it possible that the month prior to a huge marketing campaign and sales month were forecasted to be the same?

You cross check the list of items shipped to the store, and everything was accounted for by receivables. Upon further review, you leave this train of thought without noticing the slight change in numbers that your query returned vs. what was in the report from your boss. The report came from the powers that be, so you didn’t think to question where the numbers came from; nor do you now, but I wanted to tickle you with that thought for when this applies to your real world situation of things to think about – the effects of dirty data on quantification of business intelligence gaps / problems. But I digress…Back to our shared train of thought…

So, if we ordered what we projected we would need which was supplied without any replenishment challenges, why 3 enters the scene… 

Why 3 : Why, if the supply chain process worked as expected, did we issue an unusually high number of rain checks of the items advertised in our marketing campaign during what we consider to be our highest retail sales period in the year?

While pondering this question, you remember and ask yourself Why 4:

Why 4:  Why was the warehouse overstocked in the two items that showed revenue shortfalls more than any other, prompting you to open a spreadsheet. After pasting the results of your query into a newly labeled column A, row 1 ‘Forecast from System 1’; in column B, you enter ‘Forecast from Leadership Report’, in column C, you enter Time Period, and in column D you enter the name Variance. The save the document as ‘Multiple Forecast Systems Impact and Gap Analysis’ and pull the order form again. You notice the authorization signature was the head of marketing and not your boss, as expected. After surveying some of the employees over in marketing you map a secondary process for quarterly seasonal marketing campaigns since it involves a slightly different process, as you discovered. Since the process is only different in terms of which department submits the order to the warehouse, the analyst adds this as column E, titled Authorized By.

The results of the data transformation paints an interesting picture. In every case of a forecast shortfall or overage, the authorized was the head of marketing and it was for a marketing campaign. Knowing this person to be very good about not making false claims about the health of their company, he wasn’t overly swayed that this was the silver bullet, by any means. This became a bone on a subsequent fish bone diagram and Cause and Effect matrix, two effective tools when using the 5 Whys to solve a question. Unlike other periods in which your team of analysts produces the forecast by running the root queries directly against the data warehouse system, the marketing team using a decision support system that the BI team built to deliver reports to the execs on demand.

Looking at the 2 sources of data columns, it was evident that the marketing team always used the reporting platform, whereas we were running adhoc queries directly against the DW. When our team was the approver of the RO, the data source was the DW. When the marketing team head was the authorizer, they used the reporting system.  The last of the 5 Whys – Why did the numbers used by marketing vary so substantially from the numbers we pulled in terms of the expected forecast for campaign periods if the sources of data were ultimately generated by the same team?

And the answer started to reveal itself…The numbers from the table compiled above were different in every period involving campaigns which were ordered by the marketing team and they used the reporting platforms in order to generate forecasts of inventory to order. Our analysts now exploring the 5 Whys is the author of the reports so the assumption was they were accurate. The few times they were out of sync, pre-set alarms went off alerting analysts before the impact worsened by the exponential nature of time. The variances were there and were high enough in which I questioned Why the thresholds were set so low, thus escaping notice 

Believing the assumption that the forecasts were the same and thus, didn’t need to be compared to one another was a simple mistake to make, and to learn to see before substantial financial loss occurs. Remember, supposed to be ranks up there with should’a, would’a and could’a when it comes to multiple data reporting sources. In time, all systems can become out of sync because one source gets updated without fully analyzing or realizing the full impact to a related or linked sources. Or, when two disparate replications of another system exists (like we BI professionals don’t know all about the "under the desk" solution, or servers living out of IT and under our desk…Pish posh) but that are not linked, when one gets updated, the other produces inaccurate and out dated information, even if it the same team of IT or business intelligence folks are responsible for maintaining the sources of information.

It is a concept known as systems synchronization. Companies mitigate this problem in a number of ways, and when syncing issues do occur, they analyze and correct the issue often escaping notice of the wider population in their attempts to align the data between sources. But there is always a time period in-between synchronization when both systems are out of sync and when groups of employees are exposed to those same systems and pieces of data, used to produce their manual ETL magic to transform that data into readable information and ultimately, to produce fancy business PowerPoint’s for leadership, often telling the tale of 2 or more systems are intertwined in order to answer a question. Any question posed by anyone in the organization which can be generated and compiled into data sets in a sorted Information Symphony, knowing nothing of the fact that 50% of the time, someone is going to roll the dice and land in the unaware zone, a land of growing data integrity issues that has formed in one of those systems. And, even worse, who is the person to say which source is the system of record or is even correct, once the damage is done. They will always question the validity of the data from that system; not out of anything but disbelief in the intelligence being propagated for their benefit. Results are significant, not minimal; the effect that this can have when the damage is done.

In the end, it was the slightly different process causing the issue, and once it was changed to a centralized order er responsible for submitting to the warehouse, along with the fix to the out of sync problem on a more permanent basis: creating a centralized ordering system.

However, without the skill of the ‘Jill of all trades’ I was once dubbed by a mentor along with ‘Crackerjack Analyst’, this company might have lost additional campaign related revenue from other botched campaigns. An analyst at heart is hard to find in employees, which is especially true and necessary within the business intelligence professionals of the world, though often neglected during standard hiring practices.

I learned what I learned while I was an analyst, though appreciated it the least, as it wasn’t a respected technical position within my org. In fact, I believe it was dubbed the lowest notch on the proverbial totem pole within the IT department. As a final farewell to 6 years of my life and amazing growth in this field, I bid the greatest of thanks to Expedia for everything I walk away from, especially the respect of my colleagues.

Hopefully, readers of the world, this example will help you to see and share (in the form of comments) the real life BI problems and solutions examples dreamt up or affecting you. 

To those ‘Crackerjack analysts’ out there…we are fewer and fewer by the day and have to watch out for each other. Our tenant is simple…To provide objective and meaniful data to enable decision making for the organization in which I am employed. Happy New Year to all and best wishes for making 2008 the year in which you shine.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s