Popular wisdom says that in order to do business analytics you need to get your data in order first. Consolidate it, cleanse it, standardize it, centralize it…create one version of the truth. I’m not about to debunk that, but I do think it’s only part of the story.
There are cases where we want and need the best quality information we can get. If you’re the CFO your earnings release needs to be right, not just close. But there are also times when it would be valuable to have “quite clean” information. The trick is to get the labeling right.
Faced with a choice between no data and data marked “70% reliable,” I know that I’d usually go for imperfection. What’s more, faced with a choice between perfect data next week and “good enough” data right now, there are also cases where I’d go for the latter. Not all decisions are equal, and many times getting information immediately is more important than perfect quality, as long as there’s a measure of reliability attached.
If that reliability number comes straight from the data quality experts in IT, that’s good. And if it’s crowdsourced from a group of people that I know and trust, I might still be better off using it than having to wait.
So although sometimes we need that single version of the truth, there are also times when dirty data can be good for you!



