Data Analysis - Featured TestSphere Card

One hundred cards. One hundred Test-related concepts.
Here on the club, we’ll feature a card from the TestSphere deck for people to write their stories about every month.

I challenge you:
Take a few minutes to think about your experiences with the featured card.

What bugs have you found that are related? Which ones have you missed?
How have you tackled testing for this concept?
What made it difficult or more easy?
What have you learned? What can others learn from your experience?

Take one of those experiences and put it to prose.
Telling your stories is as valuable to yourself as it is to others.


My Data Analysis story is set in a DNA analysis system. The client had been using huge Excel files to store the millions of lines of data but wanted a software solution that could store it in a central place where everyone could work together to store, update and use those big files. I probably don’t have to tell you that this old data had a lot of outdated and bad instances in there.

A big part of our testing was validating the old data to see whether it was actually good or not. Which resulted in a lot of uploading, exporting and comparing of rules and data.
But given the sheer amount of possible mistakes in an invalid export getting the actual offending data out of there was quite the challenge.
During the development of this functionality, the feedback was often not exact or just plain missing.
So we had to really delve into the data and fish up those problems ourselves.

We did know that if we found one invalid data point, chances were high that for that same file, there’d be more of the same problems. Using targeted formulas, conditional formatting and other Excel magic really helped us visualize and sort out the bad data.
Would there be a better way to do this using code? Probably, yes. But none as quick, reliable and versatile.

What’s your story?