Trouble with manual data capture

Asking people to fill out forms in order to monitor performance, track a phenomenon or try to gather data for problem solving, too often leads to trouble when data is ultimately collected and analysed.

The case is about manual data capture into paper forms and logbooks on production lines. A precious source of information for a consultant like me. Potentially.

Alas, as I started to capture the precious bits of information from the paper forms into a spreadsheet, I soon realized how poorly the initial data were written:

Most of the forms were not thoroughly filled out, some boxes not filled, fields left blank, totals not calculated or wrong, dates not specified and a lot of bad handwriting leading to possible misinterpretation, among other liberties taken.

It seems obvious that the production operators do not understand the importance of the data they are supposed to capture nor the reasons for desired accuracy and completeness.

To them it’s probably a mere chore and not understanding the future use of the stuff they are supposed to write, they pay minimum attention to it.

It is also obvious that management is complacent about the situation and does not use the data, otherwise somebody else would have pointed out the mess before me, and hopefully acted upon.

Well, we can’t change the past and all data lost are definitely lost. The poorly input ones is all I could get, so I’ll had to make with what I had.

Thanks to a relatively important (I dare not write big) amount of data, flaws do not have too much impact, the big picture remains truthful. For me the importance is the big picture, not the accuracy of each single data point. (A takeaway from my exposure to big data!)

I noticed that most of the worse filled forms related to “special events”, when production suffered a breakdown, shortages and the like. These dots on the performance curve would anyhow been regarded as outliers and discarded for the sake of a more significant trend.

So it was not a big deal to disregard them from the beginning.

However, the pity was that no robust and deeper analysis could be conducted on these “special events”, not that unusual over a six-month period.

Some incomplete data could be restored indirectly, for example calculating durations from start and end time or conversely a missing timestamp could be restored from another date and duration for example. Sometimes, these kind of fixes introduced some uncertainty on the values, but again I was not after accuracy but trying to depict and understanding the big picture.

In order to be fair with personnel on the lines, I have to agree that some of the forms had poor design. A better one could have led to less misunderstanding or confusion. This acknowledged, the data reporting was not left to everybody’s choice, as it is mandatory by regulation.

Because to my great surprise and disappointment, this happened in pharma industry.


About the author, Chris HOHMANN

About the author, Chris HOHMANN

View Christian HOHMANN's profile on LinkedIn

If you liked this post, share it!

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.