Ensuring data quality — part 2: Tools and processes to avoid missteps

Data Architecture 26 October 2017

In our first article, we saw that traits such as daring, organisation, and curiosity are key professional skills when it comes to managing the quality of your data (if you’re just joining us, you can catch up here).

But everyone’s qualities are dulled when it comes to quality management — work can get sloppy, or repetitive tasks can take their toll. Even with the best of intentions, professional skills need direction in order to be used wisely, like a train placed on the right track. In the second part of this article, we will discuss the tools and processes that can be implemented to stay on track.

Data quality depends directly on the care taken during the testing phase

Though it is true that all (or most) projects are on a restricted timeline, data quality depends directly on the care taken during the testing phase. It is thus necessary to approve the number and duration of different testing phases before the project begins. At least three testing phases should be completed, as this gives development teams two opportunities for correction. This “detail” of the timeline is too often overlooked, in favour of so-called “more agile” methods or “digital” projects where everything must move forward quickly. But in reality, no project manager can ensure data quality without first having planned to check the data.

This is why it’s important to describe the tests carried out, from the very beginning stages of a project – first, broadly, and then, once the collection plan has been defined, in more detail ( KPI by KPI). Though digital projects are often innovative, the tests themselves are part of a tried and true logic. More specifically, we’re talking about unit tests (KPI by KPI), and global tests (across all data).

These tests are to be rolled out across several levels, following the data supply chain:

  • On the site, to approve the validity of variables provided by the developer (Does the variable exist? Is its value coherent with and relevant to the visible information on the page?)
  • On the TMS, which captures and reroutes the data, to approve reprocessing of these variables in hits, in function of the chosen webanalytics tool
  • In the web analytics interface, to approve the proper processing of data by the tool

Don’t forget: simply checking the data in the interface is not enough to approve the right implementation of KPIs! For example, you might see events that line up with newsletter subscription for your site, but how can you be sure that these events were not sent after a faulty interaction – like an invalid subscription due to a typo?

Once the site has been tested, the implementation has been approved, and the tracking is up and running, it may seem like you’re in the clear. But your digital assets evolve: new pages, new sections, new features. You must be sure that data quality remains satisfactory, via regular verification. This, too, should take place across multiple levels: on the site, and in your web analytics reports. There are several tools available that can imitate a site visit, and send back a full report of information gathered (generally using ghost navigators). Similarly, analytics tools today include automatic alert systems that must be set up. Take advantage of this tool! Instead of losing weeks of data — the time it would take for someone to check the relevant report and then make the effort to flag it — you’ll only lose a day if there’s a tracking problem.

Spend as much time on documentation as you do on implementation

Lastly, data collection projects are no exception to a major rule: spend as much time on documentation as you do on implementation. How many people take up a project in the middle and complain about inadequate documentation, yet are hard-pressed to supply their own! There are many documents:

  • Tagging plan (including functional and technical specifications) (mandatory)
  • Test status (KPI by KPI, detailed enough to be easily reproduced) (mandatory)
  • Tag details and rules in the TMS (if applicable)

It is true that documentation can sometimes be automated (from the tagging plan, for example) to save time. But careful documentation can guarantee a project’s success, particularly when contributors are constantly changing (consultants, agencies, etc.).

Though data is now a major asset for many industries, data quality is often a project’s poor relation. However, it is an essential supporting character, the cornerstone on which the project’s long-term success depends. How can you happily drive a car that you have to bring to the mechanic with every fill-up? To avoid this situation, you must take great care that your data is correctly implemented. Take the time to learn the mechanics of data collection, to understand its ins and outs, and to keep an eye on collection for as long as your site exists. This is the cost of making sense of future analyses.


Translated from the original French by Niamh Cloughley

Would you like another cup of tea?