Harmonising Data to make it sing

By Tom Doolan, Operations Director APAC

Smoothing the Rough Spots

It is not unusual for data to have vast discrepancies in consistency, modes of measurement and coverage. This is a challenge when trying to get an accurate picture of the consumer packaged goods market and its consumers. Among the most long-standing and basic data in its vast library of data sets, IRI uses point-of-sale (POS) and Shopper Panel data in a complementary fashion to provide clients with a well-rounded view of attitudes and behaviours, down to the household level.

To ensure maximum data accuracy and minimise data’s innate weaknesses, IRI adjusts and aligns this data through complex statistical processes. One of these processes is negative binomial distribution (NBD) adjustment. NBD is a widely used probability distributional model to align observed count data, or data that counts rather than ranks.

Painting a Picture of the Shopper Journey

Because panel and POS form an integral foundation on which FMCG companies base marketing and overall business decisions, IRI has invested heavily to eliminate innate weaknesses within the data and capitalise on the benefits that both has to offer.
POS data is the gold standard of marketplace measurement of product sales and share trends, measuring actual levels of product sales within and across retail channels. It also provides valuable insight into distribution and price and promotion considerations. Because it is so comprehensive, POS data is ideal for trend analysis.

Panel data relies on a panel of Australian shoppers who agree to scan all their purchases and participate in surveys to share their opinions, beliefs and behaviours. Through the panel data, FMCG marketers can glean insights into a variety of household-level measures, including:

  1. Trial and Repeat behaviours
  2. Penetration
  3. Purchase Frequency
  4. Cross-Purchase Behaviour
  5. Differences in purchase behaviour across demographics
Furthermore, because panellists provide basic demographic data and answer surveys, panel data provides the ability to link these demographics, behaviours and attitudes to product purchases. Panel data also provides important consumer-level statistics, such as brand penetration; buy rate; brand loyalty (an intra-purchase correlation); and the correlations among categories, subcategories and brands. Taken together, these measures provide valuable insights into “the why behind the buy.”

Rising to the next level of accuracy

IRI moves to the next level of accuracy through the use of a complementary process known as NBD adjustment. NBD is a widely used probability distributional model that aligns observed count data, or data that counts rather than ranks. Examples of count data include the frequency of purchases made by a given household, number of children in a household, etc. By applying NBD after panel data are weighted demographically, coverage becomes consistent and year-over-year variability is eliminated. And by using NBD to adjust penetration and buy rate, the internal consistency of the data is maintained.

In the FMCG world, IRI’s NBD adjustment methodology resolves any dissonance between panel and POS data, ensuring that various household-level statistics, including buy rate and penetration, are consistent with POS mean sales. And, since mean sales is equivalent to buy rate times penetration, the product of the two is perfectly aligned to the mean sales implied by POS. Similarly, statistics such as repeat purchasers and cross-purchasers, post-NBD adjustment, should likewise be consistent with POS.

Harmonising these data sets allows the emergence of a complete and accurate measure of product sales and consumer-level details associated with those sales, bringing to life not just the story of products being purchased, but also the consumers making those purchases and the influencers of those purchase behaviours. In short, NBD adjustment ensures the granularity of panel data with the accuracy of POS.

Constantly adjusting NBD to the evolving landscape

IRI is continually working to appropriately address ongoing complexities inherent in NBD, including those related to the reduced coverage in the beer/wine/ spirits channel and the convenience store geography. Our statisticians are actively testing the fundamental assumptions of zero-inflation and heterogeneity adjustment methodologies to understand the robustness of these models in low-coverage situations and assessing extended versions of NBD that allow coverage to be estimated.

The team is also exploring the utility of new and alternative data sources, such as convenience channel frequent shopper program (FSP) and shipment data, as a means of improving the adjustment of panel data, and analysing the multivariate time-series aspect of the data to incorporate the past, to inform the future and fuse multiple data sources to better stabilize estimates.

In other words, we are hard at work to ensure that NBD also means “now, better data!”

How can we help you supercharge growth and profitability?


95% of CPG, retail, and health and beauty companies in the Fortune 100 work with us

Answer the question below:
= three - four