Taming Tracking Data: Using the Kalman Filter to Improve Reliability of Tracking

The challenge: be the first person to get to the moon (and back)

Put yourself in the shoes of Frank Borman, the commander of Apollo 8, the first manned spacecraft to orbit the moon and return to earth. Picture yourself piloting a vehicle that is just 11 feet tall and 13 feet across. Your tiny spacecraft will hurtle through space at 24,200 mph, aiming to enter the orbit of the moon—a target that is also racing through the cosmos at 2,288 miles per hour.

To get to the moon and back you need to navigate your way there with just a sextant and a measure of acceleration. It’s difficult to get accurate measurements with the sextant, and you have relatively long periods between observations, in part because your target is tearing through space at an incredible rate. So you have noisy data and lots of missing information, but a critical need to get the estimation right because your life depends on it. You need help.

Fortunately, tracking studies are not matters of life and death, but we need to get them right too. That’s why, to help us improve the accuracy of our tracking, we turned to the Kalman filter—an equation developed to help guide the Apollo space program to the moon.

Tracking a brand in the mind of consumers is similar to tracking the movement of a spacecraft. Both follow a “smooth” trajectory, but we only get to observe it periodically. And our observation is subject to large random measurement errors. So the key is to optimally separate the signal from the noise.

How it works

The Kalman filter works by combining a prediction of the true data with the new measurement, using a weighted average. The weighted average is an estimate that lies between the prediction and the measurement, and has a better estimated uncertainty than either alone. This process is repeated at every time step, with the new estimate informing the prediction used in the following iteration. This recursive action is efficient because it requires only the last “best guess” to calculate a new estimate.

The relative certainty of the measurements and current estimate are important considerations that are managed by adjusting the Kalman filter’s “gain”. The Kalman gain is the relative weight given to the measurements and current state estimate, and can be “tuned” to achieve particular performance.

With a high gain, the filter places more weight on the most recent measurements, and thus is more sensitive to changes with the new data. With a low gain, the filter follows the model predictions more closely. At the extremes, a high gain will result in a more jumpy estimated trajectory, while low gain close to zero will smooth out noise but decrease the responsiveness. Set too high, it is easier for the model to be fooled by randomness. Set too low, it will be slow to identify real change. Additional tracking data sources, such as social media, can help inform how you set the gain.

The Kalman filter has numerous applications today, well beyond navigation systems. Kalman filtering is widely used in time series analysis used in fields such as signal processing and econometrics, and is even used to remove camera jitter in videos. But it has rarely been applied in the field of market research.

Kalman filter in action with tracking data

I presented a paper entitled “Fly Me to the Moon: the Application of Kalman Filter to Tracking Data” at the AMA’s Advanced Research Techniques (ART) Forum in 2016, in collaboration with my colleague Andrew Grenville and Karen Buros of Radius Global Market Research. It described the application of the Kalman filter technique and demonstrated how it could be used to provide less noisy estimates in various types of tracking studies.

The chart below is one example from the presentation which measures the awareness of Royal Bank of Canada (RBC) as an Olympic sponsor. The time is divided into three periods—before promotion of the Games started (2008), the year in which the Games occurred (2010) and after the Games (2011).

Awareness is a useful measure for demonstration purposes because it can be expected to increase and decrease in fairly predictable ways. Awareness was measured each week with relatively small Canadian samples, so we expected them to be relatively noisy. In the chart, the red line is the data after the Kalman filter has been applied. The dark gray line is a typical 4-week rolling average and the light gray line is the raw data for each week.

kalman filter

The application of the Kalman filter produced results that appear more accurate and reliable than the rolling 4-week average and it tamed the volatility of the raw data.

I followed that paper up at the 2017 ART Forum with a piece co-authored with Jack Horne entitled “Sampling in the River Twitter: A DIY Tool for Social Media Tracking via the REST API”. In this paper, we demonstrated that it is possible to sample tweets about a brand and to use that data as an indicator of potential instability or change in the Kalman filter.

In that analysis, we looked at tracking something more volatile: approval of President Trump. In the table below we show the raw data, from samples of 1,000 people per observation, plus the data with the application of the Kalman filter, and the application of the Kalman filter informed by data sampled from Twitter.

In the chart below we see that the Kalman filter has less effect with a large sample, but it still smooths out the data in ways that appear to reduce noise. The Twitter update makes relatively little difference to the Kalman filtered data.

tracking trump brand

When we look at a smaller sample the impact of the Kalman filter was more noticeable. In the chart below we have the same data as above, but this time for a smaller sample: just women in the Midwest. Here the filtering has a more pronounced effect, smoothing out what is more likely to be sample bounce than true 30% swings in approval over the course of a few short weeks.

trump brand tracking in subpopulation

Conclusion

Tracking data, like all survey data, has some “noise” in it, especially when the samples are smaller. Rather than being constantly distracted by random variation in tracking data, we often resort to blunt tools like rolling averages. But that’s not ideal.

The papers described here provide us with powerful evidence that using a Kalman filter helps point you in the right direction. We know it helped safely get the first astronauts to the moon and back. Could Kalman filtering help you?

To learn more, download the presentations below (hint: read the notes sections) or contact me.

Download – “Fly Me to the Moon: the Application of Kalman Filter to Tracking Data

Download – “Sampling in the River Twitter: A DIY Tool for Social Media Tracking via the REST API