Developing Message Filter
Our aim was to develop an approach to message testing that is:
- quick and easy to answer;
- mobile friendly;
- sensitive to differences between messages;
- and leads to the same conclusion as longer traditional methods.
We conducted a 5 arm study in which we asked 4 different types of binary questions and compared those results to a traditional 5-point scale.
Five measures we tested
- Very closely associated Fairly closely associated Somewhat associated Not closely associated Not associated at all
- Closely associate Not closely associated
- Describes the message well Do not describe the message well
- Please indicate which of the following attributes you feel describes this message well and which ones you do not Yes No
- Please indicate which of the following attributes you feel describes this message well (select all that apply)
Comparing a 5-point scale to alternative binary measures
We already know that binary measures tend to be three times faster to answer than a scaled question (1). We also know that they are easy to answer on a mobile device, so we sought to understand how binary measures compare to the more common 5-point scale, in terms of the answer received.
In the graph below we show the average answer—across all questions and all messages—for the 5 different types of measures we test.
Binary measures are closer to top 3 box than to top 2 box. All 3 forced choice binary measures get selected at similar rates. Selection leads to a notably lower number of attributes being chosen.
The messages we tested were public health messages aimed at encouraging people to get the influenza vaccine. We collected them from the communications of public health authorities in the US, UK, Ireland, Canada, New Zealand and Australia.
We included nine different measures in these tests, and the patterns of answering are quite consistent
across all measures.
Minimizing the number of attributes
In our quest to make the question quick and easy to answer, we also want to ask as few questions as necessary. While many of the measures are highly correlated, a close examination of the corrections and conceptual categories suggest we could drop “makes me feel positive about getting the flu vaccine” and “provides compelling information”. Reliability analysis confirmed that removing those items had no
appreciative effect, as there were other measures already covering the same topic. That leaves us with seven distinct measures that can be answered in less than a minute.
Sensitivity—measuring differences between messages and attributes
One of the biggest challenges researchers face is being able to differentiate between messages in terms of which one is the “winner”. By using messages that have been developed and used by public health authorities in major nations, we set the bar high. These are all, presumably, good messages that have already passed testing, so we expected being able to differentiate between them would be challenging.
The analysis we did allows us to look at differences between messages and also between attributes, which helps us identify which of the five measures was most sensitive overall.
In order to level the playing field across the measures, we looked at the differences between the highest and lowest scores, as a percentage of the highest score.
Yes/no and top 2 box showed somewhat more sensitivity on the difference between messages. Selection, because it results in fewer attributes being chosen, showed higher sensitivity when comparing between attributes. Yes/no appears to be slightly more sensitive when comparing between attributes, although the difference may not be meaningful. What we can conclude is that yes/no appears to be as sensitive as the more typical, but time consuming, scaled approach.
Both the scaled approach and yes/no method selected the same attribute as having the highest overall appeal: “Influenza can be anywhere. Don’t get it. Don’t give it. Get immunized.”
Our results indicate it is possible to use a yes/no binary approach to message testing that is quick and easy to answer, reliable and sensitive to differences between messages. In fact, it is as sensitive as the scaled data and leads to the same conclusion.
Given these results, and in light of how mobile friendly and quick to answer the question is, we recommend using a yes/no binary approach to message testing.
We conducted five studies as part of Canadian omnibus surveys. The samples were representative of the Canadian population and surveys were conducted in French and English. The sample sizes were: Arm 1 1019; Arm 2 1526; Arm 3 1518; Arm 4 1514, Arm 5 1515.
- A Grenville, What Works Better, Scaled or Binary Brand Ratings? Vision Critical University