How do you know that a sample is representative? Simply check whether the demographics look like the statistics from the residents’ registration office. That sounds simple. Good luck with it!
You need a large number of respondents to obtain stable results. A representative sample, on the other hand, is needed to obtain truthful results. Stable and true are two independent characteristics. While stability is easy to establish, representativeness often gives market researchers gray hair.
Let’s assume that 100 people are surveyed. In line with official statistics, 20 percent of the sample are young people, and 20 percent are high earners. But now it may be that (for whatever reason) the 20 young people are also high earners, which would not be representative at all. Admittedly, this is an extreme example. It is only intended to illustrate: Quotas based on demographic characteristics are a blunt sword when it comes to ensuring representativeness.
But it gets even trickier. In many cases, it makes no sense at all to focus on demographics. If, for example, you want to measure a political opinion or the propensity to buy an electric car, then it is crucial that the sample is representative of the types of values prevalent in the population. This usually correlates only moderately with pure demographics.
“Ok, then I’ll use the value types” is what comes to mind. Sure, you can query value types in the screener and try to weight the evaluation according to a hopefully known distribution.
But there are two catches: firstly, I need to know what influences my desired metric for political opinion or the propensity to buy an electric car, for example. Only then can I manage this type of representativeness in advance. Secondly, I need to know the distribution of these hopefully known, moderating influences in the population.
In short, representativeness is almost impossible to control for practical market research. It’s a bit like flying blind.
The influences on representativeness are different in every study and are usually little known. Their distribution in the population is also unclear. And as a further point, I cannot ensure that the multidimensional distribution (= cross-distribution of different dimensions) is correct.