## Motivation through punishment, or reward?

Reward. The amount of oversimplification there is intense. Let’s go deeper.

To decide which of those is more effective, we must first distinguish between “feedback” and “reinforcement”. Feedback is when an output becomes part of the next input. That’s something to be avoided if you’re a sound engineer, but is desired if you’re involved in a creative endeavour. Reinforcement, on the other hand, is when a stimulus is used to increase the probability of a certain response. It’s not the same thing, though consistent feedback can act as a reinforcement. Also important is the fact that negative reinforcement is NOT the same thing as punishment. Reinforcement has two flavours: Positive, the addition of “good” stimulus (eg. praise), and negative, the removal of “bad” stimulus (eg. nagging). Neither of those involve discouraging a behaviour (eg. through humiliation) – that’s punishment, which would *lower* the probability of a certain response.

Of those three choices, positive reinforcement is generally regarded as the best. That said, this column will now be examining feedback.

Next, let’s distinguish between the “law of averages” and “regression to the mean”. The “law of averages” is a mental fabrication, a misinterpreting of the much more mathematical “law of large numbers” (or Bernoulli’s Law in statistics). The “law of averages” is the notion that, for example, if a coin flips 5 heads in a row, tails then becomes more likely. While it IS true that, over a large number of flips (a SERIOUSLY large number), you will now see more tails than heads… the chances for the next flip have not changed. Every flip is independent of the last. Even after 5 heads, heads is just as likely to occur as tails. (Unless you’ve got a two headed coin.) This can be a difficult thing for us to wrap our heads around, particularly when we consider “regression to the mean” – which IS legitimate mathematics, and the topic (finally!) that will contrast punishment with reward.

**That’s So Mean**

Regression (or reversion) to the mean essentially says: The further a measurement is from “normal”, the higher the chances that subsequent measurements will be closer to “normal” (whatever “normal” happens to be for the data). To use another example, if you have a really good (or bad) day, it becomes increasingly likely that your next day will be average. Again, this is not saying that a bad event becomes more likely after several good ones – where “good” may be someone’s definition of “average” – what it’s saying is that an extreme event, when it occurs, is likely to be followed by a more average one. (We see this frequently in sports.) Granted, “regression to the mean” does not eliminate the possibility that your definition of average may change over time, for instance as skill level increases. (Consider my last column about the Dunning-Kruger Effect.) But it DOES say that, following an extreme event, we will regress back to “normal”… *regardless* of whether the feedback received for that event was in the form of a reward or a punishment.

The Veritasium channel explains the concept very well in this 7 minute video.

It’s this problem of “regression to the mean” that requires studies and experiments to have a control group, generally in the form of a placebo (a substance that is known to have no effect). After all, given people who are by definition outside the “norm” (otherwise why would they need treatment?), we must compare a set of them who receive care with those who may simply be regressing to the mean. If both the treated and untreated groups improve by about the same amount, the treatment is ineffectual. A study on osteoarthritis of the knee even showed that *surgery* could be a placebo – the patients improved regardless of whether a real procedure was done. But what does all of this mean in terms of feedback?

Consider this scenario: You do really poorly on an interview. There is little benefit to beating yourself up over it. That event was outside the norm. Statistically speaking, you WILL do better next time. Similarly, if you get a really high hit count on one blog post, the count is not likely to be repeated next week. That good post was outside the norm, and it is not possible to maintain that level of performance (statistically speaking, all other things being equal). More to the point, while it is similarly futile to *reward* yourself for that great event… doing so consistently can turn your internal feedback into a message of reinforcement. A message of positive reinforcement (with a reward) rather than negative reinforcement (no longer berating yourself) or punishment (refusing your needs until things are done right).

Hence my saying that reward beats punishment.

That said, the message of the reward is equally as important as the reward itself! If you reward yourself for “being so smart”, you’re actually encouraging a fixed mindset. The implication is that your “normal” did not change, but somehow you “beat the odds”. (The same sort of problem will occur if you decide there is nothing to *learn* from that really poor interview.) On the other hand, if you reward yourself for “your hard work”, you’re encouraging a growth mindset. The implication is that your efforts are changing your “normal”, and if you keep this up, what was once was an extreme event may become the new average. Which means that it’s the message you give to yourself – and perhaps more importantly, to the others you speak with – that’s important!

Of course, we may not get it right the first time. But it’s our average performance over the short term that, in the end, leads us towards our great expectations.

For further viewing:

Identifying Negative Reinforcement

Regression Toward the Mean

Coaching and Regression to the Mean (Video)

*Got an idea or a question for a future TANDQ column? Let me know in the comments, or through email!*