We call ourselves clicker trainers or reward based trainers, but there are many different variations of training methods and different ways to express it. Positive training, traditional training, balanced training, and resource-based training are other variations you may hear. But what does it mean? And why is it important to keep track of it?
Well, because we want to know what the training method entails and how and why it works. We also want our course participants to know what training tips they will receive when they come to us – and which ones they will not. We usually don’t talk about the theoretical definitions because they can feel a bit cumbersome, but when I (Elsa) looked through our extensive blog archive (we started blogging over ten years ago) we saw that it had actually been ten years since we wrote about it last. So then we thought it was time to do it again 🙂
Learning is something that has been researched extensively, and there are clear definitions of it. You probably recognize the names Pavlov and Skinner – Pavlov with classical conditioning (reflexes) – the bell and the dogs – and Skinner with operant (voluntary) learning. You may also have heard of Thorndike, who already in 1898 defined the Law of Effect: “responses that produce a satisfying effect in a particular situation become more likely to occur again in that situation, and responses that produce a discomforting effect become less likely to occur again in that situation.” Note that Thorndike says “organism” – so the Law of Effect also applies to humans.
So what can we gain from the Law of Effect? Well, this is where the definition of consequences comes in. And with them also the first language barrier. We often interpret the word consequences as something negative, but it is actually not. A consequence is simply something that comes after a behavior – and a consequence can be pleasant or unpleasant for the person experiencing it.
There are four different consequences and the definition of them is also a bit tricky due to how we are used to expressing ourselves, so hold on to your hats now:
Positive means we add something.
Negative means we remove something.
Reinforcement means we get more of a behavior.
Punishment means we get less of a behavior.
Clear as mud, right? 😉 BUT THINK MATH! POSITIVE IS TO ADD (+) AND NEGATIVE IS TO SUBTRACT (-).
Now we’ll combine them:
Positive reinforcement means we add something that makes us get more of a behavior. Reward the dog for sitting and it will sit more often.
Negative punishment means we take away something that makes us get less of a behavior. Don’t give the reward if the dog doesn’t sit.
Positive punishment means we add something that makes us get less of a behavior. Yank the leash when the dog pulls – the dog stops pulling (at least for the moment).
Negative reinforcement means we take away something that makes us get more of a behavior. The handler stops pinching the dog’s ear when it holds onto the retrieve object – the dog holds onto the retrieve object and avoids discomfort. “Thumb in the eye” or threat of positive punishment also falls into this category.
In the image below, Buddy shows how he experiences the different consequences:
So those were the four consequences. As we quoted Thorndike initially, all behaviors are maintained by pleasant or unpleasant consequences. If the consequence leads to an increase in the behavior we want, then the training has worked.
BUT, linked to each of the four consequences, there is also a feeling. We think a lot about HOW we achieve results and want training to be fun for both us and the dog. This means that we stay in the upper part of the image: positive reinforcement, rewarding what we want, and negative punishment, making sure that what we don’t want doesn’t get a reward. Of course, we are also humans who can get annoyed, do something impulsive, or accidentally scare our dogs – but we don’t base our training on that. When we plan our training, we always start from positive reinforcement and negative punishment and our intention is therefore to work only with that.
Now for today’s final language challenge and another charged word: correction. To correct simply means to “make right”, it is not inherently negative but in many contexts, it is interpreted as negative. “I certainly correct my dog when it does something wrong” is something you might hear, for example. In that case, it usually means that the person in question takes hold of the dog and the dog perceives it as unpleasant, that is, positive punishment (adding an unpleasant experience).
The thing is, we also correct our dogs – we correct behavior, we redo and get the dog to do it right. But we do it through negative punishment – that is, we withhold a reward and make sure that the unwanted behavior doesn’t get a reward. It also means that we prevent the dog from rewarding itself. If the dog doesn’t listen to our recall and is fully engaged in sniffing, we don’t stay there waving our meatball, we go (or run) and get the dog. We simply interrupt the fun it was having (sniffing) so that the behavior “continue to sniff when you hear the recall signal” doesn’t get rewarded.
Did it become clearer? Or does it still feel like a bunch of strange terms?
Of course, you don’t have to use them every day, but now you at least know what they mean, how learning can be described in theory, and what it means in practice.
If you want to delve even deeper, we have written a bit on the same topic before:
And also about how we think about clicker training:
We will probably come back to this later as well – there is even more to say about why we don’t mix training methods, the effects of positive punishment versus positive reinforcement, and the effectiveness of training versus achieving results at all costs. 🙂