Every now and then, I come across the misconception that classical conditioning is the same as ”traditional” training, traditional training being defined as training using more or less aversives. And sometimes clicker training is referred to as operant learning – which is correct, but almost ALL learning is operant! So I thought I’d try to shed some light on the concepts, since classical conditioning and ”traditional” training aren’t remotely the same, and since one could say that almost all dog training is operant learning.
Classical conditioning has to do with emotions and reflexes, in other words behaviours that aren’t voluntary. This was what Pavlov discovered and did his research on, which is why it’s also sometimes referred to as Pavlovian conditioning. It’s also called associative learning – in other words, two formerly unconnected events are paired together. It’s not the reflex in itself that is the association or the learned response; it’s the pairing between the stimulus (in Pavlov’s case the rattling of the food bowls or the ringing of the bell) and the response (in Pavlov’s case the dogs’ salivation). When an unconditioned stimulus (which is neutral to the individual from the beginning) comes before an unconditioned response (in other words, a reflex that will be elicited) enough times, that unconditioned stimulus becomes a conditioned stimulus that will elicit the conditioned response.
We have all learned a lot through classical conditioning. For example, if you think about a lemon, about biting into it and how sour it is, can you feel yourself begin to salivate? If you didn’t know what a lemon was, that wouldn’t happen. If the phone rings, we feel happy (if it’s normally nice people calling) or sad (if there tends to be bad news when the phone rings), just from hearing that sound.
Operant learning has to do with voluntary behaviour. The individual ”operates” on its environment and learns from the consequences of the different behaviours. A consequence can be nice or aversive to the individual, but the individual can always choose to perform the behaviour (even if the outcome of not doing so might be highly aversive). Compare this to the reflexes mentioned earlier. We can never choose not to execute a reflex.
The consequences are divided into positive and negative punishment and positive and negative reinforcement. “Positive” simply means that something is added and “negative” means that something is removed. Punishment makes behaviour decrease and reinforcement makes behaviour increase. Emotions are tied to each of the four consequences, and those fall under classical learning. Thus we can’t completely separate them from each other, but we can choose which emotions we want to work with.
- Positive reinforcement means that we add something that makes the behaviour increase, for example give the dog a treat when he sits down. The emotion connected is joy over the reward. Looking at it from a human perspective, the thought connected to this is ”Doing this will lead to…”
- Negative punishment means that we remove something that makes the behaviour decrease, for example turning our backs to the jumping dog (thus making the chance of reinforcement disappear). The connected emotions are frustration and some disappointment regarding the disappearing reinforcer.
- Positive punishment means that something that makes the behaviour decrease is added, for example a leash pop when the dog pulls on the leash. The connected emotion is fear of the aversive (or the individual that makes the aversive happen).
- Negative reinforcement means that something that increases the behaviour is removed. There are two kinds of negative reinforcement. Either an aversive is removed (think of a forced retrieve where the trainer pinches the dog’s ear until the dog takes the object) or the aversive is avoided. If for example the dog learns that jumping on the dinner table makes the owners very unpleasant, the dog by choosing to get off the table simply from a glance from the owner, avoids the aversive. The connected emotion to negative reinforcement is relief. From a human perspective, the connected thought is ”Do this or else…”
This means that all most all dog training is operant, but that it makes use of different consequences.
- ”Traditional” training focuses on positive punishment (punishing the incorrect response) and negative reinforcement (removing aversives when the dog responds correctly), with the use of some positive reinforcement (rewarding the correct responses). Sometimes there’s a certain unwarranted belief in pack dynamics and leadership that is demanded, not earned.
- Clicker training focuses on the use of positive reinforcement (reinforcing the correct response), but also negative punishment (making sure that the incorrect responses aren’t reinforced).
Then there are a number of other training methods that combine the consequences in different ways. Keep in mind that all consequences work, as long as you have two opposites to work with (reinforcement/no reinforcement, punishment/no punishment). Whichever opposites you choose, timing is extremely important for what the learner will actually learn. If the punishment or reinforcement is late, the dog will already have moved on to doing other things, and you run the risk of punishing something good or reinforcing something unwanted. As a clicker trainer, to me this is also an ethical standpoint– I want the training to be fun for both dog and trainer and if I choose to use aversives, that won’t be the case. And you know – if you can train wild dolphins to find mines for thawed fish as their reinforcer, why shouldn’t we be able to train our dogs without aversives?