In Part 1 I began by defining clicker training, beginning with a definition from Karen Pryor and then expanding upon that definition to describe the “clicker umbrella” that is the framework for all the interactions I have with my horses. Now in Part 2 we’ll look at the basic procedure;
Part Two: One Click – One Reinforcer
So how exactly does all this work? When my learner gives me the response I’m looking for, I mark that behavior with a clear, distinctive, consistent signal. That’s the “click” in clicker training. It can be made with a plastic clicker, but more often I use a tongue click.
The click is ALWAYS followed by a reinforcer. There are some people who use the clicker in a different way. When they click, they may reinforce, or they might ask for the behavior again. The click serves more as a keep going signal. Other indicators, such as reaching into the treat pouch, become the clearer marker signal for the horse.
This is NOT how I use the click. The click is a cue. You use and respond to cues all the time. If you were to say “come” to a dog, you would want “come” to mean “come”. You wouldn’t want it to mean “come” this time, but next time it might mean spin in circles. That doesn’t make any sense.
The same holds true with the click. It tells your learner that he has just done something you like, and he should now go into treat-retrieval behavior. That means he first orients to you, the handler, to find out where his treat is going to be delivered. Are you going to bring it to him, or does he need to come to you to get it? That’s just a detail. What he can count on is: if I click, he’s going to get reinforced. That’s a pairing that I want to keep very clear to avoid confusion and frustration.
When I was first exploring clicker training, I watched canine behaviorist Gary Wilkes with his cattle dog, Megan. She was a clicker superstar, especially when Gary gave her new puzzles to solve. Gary used treatless clicks as a shaping tool. Megan would become frustrated when something that had just gotten a click and treat now only earned her a click. In her frustration she’d try harder or she’d offer something new. That’s what Gary wanted. He was using treatless clicks and the frustration they caused to get behavior to vary.
Gary warned people about falling into patterns. Humans like patterns and our animals are very good at spotting them. So he cautioned everyone who was experimenting with treatless clicks to watch out for inadvertent patterns.
I was impressed by Megan, so I asked my horse what he thought of the technique. I found he wasn’t the only one who was getting frustrated. If I clicked and treated the first time, but not the next, what should I do on the third and fourth trial? Was I falling into a pattern? Treatless clicks gave me too many things to keep track of. I decided two things:
First: it may have been okay for Gary to use treatless clicks with Megan. She seemed emotionally resilient enough to work through the frustration, but I was going to be sitting on the animal I was training, and frustration didn’t seem like a good shaping tool to be using. I didn’t need to rely on this strategy. I had other ways I could get behavior to vary.
Second: I could be much clearer if I clicked behavior that met criterion and reinforced EVERY time I clicked. That avoided the trap of falling into patterns. What I varied was not whether or not TO TREAT, but what to use as my reinforcer.
Gary Wilkes was one of the early pioneers of clicker training. He helped introduce clicker training into the dog community. Since those early experiments, we have learned a great deal about how to set up our training plans for success. We don’t need to rely on treatless clicks and other extinction processes in our training. There are better techniques available to us.
This can sound as though I am clicking every little thing that my learner does. This is not the case. I am clicking on a one to one ratio. Every time my learner meets criterion, I click and treat, but the complexity of the behavior I’m looking for will be increasing over time. What was reinforced in a previous session is now a component of a larger chain of behaviors. I’ve moved on in terms of what I click. That changes over time, but always – if I click, I treat.
Coming Next: Reinforcers If I treat after every click- what do I use for a reinforcer?