
I’m wondering if one reason why so many people feel compelled to stop using rewards is because of a common misapplication of psychology.
I expect most of us know about how a variable reward schedule will keep people pumping money into a slot machine once they start, because they never know when the payout will come. They keep putting in more coins because who knows, the next one might just win the jackpot! And so they spend far more than they maybe meant to.
Variable reward schedules basically create a behaviour that’s very resistant to extinction. If the fruit machine never paid out you’d soon stop playing, but if it pays unpredictably you’ll carry on for much longer before you give up.
I don’t think this is all that relevant when we’re teaching a dog to do something though. It just muddies the water when we’re trying to communicate to a dog that we like what they’re doing, but we don’t always reward. If you’ve got a particularly intelligent dog they might decide it isn’t worth the bother, or they might start to make connections that aren’t really there. Skinner did an experiment with I think pigeons, basically rewarding them entirely randomly regardless of behaviour. They ended up with strange “superstitious” behaviours because if they say did a head nod just before the food reward arrived, they started thinking it was the nod that caused the reward when actually it was coming anyway - so they kept doing it even though it made no difference to the reward schedule at all. Of course if they continually offered head nods they’d keep accidentally coinciding with the reward and it would strengthen that connection.
So maybe an intelligent dog (who isn’t 100% proofed on sit yet) is going to try to figure out why sit #1 was good enough but sit #2 wasn’t, and come to the conclusion that maybe it’s because they turned their head to the side the first time, and then you’ve got a dog that turns its head every time it sits, just like a human that wears their lucky underpants to every football game because clearly that’s what causes their team to win!
Obviously I’m not talking about shaping here, where you progress to rewarding closer and closer approximations towards the goal behaviour. Or if you give a jackpot reward for a particularly fast sit, or a recall from something tempting, or whatever. I’ve used shaping with River since he was 10 weeks old and he understands perfectly that I will ask him to change his behaviour as we train, and that if I withhold a reward it’s because that particular behaviour wasn’t right and that he needs to figure out how to change it. But he’s very intelligent and has 2 years of experience with this kind of training. He’d be extremely confused if I just arbitrarily skipped a reward for no reason when he was sitting perfectly.