Contact: LDRidgeway at gmail dot com

Thursday, October 1, 2009

2Q versus 4Q

[Posted to PositiveGunDogs on 10-1-2009 at 7:48 AM]

In the thread "Hunting vs Trialing":

On Oct 1, 2009, at 6:27 AM, Denise Parlin wrote:
Being new to this list, can someone (in private might be better, to save everyone else) please explain 2Q vs 4Q to me so I can have a better understanding?
Hi, Denise. Others might not know the terminology either, so I'll put my perspective on these terms on the list.

Operant conditioning (OC), one of the recognized behavioral mechanisms, is described as four quadrants: positive reinforcement (+R or R+), negative reinforcement (-R or R-), positive punishment (+P or P+), negative punishment (-P or P-).

Reinforcement refers to outcomes that result in the previous behavior increasing in subsequent trials.

Punishment refers to outcomes that result in the previous behavior declining in subsequent trials.

Positive refers to the subject's learned anticipation that a stimulus will occur.

Negative refers to the subject's learned anticipation that a stimulus will not occur or will stop occurring.

The stimuli that occur as outcomes might be rewards (pleasant to the subject) or aversives (unpleasant to the subject).

A reward or an aversive might or might not have a conditioning effect. That is, you might give a treat, and the dog might eat the treat and seem to like it, but that doesn't guarantee that the preceding behavior will increase.

By contrast, the terms "reinforcement" and "punishment" specifically refer to a statistical change in behavior as increasing or decreasing.

So combining all of the above, we have the four OC quadrants:
  • +R: a reward outcome that results in the preceding behavior increasing
  • -R: an aversive that occurs before the behavior that the subject learns a behavior to escape, OR, an aversive outcome that the subject learns a behavior to avoid (a synonym for -R is escape/avoidance conditioning)
  • +P: an aversive outcome that results in the preceding behavior declining
  • -P: a reward that occurs before the behavior that the subject learns not to do in order to maintain the reward, OR, the loss of opportunity for a reward as an outcome that results in the preceding behavior declining (the latter is much more common than the former in my experience)
As you can see, two of the quadrants involve reward stimuli, while the other two involve aversive stimuli and escape/avoidance.

4Q trainers use all four quadrants. 2Q trainers do not use (or at least try not to use) the two quadrants that involve aversive stimuli. That is, 2Q trainers limit themselves to +R and -P.

"2Q training" is almost a synonym for "positive training", but some positive trainers try to limit themselves exclusively to +R.

I'll throw in a few more points because they are so important to field training, though they may play little role in other animal training:
  • For a field retriever, the most important +R by far is not any outcome provided (in the dog's perception) by a human, but is rather the opportunity to retrieve, especially if the article is a bird, and most especially if the article is a live or recently shot bird. Thus all field trainers, 2Q and 4Q alike, rely primarily on positive reinforcement.
  • By the same token, the field context also provides far more opportunities for environmental +R than the training and event venues of most other sports, with the result that undesired behaviors that will be positively reinforced by environmental stimuli is a vastly more challenging problem for the field trainer than for most other animal training.
  • Conversely, the field context also provides far more hazards for environmental -R or +P than the training and event venues of most other sports, with the result that desired behaviors that the dog will learn an avoidance response to is a vastly more challenging problem for the field trainer than for most other animal training.
  • The behaviors involved in field training, both desired and undesired, are far closer to the dog's inbred instincts than the behaviors involved in most other sports. Instincts are not learned behaviors, they are inborn tendencies. While they can be influenced by OC, they remain very powerful and, often to the confusion of the trainer, tend to become more prominent as the dog spends more and more time involved in field work. The result is a learning curve in which the dog seems to respond as expected to the trainer's methods, and then over time behaviors that seemed to be established or on their way to being established begin to deteriorate. While a conflict between OC and instinct can occur in any training project, it's especially probable in field training.
  • The closer a stimulus or the opportunity for a stimulus is, the more it influences the subject. This fact is noticed in some other sports such as agility and obedience, which require some distance work. Of course, it becomes a huge factor in field training.
Sorry if I've given you a hundred times more information than you wanted. Sometimes it helps me to review these ideas for myself.

Lindsay, with Lumi & Laddie (Goldens)
Laytonsville, Maryland

Field training blog: http://lumi-laddie-test-series.blogspot.com (see "Archive of Video Blog Entries" in right margin)

YouTube playlists:
-- Lumi: http://www.youtube.com/view_play_list?p=BC338082E0B890DB
-- Laddie: http://www.youtube.com/view_play_list?p=9A44913FB240932A

To further explore the frontiers of dog training, join our DogTrek list at: http://groups.yahoo.com/group/DogTrek

No comments:

Post a Comment