Reinforcement and Punishment
Positive reinforcement, negative reinforcement, consequences, rewards, positive punishment, negative punishment, reinforcement schedules, and social reinforcers
Classical conditioning, Pavlov, claimed people learned when an event (stimulus) evoked a response. He went on to show in his famous experiment with dog food, a bell and a slobbery dog how an event (dog food stimulus) can become associated with a different stimulus (bell ringing) which did not originally cause a response (slobbering dog) could be paired to create a new association in the subject's (dog) brain. Thereby causing the new stimulus (bell ringing) to cause the same response (dog slobbering) as the original stimulus (dog food).
John B. Watson's (1920s) demonstrated how a similar procedure could cause learning in people. His famous experiment took a 11-month-old boy, Little Albert and paired the sight of a rat with a loud startling noise to teach Albert to fear rats.
L. Thorndike (1913) described the law of effect, which claims that behavior is influenced by the effects which follow it.
Skinner (1940s) later refined these ideas and described learning as operant conditioning. A person operates in the environment with a certain behavior, which results with reinforcement that will either increase or punishment, which will decrease the behavior.
Premack Principle (1959) Using a preferred activity as a reward for a less preferred activity (Grandma’s rule) Eat your beans before you can have dessert.
We think of events and their related behaviors as reinforcers and results that follow these behaviors as consequences. Consequences can be thought of as reinforcers, which maintain or increase behavior, or punishment, which decrease or eliminate behavior. They occur after a behavior and can either add something new to the environment (party, recess, free time, ... ) or remove something present from the environment (alarm squeal, person's presence, swarming bees, ...).
A positive reinforcer is usually perceived as something pleasant (attention, privileges, honors, social approval, free-time, freedom, grades, praise, tokens, trophy, food, candy, sticker, smile, star, getting being with friends, being in class, socailizing, learning, understanding ... ).
A negative reinforcer is usually perceived as something unpleasant or something desired to be removed or taken away (pain from a thorn, seat belt alarm, fire alarm, failing, fear, unhappy, being under pressure, stress, nagging parent or teacher, attention, teased, bullied, ... ).
However, by definition for something to be a reinforcer it only needs to increases the frequency of a behavior. A positive reinforcer increases behavior when it is present. Likewise, by definition something is a negative reinforcer only if it increases the frequency of a behavior when it is removed. A child may misbehave to get increased attention - positive reinforcement, or may misbehave to get his parents to stop argueing - negative reinforcement.
A reward is a gift, recognition, or something given for service, effort, or achievement. Rewards are often confused with positive reinforcers, which must increase behavior to actually be one. A reward causes students’ to participate or effort to achieve something, but does not require that any behavior be reinforced positively. In fact many rewards actually decrease the likelihood of a behavior being repeated. Effort to achieve the reward, but less desire to repeat the behavior. Or effort to receive a medal, then lack of effort afterward. Was it the medal or the effort in pursuit of personal achievement that was the reinforcer?
Punishment is a consequence of a behavior that weakens or decreases behavior when present (actual or imagined). Often retribution for a behavior.
Because reinforcement always increases behavior, negative reinforcement is not the same as punishment. For example, a parent who spanks a child to make him stop misbehaving and actually decreases the child's misbehavior, is using punishment, while a parent who takes away a child's privileges to make him study harder, and the child actually studies harder, is using negative reinforcement.
Shaping is the gradual application of operant conditioning. For example, an infant who learns that smiling elicits positive parental attention will smile at its parents more. Babies generally respond well to operant conditioning.
Behavioral systems are physiological processes in the brain that trigger behavior. Release of dopamine, adrenaline, ...will increase activity in the Behavioral activation system (BAS) and the Behavioral inhibition system (BIS). Dopamine causes feelings of joy, hope, interest, and urges optimism to approach an object or event for personal gain.
Verbal prompts with reinforcement: When reinforcing an event be careful giving verbal or nonverbal prompts. The prompt should be removed as soon as possible so the person will associate something else with the behavior rather than the prompt. i.e. A child enters a room and does not hang up his or her coat. If you prompt her, then the prompt may be associated with hanging up the coat rather than with coming into the room or taking the coat off. Use a firm and direct prompt and do NOT add OK because that implies a choice when none is intended. If there is resistance you might repeat the directive once and if compliance is not achieved, then apply a brief time out and return to the practice.
Tangible Rewards - Recess, money, coupons to exchange for gifts, free time, video, stickers, trophy, certificates, medals, token economy...
The first public school in New York City used a token economy of coupons for toys. They abandoned it because the schools trustees felt it fostered a mercenary spirit.
Alfie Kohn (1993) in his book Punished by Rewards cites thirty years of research showing the more people are rewarded for completing a task, the more they lose interest in that task in the long run.
Passage in Stargirl where one character asks the other why one would do a random act of kindness without ever being acknowledged.
Behavior increases - usually positive
Behavior decreases - usually negative
(+ reinforcer + behavior)
(+ punishment - behavior)
teacher stares till work
(- reinforcer + behavior)
(- punishment - behavior)
When there are opposing behaviours, then when one changes in one direction the other will change in the opposite direction. Distracting behaviors decrease or stop as on task behaviors increase.
Continuous - all responses deemed satisfactory are reinforced.
Intermittent - some responses deemed satisfactory are reinforced.
- Variable - meaning unpredictable
- Ratio or proportional reinforcement (50%) of the time, but not know when. Could be after 1, 3, 4, or more times of appropriate behavior. Could also be several in a row. However, the overall rate is proportional to given or not. Could be determined by a coin flip after the response is deemed appropriate, roll of die with the even being reinforced and odd not, or some other random generator.
- Interval - based on the passage of time, which is irregular intervals, but proportional. Say for example you want to acknowledge a student on average every minute. You walk by and smile after 1 1/2 minutes, then 1/2 a minute later make eye contact and smile, then minute later say something positive, then two minutes later answer a question and follow up immediately with a positive statement about the last five minutes.
- Fixed - meaning predictable
- Ratio - is proportional, but predictable. For example if it is twenty-five percent or one in four, when there is reinforcement. They know if they were just reinforced there will be three more times when there will not be any until there is again. Or it could also be thought of as I have to do this four times before I get reinforced.
- Interval - is based on time and predictable. Reinforcement will occur every ten minutes if the behavior is appropriate or reinforcement will be provided to the first occurance within a ten minute period if it happens. If not then whenever the first occurance happens. After reinforcemnt is provided, then the clock will reset and timing will start again.
- Ignore inappropriate behavior. (time out or negative reinforcement)
- Reinforcement should immediately follow the appropriate behavior. (reward)
- Reinforcement must be contingent on the specific behavior. (reward)
- Reinforcement should be individual. (A reinforcer for one may not be for another)
- Reinforce continually at first. (to associate behavior with consequences)
- Reinforce approximations of behavior. (shaping)
- Reinforce intermittently after behavior is established. (move from extrinsic to intrinsic)
Dr. Robert Sweetland's Notes ©