Reinforcement: Positive, negative, social, and punishment
- Historical background
- Current reinforcement related ideas
- Positive reinforcement
- Negative reinforcement
- Positive punishment
- Negative punishment
- Reinforcement schedules
- Social reinforcers
- Reinforcement schedules
- Social reinforcement
Classical conditioning, Pavlov, claimed people learned when an event (stimulus) evoked a response. He went on to show in his famous experiment with dog food, a bell and a slobbery dog how an event (dog food stimulus) can become associated with a different stimulus (bell ringing) which did not originally cause a response (slobbering dog) could be paired to create a new association in the subject's (dog) brain. Thereby causing the new stimulus (bell ringing) to cause the same response (dog slobbering) as the original stimulus (dog food).
John B. Watson's (1920s) demonstrated how a similar procedure could cause learning in people. His famous experiment took a 11-month-old boy, Little Albert and paired the sight of a rat with a loud startling noise to teach Albert to fear rats.
L. Thorndike (1913) described the law of effect, which claims that behavior is influenced by the effects which follow it.
Skinner (1940s) later refined these ideas and described learning as operant conditioning. A person operates in the environment with a certain behavior, which results with reinforcement that will either increase or decrease punishment, which will increase or decrease behavior.
Premack Principle (1959) Using a preferred activity as a reward for a less preferred activity (Grandma’s rule) Eat your beans before you can have dessert.
Current reinforcement related ideas
We think of events (antecedents) and their related behaviors as reinforcers and results that follow these behaviors as consequences (ABC). Consequences can be thought of as reinforcers, which maintain or increase behavior, or punishment, which decrease or eliminate behavior. They occur after a behavior and can either add something new to the environment (party, recess, free time, ... ) or remove something present from the environment (alarm squeal, person's presence, swarming bees, ...).
A positive reinforcer is usually perceived as something pleasant (attention, privileges, honors, social approval, free-time, freedom, grades, praise, tokens, trophy, food, candy, sticker, smile, star, getting to be with friends, being in class, socializing, learning, understanding ... ).
A negative reinforcer is usually perceived as something unpleasant or something desired to be removed or taken away (pain from a thorn, seat belt alarm, fire alarm, failing, fear, unhappy, being under pressure, stress, nagging parent or teacher, attention, teasing, bullied, ... ).
However, by definition, for something to be a reinforcer it needs to increase the frequency of a behavior. A positive reinforcer increases behavior when it is present. Likewise, by definition something is a negative reinforcer only if it increases the frequency of a behavior when it is removed.
A child may misbehave to get increased attention - positive reinforcement, or may misbehave to get his parents to stop arguing - negative reinforcement.
A reward is a gift, recognition, or something given for service, effort, or achievement. Rewards are often confused with positive reinforcers, which must increase behavior to actually be one. A reward causes students’ to participate or effort to achieve something, but does not require that any behavior be reinforced positively. In fact many rewards actually decrease the likelihood of a behavior being repeated. For example, the offer of a reward can increase effort to achieve the reward, but after the reward is attained, their is less desire to repeat the behavior that won the award. Also it is possible that a person may already be reinforced by their personal pursuit of a goal for which a reward is being offered and compete to win the award. However, it can't be know it it was the medal or a personal pursuit of achievement that was the reinforcer?
Punishment is a consequence of a behavior that weakens or decreases behavior when present (actual or imagined). Often retribution for a behavior.
Because reinforcement always increases behavior, negative reinforcement is not the same as punishment. For example, a parent who spanks a child to make him stop misbehaving and actually decreases the child's misbehavior, is using punishment, while a parent who takes away a child's privileges to make him study harder, and the child actually studies harder, is using negative reinforcement.
Shaping is the gradual application of operant conditioning. For example, an infant who learns that smiling elicits positive parental attention will smile at its parents more. Babies generally respond well to operant conditioning.
Behavioral systems are physiological processes in the brain that trigger behavior. Release of dopamine, adrenaline, ...will increase activity in the Behavioral activation system (BAS) and the Behavioral inhibition system (BIS). Dopamine causes feelings of joy, hope, interest, and urges optimism to approach an object or event for personal gain.
Verbal prompts with reinforcement: When reinforcing an event be careful giving verbal or nonverbal prompts. The prompt should be removed as soon as possible so the person will associate something else with the behavior rather than the prompt. i.e. A child enters a room and does not hang up his or her coat. If you prompt her, then the prompt may be associated with hanging up the coat rather than with coming into the room or taking the coat off. Use a firm and direct prompt and do NOT add OK because that implies a choice when none is intended. If there is resistance you might repeat the directive once and if compliance is not achieved, then apply a brief time out and return to the practice.
Tangible Rewards - Recess, money, coupons to exchange for gifts, free time, video, stickers, trophy, certificates, medals, token economy...
The first public school in New York City used a token economy of coupons for toys. They abandoned it because the schools trustees felt it fostered a mercenary spirit.
Alfie Kohn (1993) in his book Punished by Rewards cites thirty years of research showing the more people are rewarded for completing a task, the more they lose interest in that task in the long run.
Passage in Stargirl where one character asks the other why one would do a random act of kindness without ever being acknowledged.
Behavior increases - usually positive
Behavior decreases - usually negative
(+ reinforcer + behavior)
(+ punishment - behavior)
teacher stares till work
(- reinforcer + behavior)
(- punishment - behavior)
When there are opposing behaviors, then when one changes in one direction the other will change in the opposite direction. Distracting behaviors decrease or stop as on task behaviors increase.
Continuous - all responses deemed satisfactory are reinforced.
Intermittent - some responses deemed satisfactory are reinforced.
- Variable - meaning unpredictable
- Ratio or proportional reinforcement (50%) of the time, but not know when. Could be after 1, 3, 4, or more times of appropriate behavior. Could also be several in a row. However, the overall rate is proportional to given or not. Could be determined by a coin flip after the response is deemed appropriate, roll of die with the even being reinforced and odd not, or some other random generator.
- Interval - based on the passage of time, which is irregular intervals, but proportional. Say for example you want to acknowledge a student on average every minute. You walk by and smile after 1 1/2 minutes, then 1/2 a minute later make eye contact and smile, then minute later say something positive, then two minutes later answer a question and follow up immediately with a positive statement about the last five minutes.
- Fixed - meaning predictable
- Ratio - is proportional, but predictable. For example if it is twenty-five percent or one in four, when there is reinforcement. They know if they were just reinforced there will be three more times when there will not be any until there is again. Or it could also be thought of as I have to do this four times before I get reinforced.
- Interval - is based on time and predictable. Reinforcement will occur every ten minutes if the behavior is appropriate or reinforcement will be provided to the first occurrence within a ten minute period if it happens. If not then whenever the first occurrence happens. After reinforcement is provided, then the clock will reset and timing will start again.
- Ignore inappropriate behavior. (time out or negative reinforcement)
- Reinforcement should immediately follow the appropriate behavior. (reward)
- Reinforcement must be contingent on the specific behavior. (reward)
- Reinforcement should be individual. (A reinforcer for one may not be for another)
- Reinforce continually at first. (to associate behavior with consequences)
- Reinforce approximations of behavior. (shaping)
- Reinforce intermittently after behavior is established. (move from extrinsic to intrinsic)
When people use the phrase:
Ive told you a hundred times.
They need to realize - it is not the child who is dense.
Misbehavior is generally discouraged with punishment:
The behaviors in the chart are punishable misbehaviors identified in United States schools by Hyman. (Hyman pps. 13-14)
|Excessive talking in the classroom, hallway, lunchroom...||Indecent language or gestures|
|Insolence toward school staff||Stealing|
|Fighting or attacking school personnel||Defacing and vandalizing school property|
|Gambling||Throwing objects in class or around school grounds|
|Loitering in unauthorized places||Dishonesty|
|Rudeness||Not bringing required instructional materials to class|
|Absenteeism from class or school||Leaving class or school without permission|
|Disobeying requests of school staff||Not completing assignments|
|Inattention to classroom activities||Possession of weapons|
|Habitually breaking the dress code||Body odor|
|Cheating||Extortion of other students|
Type 1 punishment is application of an aversive event after a behavior.
Type 2 punishment is removal of a positive event after a behavior.
Technically punishment is a decrease in the rate of a behavior.
- If a child was spanked for running onto the road and stops running on to the road, then the spanking was punishment.
- If the child continues to run onto the road, then she was not punished.
In the classroom if a child completes an assignment and the teacher says very good and the frequency of completion decreases, because of the teachers praise, then the student has been punished.
Again, Punishment is technically defined by its effect on behavior.
Punishment can include sounds, smells, tastes, visual images, or physical sensation.
Research supports both types of punishment as both working and not working.
Research also supports that punishment decreased misbehavior of people not being punished, but observed or heard about the punishment of others (Foxx, 1982; Axelrod, 1983; Van Houten, 1983). By definition it is punishment, since it reduces the future probability of behavior.
Baer (1971) argues that punishment is legitimate, commendable, and justifiable when it relieves persons of the even greater punishments that could result.
- Identify the rationale for the treatment.
- Identify techniques to use.
- Use the doctrine of the least restrictive alternative. This means that other less intrusive procedures must be considered and/or tried before punishment is presented. This is based on the premise that the individual has the right of basic human freedoms. The intervention should not cause pain, tissue damage, humiliation, discomfort, and stigma as expected side effects accompanying the behavior change. Carr and Lovaas (1983) state the use of punishment by contingent presentation of a stimulus should not be the method of first choice, even when trying to reduce self injurious behavior. Should try 1) DRO, 2) DRO with extinction, 3) time out from positive reinforcement, so all environmental reinforcement is reduced, and 4) DRO combined with positive practice overcorrection, the intent is to have the individual practice appropriate, alternative responses. There may be times when none of these are appropriate, but you should have considered them and why they are not appropriate.
- Know if the issue is related to cruel and unusual punishment, and cruel and unusual punishment according to Longo (1981) serves no more effective purpose than a lesser punishment; and is inflicted arbitrarily.
- The 18th amendment provides protection from this and the 14th protects individuals from harm.
- This protection has been upheld by the courts in several cases Wheeler vs. Glass, 1973; NY Association for Retarded Children v. Carey, 1975.
- Also in Ingraham v. Wright (1977) they upheld the notion that paddling as swatting of a student on the buttocks in the presence of witnesses, does not violate constitutional protection against cruel and unusual punishment. To lessen the risk three controls should be set-up 1) a review mechanism should be followed before, during, and after the punishment is administered; 2) staff should be properly trained and supervised; and 3) informed consent should be obtained from parents or legal guardians. Informed consent should include review of materials to deliver the stimulus, should discuss the nature and side effects of the program, all should experience the aversive stimulus themselves, public should be made aware of the proposed treatment. The person asked for consent should be able and capable to understand the program (language, mentally competent, no jargon).
- Assess the severity, harmful, too major to ignore
- Rule out medical problems
- Does the behavior restrict the individual from earning reinforcement? If not use another strategy.
- Is the behavior reinforced by reinforcers? If so use another strategy.
- Could the environment of the classroom be changed? Reduce distractions, add sensory objects to have the subject engage in alternate behavior, ... see environmental.
- Could the curriculum be adapted?
- What extent does an antecedent event affect performance?
If decide punishment is the right procedure, then decide what procedure will be most effective.
Examine previous intervention records for other types and success of interventions and medical records.
If a least restrictive model is used, then the following should be considered in the following order: response cost, time out from positive reinforcement, and overcorrection.
The final decision should be based on:
- the individual characteristics of the child and the behavior problem,
- the likelihood of the program being implemented and carried out in a consistent manner,
- the probability of successfully eliminating the behavior, and
- the ethical and legal legitimacy of using the procedure. Then implement and evaluate.
- When any punishing event occurs it should be accompanied by a verbal reprimand so that the verbal reprimand acquires an aversive property. i.e. NO or STOP THAT. Collect data to see if the program is working.
- If a teacher uses mild forms of punishment i.e. saying to please sit down, it may be less effective than saying sit down and mean it once. As Madsen, Becker, Thomas, Koser, and Plager (1968) showed that repeated use of a reprimand served to increase behavior rather than decrease it.
- To be effective the aversive stimulus must be delivered immediately after the maladaptive behavior occurs. It is even more effective to apply the punishment at the start of the procedures, this stops the secondary reinforcers from being initiated Barlow (1972).
- Be consistent and keep warnings to a minimum.
- Use a combination of procedures especially DRI and DRO. Use the FAIR PAIR RULE, for every behavior you decrease increase one or more others. Walker (1979) found punishment was most successful if used with reinforcement of incompatible behavior.
- Holz, Azrin, and Ayllon (1963) found punishment ineffective in reducing psychotic behavior when the behavior was the only way to receive reinforcement.
- Observe for an increase of other inappropriate behaviors that may result from the punishment. Students who are reprimanded for leaving their seats may stay in their seats, but begin to shout or snap their fingers for the teachers attention.
- Punishment creates negative emotions. Since punishment is an averse happening, they cause fear, anger, anxiety, withdrawal, and undermine relationships.
- The punisher is negatively reinforced. That is when the punisher applies the punishment and the behavior stops the punisher is reinforced by the decrease of that behavior. The next time the behavior occurs the punisher will likely use the same procedure to remove the negative effect. Resulting in a belief that punishment works.
- Example, A teacher who starts the day or week by telling students to "be quiet!". Later, the "be quiet!" does not get the same reaction so s/he says "be quiet!" a little louder and gets results. The next time when students do not respond the first time s/he thinks well last time I was a little louder so lets try it again. Before long s/he is yelling. Resulting in an increase of the teacher's aggressive behavior.
- Severe punishment can produce aggressive behavior to terminate the punishment.
- Most of the time when people are being punished they are thinking about revenge. They may also resolve never to be caught again. If a person does decide not to repeat the behavior that caused the punishment, they do so because of fear and intimidation, not because they have developed a caring attitude, principles of right and wrong, a strong character, or strong moral values.
- Punishment can cause a person to withdraw. A person punished for being late to class may not come to class. Individuals can also escape by taking alcohol and drugs, or tuning out, or suicide.
- People who observed aggressive behavior used the same behaviors. Bandura (1969) BOBO doll.
- Negative modeling: spanking a child while saying, "This will teach you not to hit anyone."
- Punishment is unpredictable.