What did the Skinners Box Experiment show

Operant conditioning according to Skinner

Skinner

B. F. Skinner shared Thorndike's view that all behavior is influenced by previously experienced behavioral consequences. In 1930 Skinner described his experiments with the Skinnerbox (a variant of a problem cage named after him), in which experiments were carried out with rats and pigeons. Skinner initially relied on Thorndike's work and developed his theories further: While Thorndike concentrated on the basic trial-and-error behavior, Skinner moved various types of reinforcement as a consequence of a demonstrated behavior into the focus of his research.

Example: Skinner locked a test rat in a cage in which there were some signal lamps to test differentiation and generalization (see below) and a feeding bowl that could be filled from the outside. Furthermore, there was a lever in this cage, which, depending on the experimental animal and experimental arrangement, offered a different consequence:

Operant conditioning

Image source: Lefrancois (1994, 36)

Rat 1 was given food when she operated the lever, Rat 2 was able to switch off the current flowing through the floor grate (see graphic) by operating the lever, and Rat 3 was electrocuted when she operated the lever.

After several attempts, rat 1 and rat 2 repeatedly operated the lever, while rat 3 no longer operated the lever.
The rats had learned to repeat behavior with positive consequences (getting food, switching off electricity) and to avoid negative consequences (electric shock). Skinner called this learning effect: 'learning through reinforcement' or 'learning from success': the behavior (e.g. switching off the power to the cage floor) satisfies the need and reinforces behavior

Skinner made further attempts in connection with signal lamps:
For example, the animal was only fed when it operated the lever and the ceiling lamp was on.

The animal could be conditioned to various additions: It is not only necessary to carry out an activity (to operate the lever) in order to trigger the reaction, but one has to be second condition (e.g. burning a lamp) must be fulfilled.

The test animals (in the Skinner box) had learned to have positive or pleasant consequences through their own behavior ("satisfier") and unpleasant consequences ("annoyer") to avoid or reduce.

"In Skinner's perspective, the behavior of the animal Completely can be explained by external experiences (stimuli from the environment) - by food deprivation and the use of food as reinforcement. "Zimbardo & Gerrig (1999, 208)

In operant conditioning, reinforcement is based on a demonstrated behavior.
A certain consequence counts as reinforcement, which decides on the repetition of the behavior shown.
Skinner's learning theory is based on the use of reinforcement after a "learning" individual has demonstrated a desired behavior.

Example: A student is repeatedly late for school. Since the classmates laugh in admiration and the teachers do nothing about the rule violation, the student feels the consequences for his behavior as positive.
Due to this positive amplifier, the student will probably be too late in the future to enjoy the amplifier again.

Operant means to operate on or in one's environment (to intervene). By emitting a behavior, it is possible for an individual to influence the environment. "Literally means operant "influencing the environment" or "becoming effective in it" (Skinner 1938). "Zimbardo & Gerrig (1999, 219)
In operant conditioning, therefore, an individual is active on his own because there is a 'surgery'Performs in the environment: A behavior is shown (performed, made) which causes the reaction of the environment (= consequence on the behavior). Operant behavior does not have to be planned: many operant conditioning behaviors are emitted spontaneously.

Certainly today we could also discuss the interaction with cognitive processes: the mental anticipation of the state that is reached when the amplifier occurs. In the sense of Skinner, however, the thought processes remain unnoticed in the black box of behaviorism.
 

Can you think of a behavior that you yourself learned from operant conditioning?

 Food for thought / practical examples
To go to school,
To write letters,
Answering questions,
Fishing,
Sailing or
play with a dog.
Have a student prepare for a presentation using a specific literature source. He not only sifts through the specified literature, but also makes use of other sources in order to be excellently prepared.
[a] He is praised by the teacher and his classmates. To this positive consequence (satisfier) again, he will do the same in the future.
[b] His classmates laugh at him and call him a nerd. To this negative consequence (annoyer) not to get back, he will change his behavior in the future.
A crying and screaming child is picked up by its parents in the evening. The parents try to prevent an aversive stimulus (the unpleasant crying and screaming in the evening). However, the child is reinforced: by crying and screaming loudly, it can get the parents' attention and attention in the evening. Parents may react differently during the day. So it has to be evening in order to achieve the positive consequence by screaming and crying. [This example contains both reinforcement (attention and care) and the second condition (it must be evening)!]
"I'm doing this because ..."
"So that you are calm, I'll do it for you!"
In a dialogue, F. acknowledges topics that he likes with an affirmative "mhm" or a "yes". He thus indirectly controls the choice of topic, as his interlocutor is reinforced for the respective topic.

 

 

Quotes on operant conditioning:
"The reinforcement is conditioned by the reaction." Hilgard & Bower (1973, 131)

"This dependence on the subsequent reinforcing stimulus gives the term" effective reaction "its significance ... The effective reaction ... gains a meaning for behavior and assumes an identifiable form when it acts on the environment in the sense that an intensifying one Stimulus is produced (1938, p. 22). " Hilgard & Bower (1973, 132)

"We reward people, we reinforce behavior." Skinner (1986), quoted from: Mietzel (1998 a, 137)

Excursus on instrumental conditioning:
Operant and instrumental conditioning are related approaches, as they merge into one another (cf. Fisseni, 2003, 415). The instrumental conditioning according to Dollard & Miller focuses on the instrument of the conditioning process. If, for example, the administration of food has an effect on future behavior (= learning process), then the food should be viewed as an instrument for the learning process. Instead of objects (e.g. food), behaviors can also be considered an instrument in the sense of instrumental conditioning:
"A behavior thus becomes an" instrument "for bringing about a pleasant consequence and avoiding an unpleasant one. This formulation is the background of the term" instrumental conditioning ": an activity is a means to achieve a certain consequence." Rent (1998 a, 134)

 

Features of Skinner's operant conditioning


Discrimination / learning to differentiate
In Skinner's attempt learned the rat that there is only food when the lamp is on and the lever is thrown. The animal can differentiate between two states: light off = no possibility of getting food or light on = possibility of getting food (by turning the lever).
In contrast to generalization, behavior is only rewarded in certain (= different) cases.


Extinction / extinction
If the rats received no reinforcement (any longer) for a behavior shown, Skinner was able to observe the behavior less and less after a certain period of non-reinforcement.
The frequency of behavior slowly decreases if the behavior is not reinforced.
A behavior shown is no longer reinforced, so the desired consequence does not materialize. The effort and duration of a deletion depend on the learning history and the type of reinforcement (see below).
"In operant conditioning, erasure means the absence of positive consequences for a certain behavior that is controlled by the positive consequences." Linden & Hautzinger (1996, 223)

The absence of an expected reinforcer can also be viewed as a punishment. One difficulty is locating the amplifiers, as these can also be irrationally hidden.
It can happen that if the positive consequences do not occur, this behavior is initially shown in a much stronger form.
Example: A child screams and receives the attention of its mother. If the mother does not give the child any more attention when she screams, it can happen that the child screams even more sonorous for a short time in order to still enjoy the positive consequence (the affection).

"Skinner, for example, understands forgetting as the disappearance of behaviors due to a slow process of decay over time." Lefrancois (1994, 163)

"Extinction (extinction) is the decrease in the frequency of a learned behavior due to non-reinforcement until it finally occurs only randomly." Hobmair (1996, 152)

Under spontaneous recovery (spontaneous recovery) is understood to mean showing behaviors that have already been weakened by a lack of reinforcement. In the middle of the process of deletion, the behavior is shown again without any logical justification, although it was no longer reinforced (conditioned).


Reinforcement
[PDF short version on amplification and amplifiers!]

Skinner derived the principle of reinforcement from his experiments: "Reinforcement is the process that leads to a spontaneously exhibited behavior occurring more frequently." Hobmair (1996, 149)

Examples of learning through reinforcement:
- A child helps its mother because it gets chocolate for it.
- The husband cooks because his wife praises him for it.
- B. fulfills his wishes through shoplifting because he cannot afford the goods.
- D. takes an active part in primary school education.

Reinforcement is one pleasant episode
Examples of a pleasant episode:
- Chocolate that a child receives for an achievement.
- Praise given to a husband for his kitchen activities.
- Wishes granted by shoplifting.
- Hard work cards for particularly good performance in primary school.

"If something happens directly to a behavior that is perceived as pleasant or that leads to success, then this behavior occurs more frequently in the future. It has been reinforced by the pleasant consequence or consequence. We therefore refer to pleasant consequences of a behavior as reinforcement Reinforcement increases the frequency of a behavior or the strength of an attitude. " Schmitt (1999, 4)

Instead of positive behaviors, negative behaviors can also be reinforced.
Example reinforcement of negative properties:
"A drunk man pokes at someone in the pub and says he could buy him a beer. To calm him down, the man pays him for the beer. This reinforces the behavior of“ poking at others to get a beer ” . They are likely to use this free beer behavior more often in the future. " Schmitt (1999, 4)

One strives to experience a pleasant episode again and tries to adapt behavior based on past situations.

Amplifier differentiates according to the effect

Positive reinforcement
One behavior leads to the desired one positive consequence.
Example positive reinforcement:
- smile at someone who immediately smiles back;
- order something that I receive immediately;
- Mr N. often drives too fast. Despite the constantly excessive speed, he is neither stopped by the police nor does an accident happen to him, which means that Mr. N. continues to drive too fast.

Positive reinforcement is a behavior that is shown repeatedly in a certain situation because the previous reactions to this behavior brought positive consequences.
The behavior is followed by a positive event. From a pedagogical point of view, positive reinforcement is a useful method of increasing the frequency of behavior through reward and success.

"1. A positive reinforcer is a stimulus that increases the likelihood of an effective reaction occurring when added to a situation. Examples of positive reinforcers are food, water, sexual contact, etc." Hilgard & Bower (1973, 135)
"Positive reinforcers are stimuli and events whose reaction contingent (i.e. immediately following) presentation leads to an increase in the frequency of a behavior [...]" Linden & Hautzinger (1996, 63)

Negative reinforcement
One behavior leads to a uncomfortable (aversive) There is no consistency (Escape).
Example negative reinforcement:
- Drive slowly to avoid being 'flashed'.
- Put on your sunglasses to avoid being dazzled.
- Mr M. drives very carefully and slowly. No accident happens to him and he does not have any problems with the police, which he also wants to prevent. Mr. M. will continue to drive carefully in order to prevent any negative consequences.
- A. has got used to countering his nervousness with autogenic training.
- B. has found that his test anxiety diminishes when he takes tranquilizers.

Negative reinforcement means showing behavior in a certain situation, because unpleasant consequences in the past could be avoided or eliminated by this behavior.
The behavior is followed by the absence of an unpleasant (aversive) event, i.e. not only is there no punishment for the behavior - punishment is avoided (through prophylaxis).

"2. A negative reinforcer is a stimulus that increases the likelihood of an effective reaction when it is removed from a situation. Examples of this are loud noise, very bright light, extreme heat or cold, an electric shock, etc." Hilgard & Bower (1973, 135)
"[...] under negative reinforcers one understands stimuli and events, the reaction-contiguous removal or termination of which leads to an increase in the frequency of a behavior [...]." Linden & Hautzinger (1996, 63)

Positive punishment (also called "Type I punishment" designated)
One behavior leads to one unpleasant consequence.
Example of positive punishment:
- G. runs in a wet hallway, falls down and sprains his ankle.
- Out of boredom, H. plays with a knife and cuts himself.

Negative punishment (also called "Type II punishment" designated)
One behavior leads to a there is no pleasant consequence.
Example of negative punishment:
- J. was aggressive towards a colleague yesterday and is not welcomed by her (for this) today.
- L. complains about his girlfriend's food, who completely refuses to cook for it in the coming weeks.


Learning through punishment (See Schmitt (1999 a, 9 - 14))
If a behavior is followed by an unpleasant consequence, one speaks of punishment. Punishment is intended to reduce undesirable behaviors / attitudes, however undesirable behaviors / attitudes are not permanently eliminated, but only temporarily suppressed or weakened (behavior suppression).
Often one only learns to avoid punishment through more skillful behavior. But more appropriate, more favorable behavior is not learned.
Example behavior suppression as a result of punishment:
P. has a strong need for fast cars. Since he cannot financially afford such a car, he keeps stealing sports cars. Eventually he is caught and sentenced by a court to a substantial fine. After six months he steals sports cars again, but this time he does it more skillfully so as not to be caught.His attitude towards the type of procurement of fast cars has not changed and has only been weakened and suppressed in the meantime.

Instead of undesirable behavior, however, desirable behavior can also be punished.
Example of punishment for desirable behavior:
Employee R. appropriately criticizes an order from his superior that he considers to be unacceptable. Some time later he was transferred to another department with a flimsy explanation. The actually desired behavior 'openly speaking one's opinion' was punished by this transfer. The employee will now probably prefer to keep his criticism to himself and perhaps let off his anger at home.

Punishment can also consist in the fact that pleasant consequences that one expected from certain behaviors / attitudes no longer occur or are withdrawn. The withdrawal of amplifiers is therefore an unpleasant consequence (Withdrawal from reinforcers = punishment). 
A behavior that no longer gives the expected reinforcement is then given up as meaningless and no longer shown.
Example of punishment by withdrawing the expected reinforcer:
The pupil M. tries to get attention from her classmates by showing off and showing off. However, since the classmates are fed up with their bragging rights, they ignore their behavior. Since C. does not receive any attention in this way, she refrains from bragging.

"Punishment does, however No way 
- to a reinforcement of desirable behavior,
rather always only
- to a short term Weakening or oppression an undesirable one Behavior. "Heineken & Habermann (1994, 48)

The probability of occurrence of a behavior can be reduced with a punishment, but better with the withdrawal of reinforcers. Aversion therapy, for example, is derived from learning through punishment:
"Aversion treatment is used to summarize a number of different treatment methods which have in common that an aversive stimulus is immediately linked to a clinically undesirable behavior in terms of time. The aim of such treatment methods is to reduce the future occurrence of the undesired behavior." Linden & Hautzinger (1996, 93)

Krapp & Weidenmann (2001, 150) point out, "[...] that a penalty only stops undesirable behavior, not yet but a desirable alternative behavior is being built. "


Reinforcers differentiated according to the time of their return / occurrence of the reinforcers

Continuous reinforcement / Always reinforcement
Every show of the desired behavior leads to a reward (reinforcer). Continuous reinforcement means reinforcing every time the desired behavior is shown.
The acquisition of changed behaviors happens quickly, but the stability is low, i.e. the behavior is unlearned more quickly.
Example of continuous reinforcement:
- Y. gets a small piece of chocolate for every move in the household.
- S. is praised for every form of politeness towards strangers.

Intermittent Reinforcement / Sometimes Reinforcement / Partial Reinforcement (English: 'intermittent reinforcement')
The desired behavior is reinforced every now and then. It is not reinforced after every desired behavior, but from time to time.
Modified behaviors are learned more slowly, but are retained for a longer period of time and are not (as with continuous reinforcement) dependent on permanent reinforcement.
Example of intermittent reinforcement:
- The dog A. is petted less and less with increasing age when he answers the command 'Come!' also came to his master. The reinforcement (stroking) is given irregularly.
- The sentence "Your food always tastes good, but today it's great! What's in there?" praises an excellent, unusual achievement of the cook and motivates the cook to prepare such a meal more often.

"One of Skinner's most important discoveries was the discovery that initial learning is facilitated by continuous reinforcement, but that quenching time is increased by intermittent reinforcement. Although most of his experiments have been carried out on animals, it is believed that these results in general also human behavior are transferable. " Lefrancois (1994, 211)

"Deletion takes longer if the behavior to be deleted has been learned and maintained under changing, uneven (so-called intermittent) reinforcement conditions." Linden & Hautzinger (1996, 223)

Intermittent reinforcement means only reinforcing the behavior every now and then. New behavior is acquired more slowly, but is more firmly imprinted, i.e. the learned behavior is retained longer.

 

Subdivision into:
Quota reinforcement ('fixed ratio information')
A desired behavior is reinforced with a mathematical quota.
E.g. a pigeon has to show this behavior ten times in order to be reinforced (rewarded) once ('10: 1 frequency reinforcement '). See Angermeier (1991, 61)

Interval reinforcement (English: 'interval reinforcement')
Behavior is rewarded in certain time intervals. A fixed number of amplifiers is distributed over a certain period of time. E.g. a behavior (if it is shown) should be reinforced 3 times per hour - not more often.
"With the fixed interval reinforcement, the animal is required to peck after a specified time in order to be rewarded. All reactions within the specified period, however often they may be, are not rewarded." Angermeier (1991, 62)

Optimal behavior:
A mother gives her child a piece of candy and praises them every time they try to dress themselves. In this way, she builds up the behavior of "attracting yourself" by "always reinforcing". The better the child masters the behavior, the less often it gets a candy. After all, the mother only needs to praise every now and then. The behavior has become natural and the child is proud of his performance. So the mother goes over to Machmal reinforcement and thus encourages self-reinforcement by the child ('pride').

Three essential characteristics of self-reinforcement:
"- The individual administers the amplifiers to himself. In contrast to the examples discussed so far, subject and object are united in one person.
- The individual must be able to freely dispose of the amplifiers.
- The individual does not strengthen himself at will, but only after the appearance of specific behaviors. "Edelmann (1996, 120)

Not all behaviors / attitudes are sustained through self-reinforcement. There are also behaviors / attitudes that are sustained through regular material rewards (e.g. professional practice) or social recognition (e.g. honorary positions). The reinforcement is therefore 'more valuable' if it is not received too often and dulls

Because the amplifier is only given now and then, it loses less of its value and it is not so easy to over-saturate; the behavior becomes more independent of the external consequences and is not given up so quickly if the reinforcement is absent every now and then; The fact that external consequences for an expression of behavior decrease in importance, the learner also becomes more independent of the reinforcement by others (external reinforcement). This also promotes the transition to self-reinforcement. In self-reinforcement, internal consequences such as "pride" or "satisfaction" promote the behaviors / attitudes that led to these pleasant feelings. The control and also the responsibility for an expressed behavior lies with the person concerned and no longer with other people as with external reinforcement. Thus, the promotion of self-reinforcement with its positive effects on personal development (independence, independence) is a desired educational goal.


Types of amplifiers (See Schmitt (1999 a, 5) and Linden & Hautzinger (1996, 63))

As primary amplifier serve consequences that lead to the satisfaction of physiological (basic) needs (such as water, food, sexual contact, ...).
Secondary amplifier are not essential to life (e.g. praise, grades, permission to play, ...).

Material reinforcers: As the term suggests, they consist of 'material' (e.g. money, flowers, candy, music CD, ... even the salary of an employee represents a material reinforcement).
These types of amplifiers are usually associated with a financial outlay and therefore also promote material dependency. Action enhancers consist of pleasant activities or actions (e.g. going to the cinema, watching TV, reading, swimming, playing games, listening to music, ...)

Social reinforcer (also: Action Enhancer): These amplifiers are not always linked to costs and are often acted out in the presence of other people. They tend to prevent material dependency and encourage leisure activities and social behavior. Social reinforcers consist of pleasant interpersonal contact (e.g. praise, caresses, attentive listening, applause, ...).
"This form of reinforcement is the easiest and most direct to assign and does not cost money. It has the most beneficial effects on the development of appropriate social behavior and a mature personality (self-esteem, self-confidence, behavioral awareness, etc.). The replacement of material reinforcement through Social reinforcement should be the goal of any reinforcement planning. Newly learned behavior should be sustained as far as possible through the influence of natural human relationships. " Schmitt (1999 a, 5)


Features of reinforcement (See Schmitt (1999 a, 6 - 8))
When reinforcing, it should be noted that not every individual can be rewarded with the same reinforcers: The reinforcers determine the basic attitude and preferences of the respective individual individually. E.g. a boy who does not like spinach cannot be fortified with this food. The spinach is not a pleasant consequence for the boy.
Furthermore, a single reinforcer can have different effects in different situations: Whether a consequence has a reinforcing effect depends on our needs, our demands and our expectations.
Examples of the suitability of an amplifier:
- If K. has just eaten a large sundae, she is much less happy about another sundae than before the first.
- When E. is alone and lonely, he looks forward to a call. E. on the other hand, is annoyed about a call if it is disturbed by an interesting TV show.

Furthermore, amplifiers are culture-dependent: In our culture, for example, prestige, social status, power, wealth, fame, strength and intelligence are highly regarded. See Lefrancois (1994, 37)

The person who reinforces a behavior also plays a role: meaningful people (e.g. boss, best friend, spouse, ...) have a higher reinforcement value than people who are unimportant or indifferent to the learner.
For example, praise for a good job from the supervisor is worth significantly more to the person concerned (in the sense of reinforcement) than the same praise from the company's trainee who is below himself in the company hierarchy.


Shaping (gradual approach)
In the case of adaptation services, it has often been observed that the behaviors that are to be learned or acquired gradually emerge. Complex behaviors are often impossible to learn at once (especially with children). Therefore, the (target) behavior must be divided into small sub-steps that have to be followed one after the other. After a partial step has been successfully completed, the requirements are increased slightly. Behavior shaping is understood as the successive (= one after the other, step-by-step) building up of a desired, complex behavior in which every approach to the desired behavior is reinforced.

Skinner referred to the gradual modification of behavior as "shaping of behavior".
Example: Shaping of behavior
A toddler should learn to tie their own shoes with a bow:
Steps that are reinforced each time: the child holds on to a shoelace, the child places the shoelaces on top of each other, the child makes a loop for the bow, the child ties the shoe with a bow.
If the target behavior has finally been achieved or built up, it still needs to be maintained and consolidated. Behavior is most effectively maintained when the always reinforcement is replaced by the occasional reinforcement (sometimes reinforcement).

"Trainers are well versed in this method. As a kind of bravura performance, inspired by recent studies on great apes, I taught a rat a complicated sequence of reactions. The behavior consists in the animal having to pull on a leash in order to remove a marble from you." Then the marble is to be picked up with the front paws, carried to a tube protruding two inches above the floor of the cage, and finally to be thrown in. Each phase of this process had to be practiced in several approximate steps, because the reactions contained therein were not part of the original repertoire belonged to the rat (1938, pp. 339-340). " Hilgard & Bower (1973, 150)

"In the gradual approach is a method in which successive approximations differential be reinforced (Skinner, 1951). The experimenter reinforces every step that brings the animal closer to the final reaction ... "Lefrancois (1994, 41)

"Using what he called the method of successive approximations or the Behavior shaping Skinner succeeded in teaching ping-pong pigeons. "Krech & Crutchfield (1992, volume 3, p. 36)

"His [Skinners] program was designed to lead the learner through a series of carefully planned learning steps. The order of the individual learning steps is determined by the program. It consists of a larger number of learning units, or Framesas they say, based on the English-language terminology. Within a Frames information is presented in each case, the successful storage of which is checked before the next learning unit is processed. "Mietzel (1998 a, 21)


Transfer of learning (See Schmitt (1999 a, 9))
One speaks of learning transfer when a behavior (an attitude) that has been reinforced under certain conditions also occurs in other, similar situations.
Example of learning transfer:
A client who has learned to deal verbally with conflict in a social worker's group lesson is likely to behave similarly in future conflict situations.