7 minute read

Extinction After Intermittentreinforcement

Intermittent Reinforcement

When reinforcement is discontinued after a history of intermittent reinforcement (extinction), the amount and nature of responding depends on the specific history of the organism. After continuous reinforcement (every response is reinforced), extinction is quite rapid. A pigeon, for example, will peck roughly one to 300 times before stopping. However, after some schedules of intermittent reinforcement over 60,000 responses might occur without reinforcement during an 8 hour period before the frequency of pecking falls significantly. Other conditions of intermittent reinforcement will, however, produce much less responding. How much behavior a particular history of reinforcement will generate is not apparent intuitively and depends on the specific schedules of reinforcement. Figure 7 illustrates the extreme range of amounts of behavior that may occur in extinction after two schedules of reinforcement. After a history of a reinforcement for every 50 responses, the bird responds rapidly at first but within several hundred responses stops and for all practical purposes no further behavior will occur. After the history of variable-ratio reinforcement in which reinforcements occurred every 300 responses on the average, but variably, sometimes after only a few responses and sometimes after 500 or 600, extinction responding is much more prolonged, with the bird emitting many thousands of responses and many hours of responding elapsing before behavior ceases. The previous schedule of reinforcement determines not only how much behavior will occur in extinction but also the particular manner in which it will be emitted. The amount and character of extinction after different kinds of reinforcement will vary by many orders of magnitude.

It is, perhaps, paradoxical that intermittent reinforcement, which in general weakens behavior, should prolong its elimination from the repertoire. Within limits, it is generally true that the amount of behavior in extinction is greater when the previous reinforcement has been infrequent. The examples which will be presented below illustrate the importance of this property of intermittent reinforcement in the practical control of behavior.

Many parents frequently extinguish the night-time crying behavior of their young children when they discover that the behavior is maintained by parental attention and not physical discomfort. The length of time and the amount of the child's crying depend upon how the parent previously attempted to stop the crying. If, in the past, the parent appeared each time the child cried, the sudden nonreinforcement of crying will lead to its rapid elimination. On the other hand, if there was a history of inconsistent attention to the child in which the parents appeared variably, sometimes after a short period of crying and sometimes after prolonged crying, then there will be much more crying during the extinction period.

Intermittent Reinforcement In Humanbehavior

A schedule of reinforcement refers to the specific consequence of the performance being analyzed. In extrapolating the effects of intermittent reinforcement to complex cases, it is important to distinguish between the technical use of the term reinforcement and the colloquial use of reward by specifying the exact performance that is reinforced and the exact stimulus that is the reinforcer. Consider, for example, the analysis of the reinforcers maintaining the behavior of a salaried worker. Superficially, the salary might be considered a fixed-interval of reinforcement, in the sense that money is a reinforcing event which is delivered every 14 or 30 days. In the technical sense of reinforcement, however, money reinforces only the behavior of accepting the paycheck. Although the money may be a necessary condition to maintain all of the behaviors associated with the person's employment, it has only indirect effects on the day-to-day activity of the person. The reinforcements involved in the bulk of the performances of the employment are the specific consequences of the person's work. Examples of these will be given below. A schedule of reinforcement could have some function in the delivery of the salary check, however. For example, consider the situation where the salary checks are delivered somewhat variably and without clear notice. Under conditions where the check may be needed badly, the individual may frequently look toward the desk where the checks are delivered or telephone to see whether they have arrived yet.

Strained Behavior Under Fixed-Ratio Schedules

Nearly all of the intermittent reinforcement of occupational behavior is on ratio schedules. The salesman will usually sell his product in proportion to the number of calls he makes. The probability of a sale does not change in any fundamental sense simply with the passage of time. A clear effect of schedule of reinforcement is in piecework pay of the factory worker. This is a fixed-ratio schedule of reinforcement, where the employee is paid directly as a function of the amount of behavior he emits. It is well known that this type of incentive pay produces a high rate of activity compared with any other system. Like all forms of ratio reinforcement, however, both the salesman and the pieceworker may suffer from a low disposition to continue behaving (fixed-ratio strain) that occurs from too much work per reinforcement. The increase in the size of the fixed ratio might produce a general abulia, in which the worker's disposition to do his job would be weakened considerably. Some of the objections to the use of piecework pay systems are based on the fear that the employer will decrease the amount of pay per unit of work. The student is on a similar piecework schedule when he studies for examinations or writes an essay or long report. He must sustain his performance for a long time and emit a large amount of behavior in the form of words written, references read, and copy edited. The ratio strain occurs as a low inclination to return to work just after the examination or after completing the term paper. As with the typical effects of ratio schedule, the student works in spurts and starts. Once he begins, the behavior is sustained at high rates, but the student is erratic as to when he actually works. The low inclination to begin work is an example of the same process responsible for the pause after reinforcement when we require the pigeon to peck a large number of times for each grain reinforcement.

Social Behavior Of Persuasion

The closest approximation to continuous reinforcement in social behavior is the execution of the social amenities. In our culture, reactions to "Hello," "Good morning," or "How are you" are almost inevitable. However, almost all other verbal behavior involves considerable intermittent reinforcement. The teacher of students is reinforced on some ratio schedule (variable) by the response of the student. If the instructor is reinforced by the student's mastery of the material, the amount of verbal behavior on the part of the teacher per unit of effect on the student varies widely and depends, at least roughly, on the amount of the instructor's activity. In general, it is a variable schedule of reinforcement in which the changes in the student's behavior are somewhat unpredictable, particularly for the inexperienced teacher. When aversive control is used in order to produce some behavior in another person, as nagging, threatening, and cajoling, the person who applies the aversive stimuli is intermittently reinforced. The amount of behavior required to establish an aversive state of affairs specifies a ratio schedule of reinforcement. Variability enters because different states of the individual being controlled require varying amounts of nagging, threatening, and so forth. The intermittent reinforcement of nagging and teasing behavior of the controller by the controllee makes it very difficult to eliminate this kind of behavioral control.

Nonverbal Intermittent Reinforcement

The behavior of the gambler is, in general, a simple example of a variable-ratio schedule of reinforcement, and the slot machine is an even closer analogue to the pecking pigeon. The number of coins delivered is directly proportional to the number of plays. Technically, the schedule is a variable-ratio schedule of reinforcement in which coins are delivered after varying numbers of slot machine operations. The effect of the variable-ratio schedule of reinforcement in the pigeon and the gambler is identical: a very high rate of responding even at low net reinforcement frequencies.

Performances associated with waiting for a bus provide examples of interval schedules of reinforcement. Since the appearance of the bus depends only on passage of time, the behavior of looking down the street is reinforced by sight of the bus on a fixed-interval schedule. The result is a low frequency of looking when the person first arrives at the bus stop and a continuous increase in the frequency of looking down the street as time passes, until it reaches a stable and maximum rate. A corresponding performance exists with the person waiting for a pot of water to boil. The frequency of looking at the pot when it is first put on the stove is very low. Just as in the pigeon pecking on the fixed-interval schedule, however, the frequency of responding increases as the time approaches when reinforcement can occur. The typical fixed-interval scallop that is seen in the cumulative response curve of a pigeon pecking may be observed in human examples only when the conditions of reinforcement are of the proper order of magnitude. With extremely strong behavior, as for example waiting for an ambulance, there may be no scallop or pausing. The person looks down the street continuously and with a high frequency. When the pot of water is needed very badly, the person may hover over the stove. On the other hand, the housewife for whom the boiling water is simply a discriminative stimulus to prepare some other food which is not very urgently needed and who is occupied with other chores about the kitchen is not likely to look at the pot very of ten and will probably notice it only after it has boiled for some period of time. These changes are of the same sort that can be produced in a pigeon by varying the level of food deprivation.

Additional topics

Human Behavior