9 minute read

Intermittent Reinforcement



A response will still occur for some time after it is no longer reinforced. If a further reinforcement occurs before responding ceases, responding will begin anew for another period of time. Such intermittent reinforcement will produce stable states of responding which will be maintained for so long as a schedule of reinforcement is continued. Literally thousands of different schedules of intermittent reinforcement are possible, many of which will produce stable and specific behavioral effects. Schedules of reinforcement have been studied extensively in the laboratory, and a considerable amount of technical literature of experimental investigations is available (Ferster and Skinner, 1957). Many of the performances produced by intermittent reinforcement are not apparent intuitively and appear only as a result of a technical analysis. To study these effects in the laboratory, a simple piece of behavior, such as the pecking response of the pigeon, is useful because it takes only a very brief time to emit, can be repeated easily, and is under the control of a known and manipulatable reinforcer. Such performances provide a very sensitive dependent variable in which the rate of pecking can vary from zero up to 30,000 per hour. The arbitrary response is taken as a representative item in the bird's repertoire. Any other performances would show the essential features of the relevant processes.



INTERMITTENT-REINFORCEMENT Fig. 1. Typical apparatus used to study the operant behavior of the pigeon and the monkey.

The arbitrariness of the performance that is measured in these experiments makes it relatively easy to compare the different species. Figure 1 illustrates experimental technics with a pigeon and a monkey. In each case, some performance natural to the species operates an electric switch, which defines the unit of response and the variations in response topographies that are acceptable. The operation of the switch and the correlated delivery of food produce a class of responses which have functional similarity. Because such experiments may be programmed automatically, they have made possible the objectivity of automatic recording and the possibility of long-term experiments where thousands of reinforcements may be necessary to establish a particular performance. It is not unusual in such an experiment to record 5 to 10 million responses in a single animal. Figure 2 shows some performances on a fixed-interval schedule, illustrating the phylogenetic generality of the process and the kind of interspecies comparisons that can be made.

Changes in the frequency of the reinforced responses are the major effect of the various kinds of intermittent reinforcement. The schedule of reinforcement maintaining an organism's behavior is one of the major determiners of whether the animal will, in fact, behave at any one point in time and the particular manner in which the behavior will be emitted. Conditions of reinforcement which are superficially similar and involve approximately equal frequencies of reinforcement may produce highly divergent dispositions to behave. The field of schedules of reinforcement is relevant to the maintenance of behavior rather than to learning. The question of a schedule of reinforcement arises after the organism has acquired (learned) the particular performance. The question, "Will the organism emit a response already in his repertoire on a given occasion?" arises even when we are considering how the organism develops new performances. The likelihood of action can vary over a wide range, and, perhaps, the largest single determining factor is the particular schedule of reinforcement by which the behavior is maintained.

Although some intermittent reinforcement occurs naturally in animal behavior, it is the social nature of human behavior 35 which is the greatest source of intermittent reinforcement, because of the mediation of reinforcements by another individual. Reinforcements which depend upon a behavioral process in another organism and are partially, therefore, a function of variables not entirely under the control of the individual being reinforced introduce a large measure of uncertainty in determining what effect a given performance might have. Intermittent reinforcement may occur even in such routine behaviors as a housewife cooking a meal. The eating behavior of the various members of the family is a function of many variables other than the quality of the cooking, and various members of the family may not eat the meal as a result of factors over which the housewife has no control.

The major dimension of a schedule of reinforcement is how the behavior produces a reinforcement: on the basis of elapsed time (interval reinforcement) or on the basis of number of responses (ratio, or the ratio of number of responses required per reinforcement). Ratio schedules of reinforcement (number schedules, piecework) produce very high rates of responding. This is the schedule of reinforcement which occurs predominantly in human behavior, where nearly all the important consequences occur as a result of a certain amount of behaving. Climbing stairs is reinforced on a fixed-ratio schedule. A fixed amount of behavior is required to get to the top, similarly, with digging a hole, turning a piece of metal in a lathe, writing a letter, shaving, telling a story, or persuading someone. In each of these cases, the final consequence maintaining the behavior does not become more probable with passage of time but only with the emission of the necessary amount of behavior. Interval schedules occur somewhat less frequently in human affairs, but nonetheless are of great theoretical importance. Under interval schedules of reinforcement, a response (the first response after the interval elapses) produces a reinforcement periodically. Here, the number of responses per reinforcement is not specified, and only passage of time makes possible the reinforcement. Looking into a pot of water is reinforced on a fixed-interval schedule by seeing it boil; looking down the street while watching for the bus is reinforced on an interval schedule by the sight of the bus; dialing a telephone number after a busy signal is reinforced on an interval schedule by the telephone being answered. The appearance of the bus is not hastened by the number of times one looks down the street, and it is proverbial that watching the pot does not make it boil. Nevertheless, the relevant response is reinforced after sufficient time has elapsed.

A major difference between the ratio and interval schedules of reinforcement is apparent when reinforcement is made less frequent. This is done by requiring a larger number of responses per reinforcement in the ratio schedule or a larger interval between reinforcements in the interval schedule. There is an approximate relationship in both types of schedule between the over-all rate of responding and the frequency of reinforcement. As the number of responses required per reinforcement is increased on the ratio schedule, longer and longer pauses or periods of inactivity begin to appear. At extreme values, the organism ceases to respond. Whenever the animal does behave, however, it responds at high rates, even though the number of responses required per reinforcement is straining the animal considerably. Figure 3 illustrates the effect of increasing the number of pecks required per reinforcement in the pigeon.

35 Whenever the bird pecks at all, the rate of pecking is over 3 pecks per second. The pause after reinforcement gets longer as more responses are required. When the reinforcement is made variable, behavior can be sustained at frequencies of reinforcement which produce severe strain on a fixed-ratio schedule. Figure 4 illustrates a high sustained rate from a pigeon where reinforcement occurs after every 375 responses, a piece rate that normally leads to long pauses and low over-all rates. Under the interval schedule of reinforcement, on the other hand, a more continuous relationship exists between the actual rate of responding and the frequency of reinforcement (the interval between reinforcements). Even at extremely low frequencies of reinforcement, e.g., every 120 minutes in a pigeon, the organism will 35 35 continue to respond fairly continuously, although at a low overall rate. Figure 5 illustrates variable interval performances in a pigeon where reinforcement occurs on the average of every 1, 2, 3, 6, and 10 minutes. With every decrease in the frequency of reinforcement, there is a corresponding decrease in the rate of pecking. The ratio and interval schedules also differ in that a special history is needed to maintain the performance at large values of the schedule. If a large number of responses per reinforcement (ratio schedule) are required from the start, the organism will soon cease responding even though it is severely deprived of food. However, if a small number of responses are required for reinforcement at first and increased gradually, the organism will be able to sustain behavior under schedules which would produce severe strain without the special history. In contrast, any interval schedule will sustain an organism's performance without any special history.

The sensitivity of fixed-ratio performances to the specific history by which the performance was developed is one of the reasons why apparently identical environments may produce such disparate ability in maintaining a person's behavior.

The loss of behavior produced by requiring too many responses per reinforcement illustrates that a schedule of intermittent reinforcement, in itself, can weaken behavior. This effect can be shown in a demonstration experiment by switching an animal from a fixed-interval to a fixed-ratio schedule of reinforcement. Under an interval schedule of reinforcement, the animal will emit a substantial but varying number of responses per reinforcement, even though the exact number of responses is not specified by the schedule. If we then establish that exactly this number of responses is required per reinforcement, by changing the schedule so that reinforcement now occurs as a result of the mean number of responses that had been emitted under the interval schedule, the animal pauses longer and longer after reinforcement until it virtually ceases responding altogether. The new schedule weakens the behavior even though the interval schedule for many months had stably produced this same number of responses per reinforcement. The additional factors operating in the ratio schedule produce the severe strain. Such factors, pertinent to particular schedules of reinforcement, have been experimentally analyzed, and many of the reasons for the effects are understood.

MULTIPLE SCHEDULES OF REINFORCEMENT Under natural conditions, many different schedules of reinforcement maintain the various behaviors of a given individual. The particular occasions on which a given schedule of reinforcement is in force come to control the performance of the individual so that, on each occasion, the performance is appropriate to the relevant schedule of reinforcement. It is for this reason that we so often see such a range of behavioral strengths among a person's various constituent repertoires. This process can be illustrated in an animal demonstration by presenting, alternately, two colored lights to a pigeon and reinforcing its pecking on two different schedules of reinforcement corresponding with the colored lights. The multiple schedule might consist of ratio reinforcement when the light is red: every fiftieth peck produces the food magazine. When the light is green, the delivery of food might be on a temporal (interval) basis: the first peck, 5 minutes after the start of the green light, produces food. The bird's performance soon conforms to the respective colors and schedules of reinforcement. When the light is red, the bird responds at 4 to 6 pecks per second until the fiftieth peck opens the food magazine. When the light changes to green, following the deliv35 ery of food, the bird does not peck the key for several minutes and then begins pecking as the 5 minute period elapses. Toward the end of the period, it reaches a stable but moderate rate of pecking of about one peck every second until a response opens the magazine after 5 minutes. Figure 6 illustrates a multiple schedule in a pigeon. In effect such a bird has two relatively separate repertoires (personalities) which can be strengthened independently simply by changing the color of a light. In a similar, although more complicated way, the particular occasions upon which specific schedules of reinforcement are operating during the life of an individual come to control its performance. Examples of . the effects of schedules of reinforcement in complex situations will be discussed later.

Additional topics

Human Behavior