All About Animal Training - Application of Philosophy

Foundation

In addition to reinforcement, communication and target recognition are also the basic building blocks of how animals are trained at SeaWorld and Busch Gardens. Trainers can use these basics in almost every training situation. Once they are established, an animal can learn new behaviors more quickly.

Communication

Communication is difficult between two people who don't speak the same language. So imagine what it's like to communicate with another species! It is the responsibility of the trainers to find some way for the animal to understand them. Reinforcers let an animal know when it has performed the desired behavior.

Trainers on the side of a pool prepare to give food to a dolphin in the water next to them.

Reinforcers let an animal know when it has performed the desired behavior.

Reinforcement must immediately follow the behavior in order to be effective. A delay of only a few seconds may accidentally reinforce the wrong behavior. But, it's not always possible to instantly reinforce an animal while it's performing - it may be across the pool from the trainer. Trainers must have some other way to communicate to animals when they perform correctly. To do this, trainers use a signal.

This signal is called a bridge signal. The bridge signal “bridges” the gap of time that occurs between the behavior and its reinforcement. Bridge signals vary with species. For whales and dolphins, the bridge is usually a whistle or a light touch. Examples of verbal bridges include “okay,” “nice,” and “good.”

A Commerson's dolphin in a pool raises a pectoral flipper to touch a trainer's hand. The trainer has a whistle in his mouth.

The bridge signal, such as the whistle this trainer is using, "bridges" the gap of time that occurs between the behavior and its reinforcement.

Each animal is trained to recognize a bridge signal. Before reinforcing an animal, the trainer sounds a bridge signal. Through continual pairing with reinforcers, the bridge signal becomes a conditioned reinforcer. The animal quickly learns that when it hears the bridge signal, reinforcement is coming. The bridge signal can then be used to reinforce the animal the instant it performs the correct behavior. It also becomes the stimulus for the animal to return to the trainer.

Target Recognition

To train an animal, trainers usually break down the behavior into a series of small steps. Trainers use their hands as a focal point. Animals are trained to come to the trainer's hand, touch it, and await the next signal. This behavior is called "targeting". When a behavior takes place farther away, a tool called a target is used as an extension of the hand.

Just as a flagstick is a target that directs a golfer toward a golf hole, a target directs an animal toward a position or direction. For most animals, the target used is the trainer's hand or a long pole with a foam float or ball on one end. Other targets include a tap on the glass at the side of the pool or an ice cube tossed into the water.

Each animal is trained to follow the target.

Trainers teach an animal to "target" by touching the target gently to the animal. The bridge signal is sounded, and the animal is reinforced. The trainer repeats this several times.

A Commerson's dolphin raises its head out of the water to touch a trainer's outstretched hand.

The animals are trained to come to the trainer's hand, touch it, and await the next signal. This behavior is called "targeting".

The next step is to position the target a few inches away from the animal. The trainers wait for the animal to touch the target. By this time, the animal has learned that whenever it touches the target, it gets reinforced. When the animal moves toward and touches the target, the trainer immediately sounds the bridge signal, and the animal is reinforced.

After several successful repetitions, the target is moved still farther away. Each time the animal touches it, the trainer bridges and reinforces. Eventually the animal will follow the target. The target may then be used to lead the animal through a series of steps to gradually perform complex behaviors.

Shaping a Behavior

Most behaviors cannot be learned all at once, but develop in steps. This step-by-step learning process is called shaping.

Many human behaviors are learned through shaping. For example, when children learn to ride a bicycle, most begin by riding a tricycle. The child graduates to a two-wheeler with training wheels, and eventually masters a larger bicycle, perhaps one with multiple speeds. Each step toward the final goal of riding a bicycle is reinforcing.

Animals learn complex behaviors through shaping. Each step in the learning process is called an approximation. An animal may be reinforced for each successive approximation toward the final goal of the desired trained behavior.

Here is an example of how a dolphin might be trained to do a high jump. First, the dolphin is reinforced for touching a target on the surface of the water. The trainer raises the target a few inches above the water, and reinforces the animal for touching it. As the dolphin succeeds, the trainer continues to raise the target higher and higher above the water. Eventually the dolphin brings its entire body out of the water. Each time the dolphin touches the target, the trainer may reinforce the dolphin. The trainer continues to raise the target until it is at the high jump level. The dolphin is reinforced along each step toward the final goal of a high jump.

A trainer holds a target pole over the water and a dolphin breaches to touch it.

To teach this dolphin the high jump, the trainer raises the target above the surface of the water. The dolphin must rise up to touch it.

A trainer holds a target pole high over the water and a dolphin jumps to meet it.

As the dolphin successfully masters each step, the trainer continues to raise the target higher and higher above the water.

A trainer gives a hand signal while a dolphin is airborne at the apex of a jump.

During the shaping process, the trainer introduces a hand signal paired with the target. Eventually the dolphin associates the signal with doing the bow and doesn't wait for the target before jumping. The target is then gradually phased out of use.

Stimulus Discrimination

During training sessions and shows, the trainer requests many different behaviors of an animal. The animal is trained to differentiate, or discriminate, among the situations. Discrimination is the tendency for learned behavior to occur in one situation, but not in other situations.

The animals are trained to associate signals with behaviors it has learned. A signal may be a visual, auditory, or tactile stimulus. The animals learn to discriminate between signals to determine which behavior the trainer expects. In the shows, many of the signals are built into the trainers' dialogue and stage gestures. The signals may be so subtle that the audience may not notice them.

A costumed trainer makes an exaggerated gasp gesture face-to-face with a sea lion during a show.

In the shows, many of the signals are built into the trainers' dialogs and gestures.

To an animal, a signal is a reinforcement opportunity. A correct behavior will be followed by a reinforcement. The animal comes to associate the signal with reinforcement. Eventually the signal itself may become reinforcing.

Chaining Behaviors

Sometimes an animal may perform several behaviors in a specific sequence before receiving reinforcement. This connected sequence of behaviors is called a chain.

For example, a raven may (1) fly from a perch onto the stage, (2) pick up a crumpled piece of paper, (3) fly to a nearby trash can, (4) perch and drop the crumpled paper into the trash can. Each segment is trained separately, and then they are linked together. Once the animal has learned each segment, trainers may train behavior chains "forward" or "backward."

In "forward" training, the trainer gives the signal for the raven to perform the first behavior: flying from the perch to the stage. Instead of bridging and reinforcing right away, the trainer gives the signal for the second behavior: picking up the crumpled paper. The trainer bridges and reinforces the raven after the second step. In time, the raven will learn that it must complete both behaviors before it receives reinforcement.

Gradually the third step is added before reinforcing, and then the fourth. The completion of each behavior becomes the signal for the next. Eventually, the bird will link the steps together - they will become one behavior. Then, the trainer just gives one signal and the raven performs the entire chain before receiving reinforcement.

In "backward" training, trainers train the chain of behaviors by beginning with the final behavior in the chain. The trainer gives the signal for the raven to perch on a trash can and drop a crumpled piece of paper into the trash can. The raven is reinforced. Then the third step is connected prior to the fourth so that the raven now must fly from the stage to the trash can before it can drop in the paper and get reinforced. Next, the second step is connected prior to the third, and then the first step is connected prior to the second.

Each connection always ends up with dropping the paper as the final behavior that is bridged and reinforced. Again, the completion of each behavior becomes the stimulus for the next behavior. The chain is complete when the trainer is able to give an initial signal and the entire chain is performed.

An otter walks across a stage.

This Asian small-clawed otter was trained three separate behaviors: run onto the stage,

An otter climbs onto a seated trainer's back.

crawl onto a trainer's back,

A sea otter waves while standing on a trainer's back.

and stand on the shoulders and wave. These behaviors were then combined in a sequence and appear to be one seamless behavior.

Least Reinforcing Scenario (LRS)

To eliminate undesired behavior, SeaWorld trainers have successfully developed a training technique called the Least Reinforcing Scenario (LRS). The LRS follows an undesired behavior. If a trainer requests a particular behavior and the animal responds with inappropriate behavior, the trainer must be careful not to reinforce the response. The trainer delivers an LRS - they stand still and do nothing. This way, they are least likely to deliver a reinforcer.

The LRS continues for 2 to 3 seconds. The trainer's relaxed demeanor is the stimulus for the animal to be calm and attentive. After the LRS, the trainer reinforces the animal for being calm. Then, the animal is given the opportunity to perform another behavior that will result in reinforcement.

The LRS is not a fixed posture, but instead a flexible system enabling the trainer to deliver the LRS in a variety of contexts. The trainer does not ignore the animal but must monitor the animal's behavior while taking care not to show a response to inappropriate behavior.

When used consistently, the LRS technique eventually decreases undesired behavior. Reinforcing the animal for calm, attentive behavior following the LRS helps reduce frustration that might result from the lack of reinforcement and teaches the animal to react in a non-aggressive way. An animal never is forced into a situation, nor is it ever punished.

SeaWorld Auditory Cueing System

With SeaWorld's Auditory Cueing System (SWACS), trainers use underwater tones as one way to initiate behavior. Groups of computer codes are organized to represent each animal's name, different behaviors, objects, and modifiers. A specific tone played into the water corresponds to each computer code. The tones are based on calls recorded from wild killer whales, and are at the same frequencies as whale vocalizations. Trainers use water-resistant keyboards to select tone codes, and underwater speakers project the tones into the water. For example, a trainer can transmit tones representing the words, "Shamu," "fast swim," signaling the whale, Shamu, to swim around the perimeter of the pool at high speed.

To train a killer whale to recognize a tone, a trainer pairs the tone with a hand signal and behavior the whale already knows. Eventually, the trainer stops using the hand signal, and the whale learns that the tone itself is the stimulus for the behavior.

SWACS is a useful tool to help researchers study learning in toothed whales. It is an ongoing, long-term research project.

Lexigram Reinforcement System

SeaWorld trainers have developed a system in which the animal may select its own reinforcer from an assortment of reinforcers.

This system, called the Lexigram Reinforcement System, involves training the animal to associate a symbol, or lexigram, with a known reinforcer. For example, the animal is trained that when it touches a triangle, it will receive a rub-down. When it touches a circle, it will receive a fish. Once the animal learns a certain symbol represents a specific reinforcer, theoretically, that animal will be able to indicate to trainers which reinforcer it prefers at that time.

The Lexigram Reinforcement System investigates the hypothesis that animals may learn at a faster rate, with better retention, when given a choice of reinforcing consequences. Using this system, trainers hope to gain information on different areas of animal learning, including learning rates, choice preference, and problem solving.

Observational Learning

Animals often learn through observation, that is, by watching other animals. Observational learning can occur with no outside reinforcement. The animal simply learns by observing and mimicking. Animals are able to learn individual behaviors as well as entire behavioral repertoires through observation.

At SeaWorld, killer whale calves continually follow their mothers and try to imitate everything they do. This includes show behaviors. By a calf’s first birthday, it may have learned more than a dozen show behaviors just by mimicking its mother.

Three orcas jump from the water during a show.

Killer whale calves continually follow their mothers and attempt to imitate everything they do.

Adult animals trained alongside experienced animals may learn a faster rate than if they were trained without them.

Behavior Repertoire

Animals have the potential to learn extensive repertoires of behaviors. An experienced animal may learn as many as 200 behaviors.

Animal training is an ongoing process throughout an animal's life. Trainers and animals develop new behaviors and modify current behaviors to keep the animals physically and mentally challenged.

A trainer standing in a pool holds a dolphin's pectoral flippers while training.

Animal training is an ongoing process throughout an animal's life.

Maintaining existing behavior is as equally important as training new behaviors. Trainers use the VRRV and LRS to maintain trained behaviors.

Interactive Sessions

Through decades of experience, SeaWorld trainers have learned that a variety of interactive sessions contributes to the enrichment and well-being of the animals. These interactive sessions fall into five different categories:

Learning Sessions

Learning sessions involve a formal training process for the animals, in which trainers condition specific behaviors. Learning sessions provide a series of challenges that enrich the animals' environment while yielding valuable information on learning processes. These sessions are important to the continued learning and mental stimulation of the animals, a process that never stops.

A dolphin, upright and surfaced, follows a trainer's hand during a learning session.

Learning sessions involve a formal training process for the animals, in which trainers condition specific behaviors.

Exercise Sessions

Exercise sessions are essential to an animal's health and well-being. Exercise sessions consist of high-energy behaviors of varying lengths. Each session is tailored to each species and is based on that species' daily activity levels.

A trainer gives hand signals while an orca is in midair during a jump.

Exercise sessions consist of high-energy behaviors of varying lengths.

Relationship Sessions

Relationship sessions allow time for a trainer and animal to develop reciprocal trust, which enhances the degree of learning. A trainer spends one-on-one quiet interactive time with the animal. A strong, rewarding relationship between trainer and animal is an important part of SeaWorld's animal training program foundation. In many instances trainers learn new reinforcers for the animals by observing the types of activities in which the animals frequently engage.

A trainer pets an opossum during a relationship session.

Relationship sessions allow time for a trainer and animal to develop reciprocal trust.

Play Sessions

Play sessions provide time in the day when trainers and animals interact with "games" and "toys." Trainers learn through experience which games and toys the animals appear to enjoy and elicit enthusiastic participation.

Shows

Shows provide an opportunity for SeaWorld to educate the public about the behavior, physiology, and ecology of marine animals. Education is one of SeaWorld's main goals as a premier zoological institution. Information is shared with guests through entertaining presentations. While the shows follow a basic format, the behaviors, show segments, and reinforcers continually change, making each show different for the animals. The variability of each show integrates aspects of other types of sessions, providing stimulating and entertaining interactions.

A sea lion interacts with two costumed trainers during a show.

The variability of each show integrates aspects of other types of sessions, providing stimulating and entertaining interactions.

Environmental Enrichments

Trainers at SeaWorld aim to create an increasingly complex and stimulating environment to help foster the proper care and management of the animals, their habitat, and their behavior.

One enrichment technique is to create changes in an animal's daily activities, giving them variety. Animals are provided with activities they seem to find interesting and stimulating, including play sessions with trainers and other animals.

To further enhance the positive environment for the animals, trainers present "toys" to the animals. These environmental enrichment devices can be used during any type of interaction. Under trainer supervision, the animals interact visually or physically with these environmental enrichment devices. The many enrichment items vary depending upon the animal. Animals may interact visually with mirrors, brightly colored cones, balls, and animal-shape cut-outs. They may physically interact with floating plastic barrels, large plastic toy hoops, and rubber balls.