Embodied Evolution

Richard A. Watson, Sevan G. Ficici

{richardw, sevan}@cs.brandeis.edu

Summary

Read Details

Introduction

We define embodied evolution (EE) as evolution taking place in a population of robots. Further, we stipulate that the evolutionary algorithm is to execute in a distributed and asynchronous manner within the population, as in natural evolution. Thus, algorithms that centrally maintain and manipulate the specifications of individual agents are not permitted. We wish to create a population of physical robots that perform their tasks and evolve hands-free -- without human intervention of any kind, as [Harvey 1995] articulates. Here, we introduce our conception of embodied evolution and report the initial results of experiments that provide the first proof-of-concept for EE.

Embodied Evolution

The purpose of embodying the agent inside physical hardware is, of course, to escape the problems of poor simulation fidelity that prevent simulated hardware and their controllers from successfully transferring into the real world. In particular, rich multi-robot domains that study group behavior are notoriously difficult to simulate. The purpose of distributing the evolutionary algorithm amongst the population of robots is -- aside from appealing to biological interest -- to enable scalability. As robot populations become larger (on the order of hundreds or thousands) and deployed in more complex environments, the less tenable a centralized evolutionary algorithm becomes. EE has forced us to address the technological issues of achieving continuous, untethered power and algorithm decentralization. Both of these will issues arise in any multi-agent domain that uses a learning method where agents share knowledge, and must be addressed if autonomous, situated robotics is ever to bloom into its envisioned potential.

Autonomous Evaluation and Reproduction

The principle components of any evolutionary algorithm are evaluation and reproduction, and both of these must be carried out autonomously by and between the robots in a distributed fashion for EE to scale effectively. Because the process of evaluation is carried out autonomously by each robot, some metric must be programmed into the robots with which they can measure their performance. This could be quite implicit, for example, where failing to maintain adequate power results in "death." Or, it could be explicitly hard-coded, for example, where fitness is a function of objects collected and time. Whatever metric is used, performance against it must be monitored by the robot itself, as no external observer exists to measure explicitly a robot's ability.

Because a spatially distributed population of robots makes coordination of generational reproduction difficult, reproduction in EE must be both asynchronous and distributed. Assuming that we cannot really create new robots spontaneously, the offspring must be implemented using (other) robots of the same population. And, assuming we do not have structurally reconfigurable bodies, reproduction must simply mean the exchange of control program information. "Genetic" information thus travels via local reproduction events, according to the locations and movements of the robots. Selection may be realized by having more-fit individuals supply genes (i.e., be parents) or by having less-fit individuals lose genes (i.e., be replaced by the offspring) or by a combination of both.

Results

Our experiments use a population of eight custom-built robots that are based upon the Cricket microcontroller board (courtesy of the MIT Media Laboratory). Each robot has a 12cm diameter and is equipped with two light sensors, two motors, and an infra-red emitter/detector pair that provides local communication. The control architecture is a small feed-forward artificial neural network, the weights of which are evolved to perform phototaxis similar to that described in [Braitenberg 1984]. The task environment consists of a 130cm by 200cm pen with a lamp located in the middle; the lamp radiates light in all directions on the floor plane. The floor of the pen is electrified and delivers continuous power to the robots. The robot task is to reach the light from any starting point in the pen. Because the pen contains a multitude of robots, however, the de facto environment also includes some amount of robot-to-robot interference [Schneider-Fontán, Mataric 1996]; therefore, the task implicitly requires that each robot also successfully overcome this interference.

Figure 1 shows the frequency with which the light is successfully reached by the robot population over time in each of three experiments. The main experiment evolves the neural-network weights to perform the light-seeking task (from an initial condition where all network weights have value of zero.) The other two experiments are controls where the robots do not evolve; in one case the robots' weights are random values, in the other the robots use weights of a hand-designed solution. The two controls clearly distinguish the quality of our hand-designed solution from random networks and provide useful references against which to judge the success of the trials where evolution takes place. We see that embodied evolution allows the population of robots to achieve performance superior to that of our hand-designed solution. Interestingly, the evolved solutions exhibit behaviors that are qualitatively different from our hand-designed solution; evolution appears to favor a spiralling solution, whereas, with our hand-designed solution, the robot "swaggers" to the light.

Figure 1: Average Hit Rates Over Time.

Three curves show performance of the robot population using hand-designed (non-evolving), evolving, and random (non-evolving) solutions. The data from the hand-designed and evolved experiments are averaged over six runs, while the data from the random-solution experiment are averaged over two runs. Each run lasts 120 minutes and uses eight robots. The X axis represents time during the run. The Y axis represents the average rate (in hits per minute) at which robots reach the light. A time window of 20 minutes is used to compute the instantaneous hit rate for each data point on the graph. Random solutions achieve less than one hit per minute; our hand-built solution achieves approximately 10 hits per minute; when the population evolves solutions, performance eventually exceeds the hand-built hit rate.

Big Picture

There exists in robotics an autonomy/physicality tradeoff, where the number of moving parts in a machine is inversely proportional to its level of autonomy. Thus in robotics, we have low physical complexity systems -- laptops on wheels -- with complete autonomy, but as more complex robots are built, they can only be controlled by human operation and are little more than electronic puppets. How can "brains," or controllers, be developed for the highly complex mechanisms necessary to place intelligent machines in arbitrary real environments? That traditional AI theories do not work inside robots is well known. Thus, we see a chicken-and-egg problem. We believe that the only solution is through gradual adaptation of controllers coordinated with gradual deformation of their bodies -- nature never constructs a machine for which no controller exists. Though sound in concept, coevolving brains and physical bodies incurs the currently prohibitive cost of constructing unique physical machines; neither computer-aided manufacturing, nor self-assembly, nor lithographic "solid printing" devices are ready to automatically build bodies to computer specification. While embodied evolution does not currently include the evolution of bodies [Funes & Pollack 1997], it does provide the first step towards reconfigurable hardware by enabling a substrate for evolving heterogenous group behavior.

Further Information

Full implementation details and discussions can be found in our forthcoming technical report, "Embodied Evolution," soon to be available from the DEMO report archive.

References

Braitenberg, V. (1984)Vehicles: experiments in synthetic psychology. MIT Press.

Funes, P. & Pollack, J. (1997). "Computer Evolution of Buildable Objects." ECAL IV, Husbands & Harvey, eds., MIT Press, 358-367.

Harvey, I. (1995) University of Sussex, U.K., Personal communication.

Schneider-Fontán, M., & Mataric, M. (1996) "A Study of Territoriality: The Role of Critical Mass in Adaptive Task Division." From Animals to Animats IV, Maes, Mataric, Meyer, Pollack & Wilson, eds., MIT Press, 553-561.