Table of Contents Environments vs. Task environments Strategies for Deforming the State Space |
David Kirsh Dept. of Cognitive Science Univ. California, San Diego La Jolla, CA 92093-0515 +1 858 534-3819 kirsh@ucsd.edu AbstractThis paper examines some of the methods animals and humans have of adapting their environment. Because there are limits on how many different tasks a creature can be designed to do well in, creatures with the capacity to redesign their environments have an adaptive advantage over those who can only passively adapt to existing environmental structures. To clarify environmental redesign I rely on the formal notion of a task environment as a directed graph where the nodes are states and the links are actions. One natural form of redesign is to change the topology of this graph structure so as to increase the likelihood of task success or to reduce its expected cost, measured in physical terms. This may be done by eliminating initial states hence eliminating choice points; by changing the action repertoire; by changing the consequence function; and lastly, by adding choice points. Another major method for adapting the environment is to change its cognitive congeniality. Such changes leave the state space formally intact but reduce the number and cost of mental operations needed for task success; they reliably increase the speed, accuracy or robustness of performance. The last section of the paper describes several of these epistemic or complementary actions found in human performance. Keywordsadaptation, task environment, redesign, epistemic actions, complementary actions IntroductionThere are at least three logically distinct strategies a creature has for improving its fitness:
In this essay I shall examine the third strategy: redesigning the environment. The problem of when to adapt, when to redesign an environment, and when to search for a new habitat is broad enough to be treated as a general fact of life. Humans, no doubt, are one of the few creatures who (explicitly) reason about this problem; but it is certain that other animals tacitly face a similar problem and that evolution has wired in a partial strategy for solving it. The reason we may expect creatures to have built-in strategies for redesigning their environments is that creatures with some active control over the shape of their environment will have an adaptive advantage over those who can only passively adapt to existing environmental structures. This follows because there are limits on how many different tasks a creature can be designed to do well in. It is not likely that a design which allows a creature to be exceptionally fleet footed on flat terrains will also confer exceptional agility on rocky terrains. Design trade-offs must be made. If, however, a creature could somehow change the rocky terrain it visits (by making paths), or if it could augment its design with prostheses or tools (e.g. carry cups of water in dry regions, use `fishing sticks' to poke narrow crevices), its existing design becomes more adaptive. We do not have to look far to find examples of how animals and humans modify their environments in adaptive ways. Beavers dam ponds, birds build nests, ants farm aphids, chimps leave useful nut cracking stones in commonly used places [Kummar 1995], squirrels collect nuts for winter, and Egyptian vultures drop stones on ostrich eggs. In each case, the environment is warped to the creature's capacities rather than the other way around. This is to be expected within a classical adaptationist approach. For in the `struggle for existence' organisms with these favored behavioral tendencies will outreproduce their competitors with less favored behavioral tendencies; if these tendencies are heritable the distribution of tendencies will change over generational time, with the favored tendencies becoming more common. So classical adaptationism predicts that environmental modification will occur. [Butler, 1995] But evolutionary arguments do not explain why these traits are adaptive. For that we must look to economic/ethological arguments. Thus, because food stuffs are scarce in winter but plentiful and cheap in the fall it is wise to stock up in the fall (standard inventory control principles). Because the probability of finding an ideal nesting site is small, at some point it becomes more cost effective to build a nest than to continue searching for one, (investment analysis). And because good nutcracking stones take time to find, it is better to leave them where they will be most useful for everyone concerned than to discard them in arbitrary ways (amortize the cost of search). When such explanations are offered, we feel we have understood the phenomena better. But if we look closely we note that behind such explanations there is an appeal to a notion of task environment that is distinct from the notion of the selective environment present in evolutionary arguments. It must be so, for the economic principles appealed to all show that a given behavior X is preferred to behavior Y with respect to a certain goal. They cannot show that behavior X is better than Y all things considered. Thus leaving around a nutcracking stone is desirable for the task of nutcracking, but it may not be a good tactic overall since it may have a side effect of cueing predators. What is good for a task in isolation may not be good when all tasks are considered together. My objective here is to explain this notion of task environment in order to extend the range of analyses available to understand active redesign. I will draw on ideas from computer science, economics, the theory of problem solving, and the study of interactivity to clarify what environmental redesign is and suggest ways we might measure its benefits. The paper is organized into three sections. In the first, I distinguish the notion of a task environment from that of the environment more broadly construed. Task environments are sub structures within a more all encompassing selective environment. This distinction supplies us with the descriptive apparatus to pose the problem of environmental redesign in a natural way: namely, how is a creature to change the structure of at least one task environment so that it's overall performance, that is its performance summed over all task environments, is improved. This global improvement may be achieved by enhancing performance in one or more task environments without reducing performance in any other – Pareto optimization -- or by enhancing performance in one or more environments to more than compensate for any reductions incurred in others. Since a task environment is a sub structure within a larger selective environment, it is necessary to provide effective criteria for deciding which parts of the larger environment fall within a given task environment and which parts fall outside it. This is the burden of section two. Using the language of computer science I define the structure of a task environment to be a directed graph where the nodes are states and the links are actions. Since not all actions available in the larger environment are allowable as moves in a particular task environment, we distinguish actions that are internal to a task from actions that are external. For example, actions which modify the structure of a task can only be achieved by performing actions that are external to the task. This distinction makes it possible to claim that creatures perform actions for the sake of redesigning a task environment. The remainder of the paper then focuses on some of the different types of task-external actions available to both human and non-human animals for redesigning task environments. Two broad categories are distinguished:
Environments vs. Task environmentsThe term environment, in normal biological parlance, refers to everything exogenous to a creature which may affect its life, experience, or death. These include the full range of external factors which may affect an agent's experience and motivation -- factors affecting its internal state broadly understood -- as well as all factors determining its possibilities for action, and the consequences of its actions. For our purposes here, it will be sufficient, to use the term environment in a slightly more narrow manner to refer to the totality of cues, constraints, resources and dangers in a creature's world which determines its success, where success means its differential reproductive success. Environment, then, means the `selective' environment. [cf. Brandon 90] One feature of the notion of selective environment is that it abstracts from most of the micro-structure in a creature's niche. If the environment is the totality of external cues and resources that can make a difference to reproductive success, there is no distinction between those cues and resources that are relevant to one type of task, such as food collection, and which affect the success or failure of particular foraging strategies, and those cues and resources that are relevant to another task, such as predator avoidance, and which affect the success or failure of particular defensive strategies. Although there is only one world which the creature inhabits, for certain types of analysis it is useful to circumscribe that one world into sub-domains, each of which is relevant to the success or failure of particular task strategies. Following psychologists, I shall refer to these sub-domains or micro-environments embedded in the larger environment as task environments. A creature's selective environment, then, is a superposition of task environments. Once we distinguish the selective environment broadly construed, from particular task environments within it -- micro-environments -- it is easy to state one problem which evolution, learning or intelligence must `solve' for a creature. It must determine how to conform to the optimality principle. Optimality Principle: efficiently allocate time and energy to the performance of different tasks so as to maximize overall `return'. [Lewontin 1978; cf. Horn 1979.] For example, we expect that a well adapted creature will efficiently divide its time between hunting, drinking, exploring the terrain, hiding from predators, finding/attracting mates, etc.. Given the returns inherent in each task environment the creature must allocate its resources among its different behavioral strategies so as to maximize its overall yield. Ethologists have been tackling this resource allocation problem for some time now using the language and methods of economics. For instance, to decide how much time an `optimal' lizard ought to spend foraging vs. hiding from predators, an ethologist would study lizards in their natural habitat and then plot the cost benefit curves of foraging and the cost benefit curves of hiding. An optimal creature ought to allocate its time and energy between foraging and hiding so that at the margin it is equally profitable whether the next minute is spent either foraging or hiding. This follows because if one activity, say foraging, were more profitable, an optimal creature would invest more time in foraging until the danger of being out in the open would lower the profitability of foraging, making hiding a more attractive option. In this way it is possible to identify an optimal allocation strategy [Koza 1992]. The calculations here are based on simple micro economic principles of optimization. The hard work, of course, is to discover the underlying facts about the lizard environment that permit the cost benefit curves to be constructed. For instance, it is necessary to determine both the probability of predation as a function of distance from the hiding spot, and the probability of catching prey as a function of distance. This is now a well accepted methodology in ethology. But once again, the entire enterprise assumes that activity-salient features of the environment can be identified. Features of the environment that are relevant to an activity can be distinguished from features that are irrelevant. If we accept this micro-economic way of posing the problem of adaptation we can easily restate our question of environmental redesign. If a creature is already optimally allocating its time among all the task environments it operates in, then the only way it has of increasing its yield, assuming it does not change its physiology or one of its behavioral strategies -- a classical process of self adaptation that we assume requires generational time -- is to increase the yield of one of its behavioral strategies. This gives rise to a principle of super optimality: Super Optimality Principle: if a creature is already optimally allocating its time among its different tasks, the only way its overall welfare can be increased is if the payoff function for one of its strategies increases. Since the payoff function is determined by the environment, this principle means that one of its task environments must change. Assuming that the forces of change are not stemming from the global environment this implies that either the creature migrates to a new habitat where it is easier to achieve one or more of its goals, or it redesigns its environment. (See also Bedau, this issue for a related view.) In microeconomics, individual agents can neither migrate to new economic environments, they cannot change their basic internal structure (physiology, technology) or behavioral capacities, and significantly, they cannot alter the payoff function by redesigning their environment. This last constraint holds because it is assumed that individual economic agents are unable to alter the structure of their task environments. Any one firm cannot change the price or supply of raw materials, the market price of its product, or change the available methods of production. This is not to say that the totality of firms making up an industry is unable to influence prices or affect technology. Collectively they can. But this power to partially reshape the environment does not reside in individual agents. It is a macro-effect. The shape of the payoff function lies outside the control of any one agent. It is a given of the environment, and so not a manipulable variable. [Simon, 1955]. The same assumption applies in classical ethology. For ethological analysis to get off the ground, it is assumed that an individual predator cannot significantly change the prey population (input supply) -- although of course there are aggregate or macro effects of the entire population of predators which result in predator-prey cycles, trends and so on. Nor can individual predators change the metabolic benefits (price) it gets from eating prey, nor the hunting techniques it has at its disposal. These are the givens of the adaptive context it finds itself in. Evolution can change metabolism, culture can change hunting techniques, and in the course of a single lifespan predator prey cycles may change the relative proportion of prey to predators. But all these changes must be regarded to be macro level changes, events which help to shape and define the micro environment of individual predators. Since the behaviors I will be interested in often occur at a very short time-scale (relative to more gradual, smooth evolutionary changes), and often with limited effect (just affecting the environment of one creature, or possibly a few), the environmental changes brought on by populations are not my primary concern. This does not mean that such changes in task environment structure cannot be explained using the analyses I will propose. It means, rather, that there are a class of environmental redesigns that are likely to slip beneath the filter of evolutionary selection. Despite the importance of the assumption that agents do not change their task environments this idea is not enshrined in the theory of natural selection, which is concerned with population changes in the selective environment. There was nothing in Fisher's [1930] original mathematical analysis of natural selection that required agent environment independence -- although Fisher himself relied on the assumption in order to prove his fundamental theorems. The reason Fisher assumed that environments would be the same both before and after adaptation was to prevent having to change the differential goodness of an attribute as it spread through the population. For example, if a gene for increased appetite and stomach size enters the population and confers an adaptive advantage on its owners, Fisher assumed that as the gene spread through the population the differential goodness of larger stomachs and appetites would not diminish even though a consequence of more animals with larger stomachs might be that there is less vegetation available for other members of the population.. Put differently, since every environment has a limited capacity to accommodate its population (its carrying capacity, K), we might well expect that increased appetite alters the carrying capacity of the environment leading to a smaller sustainable population, and also to a reduced selective advantage (reproductive rate r) as the population nears K. Fisher did not alter r and K, but again there is no mathematical reason, short of simplicity, to require these to be constant. Not surprisingly, there have been several efforts to accommodate the density dependence of attributes. [See Royama T., 1992, for a good account of density dependent parameter models]. Such models permit the spread of an allele to alter the carrying capacity of the environment. This is clearly a step in the right direction. But one assumption limits all these models: they all assume that r and K vary smoothly with the spread of the attribute. Accordingly, there are no threshold effects or jumps in K or r as a result of reaching certain population densities. This is not entirely realistic. Such jumps might happen, for example, if greater appetite leads to greater defecation, which in turns either increases the vegetation yield non-smoothly, because of threshold effects, hence non-smoothly increasing the carrying capacity, or greater defecation decreases the yield non-smoothly through threshold toxicity, suddenly killing off vegetation and so non-smoothly decreasing carrying capacity. Introducing agent environment co-dependence is an important first step in allowing individual agents to have a significant effect on the structure of their environments. But, as noted, there remains an assumption in all these models that such co-dependence is highly constrained, leads to smooth changes in environmental structure, and most importantly of all, is the kind of co-dependence that can be genetically selected for. If there exist examples of environmental redesign, as I believe, that are idiosyncratic to particular agents, and which sometimes result in non-smooth changes in fitness, research at the level of evolutionary selection will require complementary studies to explain these phenomena. Again, this does not imply that evolutionary selection is irrelevant; the capacity to opportunistically exploit elements of the environment to enhance task performance is a valuable capacity to pass on. It implies that the specific ways individual creatures have of redesigning their environments may be localized in time and space, and so not fully explicable from a selectionist standpoint. To make this case plausible it is necessary to define exactly what we mean by a task environment. What is a Task environment?The adaptationist program encourages an engineering approach to organisms: in the ideal case, well adapted creatures ought to converge on optimal designs for particular tasks. [McFarland 1981]. The point of partitioning the global environment into a collection of task environments is to provide us with a theoretical abstraction, the task environment, that is sufficiently well defined that we can evaluate the performance of competing organism designs or competing algorithms for action by using concepts drawn from computer science. We want to be able to compare how efficient different algorithms are at carrying out particular tasks. Formally, a task can be represented as a directed graph, where the nodes of the graph denote choice points (i.e. possible states), the links represent transitions or actions, and a privileged set of states represent the possible ways of completing the task. A successful effort at the task can then be understood as a trajectory, or path, through this graph structure starting from an initial state and ending at one of the states satisfying the goal condition. Typical tasks we might hope to represent as trajectories include caring for cubs, building a nest from local debris; collecting termites from a 6 foot mound; damming a stream; avoiding a charging predator, mating, and so on. In the human world the tasks for which a directed graph representation might be constructed range from highly structured activities such as playing solitaire; solving an algebraic problem; or making a curved surface in a graphics program, where the task supports a small number of possible actions at each choice point, to less formal tasks, such as cooking, cleaning up, driving to work, and even writing an essay, where the actions available at an arbitrary choice point are harder to enumerate and success is harder to measure. In studying behavior in these tasks, a researcher attempts to determine the topology of the state space, and to discover a plausible metric -- such as minimum number of actions required to get from the current to the goal state, or the amount of energy required to reach the goal, or the reliability of the paths to the goal -- to permit comparing the goodness of different plans or algorithms for performing the task. It is then possible to rank different algorithms with respect to how much energy they require for task completion, or how reliable they are, or how many actions they use. It is thought that this methodology will scale to large tasks too, as shown by the vast literature on administrative behavior, [Simon 1976], workplace design [Kroemer 1993], human factors, [Teichner, 1971], and human computer interaction. [Diaper 1989] To be more precise, let us follow Simon [Simon, 1973] and say that a task environment is well defined if:
Action selection can then be seen as the application of two successive filters. [Elster 1981]. Given a choice point, filter the action repertoire to yield a feasible set, then filter the feasibility set to yield a choice set by applying the metric to the consequences flowing from each feasible actions, and invoking a decision rule to select the best. If the choice set is determined by a simple decision rule (filter two), such as maximize expected return, each action in the choice set will have the same expected utility. If the choice set is determined by a different decision rule, such as minimize the worst possible injury or cost, each action will serve the conservative function of leading to states that are not likely to be costly, even though they may have very different expected benefits. The action actually carried out is any one of those in the choice set, and is assumed to be arbitrarily chosen, although often there will only be one action in the choice set. To see how we can use the directed graph notation to understand behavior consider how we would explain a bird constructing its nest. We would begin by defining the goal condition as, say, the construction of a stable structure with certain shape, size, thermal and tensile properties. The initial state would be the site chosen to build on, such as a branch or small hole, as well as the distribution of useful resources or debris -- the local inventory -- to be found in the region the bird is willing to scavenge over, for instance, sticks, feathers, paper, saliva, dirt within a radius of 50 meters. Choice points are then introduced by assuming that there are a set of construction relevant actions available at various points in the nest building process. For instance, if the nest walls need to be heightened and the bird's beak is empty, the bird faces the choice of selecting which of a small set of appropriate nearby inventory items should be fetched. If an item is already in the beak then the bird must choose between placing the item on one of a small set of points or surfaces on the emerging nest, rejecting the item and dropping it, or storing it for later use by placing it to one side. As each action is completed the state of the environment changes and the bird faces a new choice point determined by the demands of the task. The distance from the goal, meanwhile, might be estimated by how many items of the inventory will be required to finish the nest, or how much energy will be needed. And the consequence of any action is given by stating how the nest and bird each differ after tamping, or after the bird fetches a stick, etc.. By using a directed graph to represent the nest building process we distinguish trajectories of nest building that are likely to be successful from those likely to be unsuccessful. When this analytical framework is used in conjunction with empirical observation of these trajectories we hope to infer the strategies, plans, and behavioral programs which animals actually use. One advantage of developing an account of behavior that treats it as algorithmic is that we can consider some of the computational properties of the algorithm regulating it. [See Harel, D. 1987 for a simple account of these properties.] How much memory is required to follow this algorithm? How many steps will be required in the worst case, in the average case? How robust is the algorithm to interference? to noise? If we can determine these properties of behavior control strategies we can explain the adaptive advantage of different strategies. The factor which complicates this simple picture is that we cannot simply infer these computational facts. The robustness of a program may depend on how it is implemented. A production system implementation [Newell 1973] of a skill or strategy that consists of a set of rules that trigger only when the creature and environment are in a certain state, may have one level of robustness and performance; while a recurrent network implementation of the same skill or strategy, may have another. So we cannot go directly from the behavioral analysis of programmed behavior to the computational properties of the `program' which regulates behavior, since our analysis will depend on how we think the creature is programmed. The reason to be hopeful about this line of research, though, is that there is a well established belief in computer science that, at an abstract level, we can discuss the absolute complexity of a task or problem in a manner that abstracts from arbitrary aspects of an algorithm and implementation. [Harel 1987] Thus even if we do not know whether a creature has a working memory capable of storing M items, we can still know that the creature must be capable of keeping track of M items if it is to perform the task, and that it will have to perform at least N (mental or physical) `manipulations' on those M items. How it does this `storing' is implementation dependent. It may somehow encode markers in working memory and manipulate them internally. Or it may have a less symbolic memory and rely on visual strategies for tracking or quickly isolating recently seen items in the environment, and then physically manipulate those external items. We cannot say in advance of detailed research. Nonetheless, we can rank different algorithms by their `memory' requirements, and we can rank different tasks by their complexity in the sense that the most efficient algorithm capable of solving them will have certain complexity measures. [Papadimitriou, 1995] Justifying a choice of a state space representation The concept of a task environment that I have been presenting, is an abstraction laid over the interactions between an agent pursuing some goal, which we call its task, and the physical environment in which it is acting. The point of the abstraction is to reveal the constraints on goal-relevant activity inherent in agent environment interaction, so that we can explore the sub-goal structure of the task and thereby explore the costs and benefits of pursuing different paths in the task state space. In order to justify a particular state space analysis of a task, it is necessary to first justify one's choice of action repertoire, choice points, option set, consequence function, metric, goal condition and initial state. Of these the most crucial choices are the action repertoire, option set, and consequence function, since these determine the non-metrical (or qualitative) topology of the space. How then do we choose the states and actions which allow us to interpret activity as a trajectory in a particular task state space? There are really two issues here. First, how do we decide how to classify the states and actions occurring in task performance? Second, how do we decide whether an action observed while a creature is engaged in a task is the type of action to qualify as a move within that task space as opposed to being an action that just happens to occur in the same time frame as the task? A word about the first problem. Since a task environment is purely a theoretical construct, its adequacy depends entirely on its success in explaining behavior. For instance, in describing the task environment of ping pong we are free to choose a set of states and state transitions (actions) which we think will clarify the structure inherent in the game. Thus we might choose impacts as the states, and divide state transitions (actions) into two sorts: strokes, which are the transitions caused by players, and bounces, which are the transitions caused by the ball hitting a non-moving surface, such as the table top, or the player's body or the floor. Each transition has two parameters: spin, and non-angular momentum. Alternatively, though, we might characterize the game more qualitatively. There are several types of actions: forehands, backhands, smashes, each with parameters such as topspin, underspin, flat. The states are described from the perspective of the player currently stroking the ball, and qualitatively characterized by such terms as topspin deep court, smash down the line, etc. The advantage of such a qualitative account is that it coheres more with how players themselves think about the game and also lends itself to describing strategies we know players know. Nonetheless, such an account is no less a theoretical construct imposed on the continuous set of changes occurring in real ping pong game than is the first description. The choice between descriptive languages must be based on the quality of the task analysis it permits. This is all the more true when the animals performing the task lack a system of descriptive concepts for events and situations. This leaves us with problem two: how do we decide which actions of a creature's action repertoire belong to a task and so are internal to the task environment, and which actions are properly speaking not part of performing the task and so lie external to the task environment? Three natural criteria come forward as necessary.
Applying these criteria is not always an easy job. We cannot always have advance knowledge of all the actions which are potentially relevant to a goal, so it may be difficult to decide if an action is too idiosyncratic to count as an action within a task, or whether it is part of a novel strategy for task performance. Empirical studies should help but it is a real problem. As formal criteria themselves, however, these conditions seem both necessary and sufficient to let us distinguish actions that are internal to a task from those that are external. I now want to show that if we stick to these natural criteria all sorts of `external' or `meta-task' actions are possible that can have significant effects on performance. Actions taken outside a task may have side effects which affect performance within the task. If there are regularities in the way these side effects may be created they can be exploited by a creature so that, in effect, it may adapt the task to the strategies it has, rather than adapting its strategies to the task. We may distinguish two broad families of strategies creatures have of shaping their task environments.
Of these two, changing congeniality seems reserved for higher animals, particularly humans. So my examples of those strategies will be drawn from studies and analyses we have done with humans. But close observation of animal populations may yet reveal that even for these highly cognitive strategies, there are rudimentary analogous strategies occurring in the wild, particularly among creatures with `cultures'. Let us turn to the first category of task external actions. Strategies for Deforming the State SpaceThe structure of a task, I have been arguing, is given by its state space topology. It follows that to change a task environment is to change its topology. Since any change to the choice points in a creature's environment, to its action repertoire, to the consequence function (or metric defined over that environment), will either add or subtract nodes, add or subtract links, or change the distance between nodes, a natural strategy of redesign is to alter one of these constituents. To establish that a change is to a creature's advantage let us assume that behavioral strategies can be analyzed as procedures, or algorithms. Relative to a particular task environment an algorithm will have a particular set of costs and benefits. These may be measured in computational terms, such as time (number of steps or actions), or space (amount of memory, items to be tracked in the environment) required to accomplish the task. But we may also use broader metrics, such as the amount of energy or labor required to complete the task, the robustness of the algorithm to interference and noise, etc. It is important to be clear that, at this stage of our inquiry, the term algorithm will be reserved for the program or control structure organizing the actual behaviors a creature reveals in task performance. Since every attempt at performing a task will be interpreted as a trajectory of actions through a task state space, we can ask what are the properties of a program responsible for these trajectories. For lower animals these programs may resemble a collection of highly constrained action routines -- tropes. Accordingly, action is largely reactive to local conditions. But as we ascend the great chain of being we expect the programs to become larger, more flexible, and sensitive to non-local features of the environment. When humans and perhaps the great apes are the agents, their programs become sufficiently complex that we cannot understand their effects unless we also understand the regulating role of their mental operations in the more general system consisting of both mental and physical acts. At that point we must expand our notion of algorithm to include the idea of a control structure regulating both mental and physical actions, and the tight coordination between the two. But more on this in section 4, where we discuss cognitive congeniality. Since algorithms regulate activity within a task environment we are assuming that they do not change a task, they cause transitions to new states within the task. This is an observation of some import. To change the task environment requires executing an action that lies outside the normal algorithm for the task. This makes creating a better task environment resemble selecting a better habitat. The similarity is only superficial. In environment redesign the creature remains in the same geographical region and is itself responsible for the change in environment. The global environment does not present the creature with a range of `pre-existing' habitats, differing in salient respects, which the creature then chooses between. Rather, the creature itself actively creates the changes from a different `pre-existing' habitat. Thus, in habitat selection, the environment is assumed to have its task characteristics independently of the creature, whereas in active redesign the environment has been forcibly changed, and may be expected to gradually return to its original state upon the creature's death. Let us now review some of the different ways individual creatures have of altering the topologies of their task environments. These include eliminating initial states hence eliminating choice points – the Just Say No strategy, and method of Routine Maintenance; changing the action repertoire – the methods of Routine Maintenance, and Tool Use; changing the consequence function – Scouting Ahead; and lastly, adding choice points -- Tool Use. Just say No The first example of a strategy for deforming the task environment is best understood in terms of a filter which sifts out undesirable choice points, particularly the initial choice points which a creature faces upon entering a task. In the course of its life a creature comes across a variety of task situations -- a variety of initial states. Some of these initial states are easy to solve, some are hard. If a creature learns to avoid the hard situations, if it learns to refrain from attempting a task when it is hardest, it can ensure that certain regions of the complete state space are never visited and so are effectively pruned from the space. This has the effect of reducing the worst case performance of its behavior routines, and hence the average complexity of the task it actually attempts. The Just Say No method bears closer scrutiny. Virtually every problem has hard and easy instances in the sense that every algorithm designed to solve that problem will do worse on harder instances than on easier ones. The average complexity of a problem is given by the distribution of these hard and easy cases. See figure 1. Although in general, it is not possible to identify in advance which cases are going to be hard and which are going to be easy, in particular problems there are often cues that signal hard cases, and of course, a creature may remember that a particular case is hard if it has tried it once before. As the creature learns more about the cases that are hard, it can create a personal filter that effectively reshapes the distribution of cases it will face. Without physically changing the environment it changes its task environment. In real life this strategy is intuitive, pervasive, and population wide, although we also expect to find idiosyncratic instances of it, were we to observe individuals closely. If a beta male finds itself threatened by an alpha, it is likely to growl then avert its eyes and avoid a confrontation. Since it lacks the strength and fighting skills (fighting algorithms) to survive the harshest competition it adopts a policy of retreat. Given the less daunting environment defined by the world it retreats to, its fighting algorithms may suffice. This means, of course, that it will never have access to several females; but ex hypothesis we are assuming that in a competition it would lose and potentially incur serious injury.
Prudent retreat is not restricted to aggressive contexts. In looking for a stream to dam, a beaver `judges' whether the environment will prove hospitable and yield to its construction algorithm, or whether it is likely to prove a hard case. If it seems that the engineering skills required are beyond its level, it may simply reject the site and search for a more amenable one. The same applies to nesting birds. Sites for nests are chosen not just for their camouflage value, or for their protective features, but for their affordances for nest building. If one local area does not provide hospitable sites, it is best to continue the search for better sites. A second rate site might be adequate for a first rate nest if a bird has the requisite architectural skills. But if it does not, the easiest way to live within its means is to reject second rate sites. In a similar way, predators select their prey carefully, usually stalking them until the 'initial conditions' of the attack phase are in a good region of their attack algorithm. If it is raining or stalking conditions are bad, rather than testing the excellence of its stalking algorithms, a predator may simply postpone hunting, and wait for better conditions. Selectivity is an important component of competence. Even if a creature cannot guarantee that it will never have to face worst-case scenarios, it can strongly bias the distribution of cases and so improve the average case scenario it confronts. A strictly human example of the Just Say No method of altering the task environment can be found in chess playing. It is sometimes maintained that chess players inhabit the same state space whether they are novices or experts. If the state space is determined by the rules of the game it is hard to see how it could be otherwise. Yet novice players never face most of the states which experts face. One reason is that they themselves are incapable of making the sort of moves necessary to reach certain states. This is a consequence of the competitive nature of chess. But another, more salient reason for us now, is the way chess partners self select. Players prefer to play players of comparable rank. Hence novices face a slightly different chess environment than experts because they never face states which only experts could put them in. A final example of the Just Say No strategy can be found in the recent simulation studies on the iterated prisoner's dilemma by Batali and Kitcher [1996]. They found that when players are allowed a third action of waiting out a turn in the prisoner's dilemma task, they are able to achieve better aggregate outcomes. The average outcome for an agent playing the iterated prisoner's dilemma is greater if that player along with its co-participants can abstain from play whenever they like instead of having to make one of the two classic moves inside the game. By devising a policy of saying no under certain circumstances players change their opponents' expectations of the costs and benefits of actions, leading everyone to choose actions with greater social welfare -- a classic example of the Just Say No strategy, applied in a game theoretic context. This then is the Just Say No strategy for environment change. It is not always easy to decide when it is being followed, since it so often looks like an integral part of a creature's fighting, building, hunting algorithm. But conceptually it is distinct. Indeed, if our analysis has been correct so far, it must be distinct from the task algorithm itself since such actions are task external and often super-optimal. That is, when they don't have serious downstream effects, such as preventing learning (we are assuming the creature has completed its learning phase), they are ways the creature has of increasing the average payoff for one of its strategies. This leads to an increase in overall fitness. To see why actions such as avoiding confrontation, sitting on the side lines, looking for more hospitable sites, are task external actions, and that they can at times be super-optimal, we need to remind ourselves how we decide on the action repertoire of a task. As was mentioned earlier, it is not always obvious what actions should be regarded as part of a task and what actions should be regarded as external to a task because it is not always clear whether a given action may in principle advance or hinder progress toward a goal. It may seem that selective acceptance of initial conditions is just such a hard case. My reason for treating it as an external action is that entering and leaving a task are not literally part of a task. They represent decisions at another level. If we were to admit Just Say No actions to be part of hunting, or fighting, or nest building, we would have to admit a range of considerations into these capacities that really have nothing to do with them. For instance, the decision to accept a nesting site must be taken in light of the availability of sites, the lateness of the year, the prevalence of predators etc. All these considerations have nothing to do with the job of building, which can be treated as a fairly modular skill. Hence Just Say No actions fall outside that modular skill. They are actions which affect the structure of a task without changing the state of the task. They do not cause transitions in that state space. Showing that Just Say No actions are often super-optimal is more difficult. An action is super-optimal, by our definition, if it does not lower the return to actions in other tasks, and it raises the expected value of activity in the current task. Since Just Say No actions do not require that the creature must do anything other than reject the opportunity of engaging in a task we must show that there are times when `doing nothing' is the best thing a creature can do. To appreciate why sitting on the side lines can increase the yield of the current task without decreasing the yield in any other task we need to recognize that there are usually entry costs to shifting from one task to another. The opportunity cost of doing nothing depends not only on the return that might be available were the creature to do something else with its time; it also depends on how easy it is to leave one task and begin another. Assuming that a creature cannot instantaneously change gears, there are going to be moments when the best thing a creature can do is to be idle, particularly if by idling it increases the expected return of time spent on the same task later. The upshot is that by invoking a strategy of waiting, or deferring action, a creature can substantially improve its expected yield from its current skills. Without having physically altered the properties of its habitat it has altered the structure of one of the tasks it faces in that habitat, and so improved its prospects for surviving. Routine Maintenance Learning to selectively say no to certain problem situations is one way of filtering out choice points to improve performance. Another way of achieving a similar outcome is to have a policy of maintaining one's environment so that undesirable choice points (i.e. states one must act on) rarely arise. This is a more active policy which actually alters physical attributes of the habitat. Proverbs such as `a stitch in time saves nine', `an ounce of prevention is worth a pound of cure', `scatter the stones before they make a pile' [Lao Tze] all reflect the idea that preventative measures cost little when compared to the costs they save later. They are often easier to perform too. It requires less knowledge and effort to do routine maintenance on a car than to fix it when it breaks. The value of maintenance strategies was discussed in [Hammond 1995] in the context of activity management and planning. They noted that because certain resources tend to reside in specific places -- clean glasses, crockery, cutlery, cleaning equipment, for instance, are typically found in cupboards, drawers, and closets -- agents learn to count on these resources being in their appointed location. This a useful feature, for it means that if one is cooking or setting the table it will not be necessary to systematically search the workspace to find the items one requires. But this savings does not happen magically. Entropy teaches us that objects tend to scatter. So a plan or program which depends on resources being in their expected places will be apt to fail unless someone in the agent's environment ensures that items find their way back to their 'proper' place. A certain state of the environment must be maintained or enforced, either by the agent itself, by some automatic mechanism, or by other members of the agent's group. We can generalize the notion of resource maintenance. To begin, we note that environments have a probability of moving into states that are undesirable for a creature. See figure 2. These states vary from the manageable but difficult, to the impossible. In a world designed to make life easy for a creature they would not arise at all. But inevitably they do arise because often they are side effects of the very actions taken by the creature. Thus eating leads to digestion leads to defecation which leads to soiling one's immediate locale. Unless, of course, the odious result is buried, as domestic cats do, or the creature leaves the immediate locale to defecate, as most creatures do their nests. Related actions are taking the garbage out of the burrow. Squirrels are known to remove rotting vegetation from their burrows, and occasionally the empty shells of their nuts. Undesirable states also arise because of exogenous factors. Winter snows bury nuts that otherwise could be counted on to be found. Other animals gather and consume nuts. The net result is that in wintertime the probability of finding a nut just in time is too low to be relied on. It is better to pay the storage costs and the up front labor costs to build an in house inventory. Hence storing nuts for winter is another instance of resource management. Given that resources are constantly in demand and not themselves always well defined it is difficult to know the extent to which creatures practice resource maintenance. Involved in the general notion of maintenance are keeping parts or systems (or livestock, or cultivated goods) in good working order, having resources located where they are most convenient, and when they are most needed, and most complex of all, in a form that is most useful. Thus, to address the last condition alone, tidying up a workspace, may involve putting items in canonical locations, but equally might involve redistributing the items in the same workspace to remove clutter. No wonder it is hard to know when an action has beneficial resource management consequences. Two general principles can be invoked to explain the virtues of maintenance.
According to standard investment theory, the decision about whether to invest time doing A right now, when one is not likely to need the results of A until later, depends on whether the expected benefits to be collected sometime in the future outweigh the current costs multiplied by some interest factor to compensate for risk. If there is no interest rate, there is no reason to prefer near term returns on investment (instant gratification) to long term returns (deferred gratification). Storing nuts for winter makes good investment sense because, effectively, the price of nuts goes up enough in winter, when they will be scarce and need more labor to be found, to compensate for present efforts. Accordingly, the current value of doing A now, exceeds the current value of anything else that might be done now. This is the simple story. But it makes the idealizing assumption that the cost of labor is constant and arbitrarily divisible, when in fact we know that some requests on our time are more urgent than others, and we cannot always break with what we are doing to engage in other, more profitable, activities. Hence we cannot suppose that we will always be able to do A at the last minute, or at least not without incurring potentially exorbitant costs. This complicates the equation, for it means that not only is there risk on the benefits side, there is risk on the labor (i.e. cost) side. The upshot is that on economic principles not only should a creature engage in maintenance actions whenever it has spare time, and so doing A now, when labor is cheap, is more profitable than doing anything else; it means that it should engage in more maintenance actions if it is uncertain when it might next get the chance. Maintenance can also be justified on Artificial Intelligence principles. The reasoning is as follows. Expertise consists in knowing what to do in particular cases. The more an agent practices a skill, the broader the range of cases it learns to handle and the more compiled or chunked its skills become. [Newell 1993]. The net effect is faster performance on a wider set of examples. Since the full space of possible situations is large, however, there are always cases which are novel but which will be treated as identical to known cases with possibly unfortunate consequences, or else they will require on-line adaptation (reasoning). The virtue of maintenance is that it can bias the probability against the occurrence of these novel or unfamiliar cases. Since we know what to do in accustomed cases, the more cases that are familiar the better our performance. If maintenance can cause this biased sample it is adaptive. For both these reasons Routine Maintenance is an effective means of shaping the environment to help a creature circumvent performance limiting circumstances and so increase average yield on its actions. It is an important super-optimizing strategy.
Scouting ahead The strategies of Just Say No and Scouting Ahead are both key ways creatures have of taking back some degree of control over their environments. Because of such task external strategies creatures do not have to be passive optimizers always striving to do their best in games posed by nature; they can be active participants, modifying the rules of the games they must play. A third technique for changing the terms of a task is to scout ahead and change one's knowledge of the consequences of actions. Imagine a lion stalking a herd of wildebeest. Should it attack from the right flank, or should it circle around and attack from the left? If there is a hill nearby, a third action would be to defer attacking in order to secure extra or better information about the layout of the herd. In decision theory, value of information analysis (Howard 1966), provides a method for determining when it is worthwhile to pay the costs of acquiring information. As can be seen in figure 3, the expected utilities of the actions available to a lion at the same physical location can be plotted on two separate occasions. On the first occasion, before any special information seeking actions have been taken, the lion must operate with prior probabilities about the arrangement of large and small animals. On the basis of these priors the clear choice is to attack from the left flank, for the value of that attack is given by the sum of the expected utilities, which is 4.5. On the second occasion, the lion moves to a better viewing position and now has a commanding view of how this particular herd is distributed around the plain. Given this more informed idea of the organization of the herd it is possible to know with very high confidence -- virtual certainty, let us suppose -- what the payoffs will be from an attack from each flank. So once at t2 the lion will know whether it will catch a large, small, or no animal by attacking from a particular flank. However, since neither the creature (nor us) can know at t1 which situation will obtain at t2, all that can be known is that once information gathering activity has been performed the creature will be in one of three states of knowledge. Either it will know that it can secure a large animal, or a small animal, or no animal at all. Thus, at t1 it knows that at t2 there is a 1/3 chance of taking a large animal, a 1/3 chance of taking a small animal, and a 1/3 chance of taking a no animal. Since the value of being at t2, is 5 and the value of being at t1 is 4.5 any information gathering activity that costs less than .5 is worth undertaking.
Scouting Ahead is an interesting way of increasing control over an environment because we do not normally think of information gathering actions as ways of altering a state space. And indeed, often they are not. Exploration might be an action internal to a state space, in which case any scouting action cannot alter the space, for the topology of that space will resemble that given in figure 3. If exploration is part of a task, the action of moving to t2 would not alter the state space; it would merely be a rational move within that space. This re-iterates the view that state spaces are relational constructs, defined relative to an action repertoire. But often exploration is not an action that is internal to a state space, in which case it is a way a creature has of altering the space. The reason Scouting Ahead, as an external action, would have the effect of altering the state space is that a state space is not just a topology of connected states, it contains a measure of the expected goodness of different states, and this measure assigns a distance between states. Any alteration in the distance between states counts as a change in the state space. An external action of scouting ahead has the effect of altering the distance between states. If this seems odd it may be because we are not used to regarding changes in expected utility as real changes in distance, hence as real changes in a state space. An example that converts expected utility into expected travel time may help dispel this view. Every day when I leave the university I must choose between two routes to travel home, the coastal route or the highway. Normally the highway is faster (though less pleasant), and so if I am in a rush, I take it. But if I leave at rush hour there is a good chance of a traffic jam on the highway route -- right where there is a merge with another highway. Happily, it is possible to take a small detour and look out over the merge area. At rush hour, this action, though taking a few minutes, ends up saving me time on average because I am able to then take whichever route is faster. If I were always to take the coastal road at rush hour my average travel time would be longer than if I take the coastal road only when there is a traffic jam on the highway. Similarly if I were always to take the highway at rush hour my average travel time would be longer than if I sometimes take the coastal road. By adding a Scouting Ahead action I can reduce my average travel time. Hence improve my average performance. This ought to make clear that Scouting Ahead can alter the topology of a state space if it is an action external to the task. But why view Scouting Ahead as a behavioral routine that actually is external to such tasks as hunting (or driving home). My argument is basically a slippery slope argument: if we are to use the notion of task environment as an explanatory construct in understanding animal behavior we need to draw a line between actions that are part of the task – part of hunting, in this case -- and actions that are not. If we do not draw this line, then because information gathering actions are so diverse and so hard to determine if they are relevant, we may be forced to include actions that seem completely unconnected to hunting as part of the core competence of hunting. We can concretize this slippery slope argument by returning to the example of a lion scaling a hill to look out over the plains. Clearly scaling a hill is not part of the attack phase of hunting, which occurs on the plains. If scaling hills is part of hunting at all, it must be part of a more inclusive task of hunting, say, hunting-and-hunting-related-scouting. Are there natural limits to this more inclusive task? I think not. Wherever we draw the boundaries of hunting-related-scouting there will always be new information gathering strategies which have a measurable impact on hunting performance and yet which fall outside hunting more broadly defined. For instance, who would call the action of going for a post-kill `exploratory' walk a part of hunting? For one thing it occurs after `hunting'. For another, it may involve walking to regions that are spatially distant from areas the lion ever, as a matter of fact, hunts in. Yet a predator with good knowledge of the lay of the land -- knowledge of where enclosures and open spaces are -- can often use that knowledge in trapping animals during the hunting phase. Since rats are known to engage in exploratory behavior unconnected to food search, other animals may be expected to as well. But then we face a dilemma: either we must conclude that there are actions that are temporally and often spatially distant from the normal actions of a task, but which nonetheless, we must accept as part of the task – a conclusion that is tantamount to saying we don't know what is involved in performing the task. Or we must accept that wherever we draw the boundary of a task's state space, it will be possible to find information gathering actions that lie outside that boundary, and which are capable of altering activity inside it. Sometimes these actions are super-optimal. Creating new actions by using tools Throughout our discussion we have often spoken of the task changing power of tools. Since a task environment is defined relative to an agent's action repertoire any change in the actions the agent may perform changes the topology of the state space. Introducing a tool is one of the easiest ways of changing an agent's action repertoire, for now it is possible to do things previously unattainable or unattainable in a single step. Take the case of New Caledonia crows recently discussed by Gavid Hunt [1996]. Many bird species use twigs, bits of bark, and in at least one case, cactus spines, lying on the ground to aid them in their search for tasty insects and spiders. The New Caledonian crows, though, actually fashion the probes they use. One tool is made from twigs bitten from living trees. It serves as a hook. Another is a pointed probe, 20 cm long, made from the tough barbed leaves of the screw pine. In certain cases, such tools just make it easier or more reliable to fish out insects. But more often, it would be impossible for the crows to search the interior of holes without the sticks. Owing to the different ways crows in different places use the tools, it is an interesting question how local the tool making cultures are. But the virtue of the tools, and in part the cultures that ensure the skills to craft them, is that new actions are possible which alter the state space of insect search. These allow new and shorter paths to the goal of insect capture. It may seem that delineating the effect of tool use on the structure of a task is as easy as rewriting the state space with a slightly changed connectivity and possibly a few extra states thrown in. But this is an oversimplification. In fact, tool use can change almost every facet of a task. Tool use may require:
Given the profound impact which tool use can have on performance it hardly needs justification as a strategy for super-optimization. Tool use, and its `cognate,' co-opting existing resources for new functions, are two of the most powerful ways of changing the state space of the task. With the act of introducing a tool, or with a behavior that imbues an existing resource with new functionality, a creature can leverage its existing capacities to new levels. Few super-optimizing strategies are as powerful. Strategies for Improving Cognitive CongenialitySo far we have considered actions which prune the state space of a task environment, which introduce shortcuts, and which guarantee that action trajectories will lie in hospitable regions. These actions increase the likelihood of task success or reduce its expected cost, measured in physical terms. There are also a class of task external actions which leave the state space formally intact but which reduce the number and cost of mental operations needed for task success. These actions, which elsewhere I have called epistemic and complementary actions, [Kirsh and Maglio 1995; Kirsh 1995], change the world in order to have useful cognitive affects on the agent. They reliably increase the speed, accuracy or robustness of performance. They are yet another way a creature -- usually a human creature -- has of improving its fit with its environment. Let us call the measure of how cognitively hospitable an environment is its cognitive congeniality. Different implementations of a state space have different degrees of cognitive congeniality. We can explore this notion using an old chestnut from the theory of human problem solving. According to Simon [1981a] tic-tac-toe and the game of 15 have isomorphic state spaces (see Figure 4). Yet tic-tac-toe is a trivial game while the game of 15 is not.
Imagine now that we are playing the game of 15 but we have been allowed to transform it to a magic square. On each turn player X chooses a card and flips it over on its place in the square, while player O chooses a card and takes it off the board (see Figure 4). Clearly this new arrangement of 15 is cognitively more congenial than the first. It is not as congenial as tic tac toe itself, but it is a step in that direction. Like tic tac toe, the new arrangement allows us to extract much critical task information perceptually rather than by mental arithmetic and so it encodes needed information more explicitly, [Kirsh 91], in the environment than in the original, linear arrangement. The result is that it saves mental computation, reduces cognitive load on working memory, and makes the game relatively easy without long hours of practice. Using this notion of cognitive congeniality we can now consider a range of natural environments to see how creatures, most especially humans, adapt those environments to make them more cognitively congenial. Let me emphasize that I am not suggesting that non-human animals engage in environmental re-organization in ways that seriously resemble the rather sophisticated methods we find among humans. Even among humans gross re-representation of the sort occurring in tic tac toe and the game of 15 is rare outside of paper and pencil contexts. So it would be a surprise to find close analogues to external re-representation among animals. But there is a fairly sizable class of actions that are less `cognitive' which also improve congeniality, and which I will discuss. And it is important to appreciate that even for animals, environments can be ranked along certain non-physical dimensions, such as their cognitive congeniality, and that this ranking may show up in patterns of habitat selection where it is evident that creatures are exercising a selective function over the various environments they pass through. The study of the cognitive congeniality of an environment is an area where work is in it earliest stage. Questions such as: What is the maximum amount of working memory required to perform the task? What is the maximum amount of mental computation required to decide what to do next? How much task relevant information is encoded in the environment, and how easily is it recovered? are central questions for the field. But there is, to date, no general theory to report. Nonetheless, it is apparent that cognitive congeniality is a key attribute in dozens of design fields, ranging from architecture and human computer interaction, to product design and industrial engineering. As we learn more about animal cognition it may well have an important role to play in ethology too. In the remainder of this paper, I shall consider some of the techniques humans have for improving the cognitive congeniality of their environments. It is my belief that weak correlates to these can occasionally be found in the animal world. Complementary Strategies A complementary strategy is an interleaved sequence of physical and mental actions that results in a problem being solved – a computation being performed -- in a more efficient way than if only the mental or physical actions alone are used. An example discussed in [Kirsh 1995a] is pointing at coins while counting them. If subjects are asked to count 30 coins made up of nickels dimes and quarters without either touching the coins or pointing to them the result is that over 50% of the time they give the wrong answer. If they are allowed to point to the coins they reduce that error rate to about 35%, and if they are allowed to move the coins freely the error rate falls to about 20%. The hypothesis is that if an agent learns how to manipulate resources in its environment in a timely and constructive manner it is able to solve cognitive tasks with less working memory, less visual spatial memory, less control of attention, or less visual search, than would otherwise be required. Complementary actions are part of a strategy for restructuring the environment to improve the speed, accuracy or robustness of cognitive processes. Recently I observed a nice example of a complementary strategy in an 18 month-old child. As shown in figure 5, ZeeZee was playing with a simple wooden puzzle. She had already played with this same puzzle twice before, and judging by the speed and accuracy with which she now assembled the puzzle, she had apparently memorized where each piece was to be placed. The little bird went in the lower left, the cat in the upper right, and the giraffe in the upper middle. I decided to test how she would solve the same puzzle when it was rotated 180 degrees, so that pieces that normally belonging in the bottom row now belong upside down in the top row. (see Figure 5). Would she solve the puzzle this time by adapting her memory of each piece went, would she fail to recognize the transformation and so use on-line reasoning to solve it as if from scratch, or would she do something different?.
Her first action was to try the little bird in the lower left, its customary place had the board been right side up. When it didn't fit, she looked the board over and tried placing the bird in the middle space along the bottom row, once again in the wrong orientation, as if now trying unsuccessfully to solve the puzzle but confused by her incorrect memory. She then went straight to the upper right slot (the correct position), and placed the piece in its appropriate orientation after a little effort. At this point, she turned the entire puzzle 180 degrees, returning it to its normal orientation, and quickly placed the remaining pieces in their proper (and apparently memorized) positions. What type of action is this sort of board rotation? It is not a normal task-internal action, since no amount of re-orientation can bring us closer or farther from the goal of having all the pieces in place. Nor is it a task external action of the sort we have already described, since the topology of the state space remains unchanged. Any state accessible before rotation is accessible after rotation. And the number of placement actions, and the physical energy required to place each piece in its position, is identical whatever its orientation, assuming arbitrary orientation of the pieces on the ground. So there has been no change in the physical distance separating states. Evidently it is a different type of action, an action that re-organizes the environment for mental rather than physical savings. By performing the `meta-task' action of rotating the whole board, ZeeZee was able to stop her effortful on-line problem solving, and return to her original rote strategy. This means that she could once again solve the puzzle using long-term memory rather than the working memory needed in on-line problem solving. She seems to have brought the world into conformity with her mental model rather than adapt her mental model to accommodate upside down cases. ZeeZee's board rotation is a simple example of a complementary action or strategy. Here is one that is slightly more complex than ZeeZee's. Once again it is an interactive technique that saves mental not physical effort. In figure 6 we see a scatter of 20 sticks of similar diameter but differing in length. The agent's task is to identify the longest. Because longest is a globally defined property, we cannot be sure that a given stick is the longest without checking all the others. But here we are faced with a choice: shall we move sticks as we visually check them, or shall we leave them untouched?
Many strategies are possible, but virtually all good ones involve moving the sticks. For instance, one algorithm is to pick up the first two sticks, compare their length, keep the longer and discard the shorter into a `reject' pile, then continue comparing and discarding until all sticks in the `resource' pile have been checked. The stick we are left holding must be the longest. Clearly, this is an algorithmically effective strategy. We know that we will check all and only the sticks we need to compare. But it runs longer than necessary. A better algorithm exploits our ability to make good guesses about the longest stick remaining in the resource pile. This time we pick up the two largest looking sticks, and discard the shorter until we see a major difference in length between the stick in our hand and the longest remaining stick in the resource pile. After a few sticks we are certain. An even more efficient strategy is to grasp the 3 or 4 largest looking sticks and then push their bottoms against the table. The one stick which pokes out farthest is the longest. As theorists, what are we to make of all this activity? Should we regard these various actions of picking up candidate sticks and placing them in piles as actual moves within the state space of selecting the longest? Should we regard pushing the sticks against the table as a task internal action? Or should we rather see them both as task external actions? Since one might just visually scan the sticks, mentally comparing each, there is nothing intrinsic to the task of selecting the longest that requires either the sort of organizational behavior we observe in creating distinct piles, or the analogue computation of pushing the sticks against the table top. Nonetheless, sticks must be picked up and then put down. So, intuitively, there is nothing task external about actions that involve moving sticks around and placing them in piles – although the action of pushing them in groups against the table's surface does strain the intuition. The real problem is to decide how to describe these actions. For if we include only `objective' spatial elements in our action descriptions -- so that actions of picking up are different if they involve sticks in different locations and orientations, and similarly for actions of placing down – then the state space will be large enough to describe, at a detailed level, every individual sequence of actions which subjects display, but the level of description will be so fine that we will be unable to meaningfully describe the strategies they are using. For instance, the interesting regularity at a strategic level is that subjects seem to make meaningful piles of sticks, or they intentionally exploit physical properties of the sticks and table top to perform an analogue computation. Such regularities would be lost if we confine our descriptions of actions to straightforwardly physical characteristics of actions. Yet if we define the action repertoire using concepts which human subjects themselves use in describing the structure of their workspace, we create the problem that two subjects will have to be described as operating in different task environments if they conceptualize their environment in different ways. This is not acceptable. Either we must give up the notion of task environment as a useful explanatory construct or grant that there is a core task environment which all performers of a task share, and that there are a variety of actions which may be performed that do not fall within that task environment, narrowly construed, but which alter its cognitive congeniality. Once an environment has been so altered, strategies can be devised that exploit various of these `non-intrinsic' properties of the task environment. Thus, although stick sorting tasks cannot be performed without shifting sticks about, there is no need to manage the spatial organization of the environment in the sense of partitioning it into discard regions, candidate regions, etc. Such actions make the task easier to accomplish but are undertaken for their affects on the agent's understanding of the task, and for the `cognitive affordances' they create, rather than to make literal progress in the task. They are complementary actions – actions performed externally for their effects on internal computation. In interpreting certain actions in this way we are following a growing tradition of constructivism in learning theory, [Duffy 92], and situated cognition in cognitive science, [Hutchins 1995, Norman 1988, Suchman, 1986] by regarding many actions to have more to do with cognitive scaffolding than with step by step advancement to the goal. For instance, the point of creating a discard pile is to encode information about the state of our algorithm. It is to help us keep track of the sticks we have checked and the ones that remain. Indeed, the point of creating a discard pile can only be understood if we see the action as part of an algorithm that is being executed partly in the world and partly in the creature's head. Assuming, then, that there is a class of actions which may improve task performance and which lies outside a state space formalism -- a class of actions that improves judgment, decision making, planning and execution -- all that remains is to show that such `meta-task' actions are super-optimal. This is easy. Since the point of most complementary actions is to reduce cognitive loads, and so improve performance, it is no surprise that without such actions judgments tend to be error prone. In our pilot study, when subjects were not allowed to point, touch or re-organize the sticks, and their task was to identify the longest stick out of twenty distributed as shown in figure 6a, they take more than 55% more time and make over 3 times as many errors as when they are allowed to manipulate the sticks any way they like. Similar results hold when they are required to pick up the largest stick, but otherwise leave the arrangement intact. When the prohibition against movement was lifted subjects invariably relied on moving the sticks into piles, and their accuracy rose as indicated. This suggests that if accuracy is taken into account the easiest way to improve performance is to allow certain task external actions. Actions which encode information externally If we attempt to be more specific in the function of complementary actions we very soon distinguish a large family of such actions which are concerned with externally encoding information about ongoing mental activity. The stick sorting algorithm, for instance, uses the distinction between reject and resource piles to encode information about which sticks have been checked and which sticks haven't. Having a reject pile lets us proceed in the algorithm without having to remember which sticks have already been checked. A second example of this same pervasive activity can be found in card games. In ordinary games, such as gin rummy, bridge and especially pinochle, it is common for players to continually organize their hands into arrangements that encode information about their game intentions. For instance, in figure 7, we see four different ways of encoding the same set of cards dealt in a game of 14 card gin. From a purely pragmatic viewpoint there is no reason to group cards. Grouping has no effect on the goals you can reach. So from a task environment perspective, each grouping designates the same state in the state space. Hence grouping must be a meta-action. Why do players bother? What function does it serve? The simplest analysis of grouping is that it is done to encode plan fragments. One of the key cognitive tasks a card player faces is to sketch out a rough plan of the sub-goals he or she will attempt to achieve. In gin rummy, for instance, as shown in figure 7, there are many possible completions one can aim for. An obvious explanation of the differences between the various organizations shown is that different players have chosen different strategies. Of course, we cannot be certain that we are right in our interpretation of what is encoded in each hand. We must guess at the encoding scheme each player is using, or believe what they say when we ask them. Moreover, players may make mistakes in encoding according to their own encoding scheme. But within these limits characteristic of an interpretive science, we believe we can tell from the way the hands are laid out {and from their response to questioning) when a player has overlooked certain possible continuations which others have noticed. In figure 7, d has overlooked the fact that there are 4 queens, which b and c noticed. Now the reason I am elaborating specific techniques we observe in card playing, counting sticks, and solving simple puzzles is that these actions are a central, if neglected, element of human activity. In card playing it is obvious what the advantages of continual re-sorting are. Since game intentions are encoded externally, the player need not remember them; they can be read off the cards. This savings will re-appear every time a change in intentions is made. Moreover, if the cards are well laid out, the time required to judge whether a target card can serve as a completion is faster than if the cards are poorly laid out. Players using good layouts also produce fewer errors, leading one to suppose that effective goal encoding helps make execution of a plan more reliable and speedy. As long as we accept that card playing is not an unnatural task -- a task unlike any we might be called on to perform in non card contexts -- we have reason to suspect that the kinds of complementary actions shown in card play have their counterpart throughout everyday life. To sum up, my point in discussing interactive strategies typical of human situated activity, is that there is a second family of strategies which agents have for making their environments more hospitable. In addition to deforming the topology of their state spaces and hence the physical effort required to traverse states, they may alter the cognitive properties of their environments and thus save mental effort. In some of the examples just mentioned the method of changing the cognitive properties of environments was to redesign the appearance of the task sufficiently to change the complexity of the task. We know that how a problem is represented can have a major impact on the time and space required to solve the problem. [Gigerenzer and Hoffrage 1995]. With a good representation a problem may be easy to solve, requiring little search, with a bad representation the problem may be almost impossible to solve in that form, and require inordinate amounts of search, calculation and recall of states. Once we view creatures as carrying out algorithms partly in their heads and partly in their environments we must recognize that particular environmental layouts permit algorithms that lead to major savings. Even if these savings are not always evident in the time necessary to complete the task, they will often show up as significant improvements in accuracy, robustness and reliability, all factors that matter to creatures.
ConclusionI have been arguing throughout this paper that organisms have two rather different ways of improving their fitness. The first, and most familiar, is by adapting themselves to their environments. In equilibrium this leads to an optimal allocation of time between the different adaptive activities that make up an organism's life. The second way of improving fitness is by redesigning the environment to make it more hospitable. That is, organisms can adapt the environment to fit their existing skills and capacities. This leads to super-optimization for a creature already in equilibrium. Super-optimization requires the environment to be altered so that existing skills and techniques of survival yield greater returns in at least one task environment without sacrificing returns in others -- Pareto optimization -- or so that any lower returns in other task environments are more than compensated for by increases in others. In order to get the analysis based on super-optimization off the ground it was necessary to introduce the notion of a task environment (drawn from the theory of problem solving) and the notion of a behavioral strategy interpreted as an algorithmic process. Two broad methods for improving algorithmic performance in a task environment were then distinguished:
Both of these have the effect of altering the average task complexity of performance. It was suggested that the cognitive capacities needed to bring changes at the level of cognitive congeniality exceed those of most animals, although there may exist analogues to such environment changing activity which naturalists may discover once they have more detailed models of the cognitive processes underlying animal skills. Chief among these strategies are complementary actions. Complementary actions are actions performed for the sake of simplifying computation. In humans we find complementary actions everywhere. When a person uses their finger to point to a phone listing in the phone book, they are executing a complementary action because their use of hands saves them from having to remember the precise location of the target amidst a set of distracters. Without the sort of interactive help that comes from manipulating environmental resources, including hands, many of the tasks we easily perform would be beyond our abilities. Actions which deform the topology of a state space, by contrast with transformations of congeniality, are common in the animal kingdom, and no doubt will be more widely appreciated once naturalists begin explicitly looking for them. Some are obvious. The action of introducing a new tool or putting an existing object to new use, for instance, are clearly actions which may enhance the performance of existing strategies and permit variations that increase the yield of activity undertaken in that task environment. Less obvious strategies have to do with filtering out hard initial states of a task -- the Just Say No strategy; maintaining the environment in a felicitous condition -- Routine Maintenance; and undertaking exploratory actions -- Scouting Ahead. The point of each of these strategies is to change the expected payoff of actions; it is to deform the task structure to make it easier or cheaper to successfully complete the task. In the end, the value of analyzing activity along task analysis lines will depend on the insight it gives us into the principles which structure behavior. I have suggested that one useful approach is to distinguish actions that are internal to a task from actions that are external to it. These task external actions are special only in the sense that as researchers we note that they force us to revise our models of behavior. Instead of assuming that most actions occurring in the time frame of a task are part of a strategy for solving the task, we may begin to consider whether some of those actions are external to the strategy, designed specifically to modify the task. If this proves to be a constructive way of looking at human and animal behavior, then evolution may select for both effective behavior control strategies and effective task redesign strategies. AcknowledgementsIt is a pleasure to thank Peter Todd for his excellent comments and generous efforts to help improve this essay. I also thank Eric Welton, Rik Belew, and Matt Brown for their thoughtful commentary. This research is being funded by the National Institute of Aging grant number AG11851.BiographyDavid Kirsh is currently an Associate Professor in the Dept. of Cognitive Science at University of California, San Diego. He received his BA from the University of Toronto in Economics and Philosophy, his D.Phil from Oxford University in the philosophy sub-faculty doing research on Foundations of Cognitive Science and then spent 5 years at the Artificial Intelligence Lab at MIT as post doctoral fellow and then research scientist. In addition to an enduring interest in foundational issues he is working on a theory of interactivity in natural and artificial environments.References
Other Articles
|