Tuesday, September 19, 2006

Causal learning: how different is it from normal learning?

I was browsing a write-up on Causal reasoning by Mixing Memory, and came across this article by Lagnado et al, regarding the Causal Structure underlying causal reasoning.

In brief , Causal reasoning refers to that ability of the humans by which they can classify some events as causes and some events as effects and also determine either deterministically or probabilistically as to which effects are caused by which causes. In simple words, the ability to assign causes to effects.

Historically, Causal reasoning has focused on the statistical methods of covariance or correlation between two events and used the strength of the correlation to calculate and predict the causal relation between the two events. This suffers from several drawbacks like inability to determine the direction of causation or the inability to rule out a third common cause of which the two observed events are the effects.

Langrado et al, in their paper, present a refreshing new perspective on causal reasoning by differentiating between the qualitative Causal Structure between two or more events and the quantitative Causal Strength of that relationship. For example, a causal structure may causally relate the presence of fever with bacterial infection thus identifying bacterial infection as a cause of fever; but the causal strength between bacterial infection and fever would determine what probability we assign to a particular case of fever to have been caused due to bacterial infection (diagnostic learning) or the probability that given bacterial infection a person would develop fever (predictive learning).

The authors contend that the issues involved in causal strength learning and causal structure learning are different and should be addressed differently. Further, they contend that most of the historical research has been limited to causal strength learning, ignoring the prior and more fundamental stage of causal structure learning; as in their theory, the causal strength of any relation can only be learned once one has some a priori qualitative assumptions about the underlying causal relationships. Their paper thus focuses what cues/mechanisms are involved in the formation of the causal structure.

Causal-model theory was a relatively early, qualitative attempt to capture the distinction between structure and strength. According to this proposal causal induction is guided by top-down assumptions about the structure of causal models. These hypothetical causal models guide the processing of the learning input. The basic idea behind this approach is that we rarely encounter a causal learning situation in which we do not have some intuitions about basic causal features, such as whether an event is a potential cause or effect. If, for example, the task is to press a button and observe a light, we may not know whether these events are causally related or not, but we assume that the button is a potential cause and the light is a potential effect. Once a hypothetical causal model is in place, we can start estimating causal strength by observing covariation information. The way covariation estimates are computed and interpreted is dependent on the assumed causal model.


They list the cues that humans use to form their Causal structures as

  • Statistical relations
  • Temporal order
  • Intervention
  • Prior knowledge

Before discussing, in depth, each of these cues and how they may affect causal reasoning, it is instructive to note that the concept of a Causal Structure underlying a given set of phenomena is quite close to the idea of a Cognitive Map underlying a given environment (say the maze or the mouse trap). While the latter is a spatial mental map of the objects in the surrounding 3-D space, the former may be conceived as a causal mental map of events in the temporal dimension. The reason I am using this analogy is to contrast the cues used in formulating a Causal structure with the different learning mechanisms used by mice to form a cognitive map of the mouse trap. The contention is that the same cognitive mechanisms are involved and also that these mechanisms are structured and unfold in a developmentally guided and staged manner.

The first cue to form a Causal structure or link two or more events is that of statistical relations. Here, correlation information between the events, as well as their conditional independences are used to arrive at a set of Markov equivalent causal models. Much of the learning is associative, probabilistic and maybe latent. It may not be accessible to consciousness and the learning of causal structure is more implicit, than explicit. For example, the regularities in the data may give rise to a fuzzy causal structure, where tentative causal relations are posited. Suppose from the data, it is determined that A and B are perfectly correlated. The person will have a strong sense of causation between A and B, but would be unable to determine the direction of causation. similarly if 3 events A,B and C are correlated, we would not be able to determine the directions of causation. This mechanism is very much similar to the latent learning mechanism exhibited by the mice in the mouse trap.

The second cue to form a causal structure that we consider here is that of Intervention. Here, human intervention takes place by affecting one of the events (potential cause) and by basis of that intervention or exercised choice, experiment to find out what effect that variable has on the outcome (effect). To more rigorously define Interventions, let me quote from the paper.

Informally, an intervention involves imposing a change on a variable in a causal system from outside the system. A strong intervention is one that sets the variable in question to a particular value, and thus overrides the effects of any other causes of that variable. It does this without directly changing anything else in the system, although of course other variables in the system can change indirectly as a result of changes to the intervened-on variable. What is important for the purposes of causal learning is that an intervention can act as a quasi-experiment, one that eliminates (or reduces) confounds and helps establish the existence of a causal relation between the intervened-on variable and its effects.


Suppose A and B have been found to be correlated. Further suppose that the happening of event A and B is under the control of the human subject. Then one can intervene to cause A and observe whether B occurred. If so the direction of causation is from A -> B. On the other hand if by intervening the human subject caused B to happen and did not observe A, then one could conclude that B does not cause A. To make the example concrete, consider event A as 'Fire' and event B as 'Smoke'. We find that Fire and Smoke are correlated. By intervening and conducting experiments whereby we can control the occurrence of 'fire' or 'smoke' we can come up with correct causal relation that 'fire' -> 'smoke'

Consider again, a 3 event situation whereby the relation between two causal events (A and B) and an outcome (C) has to be ascertained. Specifically, by intervening and causing A sometimes and B other times, and observing the happening of C we could ascertain the causal structure as to whether A->c or B-> C. The situation is not too different than the vicarious trail and error learning exhibited by a mouse when at a choice point. There, the mice has to, by trail-and error choosing of either right/left black /white turnings, learn which stimulus is associated with food (outcome). Thus, intervention mechanism is nothing but the refined vicarious trial-and-error learning.

The third, and perhaps the most important, mechanism that is used to form the Causal structure is Temporal ordering. This is a very simple mechanism whereby events that are occurring prior to some other event can be the cause of that event, but not vice versa.

The temporal order in which events occur provides a fundamental cue to causal structure. Causes occur before (or possibly simultaneously with) their effects, so if one knows that event A occurs after event B, one can be sure that A is not a cause of B. However, while the temporal order of events can be used to rule out potential causes, it does not provide a sufficient cue to rule them in. Just because events of type B reliably follow events of type A, it does not follow that A causes B. Their regular succession may be explained by a common cause C (e.g., heavy drinking first causes euphoria and only later causes sickness). Thus the temporal order of events is an imperfect cue to causal structure.


This mechanism is the same as the one used by mice in searching for stimulus. When two events follow each other than an active search mechanism is used to identify the salient stimulus which may have been the cause of the event. The concept of temporal ordering implying causation is inherent in this learning mechanism as are concepts of spatial and temporal contiguity and proximity. This is the normal avoidance learning mechanism in mice and in human causal structure learning may be more engaged in and relevant to identifying the causes of events that are undesirable.

The fourth cue used for identifying causal structure, that the authors do not touch on, but do hint in terms of highlighting the importance of causal mechanisms; but that I propose nonetheless, is that of causal chains construction and elaboration. This basically involves breaking the simple A-> B with intermediate and competing C, D, E etc and intervening and conducting experiments to come up with the correct causal chain. Thus, A->B may be refined as A->C->D->B or A-> E->B and experimentation done to narrow down on a particular causal chain.

This is similar to the hypothesis learning involved in mice and depends on a cognitive capacity to sequence events . Also this is normally exhibited in approach behavior and this elaboration of causal chain may be more relevant to the desirable outcomes that human subjects want to happen and all the small intermediate steps of they need to cause to make the final outcome happen.

The fifth, and for now final, cue that is used in the formation of causal structure is prior knowledge. The authors define it as follows:

Regardless of when we observe fever in a patient, our world knowledge tells us that fever is not a cause but rather an effect of an underlying disease. Prior knowledge may be very specific when we have already learned about a causal relation, but prior knowledge can also be abstract and hypothetical. We know that switches can turn on devices even when we do not know about the specific function of a switch in a novel device. Similarly we know that diseases can cause a wide range of symptoms prior to finding out which symptom is caused by which disease. In contrast, rarely do we consider symptoms as possible causes of a disease.


My take on prior knowledge is something close to that, but slightly different. The subject forms a general idea of which events are causes and which effects and also the general relationship between a primary cause and a desired/undesired later final outcome. Though, the intervening small steps of the causal chain may not be present, and thus no formal corroborating data based proof may be there, yet one can deduce the causal relationship between the primal cause and the later final outcome, ignoring the intermediate minor events down the line. A case in point would be food aversion learning, whereby one single vomit following consumption of say a spoiled food that was taken hours ago, may result in a strong automatic association and learning of that food as the cause of vomit and lead to avoidance of (or escape from) that food.

To me this mechanism is the same as that exhibited by the mice when they learn the spatial orientation in the mouse trap and are able to exhibit novel escape learning.

This summarizes the analogy between the causal learning and normal learning as of now. Will touch on the qualitatively different next 3 (causal) learning mechanisms later.

Sphere: Related Content

No comments: