TITLE: heorem For Prediction
AUTHOR: Jack Zlotnick
A collection of articles on the historical, operational, doctrinal, and theoretical aspects of intelligence.
All statements of fact, opinion or analysis expressed inntelhucnce arc those of
the authors. They do not necessarily reflect official positions or views of the Central Intelligence Agency or any oihcr US Government entity, past or present. Nothing in the contents should be construed as asserting or implying US Government endorsement of an article's factual statements and interpretations
( application of prob-aOi/u'y maf/irmarlcj to predictiveestimatesisci' jilinary potential.
A THEOREM FOR PREDICTION Jack Zlotoick
Philosophy! wrote critic and educator Mortimerbe process of entcrHuung any idea as merely potable. This act of tentative acceptance ii the good beginning in intelligence arjlysis. Theendorrect evaluation of tee several hypotheses'mints.
Seldom ii (he evidence so determinative as to clinch the caseingle hypothesis. Usually, as it accumulates, it only changes the position of one hypothesis or another on the probability scale.attack Is more likely or less likely today than iteekino-Sovict break in diplomatic relations li more probable or less probable now thans becoming more doubtful or leu doubtful that tho Labor government's position against poundcan withstand the next speculative run on sterling.
Since intelligence judgments are so often probabilistic, docs it follow that the mathematical theory of probability offers intelligence valid pointen on logical method? Promising research with relevance to tbii question, some oi it government-financed, has been done by psychology faculties in university laboratories. The main aim of the psychologists has been to compare intuitive judgments about hypotheses with the results that would be givenatbernaocaJ model based on probability theory. Borrowing from theseCIA's Office of Current Intelligence in the summer7athematical simulation of predictive intelligcoce analysis in crisis situations of recent history.
Tbe mathematical mode! derives from an equation, familiar to students of probability theory, named after Reverend Thomas Bayes. who first formulated it in the eighteenth century. The followingof Bayes' Theorem does not require mathematicalof the ic-ader; it assumes only that his learning blockages do not (delude an ingrained antipathy to any kind of numcruUvc idea.
A good entry point for the discussion is the concept ofit is used in mathematics. In the absence of certainty, thethat an event will occur (or has occurred, if past occurrencematter at issue)ecimal of fractional value betweenone. Thus the probability ised poker chip willinrandom drawingox containing ten chips,and threeational gambler would give noaffle ticket thatpon die random drawingred chip from ;
In the idiom of wagers, the term odds is often used instead of probability. The odds favoring the random selectioned chip over the random selectionlue one set the probability of the first event against the probability of the second. The odds of seven to three in this case are represented mathematically as the fraction obtained by dividing therobability ofed poker chip by therobability oflue one.
New evidenceambler's estimate of probabilities or odds. Suppose there are two large boxes filled with red and blue poker chips. In one the ratio of red chips to blue isn die other it isne of the boxes is setambler, but he is not told which. He can therefore give no better than even money that its color mix is predominantly red or blue. Allow him to draw some of the chips, however, and he wiD thenore confident choice between the two color-mix possibilities. The more chips be draws, the better the odds be will offer in favor of this choice.
This is precisely tbe setting of recent laboratory experiments at the University of Michigan and other centers. College students, serving as the test subjects, were required to give their gambler's judgments of the odds after successive drawings of poker chips, and thesewere compared with the odds obtained by using Baycs' Theorem.
In more simplified notation than is commonly used in the textbooks, the equation of Bayes' Theorem can beL
R, standing for revised odds, represents the odds favoring oneover another after consideration of the latest evidence (in this case, the color of the poker chip most recentlytands for the prior odds, those prevailing before this evidence turned up. L. the weight of the evidence that, changes the odds, stands forratio (referred to sometimes in the literature as Bayes' Factor).
The likelihood ratio compares the probabilities of (he occurrence of an event under alternative hypotheses. Suppose the evidence in the poker chip experiment is the selectioned chip oo the first drawing. There6 probability of this happening under thethatercent of the chips in the box are red. There is4 probability of its happening under the hypothesis that the drawing Is from the other box. where onlyercent of the chips are red. So the likelihood ratio for the occurrence of this redisividedr %,
The prior odds,Hevenmultiplied byo get tr* revised oddTafter the"firs*^drawing. The revised odds-then become the prior odds on the second drawing, and so -on.the gambler drawsedlue poker chips in the firstrawings, replacing the chip in the box after each drawing.will show that he could give betterdds in favor of the hypothesis that be has been drawing from the box withed-blue color mix. If the first hundred drawings areed andlue he could give welldds in favor of this hypothesis.
Significance for fnteffigenc*
He could and he would if he reasonedatheinatlciaii and had the capital to finance many wagers of this sort. Otherwise he would probably shrink from the degree of certainly implied by such high odds. The students in the University of Michigan experiments did give more confident odds the more drawings they had to go on. They did not. however, move as far from their original one to one odds as Bayes' Theorem would have justified. They did not, in other words, make the most of their inconclusive data. Like intefH-gence estimators in some parallel situations, they hesitated to move very far very fast from prior norms.
Similar overly conservative estimates were obtained in University of Michigan experiments simulating intelligenceet of six hypotheses was set before the testof different imminent war situationsixth ofcenario of events provided successive increments of evidence bearing on these hypotheses. For each increment the test subject gave five likelihood ratios expressing his opinion of how much more likely the event would be under each of the war hypotheses than under the peace hypothesis.
The test subjects of course differed among themselves inof the proper likelihood ratios. But the mostof the experiment was that their conclusions were notwith their own leadings of the evidence. Like thethe poker chip experiments, those working with intelligencewere very conservative in their final estimates. Whenratios Implied, according to Bayes' Theorem, odds of1 In favorar hypothesis, their own blend of intuitionresulted typically in oddsohen tbechanged and mathematical calculations would have givenofavorfag'peacevrhey came up'vii^oddsfl to
What Bayes' Theorem thus does for intelligence is to offer atest for internally consistent analysis. The rigor oflogic is no Indispensable aid when analysis is largelyproceeding from such general propositions as Tbe USSBhow dangerously provocative would be its shipment ofmissiles tohe instructed intellect's naked eye, so to speak, is keen enough to follow the thread of deductive thought and to detect tbe more tenuous strands of the argument
The case for mathematical assistance is stronger when analysis Isrocess of inductive Inference, proceeding notew general propositions but from many particulars. Mere verbalis then less likely to ensure against fallacy and non-scquitur. Intelligence on such occasions is well advised by Francis Bacon's in function that 'the mind itself be from the very outset not left to take its own course but guided at every step; and the business bo done as if byayes" Theorem is the kind of mechanistic aid to the intellect that Bacon here idealized.
Using this aid, the intelligence analyst does not address himself directly to the merits of hypotheses. His procedures for estimation require him to postulate, not debate, the truth of opposing hypotheses. Bayes' Theorem thus helps him get around one of his most troublesomehuman tendency to hold fast to his prior estimate when uncommitted opinion would go alonghange. And it helps spare the estimator the labor of Gghting other biases besides his own.
The Reliability Problem
In the university experiments the test subjects were in no doubt about the color of each chip they drew; nor did they have to question
the evidence set before them in the intelligence scenarios. The OA experiment, however.robability clement to refect the frequent uncertainties in the workaday intelligence world about the accuracy of reports from the field. The resultodification of the Bayesian equation.
The modified equation was worked out by analogizing from the poker chip experiments. Suppose that the test subject, instead of drawing poker chips out of the box himself, turns his back and gets his information, sometimes accurate and sometimes not, from an assistant Suppose also that he has some reasonable basis forthe rwbabi!ity*oTcorrect reporting, perhaps7 the assistants past record.
Call this probability of correct reporting the reliability0 percent reliability rating would mean thatercent of the reports withating are true, in the rater's opinion, and the otherercent are false.
False reports are of two lands. One is bereft of any conespoodmg fact, the utter fabrication for example.eport would be the assistant's announcemented poker chip when he had actually picked nothing at all out of the box. If the reportrobability of being false in this sense, the required modification of the equation is only to make the reliability rating (r) an exponent of the likelihood ratio:
The second land of false report is one which deliberately orconfuses one event with another, for example the assistant's announcemented chip when it was in fact blue. For reports estimated torobability of being false in this sense, the required modification of the equation becomes perhaps too involved to explainon-matbematical journal, but the mathematics is not really difficult.
The problem of the reliability rating does not enter into allof evidence. Reliability ratings are unimportant for much of the evidence received through technical collection. Nor are they necessary in intelligence appraisab of propaganda evidence, provided the analysis turns on the reasons why statements were made rather than on their truth or falsity. But the problem may well loom Urge In the event of garbles from technical collection and in the evaluation of reports received from human sources; and so the analyst roust be at special pains to understand the very restricted meaning of the
rating, It is in no way affected by the contenteport but represents only an appraisal of source reliability, insofar as one can be made on the basis of such considerations as the amount ofover in photography or the past record of clandestine humanorters. The pitfall to skirt with utmost care is the reliability ratingthat is nothing better than the analysts prejudgment about theotheses. Ifr his R, in other words, affects his r. thean find himselfircular rut from which no mathematics can rescue him.
' In real-life intelligence analysis perhaps no analyst can altogether i separate his biases snout the hypotheses from his'sppraisals of source reliability. When the credibility of some item ofrucial : for final conclusions, therefore, the analyst had bestetour . around the reliability issue. ase in point is the Cuban refugee report that alleges the sighting of strategic missiles near Havana. The intelligence estimator examining the hypothesis of imminent ' strategic rrussilc shipments from the USSR to Cuba can hardlyeliability rating to this report If he did, he would probably be putting into hisudgment about credibility that isthe answer be wants to get out of his analysis.
To exclude altogether this refugee report and others like it from his body of evidence, however, would put the estimator into the untenable position of giving no more weightundred suchthan one. His recourse is to appraise such reports much as he appraises propaganda evidence, eschewing judgment about truth or falsity. His likelihood ratio then represents only his opinion ofmuch more likely it is that unsubstantiated evidence of thisould appear under the hypothesis of strategic missilehan under another hypothesis. This way out of the difficulty isnot the most elegant of solutions, and possibilities of other methodological options are being explored.
The Cuban Missile Estimate
One test of the CIA mathematicalimulation of analysis just before the Cuban missile crisis, has been completed- Twoexercises were simulated. One is an estimative study inational Intelligence Estimate on Cuba was In fact published. The other is an estimative review as of three weeks later.
The analysis sets up two mutually exclusive hypotheses.one is that the USSR will soon ship strategic missiles (MRBM, IRBM, or ICBM) .to Cuba. Hypothesis two is that the USSR will not go so far as to sliip strategic missiles, despite the sharp upsurge of military aid to Havana in the summerhe task caDs for estimation of the odds favoring hypothesis one over hypothesis two.
The background of the missile crisis reaches back at leastisit to Cuba by Soviet First Deputy Premier Mikoyan ended the year of Soviet reserve that followed Castro's seizure ^of power. In the wake of Mikoyan's visit, several economic assistance agreements were signed and Soviet deliveries of armaments commenced, giving the Cubans armored, artillery, anti-aircraft, and anti-tank capabilities appropriate for defensive and internal security purposes. The Soviets withheld the obsolescentet lightand more advanced weapons that it was supplying to other countries.
Up2 moregent and refugee reports alleged the presence of missiles in Cuba. Aerial photography failed to confirm any of these reports. The Soviet Union to this point had not shipped strategic missiles to any foreign country, Communist or
This background information is useful only for establishingstarting odds. As ofne to ten odds arc postulated in favor of hypothesis one (in everyday parlance, tea to one againsthe mathematical analysis then proceeds to determine and apply likelihood ratios and reliability ratings for the evidence appearing from2 on. This process, carried out7 with2 evidence, produces three to one odds as of2 against Soviet emplacement of strategic missiles in Cuba.
The mathematical calculations7 thus support the estimate publishedhey also show, however, that (he odds arerapidly in favor of the strategic missile hypothesis. The fall in odds against the hypothesis accelerates, and by the end of the first week in October enough new evidence is in hand to make strategic missile emplacements an even money bet.
The Shape of Evidence
A pioneering experiment is often as interesting for the problems encountered as for the results achieved. The principal technical problem encountered in this test trial with Bayesian method was the
identification of units of evidence.oker chip experiment there is no doubt about the unit of evidence; it is the drawingoker chiparticular color. The intelligence analyst, however, receives many reports of events. Can be make each report rather than each event his unit of evidence?
The answer is no; at least it is negative for the mathematicalin the Cuba test. Other models may be developed, butone can tell only the significance that events, notfor hypotheses. To take reports as units of evidenceevents on^hich volume of reporting Js high and .under--pcesmly
Several reports about the same event are therefore treated in effect as one, and volume of reporting is reflected only in the analyst'sratings. These ratings represent the probability in the analyst's mind, in tho light of all the reports available to him, that his evidence is accurate.
But an event, like an atom, is made up of smaller particles, and the analyst needs toorking rule of reason to guide him in bis segmentation of the evidence. Tho rule is to combine items of evidence so clearly associated in content that separate appraisals would virtually be double counting. Successive photography showingin the constructionurface-to-air missile site can be takeningle unit of evidence on the operational status of the site. Broadcasts on the same propaganda theme can logically be counted as one unit of evidence rather than entered broadcast by broadcast into the mathematical processing. The following two extracts from the nmulated Cuba analysis illustrate tbe burden on the analyst to combine bis evidence fairly. The italicized headnit of evidence; tbe relevant reports are then described; the unit isandikelihood ratio, an estimate of bow much more (or less} likely it is that tbe unit would appear if hypothesis one (strategic missiles) is true than if hypothesis two (no strategicis right
Cutan-Scofet Fridfcmr Onarch, veteran Cuban Communist Anfbtl Escalante was curled from party leadership. Soviet [treat coalmen Wry la April endorsed the removal erf Escalaate but also called for an end to divirioas among Cuban revolutionaries. Tbe commentary emptusuxd tha virtues of collective leadership. The Intliruthoo of the commentary was that the USSR was disturbed by the setback suffered by to proteges In Havana.
A Judo export from clandestine services, angina ting withmnaSy reliable Paris source, bassaying privately thai be wanted lo Bay Independent ofa ofastro reportedly said be fell aa-rounded by orthodox Communists who would resort to anything to oUam control In Cuba,emporary arrangement with Washington.*
Fidel Castro's brother Haul deputy premier and minuter of armed forces,oscowuly. He was met at tbe airport by Marshal Malmovshy, the Soviet defense minister. Haul departed onuly without fanfare or final communique. This tack of red-carpetell suggested he did not get what bo wanted out of tbe Soviets.
allies. The frictions an evaluated as unlikely, given the assumption cast tbe Soviets are about to ship strategic uiiasuos to Cuba. On the echo band, the frictions seen) little more likely under an alternativeassumed sharply expanded rofliUry aid of any other sort ThebSerefore, carries only slight dkgnostsc vahie for contrametmg the hypothesis of imminent strategic missile shipments to Cuba. LikelihoodeliabilityinU of Nexo Cuban CopohJiUee landostiiM servieea report In July, sourocdairly reliable Cuban busineesman with good couUcti among Castro adherents, described Cuban naval officers as pe-vsunktsc about Cuban capabilities toew invasion. Cuban army officers were said to agree but to fool that tba principal dinger would be over by September.
lo anothernowledgeable Cuban was cpicttd as raying that tho US was afraid to interfere with Soviet-flag vessels but "in September the America is will also respect the Cuban Hag."
At one point, the Cuban (Che Guevara awarding to one account)to tba NATO nationselt of bases surrounding the Soviet Union. He was reportedly "livid" as be added that "in September Cuba is going to be the buckle in this belt."
Simulated2 Appraisal: Tbe alluooo to NATO mac*evelopment consistent with tbe aroimpCoo of atntegic nosxue installations in Cuba. Tbe aHuskai could also have been fipttard.less probably, given tbe assumption of expanded military oi to Cuba that stopped short of strategic missile emplacements. The accuracy of tho report) is open ts question. Likelihood Ratio: ISebxbdlry Rating; S
As these two extracts indicate, the tdescoping of reports sharply reduces the number of units of evidence available for mathematical processing. Tbe reduction gravely complicates the analyst's task. The reasOD is that Bayesian analysis takes off from starting odds which
may be mote intuitive than grounded in evidence- If many units of evidence are available, tbese should in time outweigh tbe influence of the starting odds. The rub comes when there are not many units of evidence. The prospect is then that starting odds rather than evidence will constitute the predominating influence on the final odds.
The Cuba test suggests that this problem will bedevil intelligence more often than cot Intelligence collection during the Cubanbuildup was massive, but the evidence touched on comparatively few subjects. Tho opportunities to increase or reduce tbe starting
iuck*'aftd fast, 'Enclnalyst"nt' to offer his choice of hypothesis with considerable reserve. Perhaps the best bo could do would be to say how much the evidence had shifted the odds since the starting date of his analysis. While this uvterprcta-1 tion might not justify confident predictions, it could alert policy-' makers to the implications of recent developments.
Working with the Bayesian model, intelligence islendeduction, insight, and inference from the body of evidencehole. Itequence of explicit judgments on discrete units of evidence. Bayesian analysis can carry conviction only if the evidence itselfThe analysis cannot apply the additional dialectic leverage of well-reasoned generalization cast in finely finished phrase.
This necessity to workard base of evidence limits tho prospective usefulness of Bayesian method. Current evidence in many situations carries little weight for longer-term estimation. Even for short-term prediction, the base of available evidence may be toooundation to support by itself the estimative structure thatmust often put together for the high councils oforecast of foreign reactionostulated course of VS. actionhas some evidence to go on but not much, at least not until the United States gets nearer the decision to take the postulated action.
Would it be worth while, then, for estimates of the future to Include such an interpretive tabulation of ad units of evidence as in the Cuban simulation? There is much to be said for requiringabulation in all cases, Bayesian method is helpful not only for its rules to assure valid induction but also for its duress on the analyst to separate fact from opinion. Even if the analyst docs not follow through withprocessing, his analysis should be tbe better for his labor In
poring over details of evidence and for the resulting higher level of expucitness in his working materials. Should the Ubulation of releiiat evidence be embarrassingly short, both analyst and reader are alert ec to the weakness of the evidential base and to the pivotal positionriori judgment in the estimate.
To argue for evidence, however, is to knock on an open door.would like to appeal to the verdict of evidence. The deepare not about the virtues of evidence but about (he pracocury of representing evidence with mathematical precision. It is onev to work with probabilities ofed poker chipoxiven color mix ofs it not quite another thliigto work with likelihood ratios and reliability ratings that are personal opirnous about the probabilities? The underlying data in the one case arecounts, and all the experts are agreed on the rules for assigning probability values to such data In the other case, the proVwbUrics are subjective judgments and tentative besides. If the Intelligexe analyst says that an event is twice as likely to happen if one hypothesis Is (rue than if another hypothesis Is true, does he really want that Epre to be taken literally? And If he says the chances are only four ovt of tenource is reporting accurately, does he want precisely this opinion about the source and no other to count in the basis of his final conclusions?
The question is almost its own answer. The likelihood ratios ind reliability ratings do no moie than suggest roughly how the anal^ is weighing evidence in his own mind. Mathematical processing in ml-life intelligence analysis ought not, therefore, to restrict itself to one set of likelihood ratios and reliability ratings. It should rather Inv&Ve several passes over the evidence with different sets of figures.
The processing would thus show the sensitivity of final conclusions to variation in appraisals of the evidence. Suppose one or two mixes of likelihood ratios and reliability ratings ledonclusion thatthose given by the other passes over the evidence or that contradicted the intelligence consensus reached by conventionalIt should then be incumbent on the analysts to determine the reason for this contradictory conclusion. They might decide in the cod to rule against it on the ground that it was based on unreasonible weighting of the evidence. But If they felt the weighting was not beyond the bounds of reason, they might decide to rethink the whole subject
Mathematical processing wiD not become an alternative to present metbods of intelligence analysis. It willeliability check on present methods. It will help show the plausibility of conclusions which the intelligence analyst would not otherwise recognize aswith the evidence and his own inner logic. It will tell the analyst: if you intcipret the evidence in this way, then here is theus ioo you should probably reach. Often the mathematics will be persuasive.Original document.