A Model for Forecasting Soviet Grain Yields
cop, N2 5
For OffUfHT Use Only
TABLE OF CONTENTS
Formulation ot the Model
Factors Influencing Yields
Availability of Data
The Prediction Model
The Pattern of the Regression Coefficients
Evaluation of the ilodcl
of the Data Base
Coefficient Signs of Weather
USSR: Crop Regions and Regional Groupings
A MODEL FOR FORECASTING SOVIET GRAIN YIELDS
For many years, this Office hai assessed the progress of (he Soviet grain crop during the growing season. Independent assessments are necessary mainly because lhe USSR Ministry of Agriculture provides only the most general situation reports during the growing season. Crop results are reported several months afler the completion of the harvest The tautness of the world food supply and theof the wide swings In Soviet grain production have increased the need for timely forecasts of Soviet grain output.
This publication documents lhe developmentodel to predict grain yields inrop regions (see thehe predictedon lime trendsomposite index of several weathercombined wilh reported dnla on sown area lo obtain crop estimates. With the help of the model, crop estimates can be made as early as April and revised lo lake accouul of additional information until the harvest is completed.
The model's structure, computer programs, and weather data base cau be applied to any crop for which yield data are available. So far, the model has been used lo forecast yields for all grain, winter wheat, and springo estimate the Soviet grain crop, estimation of sown area and adjurtmenui lo reflect collateral information are also used.
A weather-yield model has proved to be useful in estimating Soviet crop yields. The predicted4 grain harvest wasbove the actual harvest The model can produce reasonably reliable estimates early in ihc season and can then be revised every ten days as new weather dala are received.1
Dala on production, sown area, and yieldsave been collected for all grain, winter wheat, spring wheat, winter rye, and springeather data base covering Ihe period1 to the present has also been assembled for use with llie model. Belying on this data base, the modeleatherihc influence of temperature anda timetechnologicalexplanalory variableset of independent linear equations that predict crop yields.
* In thb publication, th* official Soviet definition ofin iiwheal,bailey, oats, com, rice, millet, buckwheat, and pulws,
model was developed in3 and formed the basis of CIA csiimilet for IKM year. The model wm modified slightlyhis publication is limited to the cor rent model, and all data refer to it unlcw otherwlM
4 predictions generated by the model seem reasonable. Tlie USSR liasota! grain hai vat5 million metric tons and two republicmillion tons for the RSFSR andillion tons lor the Ukraine. The model's final prediction (end of Augi.it) for the USSR4 million tons, withinf the total ^mounted" inn September, PolitburoBrezhnev said the crop would be "not bad" but admitted that areas in Siberia and Kazakhstan were having trouble. The announcementropillion tons confirms these indications.
The prediction4 also seems reasonable in terms of its development over time and its forecast of regional irends. The prediction increased sharply during May and June, when the weather was favorable, and then declined asdeteriorated. The final predictions fni the HSFSR and (heillion tons9 millions Ions, respectively) are slightly higher than. The prediction lor the remaining portion of the USSH iselow the actual harvest.
Although the mode! describes the historical data quite well, prediction errors can be significant. Tests show that an error ofon the predicted national yield can bc expected. An error in estimates of the sown area can either compound or mitigate the cflcct of errors in predicted yields.
FORMULATION OF THE MODEL Factors Influencing Yields Weather
(he importance of weather in determining grain yields isthe causal relationships are difficult to define. Many individualaffect the final yield. Moisture, for example, influences the numberper hectare, the number of stalks per plant, leaf area, the number ofplant, the number of kernels per head, and the weight per kernel.speed uf or curb growth and change tbe water balance in thetemperatures may injure or kill tlie plant.
Oilier weather variables that influence yields include sunlight, humidity, hail, and wind. In particular, the hot sukhtnxy (dry winds) prevalent In the east-em grain-growing regions of the USSR greatly Increase evaporation andand at times even beatrop. Weather variables also affect yields indirectly because of the importance of good weather in fieldwork at planting and harvest times and because of their role in encouraging or retarding plant diseases, parasites, and weeds.
Specific patterns of interaction between weather variables and therequirements of plants (and hence yields) vary considerably according to the peculiarities of local climates. Nonetheless, certain common factors are important. In the early stages ofleastontinuous supply of moisture In the upper layer of the soil. After tillering, which normallyittle less thanays alter sowing, the period of rapidgrowth occurs. As stems and leaves develop, consumption of water by the plants increases greatly. The heading stage, which is critical to grain yields, usually
rcprwnl gran paingrain obtained fiom the harvesting machine id the ffeld. includingrnouturc, unripe sod damaged fctrncb,sccdi, and the lonei tn handling and tramporting the grain between the field Rod Morage facilities
ittle more thanays aflerhe lelalively high temperalures that generally prevail in the USSH dining this stage result in increasedand in the more rapid depletion of soil moisture by evaporation. At that stage, therefore, the plants require more lainfall than during earlier stages nl
After heading, the plant's dependence on moisture and its sensitivity to temperature decrease. Excessive rainfall still may damage the crop, however, by causing lodging (matting) and by promotingas rust, scab, mildew, and leafwell as weed growth. As in the earlier stages of growih, excessive temperatures may be injurious, especially if accompanied by dry winds. Tooemperature preceding the harvest may delay ripening so that the crop is caught by early frost.
f.inks between weather factois and the stages of growih cannot be defined with precision because ol annual variations in sowing dates and in the seasonal pattern of growth among different grains and areas. For example, Ihc same grain in the same area may be planted al substantially different times in different years, and separate varielics planted in the same region may grow at different rates.
rop yields have Increased steadily8 in many regions of the USSR but very slowly in other regions (sec* The systematic improvement that has occurred probably idled* increased use of agriculturalfertilizer. lime, andbelter and more timely cultivation and harvesting. All of ihese changes arc technological improvements. Unfortunately,
stages of do-flopiisenl refer to spring-sown small grains. Fall-sown winter palnitiller before entering dormancy.
' Dat. for earlier yean co.ild net be used,frk-ial nwtuuN. of the grain crop is believed to base changed overew basis.iscussion of this question.
Ebcrlvard Schinke. -Soviet Agricultural SuiiitkV' In Vladimir C. Trend and John P. Harsllorter Kc/momic Sfdiislicj..
Centner, per Hmtr*
information on such improvements is incomplete and, when available, represent national or republicthe individual crop regions required for the crop prediction model
* surrogate foe technologicalime trend wasall crop regions and usedigh correlation between yields and timeIn the districts where time was not correlated with crop yieldsspring wheat belt of Western Siberia andew higherhave noi been introduced, and normal precipitation is inadequateuse of most type* ol fertilizer.
Availability of Data
weather data used in the prediction model include observationsand temperature for each ofrop regions1eare monthly averages except during the growing season, whenprecipitation readings are available for ten-day periods. Weather data arefor crop.
The yield data available cover all grain, spring wheat, winter wheat, winter rye. and spring barley for the. Yields were calculated from published data for the oblastsrop region, except for those few crop regions tliat coincide with the areas for which the USSR reports yields.
The weather data have their shortcomings. In the early years ofwere transcribed by hand from weather maps,to errors in transcription. Only Ihe most obvious of these errors could be corrected. More Important, the observationiven cop region is simply an unweighted average of the observaiions from all of the weather reportinghat region because of the lack of data ona below the oblast level. Thus, the weather observations in the major crop areasrop region were notroportionately greater weight.
Errors in the yield data can arise fromfficial Soviet sfat.sl.es. the need to estimate unpublished data, and the need to estimate the ratio of sown area to aggregate yields. Althoughconsistences appear in published Soviet sour.es. mostariance ofentner per hectare. Wheat yields for the RSFSR are usually reported only for spring and winter wheat combined By using additional information, yields for spring and winter wheat could bc estimated separately. The estimate* that couldcheeked were quite accurate, hence, any distortion is believed to he small For some crops such as spring barley, yield data were available for each divisionop region but sown areas wore unknown. The yield data therefore had to he aggregated by using estimated weights.'
The Prediction Model
he prediction model assumes that grain yieldinear function of 'ed "Ology andinear limeresents Ihe inlW ol improvedexample, fallowing and increased use of fertilizers -and improved
Jat, have bem^
reire derail ini*rn<lK
The estimated time irend coefficientas not significantly different fromfor many eastern grain-growing regions, Therefore, chc value ofin equations kI at zero lor Ihese regions.'
the second step, we regressed the residuals calculated inabove oneather variables to obtain an index of the influenceThe available weather data for each crop region are limitedor cropliereforc, in order to haveto estimate tin- weather index, we arranged theseroups We then computed the Following regression equation:
wherethe estimated residual fiomorn aop regionis assigned to one of theegional groups
weather variable kand
n unexplained residual.
aggregation of the data intoroups of crop regions isthehe useingle regression equation to express theof yield lo weather variablesulti-region area implicitly assumesweather variables have lhe same influence on yield in each of theThus, it was necessary tn select groupings of crop regions withsoils, and cultivation practices.
the third step, wc calculated the weather indexfor each crop
region. Given the parametersestimated from,
the weather index was computed as:
= weather index for cropn year i.
eatheror cropn year i,
and alliop regions (j) are assigned to one ofegional groups
n the fourth step, wetheand
in the model's grain yield predictiony performing
multivariate linear regression usingnd the values forestimated in
The Pattern of the Regression Coefficients
he signs of Ihe estimated coefficients of the weatherequationshown in Tablellie coefficients (or October-March pre-cipiLihun arc generallyIropreseason precipitation would be expected toositive effect un yield, especially in the winter grain areas. Nonetheless, many ol the negative coefficients arc in winter grain
ibuith" uw olitau wim tim. writ's data. Pooling tUc dalsi increases Die liimihei of obwrvatioro used in estimating the pur in* leu. inequation amidie number of tiepieei of iieeAoni
1 Ik types ul grain grown within each crop region and in the nationover time because oi wealher, shifts in demand, andchange. The coefficients of the weather equation rclloct changes in the structure of the sown area and therefore are unlikely to apply hilly in any particular year.
TV ute ol the model to forecast grain yields began in April andthroughout the dimmer months- The forecast* were updated every ten days, as new weather data were received. Two basic types of forecasts were made during4 crop year. The first used only weather variables for4 data had been received- For example, lhe lurrcait made a* of the end of April did not include the influence of weather in May. June. July, and August. Tlie second type used all weather var,;iblrs. vJ.diluting long-run norm values for variables not yet reported. The results lepoited here are all from the second type of forecast.
4 the estimate of grain production ranged from IMillionreseason estimateillions tons was (used on longrun norm, values for all weather variables. The tabulation below shows how thechanged a* additional weather data were received.
The development of the harvest prediction lieronies clearer when republic estimates are examined The gains made in Kazakhstan through April were more than canceled out by poor weather in the mt erf ihr country. In May. Kazakhstan fared poorly, but developments in the Ukraine easily compensated lor Kazakh-Man's diminished prospects and provided the first hopeood year. In June, predicted yields in the Ukraine continued to lnerea.se, and those in the ItSFSH made ii very large jumpesult of abundant rains. Predicted harvests in all aieas fell off iu July and recovered slightly in August. Tlie final line in the tabulation ol4 grain yields below shovn the production as rejiotled iuhe predicted values for ihe KSFSft and the Ukraine areillionillion tons too high. Since this is more than lhe differenceillion Ions for the USSR prediction, the prediction for Kazakhstan and other areas must bc too low.
May June luly
This lest suggests that an error ot lietwccnnd S% should bethe range ol the precentage errors shown on the diagonale-. While the upper part of this range is too high from the standpoint ofxperience has shown lhat collateral information can be used to detect anomalies and reduce the error.
ext we constructed several naive models to determine whether our model provides better predictions. Even the best of these naive models did not perform well enough to merit consideration. In the best of the naive models, yieldinear function of time> past yields, and selected weather variables. The data for all crop regions for which weather data were availableere used in the same equation. This equation was able to describe historical yields fairly well (the coefficient of determinationut it did not capture the cyclical behavior of Soviet agriculture, and the average absolute percentage prediction error was almost double that of our disaggregated model, based on predictions.
n evaluating the model described in this publicarion (and especially the predictions, the fact that production forecasts depend on the accuracy of sown area must be kept in mind. Although actual sown areas have nol been reportedhe errors in estimates of total sown areas3 ranged% fnr all grainor winter wheat, as shown below.
Spring wheat Winter wheat
Used intkiion Model
area data by crop region3 have been received only for theomparison of actual sown areas in the Ukraine3 with those estimated on Ihe basis of laborious review of published information also shows substantial errors, as follows:
ere derived from unrounded duiJ.
'f rounding, components may nol add to the totals
Knosvli;dge: nf the actual distribution of3 Ukrainian sosvn area would not have greatly improved the estimate of the overall Ukrainian yieldo2 centners per hectare (or winterut the error in the predicted total harvest svuuld have been shaqily reduced% tolor svinterhus, the effect on crop estimates of errors in the estimates of the regional distribution of sown area appeai to he much less than the effect ol errors in estimates of the total sown area.
"An error o! H% applied ti> ayield olenlneisa
illion hectares, result*arvest5 milium (nm.
DESCRIPTION OF THE DATA BASE
Four haue types of weather data coveting the period1 to tbe present are currently maintained on computer 1for use in wearber-yieldbyor the months of Apnl through September, total precipitation by month, average monthly temperature, and soil moisture asast day of each month.
Sources and IXeltabilily
Basic precipitation and temperature data for individual wealher stations in the USSR are broadcast two to four times daily by Soviet radio stations* From these broadcasts, weather data are interpolated fnrrid points spaced roughlyiles apartquare pattern throughout the area studied. Periodically, total precipitation, mean temperature, and soil moisture arefor each grid point and aggregated into values for crop districts.
Reiiabdity of the weather data isunction of the density ofb. lhe number of individual weather slat Jons from which reports emanate affects the interpolated dataiven grid point The density of coverage is most critical for the precipitation values because there can belilt- variation in precipitationelatively small area, especially in hilly country. Temperature values, on the other hand, are considered to be more continuous.
Before5 lhe donslty of coverage averaged one reporting weather station per grid point, largely because of limitations imposed by Ihc time-consumlne, maniul method of interpolation used. Since lhat dale, computer analysis hasomplex method of interpolation that permitted increasing the density of coverage to four or five reporting weather stations per grid point Consequently. Ihc interpolated values for precipitation and temperature since3 are considerably better indicators of weather conditions at the grid points than those for eailier yearv
irofORe and /fondling of Weather Data
Ihc basicata are received at approximately ten-day intervals and storedomputer disk file. Six categoric* ol weather variables are stored in the file in the following order:
irst decade precipitation
Second decade precipitation t. Third decadeotal monthly. Average monthly temperatureoil moisture (last day of month)
* tubk**Srd mo bW pnlods oJW"
1nbi olarld MeiAjniofticalthr USSH tturci MM* mlorau-
The data bases for harvest and sown area have been assembled for live drainllinterpringpring barley,inter rye. These data bases are computerized and will betoach fileommon slruchirc; all are processed by the same computer program. The data sources are not Ihc same, however, and some data have been estimated.
For each crop, data are required for allrop regions. Therop regions are composeddministrativeSoviel Socialist Republics6 Autonomous Soviet Socialist Republics3 oblasts,rays. For each ofivisions, data on tbe size of the harvested crop in thousands of tons and the harvested area in thousands of hectares are needed. The computer program computes yields for each ofivisions in centners per hectare, sums the harvested crop and the harvested area for all of the divisionsrop region, and computes the yield lor each crop region.
Alt Grain: Data on production and sown area for all grain are usuallyfrom Soviet statistical handbooks published fnr the area and year of interest. Although data for Kazakhstanre missing, five-year moving averages for spring wheal in selected Kazakhstan oblasts were recently published- Since spring wheat is dominant in these areas, these data were used to estimate yieldsn the assumption that spring wheat yields were the same as for all grain.
No other all grain data are estimated except in the sense that the computer program computes the yields. Sometimes thisariationentner per hectare from the yields published in Soviet slatistieal handbooks.
Spring and Winter Wheat: Data for Estonia, Latvia, Lithuania, Bclorussia, Moldavia, Kazakhstan, and the Ukraine arc taken from various statistical abstracts. The statistical handbooks for the RSFSR normally publish data for all wheal only. In order lo estimate data for winter and spring wheat separately, additional sources and estimation mclhuds had tu be used. Several cstunation procedures were used, depending on the data available
The best estimates are for thehe only data missing are Ihe harvested crop by administrative division, which can be estimated
for each division. To adjust for errors introduced by the use of rounded data, published data on spring and winter wheat by economic region were used. The actual bar vested crop for an economic region should equal the sum of theharvested crops for the divisions wilhin that region, and the published data on total wheat for each division should equal tlie sum of the calculated winter wheat harvest and the calculated spring wheat harvest. The calculated harvest data weie adjusted to moel these conditions while remaining within the range of possible rounding error
The estimates61 use essentially the same method.xact dala on the haivestcd crop by economic region are not available. It was
this information, the average yields and average harvested crops could. The estimates0 are then given by
Yi= yield in year i
A,"sown area in year i
Y= average yield
verage sown area
Tliese estimates were then refined by checking them against economic region data and against data for total wheat by administrative division. In most cases the estimates did not have to bc substantially revised.
Spring Barley: Data on spring barley are not published consistently. The statistical handbooks for the Ukraine regularly publish data by oblast, bul other handbooks normally do not. The data for the Baltic districts, Belorussia. and Moldavia come mostly from the agricultural handbook Sel'skoye) and are directly available as yields in centners per hectare. Data for Kazakhstan are not available for the
Data for the RSFSR are available at the administrative division level by yield only for theeights were estimated in order to combine Ihe yields of the divisionsrop region. They are based on data on sown area derivedoviet journal article andoviet agriculturalhe only data available01 arc for economic regions. They arc usable for two regions, the Central and the Volga-Vyalsk. which are very-close in definition to crop dislrictslhe weighted averages for each cmp district were computed and the results entered In the place of One of the divisions in each crop district.
Winter Bye: Dala by oblast for the RSFSR, whichf the aiea sown lo winter rye, have been published onlyblast data for the Ukraine (lessf the total area sown to wintti rye) are available only5 Reasonable estimatesan be made at (he crop region level, however, fiom Ihe data for economic regions which are published annually. Various republic handbooks and the USSR handbook supply the data for the remaining crop regions.
'Zemoiflw iliOiuoujfio,S0at irfiioffo lUonjvijtiva SSSfi..Original document.