Top Document: Satellite Imagery FAQ  3/5 Previous Document: Ground Stations Next Document: Is there a program to compute Assessment measures, including Kappa coe See reader questions & answers on this topic!  Help others by sharing your knowledge How can I assess my results? _(for basics, see Russell Congalton's review paper In Remote Sens. Environ. 37:3546 (1991). Think we should have a basics entry here too!)_ Michael Joy (mjoy@geog.ubc.ca) posted a question about Contingency table statistics and coefficients, and subsequently summarised replies: Second, a summary of responses to my posting about contingency table statistics and coefficients. Basically, I need to come up with a single statistic for an error matrix, along the lines of PCC or Kappa, but which takes into account the fact that some miscalssifications are better or worse than others. Tom Kompare suggested readings on errors of omission or commission. Chris Hermenson suggested Spearman's rank correlation. Nick Kew suggested informationtheoretic measures. Others expressed interest in the results; I'll keep them posted in future. The responses are summarized below. =============================================================================== Michael: Your thinking is halfway there. Check out how to use an error matrix to get + errors of Omission and Commission. Good texts that explain it are: Introduction to Remote Sensing, James Campbell, 1987, Gulliford Press start reading on page 342 Introductory Digital Image Processing, John Jensen, 1986, PrenticeHall start reading on page 228 or so. These are the books where I learned how to use them. Sorry if you don't have + access to them, I don't know how Canadian libraries are. Tom Kompare GIS/RS Specialist Illinois Natural History Survey Champaign, Illinois, USA email: kompare@sundance.igis.uiuc.edu WWW: http://www.inhs.uiuc.edu:70/ ============================================================================ Excerpt from my response to Tom Kompare (any comments welcome...) These are useful readings describing error matrices and various measures we can get from them, eg PCC, Kappa, omission/commission errors. But from these + readings I do not see a single statistic I can use to summarize the whole matrix, which takes into account the idea that some misclassifications are worse than others (at least for me). For example, if I have two error matrices with the same PCC, but with tendencies to confuse different categories , I'd like to get a ststistic which selects the 'best' matrix (ie the best image) . One simple way I can think of to do this is to supply a matrix which gives a 'score' for each classification or misclassification, and then multiply each number in the error matrix by the corresponding number in the 'score' matrix. So a very simple example of such a matrix might look like this: Deciduous Conifer Water Decid 1.0 0.5 0.0 Conifer 0.5 1.0 0.0 Water 0.0 0.0 1.0 In this notation, the 'score' matrix for a PCC statistic would be a diagonal matrix of "1". Obviously there are a number of issues for me to think about in using such a matrix, eg can you 'normalize' the score matrix? Can you use it to compare different matrices with different numbers of categories? An obvious extension to this would be to apply this idea to the Kappa statistic as well. =========================================================================== Hi Michael; Spearman's rank correlation is often used to test correlation in a situation where you are scoring multiple test results. You might be able to adapt it to your problem. Chris Hermansen Timberline Forest Inventory Consultants Voice: 1 604 733 0731 302  958 West 8th Avenue FAX: 1 604 733 0634 Vancouver B.C. CANADA clh@tfic.bc.ca V5Z 1E5 C'est ma facon de parler. ========================================================================= Hi, Your question touches on precisely the field of research I'd like to be pursuing, if only someone would fund it:) > Hi, > I'm comparing different datasets using contingency tables, and I would > like to come up with summary statistics for each comparison. I am using > the standard PCC and Kappa, but I'd also like to come up with a measure > which somehow takes into account different 'degrees' of misclassification. > For example, a deciduous stand misclassified as a mixed stand is not as > bad as a deciduous stand misclassified as water. I would strongly suggest you consider using informationtheoretic measures. The basic premise is to measure information (or entropy) in a confusion matrix. I can send you a paper describing in some detail how I did this in the nottotallyunrelated field of speech recognition. This does not directly address the problem of 'degrees of misclassification'  just how well it can be used to do so is one of the questions wanting further research. However, there are several good reasons to use it: 1) It does address the problem to the extent that it reflects the statistical distribution of misclassifications. Hence in two classifications with the same percent correct, one in which all misclassifications are between deciduous and mixed stands will score better than one in which misclassifications are broadly distributed between all classes. Relative Information is probably the best general purpose measure here. 2) By extension of (1), it will support detailed analysis of hierarchical classification schemes. This may be less relevant to you than it was to me, but consider two classifiers: A: Your classifier  which for the sake of argument I'll assume has deciduous, coniferous and mixed woodland classes. B: A coarser version of A, having just a single woodland class. Now using %correct, you will get a higher score for B than for A  the comparison is meaningless. By contrast, using information (Absolute, not Relative in this case), A will score higher than B. You can directly measure the information in the refinement from B to A. > In effect I guess I'm > thinking that each type of misclassification would get a different 'score', > maybe ranging from 0 (really bad misclassification) to 1 (correct > classification). I've thought a little about this, as have many others. The main problem is, you're going to end up with a lot of arbitrary numerical coefficients, and no objective way to determine whether they are 'sensible'. Fuzzy measures can be used, but these are not easy to work with, and have (AFAIK) produced little in the way of results in statistical classification problems. > I can invent my own 'statistic' to measure this, but if there are any such > measures available I'd like to use them. Any ideas? Take the above or leave it, but let me know what you end up doing! Nick Kew nick@mail.esrin.esa.it ============================================================================  Michael Joy mjoy@geog.ubc.ca University of British Columbia, Vancouver, B.C., Canada User Contributions:Comment about this article, ask questions, or add new information about this topic:Top Document: Satellite Imagery FAQ  3/5 Previous Document: Ground Stations Next Document: Is there a program to compute Assessment measures, including Kappa coe Part1  Part2  Part3  Part4  Part5  Single Page [ Usenet FAQs  Web FAQs  Documents  RFC Index ] Send corrections/additions to the FAQ Maintainer: satfaq@pobox.com
Last Update March 27 2014 @ 02:12 PM
