Search the FAQ Archives

3 - A - B - C - D - E - F - G - H - I - J - K - L - M
N - O - P - Q - R - S - T - U - V - W - X - Y - Z - Internet FAQ Archives

Satellite Imagery FAQ - 3/5
Section - How can I assess my results?

( Part1 - Part2 - Part3 - Part4 - Part5 - Single Page )
[ Usenet FAQs | Web FAQs | Documents | RFC Index | Airports ]

Top Document: Satellite Imagery FAQ - 3/5
Previous Document: Ground Stations
Next Document: Is there a program to compute Assessment measures, including Kappa coe
See reader questions & answers on this topic! - Help others by sharing your knowledge
    How can I assess my results?
   _(for basics, see Russell Congalton's review paper In Remote Sens.
   Environ. 37:35-46 (1991). Think we should have a basics entry here
   too!)_ Michael Joy ( posted a question about
   Contingency table statistics and coefficients, and subsequently
   summarised replies:

Second, a summary of responses to my posting about contingency table statistics
and coefficients. Basically, I need to come up with a single statistic for
an error matrix, along the lines of PCC or Kappa, but which takes into
account the fact that some miscalssifications are better or worse than others.

Tom Kompare suggested readings on errors of omission or commission.
Chris Hermenson suggested Spearman's rank correlation.
Nick Kew suggested information-theoretic measures.

Others expressed interest in the results; I'll keep them posted in future.

The responses are summarized below.


Your thinking is halfway there. Check out how to use an error matrix to get
+ errors
of Omission and Commission.

        Good texts that explain it are:

        Introduction to Remote Sensing, James Campbell, 1987, Gulliford Press
        start reading on page 342

        Introductory Digital Image Processing, John Jensen, 1986, Prentice-Hall
        start reading on page 228 or so.

These are the books where I learned how to use them. Sorry if you don't have
+ access
to them, I don't know how Canadian libraries are.

                                Tom Kompare
                                GIS/RS Specialist
                                Illinois Natural History Survey
                                Champaign, Illinois, USA

Excerpt from my response to Tom Kompare (any comments welcome...)

These are useful readings describing error matrices and various measures we can
get from them, eg PCC, Kappa, omission/commission errors. But from these
+ readings
I do not see a single statistic I can use to summarize the
whole matrix, which takes into account the idea that some misclassifications
are worse than others (at least for me). For example, if I have two error
matrices with the same PCC, but with tendencies to confuse different categories
I'd like to get a ststistic which selects the 'best' matrix (ie the best image)
One simple way I can think of to do this is to supply a matrix which gives
a 'score' for each classification or misclassification, and then multiply each
number in the error matrix by the corresponding number in the 'score' matrix.
So a very simple example of such a matrix might look like this:

                   Deciduous    Conifer    Water
         Decid         1.0        0.5        0.0
         Conifer       0.5        1.0        0.0
         Water         0.0        0.0        1.0

In this notation, the 'score' matrix for a PCC statistic would be a diagonal
matrix of "1". Obviously there are a number of issues for me to think about
in using such a matrix, eg can you 'normalize' the score matrix? Can you
use it to compare different matrices with different numbers of categories?
An obvious extension to this would be to apply this idea to the Kappa
statistic as well.

Hi Michael;

Spearman's rank correlation is often used to test correlation in a situation
where you are scoring multiple test results.  You might be able to adapt
it to your problem.

Chris Hermansen                         Timberline Forest Inventory Consultants
Voice: 1 604 733 0731                   302 - 958 West 8th Avenue
FAX:   1 604 733 0634                   Vancouver B.C. CANADA                          V5Z 1E5

C'est ma facon de parler.


Your question touches on precisely the field of research I'd like to be
pursuing, if only someone would fund it:)

> Hi,
> I'm comparing different datasets using contingency tables, and I would
> like to come up with summary statistics for each comparison. I am using
> the standard PCC and Kappa, but I'd also like to come up with a measure
> which somehow takes into account different 'degrees' of misclassification.
> For example, a deciduous stand misclassified as a mixed stand is not as
> bad as a deciduous stand misclassified as water.

I would strongly suggest you consider using information-theoretic measures.
The basic premise is to measure information (or entropy) in a confusion matrix.
I can send you a paper describing in some detail how I did this in the
not-totally-unrelated field of speech recognition.

This does not directly address the problem of 'degrees of misclassification' -
just how well it can be used to do so is one of the questions wanting further
research.   However, there are several good reasons to use it:

1) It does address the problem to the extent that it reflects the statistical
   distribution of misclassifications.   Hence in two classifications with
   the same percent correct, one in which all misclassifications are between
   deciduous and mixed stands will score better than one in which
   misclassifications are broadly distributed between all classes.
   Relative Information is probably the best general purpose measure here.

2) By extension of (1), it will support detailed analysis of hierarchical
   classification schemes.   This may be less relevant to you than it was
   to me, but consider two classifiers:

A: Your classifier - which for the sake of argument I'll assume has
   deciduous, coniferous and mixed woodland classes.
B: A coarser version of A, having just a single woodland class.

Now using %correct, you will get a higher score for B than for A - the
comparison is meaningless.   By contrast, using information (Absolute,
not Relative in this case), A will score higher than B.   You can
directly measure the information in the refinement from B to A.

> In effect I guess I'm
> thinking that each type of misclassification would get a different 'score',
> maybe ranging from 0 (really bad misclassification) to 1 (correct
> classification).

I've thought a little about this, as have many others.   The main problem is,
you're going to end up with a lot of arbitrary numerical coefficients, and no
objective way to determine whether they are 'sensible'.   Fuzzy measures can
be used, but these are not easy to work with, and have (AFAIK) produced
little in the way of results in statistical classification problems.

> I can invent my own 'statistic' to measure this, but if there are any such
> measures available I'd like to use them. Any ideas?

Take the above or leave it, but let me know what you end up doing!

Nick Kew


Michael Joy                  
University of British Columbia, Vancouver, B.C., Canada

User Contributions:

Comment about this article, ask questions, or add new information about this topic:


Top Document: Satellite Imagery FAQ - 3/5
Previous Document: Ground Stations
Next Document: Is there a program to compute Assessment measures, including Kappa coe

Part1 - Part2 - Part3 - Part4 - Part5 - Single Page

[ Usenet FAQs | Web FAQs | Documents | RFC Index ]

Send corrections/additions to the FAQ Maintainer:

Last Update March 27 2014 @ 02:12 PM