Objective Coder Evaluation

Here we perform a more thorough comparison of RECON and SPIHT, based on an objective  coder selection procedure . Tests here reported were performed on the dataset of 100 standard $512 \times 512$ grayscale test images.

Given a test image $I$ , let $\{ I_{q(1)}^{spiht}, \cdots, I_{q(K)}^{spiht} \}$ be the set of decoded images at very low bit rates $q(1), \cdots, q(K)$ using SPIHT; $\{ I_{q(1)}^{recon}, \cdots, I_{q(K)}^{recon}
\}$ be the set of decoded images at the same bit rates $q(1), \cdots, q(K)$ using RECON. The compound gain $CG$ may then be applied to quantify the visual distinctness by means of the difference between the original image $I$ and decoded images at very low bit rates $q(i)$ :

\begin{displaymath}
f(spiht, i) = CG ( I, I_{q(i)}^{spiht} ) \, .
\end{displaymath}

and similarly, $f(recon,i)$ .

Once distortion functions $f(\dag , i)$ have been calculated following above equation , we make use of an objective criterion for coder selection based on the overall difference between the two functions $f(spiht, i)$ and $f(recon,i)$ , which can be measured by a Kolmogorov-Smirnov (K-S) test to a certain required level of significance.

Definition: Coder Selection Procedure. In the language of statistical hypothesis testing, the coding scheme RECON is significantly better than SPIHT for test image $I$ if the following two conditions are true:

(1)
$f(recon, i) \leq f(spiht, i)$ , with $i = 1,2, \cdots, K$ ; and
(2)
we disprove, to a certain required level of significance, the null hypothesis of a Kolmogorov-Smirnov test that two data sets $\{ f(recon, i) \mid i = 1,2, \cdots, K \}$ and $\{ f(spiht, i) \mid i = 1,2, \cdots, K \}$ are drawn from the same population distribution function.

Condition 1 takes into account that optimal coder tends to produce the lowest value of $f(\dag , i)$ across bit rates, and disproving the null hypothesis in condition 2 in effect proves data sets $\{ f(recon, i) \mid i = 1,2, \cdots, K \}$ and $\{ f(spiht, i) \mid i = 1,2, \cdots, K \}$ are from different distributions. If both conditions hold, it allows us to assess the fact that dataset $\{ f(recon, i) \mid i = 1,2, \cdots, K \}$ is significantly better than dataset $\{ f(spiht, i) \mid i = 1,2, \cdots, K \}$ .


Table I:

  CODER SELECTION PROCEDURE WITH % CONFIDENCE (SPIHT/RECON)
      Condition 1 Condition 2           
IMAGES (y/n ) (-/y/n ) Confidence
$\begin{array}{c} \char93  16, \char93  25, \char93  26, \char93  27, \char93  3...
...77, \char93  81,
\char93  88, \char93  89, \char93  93, \char93  95 \end{array}$ $y$ $y$ 99 %
# 2 $y$ $y$ 95 %
# 36, # 57, # 61, # 67 $y$ $y$ 90 %


Tables I and II summarize the results of this experiment on the test images of the dataset in : twenty-five out of hundred test images (25 %) have passed conditions (1) and (2) in the coder selection procedure, and hence, RECON is significantly better than SPIHT with high confidence level for twenty-five per cent of the dataset of test images. Whereas SPIHT is better than RECON for one per cent of images.


Table II:
PERCENTAGE OF IMAGES AT WHICH RECON/SPIHT IS SIGNIFICANTLY
  BETTER THAN SPIHT/RECON AT LEAST WITH 90 % CONFIDENCE
 
TOTAL
RECON better than SPIHT 25 %
SPIHT better than RECON 1 %