Scientific Papers

Comparative analysis of alignment algorithms for macular optical coherence tomography imaging | International Journal of Retina and Vitreous


Qualitative results

Figure 2 was created to demonstrate visually the kind of artifacts that could be introduced if no B-scan alignment is performed, and shows the surface maps from four sample control and four sample AMD OCT volumes. Each row represents a separate eye. Each column represents a different alignment algorithm: ANTs [16, 30, 31], FLIRT [32,33,34], ITK [35,36,37], OAR [38] and TOADS [39, 40]. The left-hand column shows the en face surface map using unregistered B-scans. In general, the alignment algorithms created a smoother inner retinal surface for all eight eyes. In addition, these are our observations. First, the horizontal streaks in the images (e.g. Control 2 and AMD 1) represented significant mis-alignment of the B-scans along the Z axis in the OCT volumes. Second, the signal intensity differences seen in Control 3 and AMD 3 were due to mis-alignment between the central and peripheral regions within the area of the macula that underwent imaging. Third, within the same eye, such as Control 3, there were significant variations in performance between the algorithms. Fourth, black points that were typically seen near the edges were artifacts created by incorrect surface depth estimates (and quantified as “edge errors” in Fig. 4). Of the five algorithms, the ANTs alignment algorithm appeared to have both the highest degree of surface smoothness and fewest edge errors.

Fig. 2
figure 2

En face surface maps of OCT volumes. Each row represented a different patient, and each column represented a different alignment algorithm. The left most column was created from unregistered B-scans. (AMD Age-related Macular Degeneration)

Quantitative results

In this section, the alignment results will be compared quantitatively. The inner retinal surface of a human macula, in the absence of surgical manipulation or significant trauma, should be smooth. Thus, the success of the alignment algorithms was quantified by the mean of the Laplace difference across an en face surface map. The smaller the mean of the Laplace difference, the smoother the surface map, and the better the alignment.

When all OCT volumes were included for analysis, the mean Laplace difference for AMD and Control groups without alignment was 27.0 ± 4.2 pixels and 26.6 ± 4.0 pixels, respectively (Fig. 3, top panel, right). Within the AMD group, ANTs and OAR performed the best, with a mean Laplace difference of 5.5 ± 2.7 pixels and 8.1 ± 4.3 pixels, respectively. The mean Laplace difference for the FLIRT, ITK and TOADS algorithm was 16.2 ± 7.4 pixels, 14.7 ± 8.0 pixels, and 15.3 ± 8.5 pixels, respectively. For the AMD group, the mean Laplace difference for all five algorithms was statistically smaller than without registration (p < 0.05). Within the control group, ANTs, and OAR performed the best, with a mean Laplace difference of 4.3 ± 1.4 pixels and 6.5 ± 2.2 pixels, respectively. The mean Laplace difference for the FLIRT, ITK and TOADS algorithm was 12.9 ± 6.1 pixels, 11.8 ± 5.3 pixels, and 11.8 ± 4.7 pixels, respectively. Similarly for the control group, the mean Laplace difference for all five algorithms was statistically smaller than without registration (p < 0.05).

Fig. 3
figure 3

Mean of the Laplace difference of the surface map over all OCT volumes for each of the registration algorithms for control and AMD patients (top). The bottom panel only included OCT volumes with 61 B-Scans. Number above each box is the p-value from a paired t-test. (The circles in the figure are outliers and are discussed at the end of the paper)

The data set used in this paper contained OCT volumes with varying B-scans densities, from 19 to 61 line scans. The higher the number of B-scans within an OCT volume, the more information it contains. Hence, we performed the same quantitative analysis, as above, to validate our approach specifically for high-density OCT images (OCT volumes with 61 B-scans) that are typically used in clinical practice or research (Fig. 3, bottom panel).

When only OCT volumes with 61 B-scans were included for analysis, the mean Laplace difference for the AMD and control groups without alignment was 23.1 ± 2.4 pixels and 25.1 ± 3.2 pixels, respectively. Within the AMD group, ANTs and OAR performed the best, with a mean Laplace difference of 3.5 ± 0.8 pixels and 4.2 ± 0.9 pixels, respectively. The mean Laplace difference for the FLIRT, ITK and TOADs algorithm was 11.7 ± 3.8 pixels, 9.6 ± 3.3 pixels, and 9.2 ± 2.3 pixels, respectively. For the AMD group, the mean Laplace difference for ANTs, OAR and TOADS was statistically smaller than without registration (p < 0.05). Within the control group, ANTs and OAR performed the best, with a mean Laplace difference of 4.0 ± 1.5 pixels and 6.0 ± 2.4 pixels, respectively. The mean Laplace difference for the FLIRT, ITK and TOADS algorithm was 12.4 ± 2.5 pixels, 9.6 ± 3.9 pixels, and 11.7 ± 2.7 pixels, respectively. For the control group, the mean Laplace difference for all five algorithms was statistically smaller than without registration (p < 0.05).

Figure 4 shows the edge errors for each algorithm for both the AMD and control groups. The top panel of Fig. 4 shows the results when all OCT volumes were included. The bottom panel of Fig. 4 shows the results when only OCT volumes with 61 B-scans were included. Overall, TOADS performed the best. For TOADS, the mean number of edge errors was 3586 ± 812, 3075 ± 829, 848 ± 346, and 701 ± 418 for the AMD (all OCT volumes), control (all OCT volumes), AMD (OCT with 61 B-scans only) and control (OCT with 61 B-scans only) group, respectively.

Fig. 4
figure 4

Mean number of edge errors over all OCT volumes for each of the registration algorithms for control and AMD patients (top). The bottom panel only included OCT volumes with 61 B-Scans

Surface map validation results

To validate the accuracy of our method for automatic surface map generation, we calculated the difference between manually generated surface maps and automatically generated surface maps, using B-scan images from 6 eyes. The difference was represented by a histogram and measured in pixels (Fig. 5). The mean and standard deviation in pixels of the difference between the six sets of surface maps was 0.28 ± 0.1, 1.4 ± 1.9, 0.45 ± 0.61, 0.5 ± 0.44, 0.99 ± 1.18, 1.21 ± 2.14, respectively.

Fig. 5
figure 5

(Left) visualization and comparisons of the surface maps created manually and automatically. (Right) Histograms showing the difference between each pair of surface maps in the number of pixels

Algorithm speed

The performance was quantified by the mean time (in milliseconds) to register a pair of B-scans images over all B-scans within the same OCT volume (Fig. 6). The fastest algorithm was OAR at approximately 2500 ms per pair. The slowest algorithm was FLIRT at approximately 5,500 ms.

Fig. 6
figure 6

Average registration time in milliseconds per pair of B-scans for each algorithm

Classification performance with and without alignment

We trained a 3D CNN to distinguish between normal and AMD OCT volumes. When our model was trained with B-scans aligned with ANTs, our model demonstrated superior performance (AUC 0.95 aligned vs. 0.89 unaligned; Table 1).

Table 1 Comparison of 3D CNN model performance with and without B-scan alignment in distinguishing between normal and AMD OCT volumes



Source link