Background: The reproducibility and repeatability of modern systems
for classification of thoracolumbar injuries have not been sufficiently
studied. We assessed the interobserver and intraobserver reproducibility of
the AO (Arbeitsgemeinschaft für Osteosynthesefragen) classification and
compared it with that of the Denis classification. Our purpose was to
determine whether the newer, AO system had better reproducibility than the
older, Denis classification.
Methods: Anteroposterior and lateral radiographs and computerized
tomography scans (axial images and sagittal reconstructions) of thirty-one
acute traumatic fractures of the thoracolumbar spine were presented to
nineteen observers, all trained spine surgeons, who classified the fractures
according to both the AO and the Denis classification systems. Three months
later, the images of the thirty-one fractures were scrambled into a different
order, and the observers repeated the classification. The Cohen kappa
(?) test was used to determine interobserver and intraobserver
agreement, which was measured with regard to the three basic classifications
in the AO system (types A, B, and C) as well as the nine subtypes of that
system. We also measured the agreement with regard to the four basic types in
the Denis classification (compression, burst, seat-belt, and
fracture-dislocation) and with regard to the sixteen subtypes of that
system.
Results: The AO classification was fairly reproducible, with an
average kappa of 0.475 (range, 0.389 to 0.598) for the agreement regarding the
assignment of the three types and an average kappa of 0.537 for the agreement
regarding the nine subtypes. The average kappa for the agreement regarding the
assignment of the four Denis fracture types was 0.606 (range, 0.395 to 0.702),
and it was 0.173 for agreement regarding the sixteen subtypes. The
intraobserver agreement (repeatability) was 82% and 79% for the AO and Denis
types, respectively, and 67% and 56%, for the AO and Denis subtypes,
respectively.
Conclusions: Both the Denis and the AO system for the classification
of spine fractures had only moderate reliability and repeatability. The
tendency for well-trained spine surgeons to classify the same fracture
differently on repeat testing is a matter of some concern.