0
Articles   |    
Severin Classification System for Evaluation of the Results of Operative Treatment of Congenital Dislocation of the Hip. A Study of Intraobserver and Interobserver Reliability*
W. TIMOTHY WARD, M.D.†; MOLLY VOGT, PH.D.†; JAN S. GRUDZIAK, M.D.†, PITTSBURGH,; YÜCEL TÜMER, M.D.‡, ANKARA, TURKEY; P. CHRISTOPHER COOK, M.D., F.R.C.S.(C)†, PITTSBURGH, PENNSYLVANIA; ROBERT D. FITCH, M.D.§, DURHAM, NORTH CAROLINA
View Disclosures and Other Information
Investigation performed at the University of Pittsburgh, Pittsburgh, Bayindir Medical Center, Ankara, and Duke University Medical Center, Durham
The Journal of Bone & Joint Surgery.  1997; 79:656-63 
5 Recommendations (Recommend) | 3 Comments | Saved by 3 Users Save Case

Abstract

The Severin classification system frequently is used to evaluate the radiographic results of operations performed for the treatment of congenital dislocation of the hip. However, the reliability of this classification scheme has not been established, to our knowledge. Ideally, a classification system should be validated before it is used to promote therapeutic guidelines or to compare results of treatment; the purpose of the present study was to establish the intraobserver and interobserver reliability of the Severin classification system.Four blinded raters and the operating surgeon independently used the Severin system to evaluate the most recent radiographs of thirty-seven children (fifty-six hips) who had been managed, an average of nine years previously, with a medial open reduction for congenital dislocation of the hip. Three of the raters evaluated the same radiographs again under similar testing circumstances eight weeks later. Ten paired interobserver and three intraobserver comparisons then were analyzed with use of the Cohen kappa coefficient (?).The average kappa coefficient for the six pairwise comparisons between the four blinded raters was 0.15 (range, -0.05 to 0.42) when all Severin classes were analyzed independently. The average kappa coefficient for the four pairwise comparisons between the blinded raters and the operating surgeon was even lower (0.02). The kappa coefficients for the three intraobserver comparisons were 0.20, 0.38, and 0.44 (average, 0.34).Kappa analysis demonstrated variable and low levels of agreement when the Severin system was used to rate the results of operations performed for the treatment of congenital dislocation of the hip. We believe that the unadjusted kappa coefficient should indicate excellent agreement (? > 0.75) for all comparisons if this system is to be used for the evaluation of clinical results. The unacceptably low levels of intraobserver and interobserver reliability call into question the clinical conclusions of reports in which the Severin system has been used as the basis of proof.

Figures in this Article
    The Severin classification system25 frequently is used to assess the radiographic results of operations performed for the treatment of congenital dislocation of the hip. Despite its widespread acceptance, the reliability of this system has not been established, to our knowledge.
    Severin first used this classification system in 1941 to describe the radiographic appearance of the hip after the closed treatment of congenital dislocation25. The system includes six main categories: class I (normal), class II (moderate deformity), class III (dysplasia with no subluxation), class IV (subluxation), class V (subluxation with a pseudoacetabulum), and class VI (redislocation) (Table I). Severin used no quantitative parameters other than the center-edge angle to determine the radiographic classification. Clinicians who use the Severin system appear to have reached a consensus that class I indicates an excellent result; class II, a good result; class III, a fair result; and classes IV, V, and VI, a poor result2-4,13,15. However, no measure of the accuracy or reliability of this system was included in the original investigations by Severin25,26 or has been reported since then, to our knowledge.
    Salter, in 1961, used the Severin system to assess both the maintenance of complete reduction and the osseous development of the acetabulum and the femoral head after innominate osteotomy22. In that study, twenty-three of twenty-five hips that had been rated as Severin class IV or V preoperatively were rated as class I or II postoperatively. In a later report on 325 subluxated or dislocated (class-IV or V) hips, Salter and Dubos noted a dramatic improvement in the Severin rating after innominate osteotomy with or without open reduction23. Those two studies, both of which involved the use of the Severin system, were instrumental in the widespread adoption of innominate osteotomy for the treatment of congenital subluxation or dislocation of the hip in children who are eighteen months to six years old.
    Since the 1960's, the Severin system also has been used for the evaluation of the results of a number of alternative procedures for the operative treatment of congenital hip disease10,15,24,34,35 as well as for the comparison of results across studies2. Many of the surgeons involved in those studies reported favorably on the utility of the Severin system.
    The value of any clinical classification system is only as good as its reliability or validity when it is used by expert clinicians. In order to validate such a system, it is necessary to demonstrate that the classifications are accurate and reproducible. However, as there is no universally accepted standard with which the Severin system can be compared, the accuracy of the ratings cannot be determined. Therefore, the consistency of the ratings among users is of the utmost importance. For example, the reproducibility of the interpretations made by the same observer on separate occasions (intraobserver reliability) or by multiple observers on a single occasion (interobserver reliability) is critical to the intrinsic value of the system. Ideally, the reliability of a classification system should be established before the system is used to promote therapeutic guidelines or to compare the results of alternative treatments. When such a system is shown to have unacceptable reliability, any conclusions drawn from its application may not be valid.
    The present study was designed to test both the interobserver and the intraobserver reliability of the Severin system as used by pediatric orthopaedic surgeons.

    *No benefits in any form have been received or will be received from a commercial party related directly or indirectly to the subject of this article. No funds were received in support of this study.

    †Children's Hospital of Pittsburgh, 3705 Fifth Avenue at DeSoto Street, Pittsburgh, Pennsylvania 15213.

    ‡Bayindir Medical Center, Selanik Cad. 35/3, Kisilay 06650, Ankara, Turkey.

    §Duke University Medical Center, Box 2911, Durham, North Carolina 27710.

    *No benefits in any form have been received or will be received from a commercial party related directly or indirectly to the subject of this article. No funds were received in support of this study.
    †Children's Hospital of Pittsburgh, 3705 Fifth Avenue at DeSoto Street, Pittsburgh, Pennsylvania 15213.
    ‡Bayindir Medical Center, Selanik Cad. 35/3, Kisilay 06650, Ankara, Turkey.
    §Duke University Medical Center, Box 2911, Durham, North Carolina 27710.
     
    Anchor for JumpAnchor for Jump  TABLE I SEVERIN CLASSIFICATION SYSTEM25*
    * The subclassifications were not used in this study.
    Radiographic AppearanceCenter-Edge Angle
    Class INormal
        Ia>19° (6 to 13 years old); >25° (=14 years old)
        Ib>15 to 19° (6 to 13 years old); 20 to 25° (=14 years old)
    Class IIModerate deformity of femoral head, femoral neck, or acetabulum
        IIa>19° (6 to 13 years old); >25° (=14 years old)
        IIb>15 to 19° (6 to 13 years old); 20 to 25° (=14 years old)
    Class IIIDysplasia without subluxation<15° (6 to 13 years old); <20° (=14 years old)
    Class IV
        IVaModerate subluxation=0°
        IVbSevere subluxation<0°
    Class VFemoral head articulates with pseudoacetabulum in superior part of original acetabulum
    Class VIRedislocation
     
    Anchor for JumpAnchor for Jump  TABLE II SUMMATED RATINGS FOR EACH RATER*
    * The values are given as the number of hips assigned to each class at the first evaluation/second evaluation. NA = not applicable.
    RaterClass IClass IIClass IIIClass IV
    18/1622/2722/104/3
    29/1126/2014/177/8
    33/134/1316/303/12
    444/NA9/NA3/NA0/NA
    5 (operating surgeon)46/NA9/NA0/NA1/NA
     
    Anchor for JumpAnchor for Jump  TABLE III ANALYSIS OF INTEROBSERVER AGREEMENT ACCORDING TO INDIVIDUAL SEVERIN CLASSES25
    * The data are given as the number of hips, with the percentage in parentheses.† The numbers in parentheses indicate the 95 per cent confidence intervals.
    ComparisonObserved Agreement (N = 56)*Kappa CoefficientWeighted Kappa Statistic
    Blinded raters
    1 and 234 (61)0.42 (0.23 to 0.61)0.55 (0.37 to 0.73)
    2 and 332 (57)0.32 (0.16 to 0.48)0.43 (0.26 to 0.60)
    1 and 327 (48)0.19 (-0.01 to 0.40)0.34 (0.15 to 0.52)
    1 and 416 (29)0.11 (-0.04 to 0.26)0.10 (-0.01 to 0.21)
    2 and 49 (16)-0.05 (-0.17 to 0.07)0.04 (-0.04 to 0.12)
    3 and 48 (14)-0.01 (-0.12 to 0.1)0.05 (-0.02 to 0.12)
    Blinded rater and operating surgeon
    1 and 510 (18)0 (-0.13 to 0.12)0.05 (-0.02 to 0.12)
    2 and 510 (18)-0.02 (-0.15 to 0.1)0.05 (-0.02 to 0.12)
    3 and 59 (16)0.02 (-0.09 to 0.13)0.06 (0.0 to 0.11)
    4 and 540 (72)0.13 (-0.23 to 0.49)0.16 (-0.05 to 0.37)
     
    Anchor for JumpAnchor for Jump  TABLE IV COMPOSITE MULTI-RATER KAPPA SCORES ACCORDING TO INDIVIDUAL SEVERIN CLASSES
    * The data are given as the number of hips, with the percentage in parentheses.† The numbers in parentheses indicate the 95 per cent confidence intervals.
    All Five RatersFour Blinded Raters
    Severin Class25Observed Agreement* (N = 56)Multi-Rater Kappa Score†Observed Agreement* (N = 56)Multi-Rater Kappa Score †
    I5 (9)0.02 (-0.10 to 0.06)11 (20)-0.02 (-0.13 to 0.08)
      II8 (14)0.02 (-0.06 to 0.10)12 (21)0.09 (-0.01 to 0.20)
      III24 (43)0.10 (0.01 to 0.18)24 (43)0.16 (0.05 to 0.27)
      IV47 (84)0.23 (0.15 to 0.31)47 (84)0.29 (0.18 to 0.40)
     
    Anchor for JumpAnchor for Jump  TABLE V ANALYSIS OF INTEROBSERVER AGREEMENT ACCORDING TO DICHOTOMOUS GROUPS
    * The data are given as the number of hips, with the percentage in parentheses.† The numbers in parentheses indicate the 95 per cent confidence intervals.
    Class I and Classes II, III, and IVClasses I and II and Classes III and IV
    Observed Agreement* (N = 56)Kappa Coefficient†Observed Agreement* (N = 56)Kappa Coefficient†
    Blinded raters
    1 and 250 (89)0.56 (0.23 to 0.89)44 (79)0.56 (0.34 to 0.78)
    2 and 347 (84)0.11 (-0.42 to 0.64)45 (80)0.58 (0.39 to 0.77)
    1 and 349 (88)0.31 (-0.16 to 0.79)37 (66)0.31 (0.06 to 0.56)
    1 and 420 (36)0.09 (-0.09 to 0.27)33 (59)0.12 (-0.16 to 0.40)
    2 and 416 (29)-0.01 (-0.35 to 0.33)37 (66)0.16 (-0.14 to 0.46)
    3 and 415 (27)0.03 (-0.12 to 0.18)38 (68)0.10 (-0.24 to 0.44)
    Composite multi-rater kappa score5 (9)-0.02 (-0.13 to 0.08)22 (39)0.20 (0.12 to 0.28)
    Blinded raters and operating surgeon
    1 and 518 (32)0.08 (-0.05 to 0.21)31 (55)0.04 (-0.24 to 0.32)
    2 and 518 (32)0.07 (-0.09 to 0.23)35 (63)0.05 (-0.04 to 0.22)
    3 and 513 (23)0.02 (-0.12 to 0.16)38 (68)0.07 (-0.28 to 0.42)
    4 and 542 (75)0.21 (0.03 to 0.39)52 (93)-0.03 (-1.0 to 1.0)
    Composite multi-rater kappa score5 (9)-0.02 (-0.10 to 0.06)25 (45)0.29 (0.19 to 0.40)
     
    Anchor for JumpAnchor for Jump  TABLE VI ANALYSIS OF INTRAOBSERVER AGREEMENT ACCORDING TO INDIVIDUAL SEVERIN CLASS25
    * The data are given as the number of hips, with the percentage in parentheses.† The numbers in parentheses indicate the 95 per cent confidence intervals.
    RaterObserved Agreement* (N = 56)Kappa Coefficient†Weighted Kappa Statistic†
    132 (57)0.38 (0.19 to 0.57)0.51 (0.35 to 0.67)
    234 (61)0.44 (0.26 to 0.62)0.59 (0.41 to 0.77)
    325 (45)0.20 (0.01 to 0.39)0.32 (0.18 to 0.46)
     
    Anchor for JumpAnchor for Jump  TABLE VII ANALYSIS OF INTRAOBSERVER AGREEMENT ACCORDING TO DICHOTOMOUS GROUPS
    * The data are given as the number of hips, with the percentage in parentheses.† The numbers in parentheses indicate the 95 per cent confidence intervals.
    Class I and Classes II, III, and IVClasses I and II and Classes III and IV
    RaterObserved Agreement* (N =56)Kappa Coefficient†Observed Agreement* (N = 56)Kappa Coefficient†
    146 (82)0.49 (0.44 to 0.53)43 (77)0.52 (0.29 to 0.75)
    245 (80)0.31 (-0.05 to 0.68)45 (80)0.60 (0.39 to 0.81)
    354 (96)0.49 (-0.2 to 1.19)33 (59)0.29 (0.07 to 0.51)
     
    Anchor for JumpAnchor for Jump
    +Fig. 1 Anteroposterior radiograph, made five years after a bilateral open reduction through a medial approach. The hip on the left side of the radiograph was rated as class II by rater 1, class IV by rater 2, class III by rater 3, and class I by raters 4 and 5. The hip on the right side of the radiograph was rated as class III by rater 1, class IV by rater 2, class II by rater 3, and class I by raters 4 and 5.
    Thirty-seven children with an average age of eleven months (range, two to twenty-five months) were managed with open reduction through a medial approach for the treatment of congenital dislocation of the hip. Nineteen patients had a bilateral procedure and eighteen, a unilateral procedure; thus, a total of fifty-six hips were treated. The most recent follow-up radiographs (anteroposterior radiographs of the pelvis and frog-leg lateral radiographs of the involved hip or hips), which had been made an average of nine years (range, three to seventeen years) after the initial operation, were used in the present study. All of the operations and follow-up examinations were performed by one surgeon (Y. T.). None of the other surgeons who assessed the radiographs had participated in any aspect of the care of the thirty-seven patients.
    The radiographs were labeled sequentially, and all data regarding the identity of the patient and the side of the operation were blocked out. The operating surgeon and four other surgeons each independently rated the fifty-six hips with use of the Severin system; only the operating surgeon knew which hip or hips had been operatively treated. The four blinded raters were fellowship-trained pediatric orthopaedic surgeons; raters 1 and 4 had more than ten years of post-fellowship experience, rater 2 had just completed his fellowship, and rater 3 had more than five years of post-fellowship experience. The operating surgeon (rater 5) was not fellowship-trained but had more than twenty years of experience as a pediatric orthopaedic surgeon and is considered a regional expert on problems related to the hip in children. At the time of writing, all five raters were practicing pediatric orthopaedic surgeons who routinely treated congenital dislocation of the hip.
    Seven hips were treated with a femoral or pelvic osteotomy after the initial medial open reduction; evidence of these procedures often could be seen on the radiographs and was potentially obvious to the four blinded raters. However, the blinded raters were not specifically told which seven hips had been treated with another operation.
    The operating surgeon (rater 5) rated each of the fifty-six hips, under routine clinical conditions, before the initiation of the study. The other four raters were given a detailed description of the Severin classification scheme, as originally published, and then were asked to rate each hip. (The operating surgeon was not provided with the detailed description.) Because the four blinded raters did not know the ages of the patients, the subdivision of Severin classes according to age-adjusted measurements of the center-edge angle was not requested, and the interpretations were categorized simply as Severin class I, II, III, IV, V, or VI (Table I). Each blinded rater interpreted the radiographs independently and did not know how they had been interpreted by the other raters. The raters were allowed unrestricted time to rate each hip, and a goniometer was available for their use. Six pairwise comparisons were made between the four blinded raters, and four pairwise comparisons were made between the blinded raters and the operating surgeon.
    Eight weeks later, raters 1, 2, and 3 evaluated the same set of radiographs again under similar testing circumstances. At the time of the first evaluation, the raters had been told to keep no notes and to make no attempt to memorize the radiographs. At the time of the second evaluation, the radiographs were presented in a different numerical order to guard against any recall bias from the first interpretation.

    Statistical Analysis

    The Cohen kappa coefficient7 and the weighted kappa statistic8 were calculated to assess the interobserver and intraobserver reliability with use of the Stata software package (version 4.0; Stata, College Station, Texas). Kappa is a measure of the pairwise agreement between observers that reflects the proportion of observed agreement beyond that expected by chance alone6-8,29,36. (Agreement due to chance alone is indicated by a kappa value of zero.) For each of the ten interobserver and three intraobserver comparisons, calculations were made in order to determine the proportion of observed agreement (Po), the proportion of expected agreement (Pe), and the kappa value.
    Whereas the kappa coefficient reflects only complete agreement between raters, the weighted kappa statistic allows partial agreement to be taken into account. For example, when two surgeons rate a single radiograph as class II and III (or class II and I), the result is considered as non-agreement for the calculation of the kappa coefficient and as partial or close agreement for the calculation of the weighted kappa statistic.
    In addition, a composite multi-rater kappa score11 was calculated for each Severin class. This score provided a weighted measure of all of the paired kappa scores for each Severin class and reflected the over-all agreement among the raters.
    The four Severin classes identified in the present study then were combined into two sets of dichotomous groups, and kappa values again were calculated for each of the ten interobserver and the three intraobserver comparisons. In addition, a composite multi-rater kappa score was calculated to provide an over-all assessment of agreement among all of the raters. One set of dichotomous groups consisted of Severin class I and the combination of classes II, III, and IV, and the other set consisted of the combination of classes I and II and the combination of classes III and IV. The analysis of the kappa values according to these dichotomous groups was an attempt to highlight any particular difficulty that the raters may have had in identifying a specific Severin class.
    The level of agreement was assessed with use of the system described by Svanholm et al.31, in which a kappa value of 0.50 or less indicates poor agreement, a value of 0.51 to 0.75 indicates moderate agreement, and a value of 0.76 or more indicates excellent agreement. This method is more stringent than the more commonly used system defined by Landis and Koch18, in which a kappa value of 0.21 to 0.40 indicates fair agreement; 0.41 to 0.60, moderate agreement; 0.61 to 0.80, substantial agreement; and 0.81 to 1.0, excellent agreement. The 95 per cent confidence intervals were calculated as kappa ± 1.96 times the standard error.

    Interobserver Agreement

    Individual Classes

    No hip was rated as Severin class V or VI. The distributions among the four remaining Severin classes were very similar for raters 1 and 2, both of whom rated a much higher percentage of hips as classes II and III than as classes I and IV (Table II). In contrast, rater 3 demonstrated a distinct bias toward rating hips as class II at the time of the first evaluation and as class III at the time of the second evaluation. The total number of hips in each class was almost identical for raters 4 and 5, both of whom rated most hips as class I. However, summating the data in this manner does not reflect the true interobserver or intraobserver agreement, as the hips assigned to a given class by one rater may not be the same hips assigned to that class by another rater. Calculation of the kappa coefficient for pairwise agreement, however, does permit direct comparisons of the classifications assigned by the different raters.
    The kappa coefficients for the six pairwise comparisons between the classifications assigned by the four blinded raters at the time of the first interpretation ranged from -0.05 to 0.42, with the average kappa value (0.15) only slightly greater than that due to chance alone (Table III). The pairwise comparisons between raters 1, 2, and 3 demonstrated better agreement than all of the comparisons between any one of these three raters and rater 4. However, none of the comparisons demonstrated a kappa coefficient that indicated even moderate agreement (? > 0.50). All of the pairwise comparisons between the four blinded raters and the operating surgeon demonstrated poor agreement, as indicated by an average kappa coefficient of 0.02. Even the comparison between rater 4 and the operating surgeon (the two raters who had assigned almost identical ratings) (? = 0.13) did not demonstrate substantial agreement (Table III). The weighted kappa statistics were either nearly equivalent or slightly greater than the unadjusted kappa coefficients.
    The composite multi-rater kappa scores, which indicate the degree to which all raters agreed that a specific hip belonged to a specific Severin class, indicated poor agreement when all five raters were included in the analysis as well as when rater 5 was excluded (Table IV). The scores were modestly greater within the two classes that indicate a more severe abnormality (classes III and IV).

    Dichotomous Groups

    Class I and classes II, III, and IV: For the next set of analyses, the four Severin classes were divided into two dichotomous groups: the first group included class I (normal hips), and the second group included classes II, III, and IV (abnormal hips). Only the comparisons between raters 1 and 2 and raters 4 and 5 demonstrated a level of agreement beyond that expected by chance alone (that is, the 95 per cent confidence interval did not include zero) (Table V). Even these comparisons, however, demonstrated only poor or moderate agreement. Furthermore, the multi-rater kappa scores for the four blinded raters as well as for all five raters actually indicated that the level of agreement was less than that expected by chance alone (? = -0.02 for both groups).
    Classes I and II and classes III and IV: For the final assessment of interobserver agreement, the four classes again were combined into two dichotomous groups: the first group included classes I and II (normal and mildly deformed hips), and the second group included classes III and IV (dysplastic and subluxated hips). The results were similar to those of the previous analysis, except for the comparisons between raters 2 and 3 (? increased from 0.11 to 0.58), 2 and 4 (? increased from -0.01 to 0.16), and 4 and 5 (? decreased from 0.21 to -0.03) (Table V). The composite multi-rater kappa scores indicated poor over-all agreement (? = 0.20 and 0.29).

    Intraobserver Agreement

    Intraobserver agreement was assessed by comparing the results of the evaluations performed by raters 1, 2, and 3. When the data were analyzed according to the individual Severin classes, the kappa coefficients indicated poor agreement and the weighted kappa statistics indicated poor or moderate agreement (Table VI). All of the kappa values were significantly greater than zero (p < 0.05).
    Analysis of the data according to the two sets of dichotomous groups demonstrated similar results compared with those for the individual classes (Table VII). The average intraobserver agreement was poor for both sets of dichotomous groups (? = 0.43 and 0.47). The findings regarding intraobserver agreement mirrored those regarding interobserver agreement.
    The results of the present study suggest that the Severin classification system is not a reliable tool for the evaluation of the radiographic results of operations performed for the treatment of congenital dislocation of the hip. There was wide interobserver variation between the four fellowship-trained pediatric orthopaedic surgeons and the operating surgeon (Table III and Fig. 1). In addition, the three surgeons who performed two evaluations eight weeks apart demonstrated poor or moderate ability to replicate their own radiographic ratings (Table VI). Although intraobserver agreement was somewhat better than interobserver agreement for raters 1, 2, and 3 across the various Severin classes, neither was excellent according to kappa analysis. It was not possible to assess the accuracy of the Severin system because there is no universally accepted standard available for comparative purposes.
    The wide variation in the ratings assigned by the five raters (Table II) reflected the very strong bias of each rater. Such bias is most likely due to either the inability of the raters to use the classification system correctly or the ambiguities within the system that prevent discrimination between varying postoperative appearances of the hip. The presence of substantial bias was confirmed by low unadjusted kappa coefficients and low composite multi-rater kappa scores. Although the inability to distinguish between Severin classes I and II or between classes III and IV is probably not clinically important, the inability to distinguish between classes II and III is. Whereas most authors have considered a class-II rating to indicate a good radiographic result, a class-III rating has, at best, been considered to indicate a fair radiographic result2-4,13,15.
    A number of previous investigators who have used kappa statistics to evaluate the reliability and reproducibility of other orthopaedic classification systems have reported findings that have contradicted commonly accepted orthopaedic teaching. Siebenrock and Gerber28, Sidor et al.27, and Brien et al.5 independently reported that the Neer system for the classification of fractures of the proximal part of the humerus was not reproducible enough to allow for a meaningful comparison of the results of studies in which that system was used. Sidor et al., for example, reported that the mean interobserver reliability of this system was only moderate (? = 0.41 to 0.60) according to the system of Landis and Koch18. Frandsen et al.12 found poor interobserver agreement when the Gardner system was used to grade fractures of the femoral neck, and Andersen et al.1 found poor interobserver agreement when the Evans system was used to classify intertrochanteric fractures of the hip. Nielsen et al.20 as well as Thomsen et al.32 determined that there was less-than-acceptable agreement (? < 0.51) when the Lauge-Hansen system was used to classify fractures of the ankle.
    One of the more interesting findings of the present study is the generally poor agreement between the operating surgeon and the other four raters. The operating surgeon (rater 5) classified forty-six of the fifty-six hips as normal (class I); although one blinded rater (rater 4) classified forty-four hips as normal, the other three raters classified only one to sixteen hips as normal (Table II). There are several possible explanations for these findings. First, because the four blinded raters knew that their interpretations were to be analyzed, they may have been more critical in their assessment than was the operating surgeon, who determined the ratings in the clinical setting. Such a situation may be expected to result in an increased level of agreement between blinded raters. Second, unintentional bias may have influenced the operating surgeon when he evaluated the results of operations that he had performed; blinded investigators generally are less susceptible to this type of methodological error. Therefore, whenever possible, radiographic results should be assessed by experts who are unaware of the operative treatment and other clinical information related to the patient. However, even when the ratings of the operating surgeon were excluded, the level of interobserver agreement between the blinded raters was unacceptably low.
    We attempted to control for a number of uncertainties that may have confounded the results. It has been suggested that the expertise of the raters can affect interobserver agreement17,21. For this reason, only surgeons who had similar training and clinical experience were asked to participate in this study. In addition, care was taken to ensure that the blinded raters did not have access to any data regarding the identity of the patient. A goniometer and the guidelines for the Severin system were available to all of the blinded raters. In addition, all of the blinded raters were given similar instructions, and only the operating surgeon may have been influenced by unintentional bias.
    Despite efforts to control for these potential areas of bias, agreement was less than acceptable when comparable experts attempted to apply the Severin system. This finding suggests a fundamental lack of clarity in the descriptive criteria and perhaps points to the need for a modified or newly designed system. The five raters in the present study obviously had different interpretations of what constitutes moderate deformity, dysplasia, and subluxation, and it is not clear what Severin himself meant by these terms. In short, the definitions of the terms in the Severin system are not specific enough to allow raters to achieve a substantial level of agreement.
    The single quantitative aspect of the Severin system is the measurement of the center-edge angle. The raters who participated in the present study expressed many concerns about the reliability of this measurement. In some cases, a rater estimated the center-edge angle to be normal but deemed the hip to be dysplastic. The reverse situation—an abnormal angle without obvious dysplasia—also was encountered. A limitation of our study lies in the various ways in which raters measured the center-edge angle. Although a goniometer was available, it was not always used in a standardized manner. Furthermore, raters frequently believed that the measurement of the center-edge angle did not completely reflect the condition of the hip; consequently, the Severin class was determined on the basis of subjective impressions as well as the measurement of the center-edge angle. However, this is how the Severin system usually is used in the clinical setting. We agree with Stulberg and Harris30, who reported that the center-edge angle may not be a useful index of acetabular development because it is affected by many factors.
    There is no uniform agreement on the definitions of dysplasia and subluxation. Coleman9 as well as Weinstein33 defined dysplasia as inadequate development of the acetabulum or the femoral head, or both, with an intact Shenton line, and defined subluxation as dysplasia with a broken Shenton line. Many variations of these definitions have been offered14,16,19. Although it is not our intent to present a new, validated rating system to grade the results of operations performed for the treatment of congenital dislocation of the hip, we believe that a more quantitative approach to grading is needed. The Severin classification scheme allows for too much subjectivity, which results in an assessment that is not sufficiently reproducible. Stulberg and Harris30 as well as Murphy et al.19 attempted to be more exact in describing the late appearance of congenital dysplasia of the hip. They discussed the use of several numerical indices, including the center-edge angle, the femoral-head extrusion index, the acetabular index of the weight-bearing zone, the measured amount of lateral and superior subluxation, the peak-to-edge distance, and the angle of the acetabular roof, among others. However, the intraobserver and interobserver agreement for each of these numerical indices must be determined before they are used to grade results of operative treatment of congenital dislocation of the hip.
    The present study indicates that the levels of intraobserver and interobserver reliability are unacceptably low when the Severin system is used to classify the radiographic results of operations performed for the treatment of congenital dislocation of the hip. Currently, both the decision to treat congenital hip disease and the evaluation of the postoperative results frequently are based on the Severin classification system.
    The findings of the present study will enable clinicians to place into more meaningful perspective the clinical conclusions of investigators who have used the Severin system as the basis of proof. Additional research is needed to construct a more precise system for the classification of the results of operations performed for the treatment of congenital dislocation of the hip.
    Andersen, E.; Jorgensen, L.G.; and Hededam, L. T.: Evans' classification of trochanteric fractures: an assessment of the interobserver and intraobserver reliability. Injury,21: 377-378, 1990.21377  1990  [PubMed]
     
    Barrett, W. P.; Staheli, L. T.; and Chew, D. E.: The effectiveness of the Salter innominate osteotomy in the treatment of congenital dislocation of the hip. J. Bone and Joint Surg.,68-A: 79-87, Jan. 1986.68-A79  1986 
     
    Berkeley, M. E.; Dickson, J. H.; Cain, T. E.; and Donovan, M. M.: Surgical therapy for congenital dislocation of the hip in patients who are twelve to thirty-six months old. J. Bone and Joint Surg.,66-A: 412-420, March 1984.66-A412  1984 
     
    Blockey, N. J.: Derotation osteotomy in the management of congenital dislocation of the hip. J. Bone and Joint Surg.,66-B(4): 485-490, 1984.66-B(4)485  1984 
     
    Brien, H.; Noftall, F.; MacMaster, S.; Cummings, T.; Landells, C.; and Rockwood, P.: Neer's classification system: a critical appraisal. J. Trauma,38: 257-260, 1995.38257  1995  [PubMed]
     
    Byrt, T.; Bishop, J.; and Carlin, J. B.: Bias, prevalence and kappa. J. Clin. Epidemiol,46: 423-429, 1993.46423  1993  [PubMed]
     
    Cohen, J. A.: A coefficient of agreement for nominal scales. Educat. and Psychol. Measure,20: 37-46, 1960.2037  1960 
     
    Cohen, J.: Weighted kappa. Nominal scale agreement with provision for scaled disagreement or partial credit. Psychol. Bull.,70: 213-220, 1968.70213  1968  [PubMed]
     
    Coleman, S. S.: Congenital Dysplasia and Dislocation of the Hip. St. Louis, C. V. Mosby,1978.  1978 
     
    Faciszewski, T.; Kiefer, G. N.; and Coleman, S. S.: Pemberton osteotomy for residual acetabular dysplasia in children who have congenital dislocation of the hip. J. Bone and Joint Surg.,75-A: 643-649, May 1993.75-A643  1993 
     
    Fleiss, J. L. Statistical Methods for Rates and Proportions. Ed. 2, p. 217. New York, Wiley, 1981. 
     
    Frandsen, P. A.; Andersen, E.; Madsen, F.; and Skjodt, T.: Garden's classification of femoral neck fractures. An assessment of inter-observer variation. J. Bone and Joint Surg.,70-B(4): 588-590, 1988.70-B(4)588  1988 
     
    Galpin, R. D.; Roach, J. W.; Wenger, D. R.; Herring, J. A.; and Birch, J. G.: One-stage treatment of congenital dislocation of the hip in older children, including femoral shortening. J. Bone and Joint Surg.,71-A: 734-741, June 1989.71-A734  1989 
     
    Ganz, R.; Klaue, K.; Vinh, T. S.; and Mast, J. W.: A new periacetabular osteotomy for the treatment of hip dysplasia. Technique and preliminary results. Clin. Orthop.,232: 26-36, 1988.23226  1988  [PubMed]
     
    Kasser, J. R.; Bowen, J. R.; and |and |MacEwen, G. D.: Varus derotation osteotomy in the treatment of persistent dysplasia in congenital dislocation of the hip. J. Bone and Joint Surg.,67-A: 195-202, Feb. 1985.67-A195  1985 
     
    Kim, Y.-H.: Acetabular dysplasia and osteoarthritis developed by an eversion of the acetabular labrum. Clin. Orthop.,215: 289-295, 1987.215289  1987  [PubMed]
     
    Kristiansen, B.; Andersen, U. L.; Olsen, C. A.; and Varmarken, J. E.: The Neer classification of fractures of the proximal humerus. An assessment of interobserver variation. Skel. Radiol.,17: 420-422, 1988.17420  1988 
     
    Landis, J. R., and Koch, G. G.: The measurement of observer agreement for categorical data. Biometrics,33: 159-174, 1977.33159  1977  [PubMed]
     
    Murphy, S. B.; Ganz, R.; and Müller, M. E.: The prognosis in untreated dysplasia of the hip. A study of radiographic factors that predict the outcome. J. Bone and Joint Surg.,77-A: 985-989, July 1995.77-A985  1995 
     
    Nielsen, J. O.; Dons-Jensen, H.; and Sorensen, H. T.: Lauge-Hansen classification of malleolar fractures. An assessment of the reproducibility in 118 cases. Acta Orthop. Scandinavica,61: 385-387, 1990.61385  1990 
     
    Rasmussen, S.; Madsen, P. V.; and Bennicke, K.: Observer variation in the Lauge-Hansen classification of ankle fractures. Precision improved by instruction. Acta Orthop. Scandinavica,64: 693-694, 1993.64693  1993 
     
    Salter, R. B.: Innominate osteotomy in the treatment of congenital dislocation and subluxation of the hip. J. Bone and Joint Surg.,43-B(3): 518-539, 1961.43-B(3)518  1961 
     
    Salter, R. B., and Dubos, J.-P.: The first fifteen years' personal experience with innominate osteotomy in the treatment of congenital dislocation and subluxation of the hip. Clin. Orthop.,98: 72-103, 1974.9872  1974  [PubMed]
     
    Schoenecker, P. L., and Strecker, W. B.: Congenital dislocation of the hip in children. Comparison of the effects of femoral shortening and of skeletal traction in treatment. J. Bone and Joint Surg.,66-A: 21-27, Jan. 1984.66-A21  1984 
     
    Severin, E.: Contribution to the knowledge of congenital dislocation of the hip joint. Late results of closed reduction and arthrographic studies of recent cases. Acta Chir. Scandinavica, Supplementum 63, 1941. 
     
    Severin, E.: Congenital dislocation of the hip. Development of the joint after closed reduction. J. Bone and Joint Surg.,32-A: 507-518, July 1950.32-A507  1950 
     
    Sidor, M. L.; Zuckerman, J. D.; Lyon, T.; Koval, K.; Cuomo, F.; and Schoenberg, N.: The Neer classification system for proximal humeral fractures. An assessment of interobserver reliability and intraobserver reproducibility. J. Bone and Joint Surg.,75-A: 1745-1750, Dec. 1993.75-A1745  1993 
     
    Siebenrock, K. A, and Gerber, C.: The reproducibility of classification of fractures of the proximal end of the humerus. J. Bone and Joint Surg.,75-A: 1751-1755, Dec. 1993.75-A1751  1993 
     
    Spitznagel, E. L., and Helzer, J. E.: A proposed solution to the base rate problem in the kappa statistic. . Arch. Gen. Psychiatry,42: 725-728, 1985.42725  1985  [PubMed]
     
    Stulberg, S. D., and Harris, W. H.: Acetabular dysplasia and development of osteoarthritis of hip. In The Hip. Proceedings of the Second Open Scientific Meeting of The Hip Society, pp. 82-93. St. Louis, C. V. Mosby, 1974. 
     
    Svanholm, H.; Starklint, H.; Gundersen, H. J.; Fabricius, J.; Barlebo, H.; and Olsen, S.: Reproducibility of histomorphologic diagnosis with special reference to the kappa statistic. APMIS,97: 689-698, 1989.97689  1989  [PubMed]
     
    Thomsen, N. O. B.; Overgaard, S.; Olsen, L. H.; Hansen, H.; and Nielsen, S. T.: Observer variation in the radiographic classification of ankle fractures. J. Bone and Joint Surg.,73-B(4): 676-678, 1991.73-B(4)676  1991 
     
    Weinstein, S. L.: Natural history of congenital hip dislocation (CDH) and hip dysplasia. Clin. Orthop.,225: 62-76, 1987.22562  1987  [PubMed]
     
    Williamson, D. M., and Benson, M. K. D.: Late femoral osteotomy in congenital dislocation of the hip. J. Bone and Joint Surg.,70-B(4): 614-618, 1988.70-B(4)614  1988 
     
    Zionts, L. E., and MacEwen, G. D.: Treatment of congenital dislocation of the hip in children between the ages of one and three years. J. Bone and Joint Surg.,68-A: 829-846, July 1986.68-A829  1986 
     
    Zwick, R.: Another look at interrater agreement. Psychol. Bull.,103: 374-378, 1988.103374  1988  [PubMed]
     

    Submit a comment

    Topics

    Anchor for JumpAnchor for Jump
    +Fig. 1 Anteroposterior radiograph, made five years after a bilateral open reduction through a medial approach. The hip on the left side of the radiograph was rated as class II by rater 1, class IV by rater 2, class III by rater 3, and class I by raters 4 and 5. The hip on the right side of the radiograph was rated as class III by rater 1, class IV by rater 2, class II by rater 3, and class I by raters 4 and 5.
    Anchor for JumpAnchor for Jump  TABLE I SEVERIN CLASSIFICATION SYSTEM25*
    * The subclassifications were not used in this study.
    Radiographic AppearanceCenter-Edge Angle
    Class INormal
        Ia>19° (6 to 13 years old); >25° (=14 years old)
        Ib>15 to 19° (6 to 13 years old); 20 to 25° (=14 years old)
    Class IIModerate deformity of femoral head, femoral neck, or acetabulum
        IIa>19° (6 to 13 years old); >25° (=14 years old)
        IIb>15 to 19° (6 to 13 years old); 20 to 25° (=14 years old)
    Class IIIDysplasia without subluxation<15° (6 to 13 years old); <20° (=14 years old)
    Class IV
        IVaModerate subluxation=0°
        IVbSevere subluxation<0°
    Class VFemoral head articulates with pseudoacetabulum in superior part of original acetabulum
    Class VIRedislocation
    Anchor for JumpAnchor for Jump  TABLE II SUMMATED RATINGS FOR EACH RATER*
    * The values are given as the number of hips assigned to each class at the first evaluation/second evaluation. NA = not applicable.
    RaterClass IClass IIClass IIIClass IV
    18/1622/2722/104/3
    29/1126/2014/177/8
    33/134/1316/303/12
    444/NA9/NA3/NA0/NA
    5 (operating surgeon)46/NA9/NA0/NA1/NA
    Anchor for JumpAnchor for Jump  TABLE III ANALYSIS OF INTEROBSERVER AGREEMENT ACCORDING TO INDIVIDUAL SEVERIN CLASSES25
    * The data are given as the number of hips, with the percentage in parentheses.† The numbers in parentheses indicate the 95 per cent confidence intervals.
    ComparisonObserved Agreement (N = 56)*Kappa CoefficientWeighted Kappa Statistic
    Blinded raters
    1 and 234 (61)0.42 (0.23 to 0.61)0.55 (0.37 to 0.73)
    2 and 332 (57)0.32 (0.16 to 0.48)0.43 (0.26 to 0.60)
    1 and 327 (48)0.19 (-0.01 to 0.40)0.34 (0.15 to 0.52)
    1 and 416 (29)0.11 (-0.04 to 0.26)0.10 (-0.01 to 0.21)
    2 and 49 (16)-0.05 (-0.17 to 0.07)0.04 (-0.04 to 0.12)
    3 and 48 (14)-0.01 (-0.12 to 0.1)0.05 (-0.02 to 0.12)
    Blinded rater and operating surgeon
    1 and 510 (18)0 (-0.13 to 0.12)0.05 (-0.02 to 0.12)
    2 and 510 (18)-0.02 (-0.15 to 0.1)0.05 (-0.02 to 0.12)
    3 and 59 (16)0.02 (-0.09 to 0.13)0.06 (0.0 to 0.11)
    4 and 540 (72)0.13 (-0.23 to 0.49)0.16 (-0.05 to 0.37)
    Anchor for JumpAnchor for Jump  TABLE IV COMPOSITE MULTI-RATER KAPPA SCORES ACCORDING TO INDIVIDUAL SEVERIN CLASSES
    * The data are given as the number of hips, with the percentage in parentheses.† The numbers in parentheses indicate the 95 per cent confidence intervals.
    All Five RatersFour Blinded Raters
    Severin Class25Observed Agreement* (N = 56)Multi-Rater Kappa Score†Observed Agreement* (N = 56)Multi-Rater Kappa Score †
    I5 (9)0.02 (-0.10 to 0.06)11 (20)-0.02 (-0.13 to 0.08)
      II8 (14)0.02 (-0.06 to 0.10)12 (21)0.09 (-0.01 to 0.20)
      III24 (43)0.10 (0.01 to 0.18)24 (43)0.16 (0.05 to 0.27)
      IV47 (84)0.23 (0.15 to 0.31)47 (84)0.29 (0.18 to 0.40)
    Anchor for JumpAnchor for Jump  TABLE V ANALYSIS OF INTEROBSERVER AGREEMENT ACCORDING TO DICHOTOMOUS GROUPS
    * The data are given as the number of hips, with the percentage in parentheses.† The numbers in parentheses indicate the 95 per cent confidence intervals.
    Class I and Classes II, III, and IVClasses I and II and Classes III and IV
    Observed Agreement* (N = 56)Kappa Coefficient†Observed Agreement* (N = 56)Kappa Coefficient†
    Blinded raters
    1 and 250 (89)0.56 (0.23 to 0.89)44 (79)0.56 (0.34 to 0.78)
    2 and 347 (84)0.11 (-0.42 to 0.64)45 (80)0.58 (0.39 to 0.77)
    1 and 349 (88)0.31 (-0.16 to 0.79)37 (66)0.31 (0.06 to 0.56)
    1 and 420 (36)0.09 (-0.09 to 0.27)33 (59)0.12 (-0.16 to 0.40)
    2 and 416 (29)-0.01 (-0.35 to 0.33)37 (66)0.16 (-0.14 to 0.46)
    3 and 415 (27)0.03 (-0.12 to 0.18)38 (68)0.10 (-0.24 to 0.44)
    Composite multi-rater kappa score5 (9)-0.02 (-0.13 to 0.08)22 (39)0.20 (0.12 to 0.28)
    Blinded raters and operating surgeon
    1 and 518 (32)0.08 (-0.05 to 0.21)31 (55)0.04 (-0.24 to 0.32)
    2 and 518 (32)0.07 (-0.09 to 0.23)35 (63)0.05 (-0.04 to 0.22)
    3 and 513 (23)0.02 (-0.12 to 0.16)38 (68)0.07 (-0.28 to 0.42)
    4 and 542 (75)0.21 (0.03 to 0.39)52 (93)-0.03 (-1.0 to 1.0)
    Composite multi-rater kappa score5 (9)-0.02 (-0.10 to 0.06)25 (45)0.29 (0.19 to 0.40)
    Anchor for JumpAnchor for Jump  TABLE VI ANALYSIS OF INTRAOBSERVER AGREEMENT ACCORDING TO INDIVIDUAL SEVERIN CLASS25
    * The data are given as the number of hips, with the percentage in parentheses.† The numbers in parentheses indicate the 95 per cent confidence intervals.
    RaterObserved Agreement* (N = 56)Kappa Coefficient†Weighted Kappa Statistic†
    132 (57)0.38 (0.19 to 0.57)0.51 (0.35 to 0.67)
    234 (61)0.44 (0.26 to 0.62)0.59 (0.41 to 0.77)
    325 (45)0.20 (0.01 to 0.39)0.32 (0.18 to 0.46)
    Anchor for JumpAnchor for Jump  TABLE VII ANALYSIS OF INTRAOBSERVER AGREEMENT ACCORDING TO DICHOTOMOUS GROUPS
    * The data are given as the number of hips, with the percentage in parentheses.† The numbers in parentheses indicate the 95 per cent confidence intervals.
    Class I and Classes II, III, and IVClasses I and II and Classes III and IV
    RaterObserved Agreement* (N =56)Kappa Coefficient†Observed Agreement* (N = 56)Kappa Coefficient†
    146 (82)0.49 (0.44 to 0.53)43 (77)0.52 (0.29 to 0.75)
    245 (80)0.31 (-0.05 to 0.68)45 (80)0.60 (0.39 to 0.81)
    354 (96)0.49 (-0.2 to 1.19)33 (59)0.29 (0.07 to 0.51)
    Andersen, E.; Jorgensen, L.G.; and Hededam, L. T.: Evans' classification of trochanteric fractures: an assessment of the interobserver and intraobserver reliability. Injury,21: 377-378, 1990.21377  1990  [PubMed]
     
    Barrett, W. P.; Staheli, L. T.; and Chew, D. E.: The effectiveness of the Salter innominate osteotomy in the treatment of congenital dislocation of the hip. J. Bone and Joint Surg.,68-A: 79-87, Jan. 1986.68-A79  1986 
     
    Berkeley, M. E.; Dickson, J. H.; Cain, T. E.; and Donovan, M. M.: Surgical therapy for congenital dislocation of the hip in patients who are twelve to thirty-six months old. J. Bone and Joint Surg.,66-A: 412-420, March 1984.66-A412  1984 
     
    Blockey, N. J.: Derotation osteotomy in the management of congenital dislocation of the hip. J. Bone and Joint Surg.,66-B(4): 485-490, 1984.66-B(4)485  1984 
     
    Brien, H.; Noftall, F.; MacMaster, S.; Cummings, T.; Landells, C.; and Rockwood, P.: Neer's classification system: a critical appraisal. J. Trauma,38: 257-260, 1995.38257  1995  [PubMed]
     
    Byrt, T.; Bishop, J.; and Carlin, J. B.: Bias, prevalence and kappa. J. Clin. Epidemiol,46: 423-429, 1993.46423  1993  [PubMed]
     
    Cohen, J. A.: A coefficient of agreement for nominal scales. Educat. and Psychol. Measure,20: 37-46, 1960.2037  1960 
     
    Cohen, J.: Weighted kappa. Nominal scale agreement with provision for scaled disagreement or partial credit. Psychol. Bull.,70: 213-220, 1968.70213  1968  [PubMed]
     
    Coleman, S. S.: Congenital Dysplasia and Dislocation of the Hip. St. Louis, C. V. Mosby,1978.  1978 
     
    Faciszewski, T.; Kiefer, G. N.; and Coleman, S. S.: Pemberton osteotomy for residual acetabular dysplasia in children who have congenital dislocation of the hip. J. Bone and Joint Surg.,75-A: 643-649, May 1993.75-A643  1993 
     
    Fleiss, J. L. Statistical Methods for Rates and Proportions. Ed. 2, p. 217. New York, Wiley, 1981. 
     
    Frandsen, P. A.; Andersen, E.; Madsen, F.; and Skjodt, T.: Garden's classification of femoral neck fractures. An assessment of inter-observer variation. J. Bone and Joint Surg.,70-B(4): 588-590, 1988.70-B(4)588  1988 
     
    Galpin, R. D.; Roach, J. W.; Wenger, D. R.; Herring, J. A.; and Birch, J. G.: One-stage treatment of congenital dislocation of the hip in older children, including femoral shortening. J. Bone and Joint Surg.,71-A: 734-741, June 1989.71-A734  1989 
     
    Ganz, R.; Klaue, K.; Vinh, T. S.; and Mast, J. W.: A new periacetabular osteotomy for the treatment of hip dysplasia. Technique and preliminary results. Clin. Orthop.,232: 26-36, 1988.23226  1988  [PubMed]
     
    Kasser, J. R.; Bowen, J. R.; and |and |MacEwen, G. D.: Varus derotation osteotomy in the treatment of persistent dysplasia in congenital dislocation of the hip. J. Bone and Joint Surg.,67-A: 195-202, Feb. 1985.67-A195  1985 
     
    Kim, Y.-H.: Acetabular dysplasia and osteoarthritis developed by an eversion of the acetabular labrum. Clin. Orthop.,215: 289-295, 1987.215289  1987  [PubMed]
     
    Kristiansen, B.; Andersen, U. L.; Olsen, C. A.; and Varmarken, J. E.: The Neer classification of fractures of the proximal humerus. An assessment of interobserver variation. Skel. Radiol.,17: 420-422, 1988.17420  1988 
     
    Landis, J. R., and Koch, G. G.: The measurement of observer agreement for categorical data. Biometrics,33: 159-174, 1977.33159  1977  [PubMed]
     
    Murphy, S. B.; Ganz, R.; and Müller, M. E.: The prognosis in untreated dysplasia of the hip. A study of radiographic factors that predict the outcome. J. Bone and Joint Surg.,77-A: 985-989, July 1995.77-A985  1995 
     
    Nielsen, J. O.; Dons-Jensen, H.; and Sorensen, H. T.: Lauge-Hansen classification of malleolar fractures. An assessment of the reproducibility in 118 cases. Acta Orthop. Scandinavica,61: 385-387, 1990.61385  1990 
     
    Rasmussen, S.; Madsen, P. V.; and Bennicke, K.: Observer variation in the Lauge-Hansen classification of ankle fractures. Precision improved by instruction. Acta Orthop. Scandinavica,64: 693-694, 1993.64693  1993 
     
    Salter, R. B.: Innominate osteotomy in the treatment of congenital dislocation and subluxation of the hip. J. Bone and Joint Surg.,43-B(3): 518-539, 1961.43-B(3)518  1961 
     
    Salter, R. B., and Dubos, J.-P.: The first fifteen years' personal experience with innominate osteotomy in the treatment of congenital dislocation and subluxation of the hip. Clin. Orthop.,98: 72-103, 1974.9872  1974  [PubMed]
     
    Schoenecker, P. L., and Strecker, W. B.: Congenital dislocation of the hip in children. Comparison of the effects of femoral shortening and of skeletal traction in treatment. J. Bone and Joint Surg.,66-A: 21-27, Jan. 1984.66-A21  1984 
     
    Severin, E.: Contribution to the knowledge of congenital dislocation of the hip joint. Late results of closed reduction and arthrographic studies of recent cases. Acta Chir. Scandinavica, Supplementum 63, 1941. 
     
    Severin, E.: Congenital dislocation of the hip. Development of the joint after closed reduction. J. Bone and Joint Surg.,32-A: 507-518, July 1950.32-A507  1950 
     
    Sidor, M. L.; Zuckerman, J. D.; Lyon, T.; Koval, K.; Cuomo, F.; and Schoenberg, N.: The Neer classification system for proximal humeral fractures. An assessment of interobserver reliability and intraobserver reproducibility. J. Bone and Joint Surg.,75-A: 1745-1750, Dec. 1993.75-A1745  1993 
     
    Siebenrock, K. A, and Gerber, C.: The reproducibility of classification of fractures of the proximal end of the humerus. J. Bone and Joint Surg.,75-A: 1751-1755, Dec. 1993.75-A1751  1993 
     
    Spitznagel, E. L., and Helzer, J. E.: A proposed solution to the base rate problem in the kappa statistic. . Arch. Gen. Psychiatry,42: 725-728, 1985.42725  1985  [PubMed]
     
    Stulberg, S. D., and Harris, W. H.: Acetabular dysplasia and development of osteoarthritis of hip. In The Hip. Proceedings of the Second Open Scientific Meeting of The Hip Society, pp. 82-93. St. Louis, C. V. Mosby, 1974. 
     
    Svanholm, H.; Starklint, H.; Gundersen, H. J.; Fabricius, J.; Barlebo, H.; and Olsen, S.: Reproducibility of histomorphologic diagnosis with special reference to the kappa statistic. APMIS,97: 689-698, 1989.97689  1989  [PubMed]
     
    Thomsen, N. O. B.; Overgaard, S.; Olsen, L. H.; Hansen, H.; and Nielsen, S. T.: Observer variation in the radiographic classification of ankle fractures. J. Bone and Joint Surg.,73-B(4): 676-678, 1991.73-B(4)676  1991 
     
    Weinstein, S. L.: Natural history of congenital hip dislocation (CDH) and hip dysplasia. Clin. Orthop.,225: 62-76, 1987.22562  1987  [PubMed]
     
    Williamson, D. M., and Benson, M. K. D.: Late femoral osteotomy in congenital dislocation of the hip. J. Bone and Joint Surg.,70-B(4): 614-618, 1988.70-B(4)614  1988 
     
    Zionts, L. E., and MacEwen, G. D.: Treatment of congenital dislocation of the hip in children between the ages of one and three years. J. Bone and Joint Surg.,68-A: 829-846, July 1986.68-A829  1986 
     
    Zwick, R.: Another look at interrater agreement. Psychol. Bull.,103: 374-378, 1988.103374  1988  [PubMed]
     
    Accreditation Statement
    These activities have been planned and implemented in accordance with the Essential Areas and policies of the Accreditation Council for Continuing Medical Education (ACCME) through the joint sponsorship of the American Academy of Orthopaedic Surgeons and The Journal of Bone and Joint Surgery, Inc. The American Academy of Orthopaedic Surgeons is accredited by the ACCME to provide continuing medical education for physicians.
    CME Activities Associated with This Article
    Submit a Comment
    Please read the other comments before you post yours. Contributors must reveal any conflict of interest.
    Comments are moderated and will appear on the site at the discretion of JBJS editorial staff.

    * = Required Field
    (if multiple authors, separate names by comma)
    Example: John Doe




    Related Articles
    Related Cases
    Related Content
    Topic Collections
    Related Audio and Videos
    PubMed Articles
    Smith-Petersen Vitallium mould arthroplasty: a 62-year follow-up.
    The Journal of bone and joint surgery. British volume: Issue date- 2011 Sep
    Clinical Trials
    Readers of This Also Read...
    jbjs jobs
    12/22/2011
    ME - Central Maine Medical Center
    12/22/2011
    VA - Charleston Area Medical Center
    12/22/2011
    Maine - Central Maine Medical Center