Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2016 Sep 1.
Published in final edited form as: Arthritis Care Res (Hoboken). 2015 Sep;67(9):1304–1315. doi: 10.1002/acr.22585

Study for Updated Gout Classification Criteria (SUGAR): identification of features to classify gout

William J Taylor 1, Jaap Fransen 2, Tim L Jansen 2, Nicola Dalbeth 3, H Ralph Schumacher 4, Melanie Brown 1, Worawit Louthrenoo 5, Janitzia Vazquez-Mellado 6, Maxim Eliseev 7, Geraldine McCarthy 8,9, Lisa K Stamp 10, Fernando Perez-Ruiz 11, Francisca Sivera 12, Hang-Korng Ea 13,14, Martijn Gerritsen 15, Carlo Scire 16, Lorenzo Cavagna 17, Chingtsai Lin 18, Yin-Yi Chou 19, Anne-Kathrin Tausche 20, Ana Beatriz Vargas-Santos 21, Matthijs Janssen 22, Jiunn-Horng Chen 23,24, Ole Slot 25, Marco A Cimmino 26, Till Uhlig 27, Tuhina Neogi 28
PMCID: PMC4573373  NIHMSID: NIHMS692970  PMID: 25777045

Abstract

Objective

To determine which clinical, laboratory and imaging features most accurately distinguished gout from non-gout.

Methods

A cross-sectional study of consecutive rheumatology clinic patients with at least one swollen joint or subcutaneous tophus. Gout was defined by synovial fluid or tophus aspirate microscopy by certified examiners in all patients. The sample was randomly divided into a model development (2/3) and test sample (1/3). Univariate and multivariate association between clinical features and MSU-defined gout was determined using logistic regression modelling. Shrinkage of regression weights was performed to prevent over-fitting of the final model. Latent class analysis was conducted to identify patterns of joint involvement.

Results

In total, 983 patients were included. Gout was present in 509 (52%). In the development sample (n=653), these features were selected for the final model (multivariate OR) joint erythema (2.13), difficulty walking (7.34), time to maximal pain < 24 hours (1.32), resolution by 2 weeks (3.58), tophus (7.29), MTP1 ever involved (2.30), location of currently tender joints: Other foot/ankle (2.28), MTP1 (2.82), serum urate level > 6 mg/dl (0.36 mmol/l) (3.35), ultrasound double contour sign (7.23), Xray erosion or cyst (2.49). The final model performed adequately in the test set with no evidence of misfit, high discrimination and predictive ability. MTP1 involvement was the most common joint pattern (39.4%) in gout cases.

Conclusion

Ten key discriminating features have been identified for further evaluation for new gout classification criteria. Ultrasound findings and degree of uricemia add discriminating value, and will significantly contribute to more accurate classification criteria.

Keywords: Gout, classification criteria, diagnostic ultrasound


Gout is the most common inflammatory arthritis in men and is increasing in prevalence (1, 2). Most gout worldwide is managed in primary care where disease identification seldom relies upon identification of monosodium urate (MSU) crystals because synovial fluid or polarizing light microscopy may not be easily obtainable (2). Therefore highly sensitive and specific classification criteria that do not require microscopic MSU crystal identification would be useful for clinical research conducted in epidemiological or primary care settings. Six classification criteria for gout have been developed, with the most widely used one being the 1977 American Rheumatism Association (ARA) criteria (3, 4).

However, there are significant limitations to the ARA criteria. These include the use of only patients with selected diagnoses (rheumatoid arthritis (RA), calcium pyrophosphate deposition disease (CPPD) and acute septic arthritis) as controls even though other rheumatic diseases may be amongst the group of patients or respondents for which such classification criteria for gout need to be applied. Only 47% of the controls had synovial fluid examination to confirm absence of MSU crystals.

The gold-standard chosen for the ARA classification criteria was physician diagnosis. In this way, the performance of synovial fluid microscopy could be examined. Only in 76 of 90 (84%) gout patients in whom synovial fluid was examined were MSU crystals identified. It is possible that excellent treatment of some patients led to clearance of MSU crystals, but the lower than expected sensitivity of synovial fluid analysis raises questions about the physician diagnosis (did the crystal-negative patients really have gout?) or the quality of the synovial fluid analysis. Many patients had incomplete data.

The ARA criteria were not tested in an external sample prior to publication, and thus were “preliminary” criteria. Subsequent external validation of the ARA survey criteria against a gold standard of synovial fluid analysis has been reported in two studies. In these studies the sensitivity was 70% and 80%, specificity was 78.8% and 64%, respectively (5, 6). Poor specificity is especially problematic in studies that seek to enroll patients into trials of new therapeutic agents, especially those with potential and unknown safety issues. It is important that patients accurately classified with gout are enrolled into studies to minimize inappropriate exposure to such drugs.

Due to the limitations of current criteria, ACR and EULAR have funded a project to update gout classification criteria (4). The Study for Updated Gout ClAssification CRiteria (SUGAR), undertaken as part of the gout classification project was designed to determine the performance of possible items that could discriminate between gout and non-gout, including diagnostic ultrasound findings. This study represents the first phase of the effort to develop updated classification criteria, aiming to identify the pertinent items to be considered for further evaluation and to assist with decision making in that evaluation. In addition, we examined the distribution of joint involvement amongst gout and non-gout patients to help distinguish patterns of joint involvement that should be considered in developing new criteria. It is important to re-iterate the difference between classification criteria, which are designed for entry into clinical studies, and diagnosis, which is a process that involves clinical care. The distinctions are more fully explained in existing literature (79). The work in the present study concerns classification criteria rather than diagnosis.

The second phase of the project consists of an expert-based selection and weighting process, informed by the SUGAR study results that will determine the new gout classification criteria that we aim to have endorsed by both ACR and EULAR. The second phase together with the final criteria will be reported separately.

Patients and Methods

Consecutive patients attending a rheumatology clinic with all of the following: a main complaint of joint(s) pain and swelling; GP or rheumatologist judges that gout is a possible differential diagnosis because of joint pain and swelling; patient has joint swelling currently or within the last 2 weeks or a possible tophus; there are no contraindications to arthrocentesis in the opinion of the clinical investigator, were enrolled into this crosss-ectional study. At the visit, clinical manifestations and a clinical diagnosis were recorded using standardized case record forms, prior to synovial fluid (or tissue) microscopy by a certified observer (see below for how observers were certified). Each centre received Ethics Committee Approval or Institutional Review Board approval according to local requirements. Informed consent was obtained from participants according to the requirements of local Ethics Committee or Institutional Review Board. The STAndards for the Reporting of Diagnostic accuracy studies (STARD) statement was used to guide reporting (10).

Gold-standard for gout (case definition)

All patients underwent arthrocentesis or tissue/tophus aspiration for polarizing microscopy to identify MSU crystals. All microscopic examination was undertaken by observers who had passed a 2-stage MSU-identification certification procedure, which consisted of a web-based crystal recognition test followed by examination of 5 to 8 vials of synovial fluid (SF) from the laboratories of Eliseo Pascual (European centres) or H. Ralph Schumacher (rest of the world) for those who passed the online test. The web-based test was strict and had a high non-pass rate (61%) (11). Each SF sample in the second stage needed to be correctly identified as demonstrating MSU crystals or not to achieve certification. SF samples with at least 10ml volume (to allow distribution of very small aliquots of the same specimens to all examinees) and typical findings for CPP crystals, MSU, and apatite (BCP, basic calcium phosphate) were selected. In addition, depot methylpredisolone crystals were added to a large osteoarthritic SF sample. Samples were pipetted into plastic tubes & sent with instructions to examine single drop specimens in the examinees microscope. Each SF sample was checked prior to shipping by express service to confirm stability of the crystals.

Cases of gout were defined as patients with MSU crystals identified by a certified observer. There were 42 certified observers involved in this study. Cases of non-gout were defined as patients without MSU crystals, irrespective of the clinical diagnosis. Synovial fluid/tissue microscopy by a certified observer was usually performed on the same day but could be done within 1 month of the visit to allow time to arrange an ultrasound guided aspiration. There was no restriction on the amount of fluid required from enrolled patients. Microbiological culture was sometimes not possible with small amounts of aspirated fluid, but this information was not mandatory. If arthrocentesis was not successful, investigators were able to repeat the procedure using US guidance. More than 1 joint could be aspirated according to the judgment of the clinical investigator, although this was uncommon. If no synovial fluid or tophus aspirate was obtained, the patient was not included in the study.

Potential classification items

Clinical data were collected at the index visit. Items were collected that included elements of existing classification criteria or had been ranked highly by physicians or patients in a previous Delphi exercise, which aimed specifically to identify potentially relevant diagnostic features of gout (12). Those items with a rating of 7 to 9 (agreed to be potentially discriminatory) from physicians or patients were selected.

All potential classification items (symptoms, signs, laboratory values and imaging results) were collected at the index visit and prior to synovial/tissue microscopy, using standardized case record forms. Currently swollen or tender joints were recorded on a homunculus. Advanced imaging items (magnetic resonance imaging, computing tomography, dual energy computing tomography) were recorded if available but were not mandatory. Ultrasound and conventional radiography were intended to be mandatory but since data were usually collected within routine clinical care without specific funding, not every patient had radiography and ultrasound data available.

For participants with their first episode, items that referred solely to previous episodes were marked as non-applicable but predictors were constructed from items so as to include all episodes (current and previous).

Statistical analysis and sample size

The overall analytic plan is shown in supplementary Figure 1. Forty-five items were collected with the expectation of these reducing to 30 items for a multivariate analysis. It was planned that the dataset be randomly divided into development (2/3 of data) and test sets (1/3 of data), which was performed after all patients were included using a computerized random number generator. Using the rule of thumb for multivariate models of 10 participants per item divided by the smallest proportion of cases or controls (0.5) and 30 items for analysis, we estimated that 860 participants would be required for building a robust multivariate model (13).

Step 1: univariate analyses

In the development set, univariate logistic regression analysis was performed for each item, calculating the sensitivity (true positive rate, proportion of cases having the feature), 1-specificity (false positive rate, proportion of controls having the feature) and odds ratio (OR, odds of having the feature in cases, compared to controls) with its associated p-value.

After univariate analyses, 30 items were considered for inclusion in a multivariate model based on p-value (p<0.05; we did not use a more lenient p<0.1 at this stage due to the number of variables that were statistically significantly associated). Moreover, it was agreed that only items that were disease manifestations of gout would be included; thus risk indicators (as opposed to features of the disease itself) such as age, sex, diet and co-morbidities were not included. For items that were highly related (e.g., if there were >1 item regarding time to resolution or time to maximal pain), the item with the best clinical feasibility or, if equivalent, the item with the lowest p-value was selected.

Step 2: multivariate analyses

Two approaches were then taken to further develop the prediction model: firstly, a purely statistical approach starting with a full model containing all variables selected after univariate analyses, followed by backward selection using one-by-one deletion of items based on the size of the highest p-value of the regression coefficients until all p<0.2 (13). This was the ‘statistically optimal model’ used to gauge the performance of a clinical model. The second approach was a clinical hierarchical model using clinical reasoning and a forward selection procedure by collecting the major features of gout, sorted into domains of clinical information (joint inflammation, time course of symptoms, physical examination, laboratory test, and imaging). For that model, variables were selected from univariate analyses based on p-value (p<0.05), OR (stronger the better), fit to the domains (while avoiding duplication), and feasibility with regards to assessment. Variables in a domain were included all at once and backward selection was applied until all p-value<0.20; then the variables of the next domain were added all at once and the procedure was repeated for that domain. Variables from previously added domains were only removed again if their p-value became >0.50. We chose a final model from these two approaches based on statistical parameters indicating better model fit (Akaike Information Criterion, AIC) and predictive accuracy (c-statistic) to determine the strength of associations between the included factors and presence of crystal-proven gout.

Step 3: correcting for over-optimism

We applied a standard method of shrinkage to reduce error in predicted values when the model is applied to new data and to avoid over-estimating the predictive ability of the model. The regression coefficients of the final model were reduced by multiplication with a shrinkage factor determined by performing a bootstrap procedure with 300 repetitions (14, 15).

Step 4: testing model performance

The performance of the ‘shrunken’ model in the test set was analyzed using the c-statistic, Nagelkerke’s R2, the Hosmer-Lemeshow test and by plotting the predicted probabilities with the observed probabilities for groups classified according to deciles of the classification rule (calibration plot). The c-statistic is an estimate of discrimination, being equivalent to the area under the curve of a ROC curve, with values of 1 indicating perfect discrimination between gout and non-gout and 0.5 indicating discrimination no better than chance. The R2 statistic is a measure of the difference between the observed class (gout/non-gout) and the class predicted by the prediction model (values of 0 indicating no explained variance and 1 indicating 100% of the variance in class membership are explained by the model). The Hosmer-Lemeshow test is a statistical test of model fit by assessing how well the observed class membership of subgroups correspond to class membership predicted by the model. A non-significant test indicates good model fit. The optimal cut point that corresponded to the inflexion point of the ROC curve signifying maximum sensitivity and specificity was also determined for this shrunken prediction model.

To further inform the decision-making process regarding which joints are most frequently involved during symptomatic episodes, patterns of joint involvement were analyzed in the development set of the SUGAR data. This was based upon currently swollen and tender joint involvement during the study visit as recorded on a homunculus following physician examination. First, we assessed the crude prevalence of each joint area with respect to reported tender vs. swollen joint involvement, and involvement of left-sided vs. right-sided joints in cases and controls separately. Once it was determined that prevalences were similar for both tender and swollen joints, and for left vs. right, tender and swollen joints were analyzed together (i.e., each joint was categorized as tender OR swollen), and both left-sided and right-sided joints were analyzed together in the same model.

We performed latent class analysis (LCA) on cases separately from the controls. LCA is a statistical approach that identifies underlying subgroups (i.e., latent classes or clusters) of individuals based upon their responses to a set of observed categorical variables. In this case, it was ascertainment of symptomatic involvement of each joint (yes vs. no) on a homunculus. This framework is the categorical analog to factor analysis, which is based on continuous variables. (1618)

We used LCA to model the symptomatic joints to identify clusters representing a homogeneous subgroup that have a similar pattern of joint involvement. We first modeled all joints of the homunculus together, then repeated analyses limited to the following 4 joints areas: i) lower extremity joints only, ii) foot and ankle joints only, iii) upper extremity joints only, and iv) hand and wrist joints only. For each model, the optimal number of clusters was determined by statistical parameters (likelihood ration G2 statistic, degrees of freedom, AIC and BIC). We obtained the prevalence of the identified clusters in each model, and the item-response probability of each item within each cluster.

Results

Twenty-five centres from 16 countries and 4 continents contributed data on 983 patients (509 cases, 472 non-cases), collected between January 2013 and April 2014. The majority, 702 (71.4%) were male. The mean (SD) age was 58.5 (17.2) years and median (interquartile range) duration of disease since first recalled symptoms was 4 (0.6 to 10) years (Table 1). In the development and test sets 30 (9%) and 17 (10%) of gout patients presented with their first episode of symptoms. Thirty-seven percent of the participants were non-Caucasian. Non-cases had a clinical diagnosis of acute calcium pyrophosphate arthritis (109), spondyloarthritis (71), rheumatoid arthritis (70), osteoarthritis (69), undifferentiated arthritis (60), clinical gout (but MSU crystals were not observed by microscopy, 49), septic arthritis (10), systemic lupus erythematosus (5), or other (31).

Table 1.

Participant characteristics Gout

Gout (MSU positive, n=509) Non-gout (MSU negative, n=474)
Time since first episode of symptoms (years), median (IQR) 6 (2 to 13) 3 (0.23 to 8)
Male (%) 440 (86%) 262 (55%)
Age, mean (SD) 60 (15) 59 (16)
Clinical diagnosis* Gout 494 49
Calcium pyrophosphate deposition disease 4 109
Spondyloarthritis 1 71
Osteoarthritis 3 69
Rheumatoid arthritis 2 70
Undifferentiated arthritis 1 33
Septic arthritis 2 10
Systemic Lupus Erythematosus 0 5
Other 2 31
Ethnicity White/European/Caucasian 347 276
African/Black 12 11
Hispanic 23 21
South Asian 47 38
East Asian 67 113
Pacific Island 3 1
Other indigenous 3 4
Other 7 10
*

Clinical diagnosis was independent of MSU crystal identification

MSU = monosodium urate; IQR = interquartile range; SD = standard deviation

Missingness

MSU crystal identification was available for all patients. Clinical data collection and arthrocentesis was on the same day for 91% of patients and within 11 days for 99% of patients. Most variables had 0 or <1% missing only; serum uric acid level was missing for 8%, current location of tender joints, whether MTP1 was first ever joint involved were missing in <10%. Radiographs/ultrasound scans were not available from 12.4/18.5% of gout cases and 11.8/13.9% of non-gout cases. DECT and MRI were undertaken too infrequently to be usefully analyzed and were not considered further.

Univariate analysis

In the development sample (n=653), the univariate association of individual items with having MSU-proven gout is shown in Table 2. Overall, most items tested were associated with MSU-proven gout in univariate analyses, though some had weak associations in terms of magnitude of effect. The items with the highest OR were tophus on ultrasound (18.8), SUA >6mg/dl (0.36 mmol/l) (17.8), double contour sign on ultrasound (14.7), patient belief that he/she has gout (14.2), clinically evident tophus (10.0), dietary trigger (8.7), MTP1 ever involvement (8.1), radiographic erosion (6.7), male gender (6.4), joint erythema (6.1), marked tenderness (6.0), at least 2 episodes that start abruptly and resolve by 2 weeks (6.0), resolution of symptoms by 14 days (5.7), and difficulty walking or using joint (5.4). Additionally, we found a clear monotonically increasing relation of SUA levels with gout, with OR 5.9 for SUA 6 to 8mg/dl (0.36 to 0.48mmol/l) and 39.4 for SUA above 10mg/dl (0.6mmol/l). Nonetheless, a therapeutic cut-point of 6mg/dl (0.36mmol/l) was used in the final prediction model for this study to be more consistent with the dichotomous format of SUA in published criteria-sets and to be consistent with the data analysis that was presented to an expert panel in the second phase of the project.

Table 2.

Univariate association of each variable with classification as gout in the development data-set (n=653). Values are percentages for categorical items or mean (SD) for continuous variables.

Item Description Present in gout (N=346), % Present in non-gout (n=307), % OR p-value
Gender (male) 89% 55% 6.4 <0.0001
Time to max. pain (current episode) 0–4 hours 23% 19% 1.9 <0.0001
5–12 hours 28% 16% 2.0
13–24 hours 16% 12% 2.6
>24 hours 34% 52% ref.
Time to max. pain (previous episodes) 0–4 hours 31% 33% 1.9 <0.0001
5–12 hours 31% 14% 3.3
13–24 hours 16% 10% 4.5
>24 hours 22% 44% ref.
Pain level 0 to 10 (current episode) 8 (7–10) 8 (6–9) 1.1 0.0085
Pain level 0 to 10 (current episode) 7 or higher 79% 70% 1.6 0.015
Pain level 0 to 10 (previous episodes) 8 (7–10) 7 (4–9) 1.2 <0.0001
Pain level 0 to 10 (previous episodes) 7 or higher 76% 56% 2.6 <0.0001
At least 2 episodes that start abruptly and conclude by 2 weeks 83% 44% 6.0 <0.0001
Time for episodes to resolve 0–24 hours 7% 6% 6.1 <0.0001
1–7 days 56% 28% 10.3
8–14 days 19% 11% 8.7
More than 14 days 11% 19% 2.6
 Never resolved 7% 37% ref.
Time for episodes to resolve More than 14 days or never 18% 56% 0.17 <0.0001
Number of episodes 1 9% 22% ref. <0.0001
2 to 5 24% 28% 3.1
more than 5 66% 50% 2.0
Stereotypical progression* yes 33% 12% 3.4 <0.0001
no 61% 80% ref.
uncertain 6% 8% 1.0
Complete resolution of at least one episode 85% 56% 4.6 <0.0001
Hypertension or other CVS diseases 68% 45% 2.6 <0.0001
Tender joint count 1 (1–3) 1 (1–2) 1.04 0.06
Tender joint count 1 53% 61% 0.59 0.0066
2–4 30% 28% 0.74
> 4 17% 12% ref.
Swollen joint count 1 (1–3) 1 (1–2) 1.06 0.026
Swollen joint count 1 53% 65% 0.38 0.0008
2–4 32% 27% 0.55
> 4 15% 7% ref.
Involved joints take longer to heal 25% 29% 0.84 0.33
Patient belief he/she has gout 81% 23% 14.2 <0.0001
Told by doctor he/she has gout 84% 27% 14.3 <0.0001
Xray - erosion 42% 10% 6.7 <0.0001
Xray - asymmetric joint swelling 43% 24% 2.4 <0.0001
Xray - joint cyst without erosion 44% 17% 3.7 <0.0001
Xray – erosion and/or joint cyst 59% 21% 5.3 <0.0001
Any of these xray features present 66% 38% 4.7 <0.0001
US - double contour sign 58% 9% 14.7 <0.0001
US - tophus 45% 4% 18.8 <0.0001
US - snowstorm effusion 31% 9% 4.4 <0.0001
Any of these 3 US features present 75% 16% 15.3 <0.0001
Power Doppler (US) grade 0 32% 42% 3.0 0.0097
grade 1 25% 23% 2.1
grade 2 28% 29% 2.4
grade 3 15% 6% ref
Patient taking ULT at time of SUA level 34% 8% 5.6 <0.0001
Tophus present or previously present 34% 5% 10.0 <0.0001
MTP1 involvement was first ever symptom 54% 16% 6.0 <0.0001
MTP1 involvement ever occurred 74% 27% 7.6 <0.0001
MTP1 occurred in isolation ever 72% 24% 8.1 <0.0001
Maximal pain greater than 7 93% 81% 3.3 <0.0001
At least 1 episode resolved by 14 days 82% 44% 5.7 <0.0001
At least 1 episode involved 1–5 joints 72% 69% 1.2 0.37
At least 1 episode involved joint erythema 87% 53% 6.1 <0.0001
At least 1 episode involved marked tenderness to touch 92% 65% 6.0 <0.0001
At least 1 episode involved only 1 joint 95% 86% 3.2 0.0001
At least 1 episode involved tarsal joints 57% 30% 3.0 <0.0001
Patient has history of kidney stone 15% 9% 1.8 0.017
At least 1 episode responded to colchicine Yes 40% 10% ref. <0.0001
No 13% 10% 0.34
Never used colchicine 47% 79% 0.15
At least 1 episode involved difficulty walking 97% 84% 5.4 <0.0001
At least 1 episode it was hard to sleep 90% 69% 4.2 <0.0001
At least 1 episode pain present at rest 91% 78% 3.0 <0.0001
At least 1 episode joint was throbbing or burning 90% 72% 3.7 <0.0001
Disabled OR sleep OR rest pain OR throbbing 99% 90% 9.4 <0.0001
At least 1 episode joint was hot to touch 95% 80% 4.8 <0.0001
At least 1 episode triggered by dietary factor 53% 11% 8.7 <0.0001
At least 1 episode there was joint swelling 97% 97% 1.1 0.79
Current SUA (mmol/l) 0.47 (0.13) 0.34 (0.13)
Highest recorded SUA (mmol/l) 0.56 (0.14) 0.38 (0.14)
Highest SUA without ULT (mmol/l) 0.57 (0.13) 0.38 (0.13)
Current SUA > 0.36 mmol/l 79% 40% 5.5 <0.0001
Highest recorded SUA > 0.36 mmol/l 93% 51% 12.6 <0.0001
Highest SUA without ULT > 0.36 mmol/l 95% 51% 17.8 <0.0001
Highest SUA category <0.36 mmol/l 7% 51% Ref <0.0001
0.36 to <0.48 mmol/l 19% 26% 5.9
0.48 to <0.60 mmol/l 38% 16% 18.3
≥0.60 mmol/l 36% 7% 39.4
Currently tender any MTP1 33% 9% 4.8 <0.0001
Currently tender any big toe IP 4% 0% 12.2 0.016
Currently tender any joint lesser toes 12% 7% 1.8 0.033
Currently tender any midfoot joint 8% 5% 1.8 0.079
Currently tender any ankle joint 30% 17% 2.1 0.0001
Currently tender any knee joint 42% 67% 0.27 <0.0001
Currently tender any hip joint 1% 3% 0.18 0.026
Currently tender any shoulder joint 3% 6% 0.44 0.036
Currently tender any elbow joint 11% 7% 1.7 0.056
Currently tender any wrist joint 15% 14% 1.0 0.88
Currently tender any hand joint 20% 10% 2.1 0.0003
Currently tender any CMC 1% 1% 1.8 0.49
Currently tender joint location Any joint proximal to ankle 36% 68% ref. <0.0001
Ankle or foot not MTP1 28% 20% 2.6
MTP1 36% 12% 5.9
As above except these are for currently swollen joints 30% 10% 4.0 <0.0001
3% 0% 10.3 0.026
40% 48% 3.7 <0.0001
7% 4% 1.6 0.16
31% 17% 2.1 <0.0001
43% 67% 0.38 <0.0001
1% 4% 0.27 0.024
11% 5% 2.3 0.0084
14% 14% 0.99 0.96
20% 8% 2.8 <0.0001
1% 1% 1.8 0.49
Currently swollen joint location Any joint proximal to ankle 38% 71% ref. <0.0001
Ankle or foot not MTP1 30% 17% 3.2
MTP1 32% 12% 5.0
*

a clear progression has been observed over a minimum period of 5 years, in which attacks initially affected only 1 joint at a time, then between 1 and 5 joints and then more than 5 joints.

SD = standard deviation; OR = odds ratio; US = ultrasound; ULT = urate lowering therapy; MTP1 = first metatarsophalangeal joint; SUA = serum urate; CVS = cardiovascular; IP = interphalangeal joint; CMC = thumb carpometacarpal joint.

Pattern of joint involvement

In the LCA model that included all currently involved (i.e., tender or swollen) joints among gout cases, 4 clusters emerged. The cluster with the highest prevalence was that of 1st MTP involvement, with a prevalence of 39.44% (Table 3). With just a slightly lower prevalence, the second most prevalent cluster was associated with predominantly knee or ankle/midfoot involvement (37.14%); the third was primarily elbow, wrist, and hand involvement (14.85%), and the fourth was polyarticular involvement, affecting virtually any joint (8.57%). This cluster pattern was in contrast to the non-gout patients, in which prevalence of knee involvement, predominantly monoarticular, appeared to be higher (56.83%), and no MTP pattern was seen (Table 3).

Table 3.

Prevalence of joint involvement clusters among gout and non-gout

Gout Nongout
Joint Pattern Prevalence Joint Pattern Prevalence
MTP1 39.4% Knee only 56.8%
Knee/ankle 37.1% Any lower extremity joint 30.1%
Elbow/wrist/hand 14.9% Wrist/hand + knee 8.0%
Polyarticular 8.6% Polyarticular 5.1%

MTP1 = first metatarsophalangeal joint

When limited to just lower extremity joints in the gout cases, monoarticular presentation of either knee, ankle, or 1st MTP was high (83.49%); limited to just the foot and ankle, 1st MTP monoarticular presentation was common (84.16%). For non-gout patients, when limited to the lower extremity, monoarticular involvement of the knee was most common (60.84%); when limited to the foot/ankle, 1st MTP monoarticular presentation was uncommon (6.36%). Upper extremity as well as hand/wrist clusters were similar for cases and controls.

Multivariate models

The multivariate model developed from backward removal of items based solely on statistical criteria is shown in Table 4 along with the hierarchical models that included clinical reasoning. Two items were removed again from the hierarchical models because of large p-values (p>0.50). The final hierarchical model parameters are shown in Table 5. This model was chosen over the statistically derived model because of smaller AIC (303 vs. 677) and larger c-statistic (0.93 vs. 0.91).

Table 4.

Multivariate model development. Two approaches are shown – by statistical selection only or by progressive addition of clinically informed groups of variables (see text for details).

Model name Variables AIC* c-statistic (AUC)
STATISTICAL SELECTION ONLY Sex, onset of maximal pain, complete resolution of episode, xray features, US DCS, tophus, MTP1 involved, disability, hyperuricemia, location of current joint involvement 380 0.91

HIERARCHICAL MODEL DEVELOPMENT
JOINT INFLAMMATION Joint erythema, marked tenderness, joint warm to touch, disability 746 0.75
+ COURSE + onset of maximal pain with 24 hours, attack resolves within 14 days, at least 2 episodes 669 0.82
+ GOUT FEATURES + tophus, MTP1 ever involved, current tender joint location 531 0.89
+ URATE + highest recorded SUA>6mg/dl (0.36mmol/l),§ 463 0.89
 + ULTRASOUND  + US features 354 0.92
 + US only DCS  + US DCS sign 355 0.92
 + XRAY  + xray features 396 0.90
 + XRAY one sign  + xray erosion or cyst,§ 392 0.90
+ US AND XRAY + US DCS, xray erosion or cyst,§ 303 0.93
*

AIC, Akaike Information Criterion is an index of model fit adjusted for parsimony (lower values indicate better fit);

Area under the curve of a receiver operating characteristic curve (values of 0.5 indicate prediction of class membership no better than chance, values of 1.0 indicate perfect prediction of class membership);

Xray features were erosion, subcortical cyst or asymmetric soft tissue swelling within a joint;

§

Item shown as deleted because it was removed due to p=0.47 (at least 2 episodes), p=0.22 (marked tenderness).

US DCS = ultrasound double-contour sign; MTP1 = first metatarsophalangeal joint; SUA = serum urate

Table 5.

Item parameters of final model from development data-set (n=654)

Variable OR (95% CI) p-value Regression coefficient Regression coefficient with shrinkage
Intercept −5.70 −4.05
Joint erythema 2.13 (1.06, 4.29) 0.03 0.76 0.54
At least 1 episode involved difficulty walking 7.34 (1.17, 46.06) 0.03 2.00 1.42
Time to maximal pain less than 24 hours 1.32 (0.71, 2.47) 0.38 0.28 0.20
Resolution by 2 weeks 3.58 (1.85, 6.95) 0.0002 1.28 0.91
Tophus 7.29 (2.42, 21.99) 0.0004 1.99 1.41
MTP1 ever involved 2.30 (1.18, 4.49) 0.01 0.83 0.59
Location of currently tender joints* Other foot/ankle 2.28 (1.00, 5.19) 0.01 0.21 0.14
MTP1 2.82 (1.37, 5.81) 0.42 0.30
Serum urate level > 6mg/dl (0.36mmol/l) 3.35 (1.57, 7.15) 0.002 1.21 1.43
US double contour sign 7.23 (3.47, 15.04) <0.0001 1.98 1.40
Xray erosion or cyst 2.49 (1.26, 4.90) 0.009 0.91 0.65
*

Ref. category is joint proximal to ankle

Logistic regression model details: LR Chi-square 305 (df 11), p<0.001; c-statistic 0.93

The regression coefficients show the strength of the association between the variable and having gout (values > 0 indicate a positive association) in the original units of the variable. The shrinkage factor is applied to prevent over-estimating the predictive ability of the model (see text for more details).

OR = odds ratio; MTP1 = first metatarsophalangeal joint; US = ultrasound

We applied a shrinkage factor of 0.76 that was estimated using bootstrapping to the regression coefficients of the hierarchical model to form a final prediction model that provided insights into the strength of association between the included items and presence of MSU-proven gout (the regression coefficients or weights are shown in Table 5).

In the multivariate analyses, the items regarding marked tenderness and having at least 2 episodes that start abruptly and resolve within 2 weeks were no longer significantly associated with presence of crystal-proven gout. There was conceptual similarity between the item “having at least 2 episodes that start abruptly and resolve within 2 weeks” and 2 other variables that remained in the model, that of “maximal pain within 24 hours” and “resolution by 14 days”. This redundancy led to removal of the first of these 3 items. Similarly, marked tenderness was likely analogous in concept to experiencing great difficulty in walking or inability to use affected joint. Further, while all ultrasound features were initially included in the hierarchical model, subsequently only DCS remained significant. Similarly, the only x-ray features that were retained for the multivariate model were erosions with sclerotic margins and overhanging cortical edges and/or subcortical cysts.

In the test data-set (n=329, 50.5% with gout), a logistic regression model in which gout/non-gout (based upon MSU presence) was the dependent variable and the model-derived ‘shrunken’ score (based upon the ‘shrunken’ regression coefficients derived in the development data-set; Table 5) was the independent variable, a close association between the score and the presence of gout was noted with R2 0.58, c-statistic 0.90 and Hosmer-Lemeshow Chi-square 7.11 (df=8), p=0.53. The model parameters for the version without imaging were R2 0.52, c-statistic 0.87 and Hosmer-Lemeshow Chi-square 1.90 (df=9), p=0.99. Using a statistically derived cut-point from the data-derived prediction model in which the sum of sensitivity and specificity was maximized, we found the sensitivity of the model was 83.9% (88.1% for the model without imaging) and specificity was 81.2% (71.6% for the model without imaging).

Discussion

In this study, we found that a large number of clinical and imaging parameters are highly associated with presence of MSU-crystal gout compared to other diseases associated with joint swelling. However, in multivariate analyses, some were no longer significantly associated with gout, while 10 factors remained independently predictive of the presence of gout, representing clinical, laboratory, and imaging features. Even so, the data-derived prediction model possessed sensitivity and specificity that were both less than 90%.

We also confirmed that MTP1 involvement was typical of gout, although the latent class analysis showed that knee/ankle/midfoot involvement was almost as common. Knee monoarthritis was more typical of non-gout, although also occurred frequently in gout cases. It should be acknowledged that this analysis was performed on currently involved joints, which does not necessarily capture the full spectrum of joint involvement that could ‘ever’ occur in a given patient.

In this study, we chose a therapeutic value of 6mg/dl (0.36mmol/l) as a cutoff to define high serum urate, but the association of serum urate with gout/non-gout varies continuously. This has been well described at a population level (19) but not previously in a diagnostic context. In the SUGAR data, we found that the odds ratio of having gout vs. non-gout increases non-linearly with quartiles of the highest ever recorded serum urate from nearly 6 for the lowest quartile (less than 6 mg/dl, 0.36mmol/l) to 39 for the highest quartile (more than 10 mg/dl, 0.6mmol/l). This is shown graphically in Figure 1. Future proposals for gout classification criteria should possibly incorporate levels of serum urate rather than simple presence vs. absence of a high serum urate level.

Figure 1.

Figure 1

Association of highest serum urate (mg/dl) with gout. The size of the bars represent the ratio of the proportion of gout/non-gout with each level of serum urate. The bar labels are the actual observed proportion in the gout or non-gout groups with each level of serum urate.

This study took several steps to minimize bias, including a requirement that all subjects underwent a rigorous gold-standard procedure, collection of pre-defined items that were previously included in existing criteria sets and additional ones identified through Delphi methods, application of ‘shrinkage’ to estimated regression coefficients to minimize over-fitting, and testing the derived criteria in a randomly selected subset of the data that did not contribute towards the development of the criteria. The sample size was large enough to test a range of possible items including key ultrasound features. Further, the subjects comprising our control group included a broader, relevant spectrum of diseases than prior classification criteria studies.

The main limitations to the interpretation of the results arise from the population that was studied. This was not a primary care population and all patients needed to have synovial fluid or tissue aspiration in order to be included in the study. This confers unavoidable spectrum bias, whereby the studied population likely had more severe than would be seen in primary care, and does not necessarily represent the full spectrum of clinical gout. In addition, this sample is biased towards chronic disease by inclusion of people with more persistent or recurrent features. Patients with persistent disease have more opportunity to accumulate disease features. Therefore the results of the study may not be generalizable to primary care settings.

It is possible that patients with large joint disease were preferentially selected because of the need for joint aspiration. However, this is unlikely to have been a major factor since 74% of gout cases had MTP1 involvement at some time in their disease course, which is within the range of what is observed in other gout cohorts (20) and furthermore 128/1004 (13%) joint aspirations were from the MTP1 joint. There were 34% of gout cases with currently tender MTP1.

Whether these same items are also predictive of presence of crystal-proven gout in a sample that may have milder disease that is not tophaceous or disease that involves only small joints is presently not known. This is one of the reasons that additional work has been planned to develop the new gout classification criteria. We also could not test the performance of presence of MSU crystals itself given that our case-control status was defined by their presence (i.e., gold standard).

We emphasize that the prediction model described in this study was intended to inform the next stages of classification criteria development and are not the final ACR-EULAR endorsed gout classification criteria. The study has demonstrated which items and which combinations of items are most strongly associated with gout. In particular, it has demonstrated that ultrasound features, even when unstandardized and performed according to local practice, can significantly contribute to classification of gout.

Nonetheless, in order to overcome the limitation of spectrum bias, the need for all patients to have undergone the diagnostic gold-standard procedure and to allow expert opinion to be integrated into new classification criteria, the second phase of the project incorporating a paper patient exercise and an expert clinician workshop was undertaken (reported separately).

Despite efforts to incorporate key diagnostic features of gout, classification of patients without searching for MSU crystals remains inaccurate with combined sensitivity and specificity of the prediction model derived from these data both under 90%. This highlights the importance of synovial fluid examination for diagnostic purposes in ordinary clinical care and cautions against the use of classification criteria for diagnosis. In situations where high positive predictive value is required, a high suspicion of gout (pre-test probability), criteria with lower sensitivity or a crystal diagnosis should be determined. Ultrasound features also contribute strongly to classification of gout and may have sufficient specificity to be useful in such situations.

Significance and innovation.

  • Accurate classification criteria for gout that do not require MSU identification would be useful for clinical research.

  • Key clinical, imaging and laboratory features that distinguish between gout and non-gout were identified in a cross-sectional, multi-national study.

  • The degree of uricemia and ultrasound findings contribute independently to the classification of gout and should be included in future classification criteria.

Acknowledgments

This study was supported by the American College of Rheumatology, European League against Rheumatism, Arthritis New Zealand, Association Rhumatisme et Travail, and Asociacion de Reumatologos del Hospital de Cruces.

We gratefully acknowledge the help of Joung-Liang Lan, Chien-Chung Huang, Po-Hao Huang, Hui-Ju Lin and Su-Ting Chang (China Medical University Hospital, Taiwan), Anne Madigan (Dublin, Ireland), Yi-hsing Chen (Taichung, Taiwan), Alain Sanchez-Rodríguez and Eduardo Aranda-Arreola (Mexico City, Mexico), Viktoria Fana (Copenhagen, Denmark), Panomkorn Lhakum and Kanon Jatuworapruk (Chiang Mai, Thailand), Dianne Berendsen (Nijmegen, Netherlands), Femke Lamers-Karnebeek (Amsterdam, Netherlands), Olivier Peyr (Paris, France), Heidi Lunøe and Anne Katrine Kongtorp (Oslo, Norway), Geraldo da Rocha Castelar-Pinheiro (Rio de Janeiro, Brasil), Fatima Kudaeva (Moscow, Russia), Angelo Gaffo (Birmingham AL), Douglas White (Hamilton, New Zealand), Giovanni Cagnotto (Pavia, Italy) and Juris Lazovskis (Sydney, Canada) with data collection, crystal examination or patient referral. We are grateful to Eliseo Pascual (Alicante, Spain) for help with MSU observer certification and to Na Lu (Boston, MA) for help with the latent class analysis.

Footnotes

Disclosures: William Taylor – consultancy, honoraria or speaker fees from Pfizer (<$10,000), educational grants from Abbvie, Roche, Pfizer (all <$10,000); Lisa Stamp – consultancy, honoraria or speaker fees from AstraZeneca (<$10,000); Fernando Perez-Ruiz – consultancy, honoraria or speaker fees from Menarini, Sobi, Pfizer, AstraZeneca, Novartis (all <$10,000); H. Ralph Schumacher – consultancy, honoraria or speaker fees from Novartis, Regeneron, Metabolex and AstraZeneca (all<$10,000); Martijn Gerritsen – speaking fees or consultancy from Sobi and Menarini (all<$10,000); Anne-Kathrin Tausche – speaking fees for Berlin Chemie Menarini, expert testimony for Novartis, AstraZeneca, Ardea Bioscience (all <$10,000); Ana Beatriz Vargas-Santos – speaking fees for AstraZeneca (<$10,000); Tim Jansen – consultancy, honoraria or speaker fees from Abbvie, AstraZeneca, Bristol-Myers-Squib, Menarini and Roche; Marco Cimmino – consultancy, speaking fees or honoraria for Janssen Cilag (<$10,000); Till Uhlig – honoraria for AstraZeneca and Sobi.

References

  • 1.Neogi T. Clinical practice. Gout N Engl J Med. 2011;364(5):443–52. doi: 10.1056/NEJMcp1001124. [DOI] [PubMed] [Google Scholar]
  • 2.Trifiro G, Morabito P, Cavagna L, Ferrajolo C, Pecchioli S, Simonetti M, et al. Epidemiology of gout and hyperuricaemia in Italy during the years 2005–2009: a nationwide population-based study. Ann Rheum Dis. 2013;72(5):694–700. doi: 10.1136/annrheumdis-2011-201254. [DOI] [PubMed] [Google Scholar]
  • 3.Wallace SL, Robinson H, Masi AT, Decker JL, McCarty DJ, Yu TF. Preliminary criteria for the classification of the acute arthritis of primary gout. Arthritis Rheum. 1977;20(3):895–900. doi: 10.1002/art.1780200320. [DOI] [PubMed] [Google Scholar]
  • 4.Dalbeth N, Fransen J, Jansen TL, Neogi T, Schumacher HR, Taylor WJ. New classification criteria for gout: a framework for progress. Rheumatology (Oxford) 2013;52(10):1748–53. doi: 10.1093/rheumatology/ket154. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Malik A, Schumacher HR, Dinnella JE, Clayburne GM. Clinical diagnostic criteria for gout: comparison with the gold standard of synovial fluid crystal analysis. JCR: Journal of Clinical Rheumatology. 2009;15(1):22–4. doi: 10.1097/RHU.0b013e3181945b79. [DOI] [PubMed] [Google Scholar]
  • 6.Janssens HJEM, Janssen M, van de Lisdonk EH, Fransen J, van Riel PLCM, van Weel C. Limited validity of the American College of Rheumatology criteria for classifying patients with gout in primary care. Ann Rheum Dis. 2010;69(6):1097–102. doi: 10.1136/ard.2009.123687. [DOI] [PubMed] [Google Scholar]
  • 7.Johnson SR, Goek ON, Singh_Grewal D, Vlad SC, Feldman BM, Felson DT, et al. Classification criteria in rheumatic diseases: a review of methodologic properties. Arthritis Rheum. 2007;57(7):1119–33. doi: 10.1002/art.23018. [DOI] [PubMed] [Google Scholar]
  • 8.Taylor WJ, Robinson PC. Classification Criteria: Peripheral Spondyloarthropathy and Psoriatic Arthritis. Curr Rheumatol Rep. 2013;15(317):1–7. doi: 10.1007/s11926-013-0317-3. [DOI] [PubMed] [Google Scholar]
  • 9.Rudwaleit M, Taylor WJ. Classification criteria for psoriatic arthritis and ankylosing spondylitis/axial spondyloarthritis. Bailliere’s Best Practice and Research Clinical Rheumatology. 2010;24(5):589–604. doi: 10.1016/j.berh.2010.05.007. [DOI] [PubMed] [Google Scholar]
  • 10.Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig LM, et al. Towards complete and accurate reporting of studies of diagnostic accuracy: The STARD initiative. Clin Chem. 2003;49(1):1–6. doi: 10.1373/49.1.1. [DOI] [PubMed] [Google Scholar]
  • 11.Berendsen D, Jansen TL, Taylor W, Neogi T, Fransen J, Pascual E, et al. A critical appraisal of the competence of crystal identification by rheumatologists (abstract) Ann Rheum Dis. 2013;72(Suppl 3):981. [Google Scholar]
  • 12.Prowse RL, Dalbeth N, Kavanaugh A, Adebajo AO, Gaffo AL, Terkeltaub R, et al. A delphi exercise to identify characteristic features of gout - opinions from patients and physicians, the first stage in developing new classification criteria. J Rheumatol. 2013;40(4):498–505. doi: 10.3899/jrheum.121037. [DOI] [PubMed] [Google Scholar]
  • 13.Steyerberg EW, Eijkemans MJ, Harrell FE, Jr, Habbema JD. Prognostic modelling with logistic regression analysis: a comparison of selection and estimation methods in small data sets. Stat Med. 2000;19(8):1059–79. doi: 10.1002/(sici)1097-0258(20000430)19:8<1059::aid-sim412>3.0.co;2-0. [DOI] [PubMed] [Google Scholar]
  • 14.Steyerberg EW, Eijkemans MJC, Habbema JDF. Application of Shrinkage Techniques in Logistic Regression Analysis: A Case Study. Statistica Neerlandica. 2001;55(1):76–88. [Google Scholar]
  • 15.Rubin DB. Multiple Imputation for Nonresponse in Surveys. New York: John Wiley and Sons; 2004. [Google Scholar]
  • 16.Lanza ST, Collins LM, Lemmon DR, Schafer JL. PROC LCA: A SAS Procedure for Latent Class Analysis. Struct Equ Modeling. 2007;14(4):671–94. doi: 10.1080/10705510701575602. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Collins LM, Lanza ST. Latent Class and Latent Transition Analysis. Hoboken, NJ: John Wiley & Sons, Inc; 2010. [Google Scholar]
  • 18.Lanza ST, Bray BC. Transitions in Drug Use among High-Risk Women: An Application of Latent Class and Latent Transition Analysis. Adv Appl Stat Sci. 2010;3(2):203–35. [PMC free article] [PubMed] [Google Scholar]
  • 19.Hall AP, Barry PE, Dawber TR, McNamara PM. Epidemiology of gout and hyperuricemia. Am J Med. 1967;42:27–37. doi: 10.1016/0002-9343(67)90004-6. [DOI] [PubMed] [Google Scholar]
  • 20.Roddy E. Revisiting the pathogenesis of podagra: why does gout target the foot? J Foot Ankle Res. 2011;4(1):13. doi: 10.1186/1757-1146-4-13. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES