Identification of PMD subgroups using a myelination score for PMD

Background: The clinical spectrum of Pelizaeus-Merzbacher disease (PMD), a common hypomyelinating leukodystrophy, ranges between severe neonatal onset and a relatively stable presentation with later onset and mainly lower limb spasticity. In view of emerging treatment options and in order to grade severity and progression, we developed a PMD myelination score. Methods: Myelination was scored in 15 anatomic sites (items) on conventional T2-and T1w images in controls (n = 328) and 28 PMD patients (53 MRI; n = 5 connatal, n = 3 transitional, n = 10 classic, n = 3 intermediate, n = 2 PLP0, n = 3 SPG2, n = 2 female). Items included in the score were selected based on interrater variability, practicability of scoring and importance of scoring items for discrimination between patients and controls and between patient subgroups. Bicaudate ratio, maximal sagittal pons diameter, and visual assessment of midsagittal corpus callosum were separately recorded. Results: The resulting myelination score consisting of 8 T2-and 5 T1-items differentiates patients and controls as well as patient subgroups at first MRI. There was very little myelin and early loss in severely affected connatal and transitional patients, more, though still severely deficient myelin in classic PMD, ongoing myelination during childhood in classic and intermediate PMD. Atrophy, present in 50% of patients, increased with age at imaging. Conclusions: The proposed myelination score allows stratification of PMD patients and standardized assessment of follow-up. Loss of myelin in severely affected and PLP0 patients and progressing myelination in classic and intermediate PMD must be considered when evaluating treatment efficacy.


Compliance with Ethical Guidelines
Authors' contributions: I. Harting, A. Vanderver, and N.I. Wolf designed the study. I. Harting and N.I. Wolf wrote the initial draft of the manuscript. All authors examined patients and/or collected and/or analyzed data.
All authors revised the manuscript and approved the submission. to the entire PMD spectrum. Loss of function-variants (also called PLP null syndrome) lead to an initially less severe presentation, with subsequent clinical deterioration (7,13,14) .
Imaging-wise, lack of myelin is the hallmark of PMD. Lack of myelin has been shown to at least codetermine motor handicap in patients with hypomyelinating leukodystrophies (1,15) and functional disability in PMD correlates with white matter volume and degree of hypomyelination (16)(17)(18) . In view of emerging therapeutic approaches, identification of distinct clinical cohorts and biomarkers for grading of severity and progression is crucial. To this end we developed an MRI severity score assessing the degree of myelin deficit and atrophy using conventional T2-and T1-weighted images, which are available for most patients and can thus be drawn on for both retrospective studies of natural history and routine clinical use.

Patients and Methods: Patient selection:
For development of the MRI score and analysis of temporal changes patients with genetically proven Pelizaeus-Merzbacher disease imaged up to the age of 20 years were retrospectively identified using the database of VU Medical Center in Amsterdam. MRI scans had been acquired at different scanners (1.5 and 3T) using different protocols and were only included if axial T2-and T1weighted images and a sagittal sequence were available. Patients were clinically assessed as belonging to specific subtypes of the PMD spectrum by a paediatric neurologist with long-time experience in hypomyelinating disorders (NIW).
For interrater reliability six MRI scans of five genetically proven PMD patients were collected from the Myelin Disorders Bioregistry Project at the Children's Hospital of Philadelphia Controls: Normal myelination was retrospectively assessed in 374 MRI scans of 364 patients imaged between 0 and 20 years with normal findings on cranial MRI, excluding patients with significant J o u r n a l P r e -p r o o f developmental delay, known epilepsy, and a history of intraventricular or subarachnoid haemorrhage (163 female, 211 male, mean age at MRI 3.13, median 1.31 years). Sampling density was age-adapted in order to account for the time frame of myelination (Suppl.Figure1).

MRI scoring:
Myelination of patients and controls was graded separately on T2-and T1-weighted images (T2w, T1w) as best myelination in a total of 15 specific anatomic regions ("items") by an experienced paediatric neuroradiologist blinded to the clinical diagnosis (IH). Scoring items represent supratentorial primary motor and visual pathways, early myelinating infratentorial structures, as well as later myelinating supratentorial white matter regions (Suppl.Table1). 13 of these 15 items were scored depending on their signal in relation to cortex, while T2-signal of pyramidal tract and medial lemniscus in pons was scored relative to signal of surrounding brainstem white matter.
The validity of the items as a model of myelination was tested by performing principal component analysis (PCA) for T2-and T1-items in controls. As the first principal component explained 85.0% (T2w) and 85.4% (T1w) of the variation of the model and only the first principal component had an eigenvalue greater 1, a unifactorial model was thus sufficient to explain the variance of all items. This is consistent with the hypothesis that changes of the items as surrogate parameters of myelination are primarily dependent on age with only minor influences due to differences in imaging sequences and reading of MRIs.
For investigation of inter-rater reliability two further experienced pediatric neuroradiologists (AM, SR) were trained by the first, before all three raters scored six MRI scans of five new PMD patients.
Concordance among raters for ordinal data was computed using linear weights in order not to be affected by kappa paradoxes, in particular by restricted range lowering the estimates of inter-rater reliability (19) . The three T2-items ALIC (0.5), pyramidal tract in pons, and simplified scoring of genu (0.56) had only "fair concordance" between inter-raters and were discarded while, while except for T2-scoring of peridentate white matter (0.62), concordance of all other items was excellent (>=0.75) and items retained. Importance of scoring items for discrimination between PMD J o u r n a l P r e -p r o o f patients and controls and between PMD subgroups was investigated visually and using curves of receiver operation characteristics (ROC).

Surrogate parameters of brain volume
The bicaudate ratio (BCR), visual scoring of the corpus callosum on midsagittal images (thin y/n, e.g. Suppl. Figure 2), and the maximum anterior-posterior diameter of the pons on sagittal images were used as surrogate parameters of brain volume. BCR, defined as the minimum intercaudate distance divided by the transverse width of the inner table of the skull at the same level, i.e., the outer surface of CSF signal, was measured using axial T2-weighted images where the caudate heads were most visible and closest to one another. Raw values of BCR and pons diameter were compared with controls using z-scores and age-adapted controls (20) .

Statistical analysis:
Statistical analysis was performed using R environment for statistical computing and graphics (R, 2020). Principal component analysis (PCA) for validation of T2-and T1-scoring items as a model of myelination was implemented through R function prcomp (21) , analysis of receiver operating characteristics (ROC) of differently composed scores through R package pROC (22) . Concordance among inter-raters using linear weights was implemented through R package raters Version 2.0.1, cut-offs were derived from Cicchetti, values less than 0.40 being considered as poor and ratings between 0.40 and 0.59 categorized as fair, between 0.60 and 0.74 as good, and between 0.75 and 1.00 as excellent (23) . Scores of subgroups were compared using the pairwise Wilcoxon rank sumtest with Holm adjustment for multiple testing as non-parametric test through R package rstatix (24) .

PMD myelination score
For score development only the first MRI scans were used as delineation of deficient myelination is more difficult at younger ages. Patients with mild hypomyelination and/or first MRI in adolescence were not excluded, as they are the most difficult ones to distinguish from controls. Items were retained or discarded based on practicability of scoring and importance for discrimination between patients and controls as well as between subgroups of PMD patients (Suppl. Material: Selection of T2-and T1-items for the myelination score).
The resulting score included eight T2-and six T1-items combining the five T2-and T1-items of supratentorial pyramidal and visual tract (central region, centrum semiovale, PLIC, optic radiation, and primary visual area) with the T2-items of subcortical frontal white matter, medial lemniscus, and MCP and the T1-item of MCP (Table 1, Fig. 1, 2) While a basic T2-score of supratentorial pyramidal and visual tract items was sufficient for discrimination of patients and controls, the three additional T2-items increased differentiation of subgroups and as did the T1w-items, which were not primarily suited for differentiating patients and controls.
The resulting myelination score consisting of eight T2-and six T1-items allowed discrimination between patient and controls as well as classic and connatal patients when plotted ( Figure 3C). On

Changes of myelination score in patients with follow-up:
Follow-up MR scans for longitudinal assessment were available for 19 patients (Suppl. Table 2).
Myelination scores changed in 13 of 19 patients (e.g. Figure  BCR as an indicator of supratentorial volume deficit was increased to more than two standard deviations above age-normalized values (z-scores) in 13 patients, in 5 at first MRI. This was uncommon during the first year of life, present in a third of patients imaged during their second year of life and in more than half of the patients imaged after the age of two years. There was a mild, significant correlation of BRC-z-score with age for all MRIs (r= 0.282, p=0.041). The more frequent increase of BCR in classic compared to connatal type patients (6/10 vs. 2/5; Suppl. Table 5, Figure 4) is likely due to higher age at imaging of classic patients.
Corpus callosum was thin for age in 21 patients, in 14 at first imaging and observed earlier than increased BCR. Thin corpus callosum was present in 25% of patient imaged during the first year of life and the majority of patients imaged afterwards (Suppl. Table 5). Similar to BCR, a more commonly thin corpus callosum in classic compared to connatal type patients (9/10 vs. 2/5) is likely related to differing age at imaging. Maximal sagittal pons diameter was below -2 SDS only in one classic patient at 11.95 years and became so on follow-up at 6.6 years in one transitional patient.

Discussion
In view of evolving treatment for PMD, knowledge of the natural course and reliable biomarkers are crucial for stratification of patients and assessment of treatment effects. To this end we have developed a visual, semiquantitative PMD myelination score as a tool for standardized initial assessment and follow-up. With this score, available imaging may be used retrospectively, allowing studies on large patient cohorts who did not undergo advanced quantitative imaging studies. Using reported that diffuse T2-hyperintensity in brainstem and a T1-myelination age "before birth" predicted severe forms, while patchy signal alterations or a pattern of diffuse hypomyelination with some T2-hypointensity of PLIC was associated with milder phenotypes as classified by best motor acquisition. The authors modified the patterns introduced by Nezu et al. by defining the presence of some T2-hypointensity in PLIC ("best myelination) as discriminator between the two patterns of J o u r n a l P r e -p r o o f diffuse hypomyelination, changing it from involvement ("worst myelination") of corticospinal tract to best myelination in PLIC as the specified location for assessment of corticospinal tract (26) .
Comparison of these results is limited not only by differing patient numbers, clinical and MRI classifications, but also by use of involvement versus best myelination of structures and of agenormalized myelination scores versus myelination age. With the aim of establishing a myelination score for direct differentiation of PMD subgroups and monitoring on follow-up, we decided to forego age normalization as this precludes direct inter-and intraindividual comparison of values (e.g. a stable myelin deficit at 3 and 9 months resulting in different age-normalised score values) and would obscure differences in patients with less myelin than a term neonate (in our group 16/28 at first MRI for T2-items).
The score was based on axial T2w and T1w images available as part of routine clinical imaging, but not FLAIR thereby avoiding the triphasic sequence of signal changes of deep white matter on FLAIR-images as well T1-relaxation effects due to the inversion pulse used (27) . Myelination of specific structures was assessed as best myelination in order to be able to delineate partially myelinated from non-myelinated structures rather than completely from partially myelinated items.
For more objective assessment the signal of specific white matter regions was scored in a semiquantitative fashion with cortical grey matter as an internal reference (except for pontine structures), and for each item examples for each value of the score were documented for training of raters.
The items included in the score were selected from a larger number scored in patients and controls based on interrater variability, practicability of scoring and importance of items for differentiation.
Consequently, the score as a model of myelination does neither reproduce the entire process of myelination nor sample all lobes and white matter structures as earlier myelination scores for PMD and 4H have done (15,17,18) but was tailored to PMD. As a result, it is "skewed" towards early myelinating, reliably discernible structures found to be central for differentiation of PMD and its subgroups in our cohort of 28 patients. Skewing toward early myelination and the extent of J o u r n a l P r e -p r o o f hypomyelination encountered in PMD is underscored by the fact that controls attain full scores between 3.5 and 4.6 months for T1-items, between 8.7 and 10.8 months for T2-items and thus also for the resulting myelination score of the combined T2-and T1-items. While this disease-orientated score development allowed us to include as few items as possible, it makes validation in a larger cohort necessary. Nevertheless, reviewing the literature on MRI changes reported in PMD there is some support for the items upon which the proposed myelination score is built: Consistent with previous studies, T2-hyperintense PLIC was only found in severely affected patients with connatal or transitional PMD (26, 28-30) . Sumida et al. reported T1-myelination not equivalent to birth -defined by lack of T1-hyperintensity in PLIC and optic radiation -as predictive for severe forms of PMD (26) , a finding reproduced in our patients in whom T1-hypointense PLIC and optic radiation were only found in connatal or transitional PMD. In the largest study of myelination reported (18) , the agenormalized myelination of arcuate fibers as well as of frontal white matter and ALIC was less in severely affected patients without motor acquisition or only achieving head control (18) . While we discarded ALIC due to larger interrater variability, which might be due to the craniocaudal temporospatial gradient of myelination in ALIC we observed in our controls, lobar white matter in centrum semiovale and myelination of subcortical white matter also differed between subgroups of our patients: At least T1-isointense myelin of central and primary visual subcortical white matter and T2-isointense myelin in centrum semiovale was only observed in classic and milder forms of PMD, and at least T2-isointense myelin in subcortical Rolandic and frontal white matter was only present in PMD subtypes milder than classic. Interestingly, subcortical white matter of primary visual region was T2-hyperintense in all our patients, even if myelin was visible in the later myelinating subcortical frontal (not Rolandic) white matter, which occurred in two intermediate, one PLP0, one SPG2, and one female patients.
Brainstem involvement in form of patchy myelination not further specified has been reported in patients with connatal PMD (30,31) and symmetric T2-hyperintensity of pontine pyramidal tract in classic patients (25) . While we did not include T2-hyperintense pontine pyramidal tract due to lower acquisition, whereas partial hyperintensity with only pyramidal tract hyperintensity depicted in the example was found even in SPG patients (Fig. 1B, 2B in (26) ), which is similar to our findings.

Temporal patterns and determinants of clinical disability in PMD
Previous studies in larger groups of patients have found that white matter atrophy is a major ). Later loss has been observed in a juvenile patient (18) as well as in late childhood in one of our female and the SPG2 patient with follow-up and may contribute to decline in patients. Atrophy years. Contrary to expectation, atrophy scores in our patients were less in connatal than classic patients which most likely resulted from earlier imaging of connatal patients. Nevertheless, this finding illustrates, that combination of myelination score and atrophy into one MRI score may be counterproductive for initial stratification of patients, as lower brain volume scores but higher myelination scores of classic compared to connatal type PMD patients resulted in blurring of score boundaries.
To summarize, the proposed PMD myelination score is based on routine MR images, differentiates clinical PMD subtypes, detects and quantifies changes on follow-up. The score needs to be validated in a larger cohort with subgroups based on best motor function instead of onset and rate of progression of the traditional clinical phenotyping as do surrogate parameters of brain volume as markers of secondary degeneration for a more comprehensive, integrated MRI score for PMD. This will also aid in understanding differences in natural history, i.e. some progression of myelination in classic and intermediate PMD and loss in severely affected patients and PLP0, which need to be taken into account when assessing efficacy in treatment trials.