Usability and inter-rater reliability of the NeuroMotion app: A tool in General Movements Assessments

Background: Early intervention after perinatal brain insults requires early detection of infants with cerebral palsy (CP). General Movements Assessments (GMA) in the ﬁ dgety movement period has a high predictive value for CP. Aim: To investigate the NeuroMotion ™ app's usability regarding ﬁ lm quality and user experience and to assess the inter-rater reliability of GMA in a neonatal risk group. Methods: GMA, inter-rater reliability and ﬁ lm quality was assessed in a cohort consisting of 37 infants enrolled in a multicentre study of GMA as part of the Swedish neonatal follow-up program for high-risk infants. Some of these infants were ﬁ lmed twice. For evaluation of user experience 95 parents of 52 infants were addressed with a web-based questionnaire. A GMA expert assessed ﬁ lm quality and performed GMA and three on-site assessors, individually performed GMA. Inter-rater reliability was computed using Krippendorff's alpha (k-alpha). Results: In all, 45 ﬁ lms showed good or excellent quality. The response rate of the questionnaire survey was 40% and revealed predominantly positive perceptions of the NeuroMotion ™ app. GMA in 36 infants resulted in substantial agreement (k-alpha ¼ 0.72, 95%CI ¼ 0.3 e 1.0) between the three on-site assessors ’ consensus and the GMA expert. Inter-rater reliability for GMA between the on-site assessors was moderate (k-alpha ¼ 0.48, 0.18 e 0.74). Conclusion: The NeuroMotion ™ app produces good technical quality ﬁ lms and the app user experience was overall positive. High agreement was observed between the on-site assessors and the GMA expert. The study design is feasible for more extensive GMA studies in cohorts of infants at risk of CP. © 2021 The Authors. Published by Elsevier Ltd on behalf of European Paediatric Neurology Society. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).


Introduction
There is a need for early intervention in children with high risk of cerebral palsy (CP) and hence a necessity to identify these children earlier than what is currently the case.Although there has been improvements due to early screening most children are still diagnosed with CP after 1 year of age [1,2].Prechtl's General Movement Assessment (GMA) is recommended as one of the most important tools for early detection of CP [3,4].GMA is a qualitative analysis of an infant's spontaneous movements from birth until 20 weeks of corrected age [5].Fidgety movements are continuous small amplitude, moderate speed movements of neck, trunk, shoulders, wrists, hips, and ankles in all directions and of variable accelerations, seen at 3e5 months post term in typically developing infants when they are awake and lying supine with no interference [4].Several studies have shown that absence of fidgety movements on GMA during the fidgety movement period (9e20 weeks) can identify infants who later develop CP with a sensitivity of 98% and a specificity of 91e94% [6e8].
Studies of GMA inter-rater reliability during the fidgety movement period have been done with different methods and with various results [9e12].To our knowledge, there are no studies on inter-rater reliability of GMA films of high-risk infants evaluated by blinded assessors and recorded in a home setting by parents using a smartphone app.Film recordings for GMA are usually performed in an outpatient clinic, often not adapted to the infant's needs and the non-familiar surroundings may influence its mood.Recording at home with a smartphone can adjust the situation to the child's alertness and wellbeing and overcome inequality due to geographical distances.In Australia, the Baby Moves app has been successfully used for this specific aim with a positive user experience [13].Nevertheless, it was not possible to use this app in Sweden to upload films to our secure server.Based on our information, there is no secure and efficient way for patients to upload films to Swedish health care providers.
Telemedicine and mHealth solutions are expanding and important technological tools in the development of pediatric health care [14].To enable nationwide studies using GMA in screening infants at an increased risk of CP a mHealth app (Neu-roMotion™) was developed to film and transfer film files to a secure server.However, mHealth tools need to be evaluated for usability such as accuracy, efficiency, effectiveness and user satisfaction [15].
Our study aims to investigate the usability of the NeuroMotion™ app and determine the inter-rater reliability of Prechtl's GMA in a neonatal high-risk group.

Study design and study population
This is a feasibility study using sub-cohorts from an ongoing prospective nationwide multicentre study on the use of the Prechtl GMA as part of the Swedish national neonatal follow-up programme for high-risk infants, see Appendix A.1.
GMA, inter-rater reliability and film quality was assessed in a cohort consisting of the first 37 infants enrolled in the multicentre study.Some of these infants were filmed twice, and all received films were used for film quality analysis.
The first 6 app users were contacted by the researchers to ensure there were no technical difficulties.The cohort used for evaluation of user experience consist of parents to 56 infants consecutively enrolled in the multicentre study starting with the 7th infant.4 of these infants were excluded due to participation in another study.Both parents to the remaining 52 infants were invited separately to participate.For details of included participants, see flowchart in Fig. 1.

The mHealth app -NeuroMotion™
The development of NeuroMotion™ was done by the company Unitalent, owned by Link€ oping University in cooperation with the first and last authors of this study.NeuroMotion™ is compatible with iOS and Android operating systems available at the App Store and Google Play.Personal login credentials are used simultaneously for every child in the different systems that allow uploading of several films.The app gives reminders by push notices when to record the infants and instructions according to GMA [5], which are short and intended to be intuitive.When the app is in filming mode, a screen filter facilitates capturing the infant's whole body on the screen.Personal information is not handled or saved in the app.The films last for a minimum of 2 and a maximum of 3 min.This timeframe was chosen, based on studies recommending at least 2 min [16,17], together with difficulties holding the smartphone steady and keep an infant satisfied for much longer than 3 min.When recording is finished, the app user can upload the film to a secure web-based platform supporting research data management at Linkoping University using REDCap (Research Electronic Data Capture) [18].REDCap is a browser-based, metadata-driven electronic data capture software to design clinical and translational research databases.
After consent, parents to the participating infants received a letter with login credentials, recording instructions and information about downloading the app.A push notice appeared in the app when the infant was 12 weeks corrected age reminding the parents to film.Because the app was newly developed, the parents also received an email reminder in case the push notification messages were not functioning appropriately.

Film quality assessment
Quality assessment of the films was performed by a GMA expert (a licensed GMA tutor, advanced course) using a structured form with questions on sharpness, camera position, clothing, surrounding influences and the infants' position and mood, according to GMA guidelines for recording [5].

User experience of NeuroMotion™
A web link to the questionnaire about user experience was sent by email 1e7 months after recording to one or both parents depending on available email addresses.Email reminders were sent twice.The web-based questionnaires were anonymous and not coupled to a specific film or infant.The questionnaire was supported by the tool "Survey&Report" [19] licensed for Link€ opings University, enabling answers customised for smartphones and computers.The survey investigated parents' experience of the app, perceptions of security, recording instructions and technique.Most questions were designed to fit into a five-point Likert scale with possibilities to make individual comments etc (Table S2).Questionnaire items were adapted from the Baby Moves app survey described by Kwong et al [13] 2.3.GMA

GMA assessors
There were three on-site assessors (two paediatric physiotherapists and one paediatric neurologist) with several years of experience with children suffering from neurological deficits.All assessors had certificates from GM trust, advanced course but with limited clinical experience using the GMA.Moreover, a GMA expert performed the same assessments as the three on-site assessors.

GMA procedure
All on-site assessors were blinded to the infants' medical history, except for gestational age at birth and corrected age at assessment.However the GMA expert was totally blinded to medical history.The GMA scoring of fidgety movements was dichotomised into two categories: present (normal or abnormal (exaggerated)) or absent (sporadic or absent) [20,21].The on-site assessors scored the films individually and in the same order as uploaded.After separate, individual scorings, a consensus decision was reached based on discussions between the on-site assessors.This approach was used to resemble a clinical setting where assessors may have varying degrees of experience and in line with the recommendation by Crowle et al. [11].The GMA expert scored the films individually on one occasion.A second film was requested if fidgety movements were lacking, the film quality was poor or instructions were not followed.This approach was chosen to ensure the results instead of asking for longer film sequences with better possibilities for parents to film the infant in a good mood and enable filming adjustments.

Inter-rater reliability
Inter-rater reliability was calculated both between the on-site assessors and between the on-site assessors and the GMA expert.In addition, the assessors' on-site consensus was also compared to the assessment of the GMA expert.This analysis resulted in six alternative calculations for inter-rater reliability.
The on-site assessors discussed the film and their ratings, resulting in a lack of blindness of the raters' further assessments regarding the same infant.In cases in which a second film (n ¼ 8) was requested, only the first was used to analyse inter-rater reliability.

Ethics
The study was approved by the Regional Ethics Review Board in Link€ oping, Sweden (Dnr 2018/236-31).Written consent after information about the study was mandatory.By answering the questionnaire and returning it, the parents implicitly consented to participate in the questionnaire part of the study.

Film quality
In total there were 45 films available for film quality assessment and most of these films (93.3e95.6%)showed excellent picture quality (Table 2).One film was defined as "not approved" because the "camera was not held in a stationary position".Infant setting was excellent in almost all films (89%e100%) (Table 2).Before filming, all infants were correctly placed in a supine position.

User experience
Altogether 95 parents of 52 infants were addressed by email with the web-based questionnaire whereof 38 parents responded.As 31 were mothers, we assume at least one parent of 31 infants responded, hence the response rate of the individual infants is somewhere between 60 and 73%.Some 92% totally or strongly agreed that the app was easy to use and that it provided instructions easy to understand and follow (97% totally or strongly agreed) (Table 3).All parents experienced that the screen filter made it easy to understand how to position the infant.Using the app, made three parents worried about their baby's development.A minority (11% totally or strongly agreed) of the parents preferred visits from health care personnel rather than sending films through the app.One parent found that the app did not facilitate sending films to health care personnel.Most parents totally or strongly agreed (37 of 38) that it felt good to be a part of their baby's assessment and 79% reported that the assessment of their baby's movements using the NeuroMotion™ felt safe and secure (Table 3).

Inter-rater reliability of GMA
For the GMA phase of the study, 37 infants were initially included.One infant was excluded because the film was not possible to assess and the second film asked for was not uploaded by the parents.Of the 36 remaining infants, 15 were born extremely preterm (<28 weeks), 4 very preterm (28e31 weeks), 4 moderately preterm (32e36 weeks) and 13 at term (!37weeks) (Table 1).Films lasted between 2 and 3 min (mean 2.54 min).The first films of the infants were recorded at a median age of 12 weeks post-term.A second film was required in eight infants and was sent at a median age of 14.5 weeks (Table 1).Absent fidgety movements were found in 3 of the 36 infants as reported together with the individual GMA ratings and final consensus in the appendix (Supplementary Table S-3).

Discussion
Overall, the app user experience was very positive, and the film quality was excellent in most films the parents recorded.According to the Landis and Koch guidelines [23], this study showed a substantial agreement between the on-site assessors' consensus and the GMA expert for the GMA results in high-risk infants using the NeuroMotion™ app.

Film quality
The NeuroMotion™ app is well suited for GMA because of the excellent film quality recorded with the app, making it possible to detect even subtle movements.According to GMA guidelines, the films were in accord with the infant and the surrounding's necessary standardised settings [5].Our findings may be compared to those in a study that used the Baby Moves app [13], although the two apps differs in some aspects.NeuroMotion™ has a minimalistic design with a simple colour scheme, brief step-by-step instructions with the intention to lead intuitively to the right actions and thereby optimizing the films (Fig. 2).In that study 21/180 films could not be analysed due to poor film quality and were nonscorable.In contrast, all films in our study could be scored based on film quality.A success factor could be the simplified instructions or better intuitive visual cues.On the other hand, our study is much smaller, which needs to be considered when comparing the results of the two studies.Another explanation may be that many people today are skilled and experienced smartphone users.Moreover, smartphone software is becoming more technically sophisticated.
a All 37 infants were included, 8 infants were filmed twice due to lacking fidgety movements, poor film quality or because instructions were not followed.

Table 3
Results from the questionnaire on the usability of the NeuroMotion™ app.

Strongly agree
Partly agree

Reminders
The push notices were helpful as a reminder when to film.

User experience
The survey results reveal overall parental satisfaction with NeuroMotion™.To our knowledge, the Baby Moves app is the only app used for remote GMA with previously published results about app usability [13].Parents found the instructions, supported by the screen filter (for positioning of the infant), within NeuroMotion™ easy to understand and follow, which was also the case with the Baby Moves app [13].The format of the screen filters in the two apps is somewhat different and the filter in NeuroMotion™ accommodates infants' different positions.The NeuroMotion™ app serves the infants' needs as parents can choose when and where to film and when the child is awake and alert.Parents' filming experiences at home and being part of the assessment are important.Our findings of high parental satisfaction are consistent with a previous study [24] investigating parents' perspectives on taking part in early neurodevelopmental screening.Opinions varied on whether the filming with the NeuroMotion™ app increased the parents' awareness about their babies' development.This discrepancy in results was also found when evaluating the Baby Moves app [13].Having a child being at risk of permanent disability due to a brain insult and staying at a neonatal intensive unit, may influence parental anxiety and vulnerability.Parents' participation in early screening may increase their awareness of their infant's risk for neurodevelopmental disability, prepare them emotionally and motivate them to early interventions [24].Our results indicate that a minority of parents would have preferred visits from health care personnel instead of filming their infant with an app.Using the NeuroMotion™ and Baby Moves app resulted in a minority of the parents feeling worried about their baby's development [13].Nevertheless, some parents' comments demonstrated the importance of a combination of using the app and having face-to-face meetings.
Telemedicine and mHealth solutions are expanding and are seen as important technological tools in pediatric health care development [14].Parents of children with genetic diseases [25] view mHealth as a time saver and an essential health care component.Thus, research on usability and implementation of these tools is vital regarding utility issues such as accuracy, efficiency, effectiveness and user satisfaction [15].The development of family-centred care, regardless of geographical distance, can also be facilitated using technical solutions (such as the NeuroMotion™ app and mHealth).Such solutions can provide specialized care opportunities, offer access to targeted resources and streamline health care resources.
Technical problems such as push notices sometimes malfunctioning (an issue now resolved) can have an impact on usability.Approximately 13% of users had some difficulties uploading the films but eventually succeeded, which might be due to issues with the internet connection.Work is underway for a solution that will allow the film uploading when the internet connection is stable.Stability is important so that the app can also be used where the internet connection is less reliable.

Inter-rater reliability in GMA
GMA is a qualitative analysis very much based on the assessor's visual perceptions of movements.Our results show that the final decision should be based on a consensus of at least two assessors.There was substantial agreement between consensus decisions but only moderate agreement between all assessors.Our on-site consensus results agree with two studies estimating GMA's interrater agreement in the fidgety period showing substantial agreement [10,12] while a third showed almost perfect agreement [11].The variability in the results of inter-rater agreement can be influenced by statistical methods for estimation, study population, blindness and experiences of the assessors.
Having all on-site assessors blinded to medical history, except for gestational age and a GMA expert totally blinded, is unique to our inter-rater reliability study compared to other similar studies [10e12].This blind approach is important when studying the true reliability of an assessment.With background knowledge of an infant's medical history, there is a risk of unintendingly allowing other known risk factors for CP to influence decisions on the absence of fidgety movements.
For statistical analysis of inter-rater reliability, we chose the Kalpha method as it considers sample size and number of assessors, not just percentage of agreement.Many studies report the percentage of agreement between assessors, making it difficult to use when comparing studies of different cohort sizes and predicting inter-rater reliability in future studies.Other statistical methods can be used to calculate the strength of agreement (reliability).We believe that K-alpha is best suited given that it can be used for two or more assessors and handle missing data [22].The strength of agreement for categorical data can be compared using cut-off values for kappa statistics (as described in Landis and Koch) [23] between the results of Cohen's kappa and K-alpha.
Another difficulty comparing studies is differences in the expected prevalence of fidgety movements in the study cohort.The children in the Swedish national neonatal follow-up programme for high-risk infants [26] have a future prevalence of 7e10% for CP [27,28].Our cohort was recruited from this group of children and would therefore have a similar prevalence of CP, and consequently, a lack of fidgety movements.The prevalence rate of CP in the abovementioned GMA studies on inter-rater reliability is most likely higher than we expect it to be in our multicentre study.We intend to include more than 1000 infants incorporated in the Swedish national neonatal follow-up programme for high-risk infants [26], allowing us to estimate inter-rater reliability in GMA for different subgroups.

Strengths and limitations
A reason for the high usability of NeuroMotion™ might be that it was developed in close cooperation between a computer coding specialist, a cyber security engineer and specialists in the field of pediatric neurology and licenced GMA assessors.Inter-rater reliability analysis was based on small sample size.Still, the statistical method used for this purpose was suitable for the small sample size.
Having information on how to make films in your own language is important.In the NeuroMotion™ app you can only choose between Swedish and English.This language restriction is a limitation because only participants mastering these languages can use it.
A limitation of the survey was the low response rate and lack of the respondents' descriptives on social demographic data, leading to a risk of response bias.Since respondents are anonymous, we cannot provide descriptives of the respondents nor clarify reasons for not responding.
It might be that parents who decided to answer the questionnaire were more concerned about their infant's health and development than those who did not respond.Thus, the participating parents might be more motivated, positive and confident using the app.Another reason for low response rate could be that only one of the parents decided to answer.The wide time window between filming and receiving the questionnaire, may have led to a response bias as well as influenced the response rate, although for the majority of infants the time window was less than 4 months.Another limitation might be that we were aware of the gestational age.However, according to the inclusion criteria all the participants have a high risk of neurological sequel as CP also of other causes than prematurity.Awareness of gestational age should therefore not influence the assessment of fidgety movements significantly.
Further studies are needed to evaluate GMA in the neonatal high-risk infant population in light of early diagnosis supported by telemedicine to improve knowledge about the assessment's accuracy in this specific setting, striving for effective and qualitative health care.Assessment of NeuroMotion™ has covered important usability measures, such as accuracy (film quality), efficiency and effectiveness (recording instructions and technical issues) and user satisfaction.

Conclusion
The NeuroMotion™ app is an excellent tool for filming infants both at home and in a clinical setting thanks to good technical quality of the films as well as intuitive instructions according to standardised GMA guidelines.Furthermore, parents report high degree of user satisfaction.The approach with GMA assessors reaching a consensus is recommended given the achieved substantial agreement with a GMA expert.Using the NeuroMotion™ app in a home setting makes it possible to study GMA in larger cohorts of neonatal infants.

Authors contribution
KAS: Conception, study planning, literature search, study design, data collection, data analysis, data interpretation, writing.(area of expertise: pediatric neurology), AFB: Data collection, critical review of the manuscript, A-CE Interpreting results, critical review of the manuscript (area of expertise: pediatric neurology), M € O: Study design, data collection, data interpretation, critical review of the manuscript (area of expertise: pediatric physiotherapy), HEK: Conception, study planning, study design, data collection, data analysis, data interpretation, critical review of the manuscript (area of expertise: pediatric neurology).

Writing Assistance
None.

Independence (role of the sponsors)
None of the funders had any role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; and preparation, review, or approval of the manuscript.

Fig. 1 .
Fig. 1.Chart of included participants in the different analyses.
KAS: A private donation through the Knut and Alice Wallenberg Foundation for research on stroke and stroke-causing factors in infancy and a grant from Medical Reaserch Council of Southeast Sweden (FORSS-859871).AFB: None ACE: None M € O: None HEKS: A private donation through the Knut and Alice Wallenberg Foundation for research on stroke and stroke-causing factors in infancy, a grant from the Region € Osterg€ otland Research Council and a grant from Medical Reaserch Council of Southeast Sweden (FORSS-859871), Stockholm county council -grant number 2019-1138.
Funding source A private donation through the Knut and Alice Wallenberg Foundation for research on stroke and stroke-causing factors in infancy, a grant from the Region € Osterg€ otland Research Council and a grant from Medical Reaserch Council of Southeast Sweden (FORSS-859871).Declaration of competing interest K.A Svensson þ H.E.K Sundelin developed the app Neuro-Motion™ with financial support from a private donation through Knut och Alice Wallenbergs Stiftelse.For the purpose of further research and later implementation of NeuroMotion™, the company NeuroMotion AB was founded and is the owner of the app.Neu-roMotion AB is owned by KAS and HEKS and has the purpose of research and to be nonprofitable.

Table 1
Characteristics of study participants in the inter-rater reliability analysis a and film quality.One participant was excluded in the inter-reliability analysis due to inaccessibility and lack of a second film. a
[23]cording to guidelines by Landis and Koch[23].b Consensus was reached in eight cases based on a second film.