Wireless Surface Electromyography and Skin Temperature Sensors for Biofeedback Treatment of Headache: Validation Study with Stationary Control Equipment

Background: The use of wearables and mobile phone apps in medicine is gaining attention. Biofeedback has the potential to exploit the recent advances in mobile health (mHealth) for the treatment of headaches. Objectives: The aim of this study was to assess the validity of selected wireless wearable health monitoring sensors (WHMS) for measuring surface electromyography (SEMG) and peripheral skin temperature in combination with a mobile phone app. This proof of concept will form the basis for developing innovative mHealth delivery of biofeedback treatment among young persons with primary headache. Methods: Sensors fulfilling the following predefined criteria were identified: wireless, small size, low weight, low cost, and simple to use. These sensors were connected to an app and used by 20 healthy volunteers. Validity was assessed through the agreement with simultaneous control measurements made with stationary neurophysiological equipment. The main variables were (1) trapezius muscle tension during different degrees of voluntary contraction and (2) voluntary increase in finger temperature. Data were statistically analyzed using Bland-Altman plots, intraclass correlation coefficient (ICC), and concordance correlation coefficient (CCC). Results: The app was programmed to receive data from the wireless sensors, process them, and feed them back to the user through a simple interface. Excellent agreement was found for the temperature sensor regarding increase in temperature (CCC .90; 95% CI 0.83-0.97). Excellent to fair agreement was found for the SEMG sensor. The ICC for the average of 3 repetitions during 4 different target levels ranged from .58 to .81. The wireless sensor showed consistency in muscle tension change during moderate muscle activity. Electrocardiography artifacts were avoided through right-sided use of the SEMG sensors. Participants evaluated the setup as usable and tolerable. Conclusions: This study confirmed the validity of wireless WHMS connected to a mobile phone for monitoring neurophysiological parameters of relevance for biofeedback therapy. (JMIR Biomed Eng 2018;3(1):e1) doi:10.2196/biomedeng.9062


Introduction
In the emerging era of mobile health (mHealth) and technology, the use of wearable sensors and mobile phone health apps has recently gained attention. This has led to a subcategory of health informatics, labeled mHealth, encompassing the use of mobile phones for medical purposes [1]. In addition to these apps, there is also a wide array of wearable health monitoring sensors (WHMS) [2],which represent a means for patients to access real-time data from a broad range of physiological parameters at home [3][4][5], thus enabling extensive data acquisition [6]. mHealth is of special interest to the younger generation, which is constantly exposed to and familiarized with such technology. It is also increasing in popularity within the field of headache care and research. In particular, mobile phone-based headache diaries are frequently used [7]. However, there is a potential for extending this mobile technology into the preventive treatment of headache disorders, such as migraine. The bulk of current mHealth research focuses on chronic conditions and delivery of self-educational treatment [8], fitting the description of behavioral headache treatments. Biofeedback, one of the several behavioral headache treatments, is well established and empirically supported [9]. Systematic reviews with meta-analyses demonstrated that biofeedback is effective as a migraine prophylaxis in both the adult and pediatric populations [10,11]. However, the treatment is both time-consuming and costly and therefore not readily available for those in need. Thus, a more optimal approach for behavioral headache treatment has long been sought [12,13]. Biofeedback has the potential to exploit the recent advances in mHealth technology [14,15]. All the while, biofeedback mHealth solutions for other purposes, such as exercise and postcancer swallowing exercises, are being developed [16,17].
Modalities proven effective in biofeedback treatment for headache disorders include surface electromyography (SEMG) and peripheral skin temperature. Both modalities are common in the current development of WHMS [2] and may serve as natural elements in the implementation of biofeedback solutions. Nevertheless, such WHMS sensors have not been validated for use in neurophysiological monitoring for the purpose of biofeedback therapy.
The aim of this study was to assess the validity of WHMS for measuring SEMG and peripheral skin temperature in combination with a mobile phone app. This proof of concept would form the basis for the development of a novel, innovative mHealth system for biofeedback therapy for young persons with primary headache.

Study Design
In the first phase of the study, we identified suitable WHMS and developed the preliminary software. In the second phase of the study, we recruited healthy volunteers to establish the validity of the chosen WHMS. The study was exploratory in nature, with the main aim to evaluate the validity of the chosen WHMS by assessing the agreement compared with stationary neurophysiological equipment following recommended guidelines for agreement studies [18].

Identification of Sensors
The inclusion criteria and requirements for suitable sensors were (1) wireless setup, (2) small size, (3) low weight, (4) simple to use compared with standard clinical equipment, and (5) low cost.

Software Development
The first version of the app was created as a minimal viable product (MVP). This preliminary version was programmed to serve as the starting point of iterative and incremental rounds of testing [19], allowing subsequent development and fine-tuning of the user interface and software components in an upcoming usability study.

Participants
We considered a sample size of 18 to be sufficient, based on the model for sample size determination in reliability studies presented by Bonett [20] (Multimedia Appendix 1). We set out to recruit 20 healthy volunteers to account for potential dropouts. Participants were recruited as a convenience sample by actively seeking out young individuals from the local research and student community. Exclusion criteria were reduced hearing, vision, or sensibility, and severe neurologic or psychiatric disease.

Equipment
TheNeckSensor (EXPAIN, Oslo, Norway) was selected as the wireless WHMS to measure muscle tension. This is a small, compact bipolar SEMG sensor, with a single SR-R adhesive gel patch containing both electrodes (total patch area, 19.8 cm 2 ), and no patient ground electrode. For wireless measurement of temperature, we selected the PASPORT Skin/Surface Temperature Probe, PS-2131, combined with PASPORT Temperature sensor, PS-2125, and AirLink, PS-3200 (Pasco, Roseville, CA, USA). Both the sensors transmitted signals via Bluetooth Smart/4.0.
As the stationary equipment, the following AD Instruments (Dunedin, New Zealand) setup was used: (1)

Experimental Procedure
Participants were seated in a recliner at a 90 degree angle in the neurophysiological laboratory. The 2 electrodes from the NeckSensor were placed over the upper fibers of the right trapezius muscle midway along the line between the spinous process C7 and the acromion [21,22]. Since simultaneous registrations of SEMG signals from the same location with different sets of surface electrodes are not possible, one set of electrodes from the stationary equipment was placed 2 cm cranially of the NeckSensor, and the other set was placed 2 cm caudally. The interelectrode distance was 4 cm. The "patient ground" electrode for the stationary equipment was placed over the spinous process C7 (Figure 1). The skin beneath the stationary electrodes was washed with alcohol swabs. The 2 skin temperature sensors were attached, without touching each other, to the volar pad of the distal phalange on the second finger with sticky tape, with the stationary sensor placed radially of the 2 sensor electrodes. Figure 1 shows the scheme of the electrode placements over the upper trapezius fibers. The wireless sensor electrode pair was placed first, midway in the line between the acromion and the spinous process C7. One of the two pairs of stationary sensor electrodes was placed cranially, whereas the other was placed caudally of the wireless sensor electrode pair. The interelectrode distance for each pair was 4 cm.
Initially, each participant was asked to relax for 5 min to allow the skin temperature to increase during relaxation. Relaxation was achieved by asking the participant to do nothing and sit still on the recliner. This served to give a baseline (relaxed) muscle tension measurement. Relaxed trapezius muscle tension (baseline) was recorded in the last 30 s of relaxation. Thereafter, the temperature sensors were detached to allow the measurement of room temperature for the remainder of the procedure. Subsequently, the participant was instructed to complete a series of exercises to activate the upper fibers of the trapezius muscle. Arbitrary angle isometric maximal voluntary contraction (MVC), through shoulder elevation, was completed in 3 repetitions, each lasting for 6 s [22][23][24][25]. The SEMG and force were simultaneously registered. The force was recorded by a dynamometer (Manual Muscle Tester, Lafayette Instruments, USA) attached to a fixed sling placed over the acromion. Subsequently, the participant was asked to complete similar sets of contractions at 50% (VC50) and 25% (VC25) of maximal contraction guided by a sound signal from the dynamometer elicited at a corresponding set force. Finally, the participant was asked to complete 4 repetitions of static contractions (15 s each) performed by abducting both shoulders to a 90 degree angle and holding them against gravity [22].
After completing the exercises, the participant was asked to answer a 5-item user evaluation questionnaire. Of these, 3 questions had reply options on a 5-point Likert scale, ranging from "Very dissatisfied" to "Very satisfied," while the remaining 2 questions were open for free comments (Table 1).  To what degree did you feel that the use of shoulder-musculature reflected the feedback in the app? 2 Do you recognize the wireless sensors as safe to use? 3 Did you experience any undesirable harmful effects (if yes, please explain)? 4 Do you have any further comments (if yes, please explain)? 5

Data Management
The NeckSensor uses a 12-bit ADC resolution sampled at 1024 Hz with a third order 10-480 Hz active bandpass filter. The sensor was programmed to calculate and transmit mean square values internally, with a window width of 40 ms, with no overlap, and a frequency of 25 Hz in order not to overload the Bluetooth capacity. The PowerLab sampled the SEMG signals at 2000 Hz with a fourth order Bessel lowpass filter at 500 Hz and a first order high pass filter at 10 Hz. In addition, a 50 Hz notch analog filter was applied [26]. All stationary recordings were evaluated visually for the presence of ECG artifacts. If found, these were to be corrected by removing the spike-correlated area in the SEMG signal and subsequently replacing the gap with surrounding SEMG activity.
First, the stationary readings were root mean square (RMS) rectified and then averaged over the two sets of electrodes to avoid phase-cancellations. The RMS value was calculated from the mean square values of the wireless sensors. The RMS values for each muscle contraction exercise to be used in the analyses were calculated as the mean of the repetitions for both equipment sets. For the temperature measurements, we calculated the difference in temperature from the start to the end of relaxation and the difference between the temperature at the end of relaxation and room temperature.

Statistics
The means and SD for the RMS values during trapezius muscle exercises and the chosen data temperature points were calculated. Systematic differences between stationary and wireless equipment were assessed with the Wilcoxon signed-rank test.
Mean difference (MD) and limits of agreement (LOA), together with Bland-Altman plots were used as descriptive tools [27].We calculated the intraclass correlation coefficient (ICC) with a two-way, mixed-effects consistency of agreement model. Coefficients for both individual and average agreement were presented. In addition, we calculated the Lin concordance correlation coefficient (CCC) [28][29][30]. For the ICC and CCC analyses, the data was first transformed to meet assumptions for a two-way analysis of variance model. Then the data was transformed by calculating the natural logarithm after adding 0.1 as a constant to adjust for values being close to zero. The ICC values were interpreted as suggested by Cicchetti et al [31], that is, unacceptable or poor (.00-.40), fair (.41-.60), good (.61-.75), and excellent (.75-1.00). All data were analyzed by using the statistical package Stata version 14 (StataCorp, College Station, TX, USA).

Sensors and Software
The WHMS fulfilling the predefined requirements were identified through pragmatic Internet-searches. The MVP version of the app used in the experimental procedure was programmed to receive data from the wireless sensors and feed raw data back to the user. The raw data were presented as two columns increasing in height with increase in muscle tension and temperature, respectively. The app was programmed to allow connection of any WHMS using Bluetooth.

Participants
A total of 20 healthy participants were recruited and completed the experimental procedure. Of these, 12 were male participants, and their mean age was 24.7 years (SD 2.7, range 18-29 years).

Surface Electromyography Sensor Agreement
We observed no ECG artifacts in the SEMG recordings ( Figure  2). Hence, the ECG-related elements were not removed from the SEMG recordings.  Table 2. The wireless sensor showed a lower voltage during trapezius muscle exercises than during all contraction periods and at baseline. Table 3 summarizes the MD in millivolts (mV) between stationary and wireless equipment with corresponding LOA, for each of the exercises. Compared with the wireless equipment, the stationary equipment indicated a systematically higher voltage during MVC (0.25 mV), VC50 (0.11 mV), VC25 (0.06 mV), static hold (0.07 mV), and baseline (0.04 mV). A Bland-Altman plot, visually presenting the MD and LOA for VC25, is shown in Figure 3. Table 3 also summarizes the ICC and CCC values for the SEMG equipment comparisons.

Peripheral Skin Temperature Sensor Agreement
Means and standard deviations of the temperature measurements at the 3 selected time points are shown in Table 2. The start temperature between the 2 sets of equipment did not differ significantly (P=.46), but the wireless sensor indicated a higher temperature at the end of relaxation (P<.001) and at room temperature (P<.001; Table 2).
The between-equipment MDs for changes in the temperature are presented in Table 3, along with the LOA and agreement indices. A Bland-Altman plot visually representing the MD and LOA for temperature change during relaxation is depicted in Figure 5. Excellent agreement was found for the change in temperature during relaxation (CCC .90, 95% CI 0.83-0.97) and from end of relaxation to room temperature (CCC .98, 95% CI 0.96-1.0). A rise in temperature was detected among 17 participants on the stationary equipment, and among 18 participants on the wireless equipment. Moreover, a rise in temperature of more than 1°C was detected among 15 participants on both equipment sets ( Figure 6).

Evaluation Questionnaire
In total, 19 of the 20 participants perceived the use of wireless sensors as practical (n=14) or very practical (n=5). Likewise, the absolute majority of participants reported that the app feedback reflected the use of shoulder musculature to a large (n=9) or a very large (n=9) degree. All participants regarded the use of wireless sensors as safe (n=2) or very safe (n=18). In contrast, 2 of the 20 participants reported undesirable, harmful effects, with both stating that the removal of the electrodes attached to the stationary equipment was unpleasant.

Principal Findings
This study aimed to provide a proof of concept for using a mobile phone and WHMS for biofeedback purposes, in a fashion similar to phase I-II development of new drug treatments [32]. We chose to investigate temperature and SMEG because they are the most commonly used biofeedback modalities [11] and are shown to be especially effective in adolescents [33]. We identified sensors fulfilling a set of predefined criteria that were considered necessary for the sensors to gain acceptance among patients, and thus these sensors were used [34]. The choice of sensors was arbitrary, as long as the predefined criteria were met. Even though the use of other temperature and SEMG sensors would not yield identical results, we argue that our approach has provided a proof of concept.
We found that the use of a wireless temperature sensor had almost perfect agreement regarding the change in finger temperature during relaxation. Furthermore, the use of a wireless SEMG sensor had a fair to excellent agreement for measuring tension in the trapezius muscle. We noted that the wireless SEMG consistently showed a lower voltage than the stationary equipment. The SEMG sensors showed excellent agreement during MVC and VC50, good agreement during VC25, and fair agreement during static hold and baseline. However, under the assumption that the stationary equipment was the most sensitive, it is not surprising that the calculated agreement decreased slightly at lower activity levels since random and equipment-generated noise constituted a larger part of the signal at low EMG-levels. Nonetheless, the wireless SEMG sensor registered consistent changes in muscle tension. We observed no ECG artifacts in the SEMG recordings. Therefore, it can be assumed that the ECG artifacts do not have a relevant influence on the SEMG recorded from closely placed bipolar electrodes on the right shoulder. Moreover, the safety and usability of the setup were highly satisfactory. In conclusion, the wireless sensors are well suited for biofeedback purposes.

Strengths and Limitations
The proper sample size for the study was assessed before recruiting participants (Multimedia Appendix 1). Traditionally, a sample of 15 to 20 participants is deemed sufficient for reliability studies [35]. However, the use of more precise calculations of sample sizes has been previously suggested [36]. Therefore, we used a CI estimation model suggested by Bonett [20] to determine the minimum sample size required. Due to the interindividual variation in our findings, the analyses would possibly have benefited from having a larger sample size because we did not obtain a predefined CI for all analyses.
There is a large degree of variability in individual human anatomical properties that may influence SEMG readings. This includes the thickness of fatty tissues, resting muscle length, velocity of contraction, muscle cross-sectional area, fiber type, posture change, interelectrode distance, skin impedance, age, and sex [22]. We chose to combine the recordings for the 2 pairs of stationary sensor electrodes to approximate the muscle activity of the wireless sensor placed in between. The relative spread of the electrode pairs may have led to EMG crosstalk, and muscle contraction exercises performed by untrained participants may have additionally resulted in movement artifacts, and suboptimal and varying performances [37]. The abovementioned factors may all have limited the precision of our measurements and contributed to a larger degree of interindividual differences, thus lowering individual ICC and CCC values for SEMG agreement. Likewise, the placement of the 2 temperature sensors beside each other on the finger might have led to differences in measurements. Figure 5 shows 1 outlier that displayed a larger increase in temperature by 1°C with the stationary equipment than with the wireless equipment. This differs from the majority that displayed the largest temperature increase with the wireless equipment. Nevertheless, LOA of ±1.5°C is still acceptable [38].
The SEMG signals usually have a frequency distribution with significant energy up to 400 to 500 Hz, requiring a sampling frequency of at least 1000 Hz (preferably 2000 Hz) to meet the Nyquist rate (2 times higher signal frequency) and avoid the so-called aliasing [39]. However, it is known that oversampling above this critical Nyquist rate does not significantly improve the signal quality [40] but will likely lead to higher cost and size of the sensor. The SEMG signals are usually bandpass filtered at 10 to 500 Hz [41], which we consequently chose to do for both setups. Furthermore, we observed that the notch filter, at 50 Hz, for the stationary equipment seemed to be saturated during recordings. After analog filtering, sine waves of 20 ms duration were still present. This may be explained by power-line noise, despite the use of a notch filter [42]. The wireless sensor also applies a notch filter at 50 Hz, which increases the signal-to-noise ratio. In total, we concluded that the wireless SEMG sensor applies appropriate signal processing settings.
We chose different statistical methods for assessing agreement to evaluate different properties of the wireless sensors. The Wilcoxon signed-ranks tests, together with the Bland-Altman plot and LOA, assess the degree of systematic differences and expected variance between measurements. A two-way, mixed-effect ICC model [43] ignores the element of rater variance (raters fixed as the 2 equipment sets), and the estimate can thus serve as an index of consistency [28,30,44,45]. This is useful to assess agreement when having mean differences between 2 measurement methods. We reported both individual and average ICC values, as the average value becomes useful when a large degree of interindividual variance exists or if individual readings are considered unreliable [30]. On the other hand, we also calculated the CCC to evaluate the degree of absolute agreement, that is, the 2 measurement methods showing identical values.

Interpretation
We have compared the WHMS with a gold standard; however, this does not imply that the gold standard is without measurement error. Thus, some lack of agreement is inevitable [46]. As pointed out by Bland and Altman [47], one should keep in mind that correlation coefficients alone do not assess interchangeability of measurement methods. The acceptable level of agreement in order to claim validity is a clinical decision. Considering the intended use of the chosen sensors, a high degree of absolute agreement is not a necessity, but consistency of agreement is important. We certainly observed that there exists variance in the data, leading to a low degree of absolute agreement. On the other hand, SEMG readings changed similarly and as expected through the experimental procedure for each participant, despite dissimilarities between the 2 equipment sets. This consistency is indeed supported by excellent to fair agreement of ICC values. Furthermore, the wireless SEMG sensor was less reliable at lower voltage, at least in terms of absolute agreement, when compared with our gold standard. A well-designed SEMG setup usually produces a system noise of about 1% of the MVC [48]. Our stationary equipment baseline showed 7% of MVC, which means that there was some inherent noise in the gold standard setup. In contrast, the baseline readings of the wireless sensors amounted to 3% of MVC, which in part may explain the increasing deviation at lower voltages.
Although the SEMG sensor did not demonstrate excellent agreement in all analyses, both SEMG and temperature WHMS appear to be suited for app-based biofeedback. Interestingly, 15 out of 20 participants (75%) managed to raise their temperature by more than 1°C during a single naive session indicating that the setup was simple to master. Moreover, all participants had similar changes in muscle tension through the sets of exercises. However, it is unlikely that the users will be able to decrease their muscle tension throughout the entire duration of a biofeedback session [49]. This means that detecting a change in tension is more important than the absolute values.
In line with this, it was recently shown that the feedback itself is more important than lowering muscle tension in the treatment of headache [50]. Taken together, these findings imply that perfect sensor agreement in itself is not a prerequisite for an app-based biofeedback platform. The main focus of app-based biofeedback should be directed at the development of high-quality feedback mechanisms and user interfaces.

Prospects for Future Research
This study confirmed the usability of WHMS in a biofeedback setting and established partial evidence for an upcoming biofeedback app. At any rate, the scientific validation of the sensor is of utmost importance for the value and effectiveness of a future treatment program. The choice to use an MVP app to assess agreement enables iterative and incremental developments. Future research should be carried out to establish further the basis for the use of WHMS for medical purposes in the emerging era of health informatics and mHealth. As an example, similar validation of heart rate variability measurements, which is of interest in biofeedback treatment, has been conducted [51,52,53]. We are currently exploring the user interface and assessing the usability of the app among adolescents with migraine.

Conclusions
This study confirmed the validity of wireless WHMS connected to a mobile phone for monitoring neurophysiological parameters of relevance for biofeedback therapy.