Published on in Vol 3, No 1 (2018): Jan-Dec

Preprints (earlier versions) of this paper are available at, first published .
Wireless Surface Electromyography and Skin Temperature Sensors for Biofeedback Treatment of Headache: Validation Study with Stationary Control Equipment

Wireless Surface Electromyography and Skin Temperature Sensors for Biofeedback Treatment of Headache: Validation Study with Stationary Control Equipment

Wireless Surface Electromyography and Skin Temperature Sensors for Biofeedback Treatment of Headache: Validation Study with Stationary Control Equipment

Original Paper

1Department of Neuromedicine and Movement Science, Norwegian University of Science and Technology, Trondheim, Norway

2Department of Neurology and Clinical Neurophysiology, St Olavs Hospital, Trondheim, Norway

3National Advisory Unit on Headaches, Section of Neurology, St Olavs Hospital, Trondheim, Norway

4Department of Psychology, Norwegian University of Science and Technology, Trondheim, Norway

5Department of Physical Medicine and Rehabilitation, St Olavs Hospital, Trondheim, Norway

Corresponding Author:

Anker Stubberud

Department of Neuromedicine and Movement Science

Norwegian University of Science and Technology

Nevro Øst, Edvard Griegs Gate 8

Trondheim, 7030


Phone: 47 73 59 20 20


Background: The use of wearables and mobile phone apps in medicine is gaining attention. Biofeedback has the potential to exploit the recent advances in mobile health (mHealth) for the treatment of headaches.

Objectives: The aim of this study was to assess the validity of selected wireless wearable health monitoring sensors (WHMS) for measuring surface electromyography (SEMG) and peripheral skin temperature in combination with a mobile phone app. This proof of concept will form the basis for developing innovative mHealth delivery of biofeedback treatment among young persons with primary headache.

Methods: Sensors fulfilling the following predefined criteria were identified: wireless, small size, low weight, low cost, and simple to use. These sensors were connected to an app and used by 20 healthy volunteers. Validity was assessed through the agreement with simultaneous control measurements made with stationary neurophysiological equipment. The main variables were (1) trapezius muscle tension during different degrees of voluntary contraction and (2) voluntary increase in finger temperature. Data were statistically analyzed using Bland-Altman plots, intraclass correlation coefficient (ICC), and concordance correlation coefficient (CCC).

Results: The app was programmed to receive data from the wireless sensors, process them, and feed them back to the user through a simple interface. Excellent agreement was found for the temperature sensor regarding increase in temperature (CCC .90; 95% CI 0.83-0.97). Excellent to fair agreement was found for the SEMG sensor. The ICC for the average of 3 repetitions during 4 different target levels ranged from .58 to .81. The wireless sensor showed consistency in muscle tension change during moderate muscle activity. Electrocardiography artifacts were avoided through right-sided use of the SEMG sensors. Participants evaluated the setup as usable and tolerable.

Conclusions: This study confirmed the validity of wireless WHMS connected to a mobile phone for monitoring neurophysiological parameters of relevance for biofeedback therapy.

JMIR Biomed Eng 2018;3(1):e1



In the emerging era of mobile health (mHealth) and technology, the use of wearable sensors and mobile phone health apps has recently gained attention. This has led to a subcategory of health informatics, labeled mHealth, encompassing the use of mobile phones for medical purposes [1]. In addition to these apps, there is also a wide array of wearable health monitoring sensors (WHMS) [2],which represent a means for patients to access real-time data from a broad range of physiological parameters at home [3-5], thus enabling extensive data acquisition [6]. mHealth is of special interest to the younger generation, which is constantly exposed to and familiarized with such technology. It is also increasing in popularity within the field of headache care and research. In particular, mobile phone–based headache diaries are frequently used [7]. However, there is a potential for extending this mobile technology into the preventive treatment of headache disorders, such as migraine. The bulk of current mHealth research focuses on chronic conditions and delivery of self-educational treatment [8], fitting the description of behavioral headache treatments. Biofeedback, one of the several behavioral headache treatments, is well established and empirically supported [9]. Systematic reviews with meta-analyses demonstrated that biofeedback is effective as a migraine prophylaxis in both the adult and pediatric populations [10,11]. However, the treatment is both time-consuming and costly and therefore not readily available for those in need. Thus, a more optimal approach for behavioral headache treatment has long been sought [12,13]. Biofeedback has the potential to exploit the recent advances in mHealth technology [14,15]. All the while, biofeedback mHealth solutions for other purposes, such as exercise and postcancer swallowing exercises, are being developed [16,17].

Modalities proven effective in biofeedback treatment for headache disorders include surface electromyography (SEMG) and peripheral skin temperature. Both modalities are common in the current development of WHMS [2] and may serve as natural elements in the implementation of biofeedback solutions. Nevertheless, such WHMS sensors have not been validated for use in neurophysiological monitoring for the purpose of biofeedback therapy.

The aim of this study was to assess the validity of WHMS for measuring SEMG and peripheral skin temperature in combination with a mobile phone app. This proof of concept would form the basis for the development of a novel, innovative mHealth system for biofeedback therapy for young persons with primary headache.

Study Design

In the first phase of the study, we identified suitable WHMS and developed the preliminary software. In the second phase of the study, we recruited healthy volunteers to establish the validity of the chosen WHMS. The study was exploratory in nature, with the main aim to evaluate the validity of the chosen WHMS by assessing the agreement compared with stationary neurophysiological equipment following recommended guidelines for agreement studies [18].

Identification of Sensors

The inclusion criteria and requirements for suitable sensors were (1) wireless setup, (2) small size, (3) low weight, (4) simple to use compared with standard clinical equipment, and (5) low cost.

Software Development

The first version of the app was created as a minimal viable product (MVP). This preliminary version was programmed to serve as the starting point of iterative and incremental rounds of testing [19], allowing subsequent development and fine-tuning of the user interface and software components in an upcoming usability study.


We considered a sample size of 18 to be sufficient, based on the model for sample size determination in reliability studies presented by Bonett [20] (Multimedia Appendix 1). We set out to recruit 20 healthy volunteers to account for potential dropouts. Participants were recruited as a convenience sample by actively seeking out young individuals from the local research and student community. Exclusion criteria were reduced hearing, vision, or sensibility, and severe neurologic or psychiatric disease.


TheNeckSensor (EXPAIN, Oslo, Norway) was selected as the wireless WHMS to measure muscle tension. This is a small, compact bipolar SEMG sensor, with a single SR-R adhesive gel patch containing both electrodes (total patch area, 19.8 cm2), and no patient ground electrode. For wireless measurement of temperature, we selected the PASPORT Skin/Surface Temperature Probe, PS-2131, combined with PASPORT Temperature sensor, PS-2125, and AirLink, PS-3200 (Pasco, Roseville, CA, USA). Both the sensors transmitted signals via Bluetooth Smart/4.0.

As the stationary equipment, the following AD Instruments (Dunedin, New Zealand) setup was used: (1) SMEG signals recorded with 5-Lead Shielded Lead Wires (MLA2505) and 5-Lead Shielded BioAmp cable (MLA2540) attached to Red Dot 2560 electrodes with a silver/silver-chloride 3.48 cm2 sensor area (3M Health Care, Germany) fed through a Dual BioAmp, FE135, and PowerLab 8/35; (2) equivalent lead wires, cables, and electrodes for registration of an electrocardiogram (ECG) through a separate Dual BioAmp; and (3) temperature registered through Skin Temperature Pod and Probe, ML309 + MLT422/A fed through PowerLab. The recordings were visualized and analyzed using the LabChart 8 software (AD Instruments, Dunedin New Zealand) installed on a Dell Latitude E4310 laptop.

Experimental Procedure

Participants were seated in a recliner at a 90 degree angle in the neurophysiological laboratory. The 2 electrodes from the NeckSensor were placed over the upper fibers of the right trapezius muscle midway along the line between the spinous process C7 and the acromion [21,22]. Since simultaneous registrations of SEMG signals from the same location with different sets of surface electrodes are not possible, one set of electrodes from the stationary equipment was placed 2 cm cranially of the NeckSensor, and the other set was placed 2 cm caudally. The interelectrode distance was 4 cm. The “patient ground” electrode for the stationary equipment was placed over the spinous process C7 (Figure 1). The skin beneath the stationary electrodes was washed with alcohol swabs. The 2 skin temperature sensors were attached, without touching each other, to the volar pad of the distal phalange on the second finger with sticky tape, with the stationary sensor placed radially of the 2 sensor electrodes.

Figure 1 shows the scheme of the electrode placements over the upper trapezius fibers. The wireless sensor electrode pair was placed first, midway in the line between the acromion and the spinous process C7. One of the two pairs of stationary sensor electrodes was placed cranially, whereas the other was placed caudally of the wireless sensor electrode pair. The interelectrode distance for each pair was 4 cm.

Initially, each participant was asked to relax for 5 min to allow the skin temperature to increase during relaxation. Relaxation was achieved by asking the participant to do nothing and sit still on the recliner. This served to give a baseline (relaxed) muscle tension measurement. Relaxed trapezius muscle tension (baseline) was recorded in the last 30 s of relaxation. Thereafter, the temperature sensors were detached to allow the measurement of room temperature for the remainder of the procedure. Subsequently, the participant was instructed to complete a series of exercises to activate the upper fibers of the trapezius muscle. Arbitrary angle isometric maximal voluntary contraction (MVC), through shoulder elevation, was completed in 3 repetitions, each lasting for 6 s [22-25]. The SEMG and force were simultaneously registered. The force was recorded by a dynamometer (Manual Muscle Tester, Lafayette Instruments, USA) attached to a fixed sling placed over the acromion. Subsequently, the participant was asked to complete similar sets of contractions at 50% (VC50) and 25% (VC25) of maximal contraction guided by a sound signal from the dynamometer elicited at a corresponding set force. Finally, the participant was asked to complete 4 repetitions of static contractions (15 s each) performed by abducting both shoulders to a 90 degree angle and holding them against gravity [22].

After completing the exercises, the participant was asked to answer a 5‐item user evaluation questionnaire. Of these, 3 questions had reply options on a 5‐point Likert scale, ranging from “Very dissatisfied” to “Very satisfied,” while the remaining 2 questions were open for free comments (Table 1).

Figure 1. Electrode placement.
View this figure
Table 1. Evaluation questionnaire.
1Did you perceive the wireless sensors as practical to use?
2To what degree did you feel that the use of shoulder-musculature reflected the feedback in the app?
3Do you recognize the wireless sensors as safe to use?
4Did you experience any undesirable harmful effects (if yes, please explain)?
5Do you have any further comments (if yes, please explain)?

Data Management

The NeckSensor uses a 12‐bit ADC resolution sampled at 1024 Hz with a third order 10‐480 Hz active bandpass filter. The sensor was programmed to calculate and transmit mean square values internally, with a window width of 40 ms, with no overlap, and a frequency of 25 Hz in order not to overload the Bluetooth capacity. The PowerLab sampled the SEMG signals at 2000 Hz with a fourth order Bessel lowpass filter at 500 Hz and a first order high pass filter at 10 Hz. In addition, a 50 Hz notch analog filter was applied [26]. All stationary recordings were evaluated visually for the presence of ECG artifacts. If found, these were to be corrected by removing the spike-correlated area in the SEMG signal and subsequently replacing the gap with surrounding SEMG activity.

First, the stationary readings were root mean square (RMS) rectified and then averaged over the two sets of electrodes to avoid phase-cancellations. The RMS value was calculated from the mean square values of the wireless sensors. The RMS values for each muscle contraction exercise to be used in the analyses were calculated as the mean of the repetitions for both equipment sets. For the temperature measurements, we calculated the difference in temperature from the start to the end of relaxation and the difference between the temperature at the end of relaxation and room temperature.


The means and SD for the RMS values during trapezius muscle exercises and the chosen data temperature points were calculated. Systematic differences between stationary and wireless equipment were assessed with the Wilcoxon signed-rank test.

Mean difference (MD) and limits of agreement (LOA), together with Bland-Altman plots were used as descriptive tools [27].We calculated the intraclass correlation coefficient (ICC) with a two-way, mixed-effects consistency of agreement model. Coefficients for both individual and average agreement were presented. In addition, we calculated the Lin concordance correlation coefficient (CCC) [28-30]. For the ICC and CCC analyses, the data was first transformed to meet assumptions for a two-way analysis of variance model. Then the data was transformed by calculating the natural logarithm after adding 0.1 as a constant to adjust for values being close to zero. The ICC values were interpreted as suggested by Cicchetti et al [31], that is, unacceptable or poor (.00‐.40), fair (.41‐.60), good (.61‐.75), and excellent (.75‐1.00). All data were analyzed by using the statistical package Stata version 14 (StataCorp, College Station, TX, USA).

Sensors and Software

The WHMS fulfilling the predefined requirements were identified through pragmatic Internet-searches. The MVP version of the app used in the experimental procedure was programmed to receive data from the wireless sensors and feed raw data back to the user. The raw data were presented as two columns increasing in height with increase in muscle tension and temperature, respectively. The app was programmed to allow connection of any WHMS using Bluetooth.


A total of 20 healthy participants were recruited and completed the experimental procedure. Of these, 12 were male participants, and their mean age was 24.7 years (SD 2.7, range 18‐29 years).

Surface Electromyography Sensor Agreement

We observed no ECG artifacts in the SEMG recordings (Figure 2). Hence, the ECG-related elements were not removed from the SEMG recordings.

Figure 2 shows the raw data of the SEMG activity for the wireless sensor (red), anterior stationary sensor (blue), and posterior stationary sensor (green) from a 24-year-old male participant. The marked areas indicate where the different exercises are performed. The figure exemplifies the absence of ECG artifacts and the similarity of the signals.

Means and standard deviations of the RMS values for the trapezius muscle exercises are presented in Table 2. The wireless sensor showed a lower voltage during trapezius muscle exercises than during all contraction periods and at baseline.

Table 3 summarizes the MD in millivolts (mV) between stationary and wireless equipment with corresponding LOA, for each of the exercises. Compared with the wireless equipment, the stationary equipment indicated a systematically higher voltage during MVC (0.25 mV), VC50 (0.11 mV), VC25 (0.06 mV), static hold (0.07 mV), and baseline (0.04 mV). A Bland-Altman plot, visually presenting the MD and LOA for VC25, is shown in Figure 3. Table 3 also summarizes the ICC and CCC values for the SEMG equipment comparisons.

Figure 2. Raw surface electromyography (SEMG) data. ECG: electrocardiogram; MVC: maximal voluntary contraction; RMS: root mean square; VC50: voluntary contraction at 50% force; VC25: voluntary contraction at 25% force.
View this figure
Table 2. Comparison of the means for stationary and wireless equipment.
ExerciseStationary equipment (SD)Wireless equipment (SD)Z-value (P value)a
MVCb0.62c (0.25)0.37 (0.15)3.73 (<.001)
VC50d0.26 (0.11)0.15 (0.06)3.92 (<.001)
VC25e0.15 (0.05)0.09 (0.05)3.73 (<.001)
Static hold0.16 (0.06)0.08 (0.03)3.85 (<.001)
Baseline0.045 (0.004)0.01 (0.002)3.92 (<.001)
Start temperature28.8f (3.4)28.8 (3.3)0.75 (=.46)
End temperature30.7 (3.6)31.5 (4.0)3.4 (<.001)
Room temperature23.0 (0.3)23.6 (0.4)3.9 (<.001)

aZ-value from Wilcoxon signed-rank test.

bMVC: maximal voluntary contraction.

cMean voltage in millivolts RMS.

dVC50: voluntary contraction at 50% force.

eVC25: voluntary contraction at 25% force.

fMean temperature in degrees Celsius.

Table 3. Indices of agreement between stationary and wireless equipment.
ExerciseMean differenceLimits of agreementICCa (95% CI) individualICC (95% CI) averageCCCb (95% CI)
MVCc0.25f−0.12 to 0.61.81 (0.57‐0.92).89 (0.73‐0.96).52 (0.30‐0.73)
VC50d0.11−0.04 to 0.27.81 (0.57‐0.92).89 (0.73‐0.96).44 (0.23‐0.64)
VC25e0.06−0.03 to 0.15.66 (0.31‐0.85).79 (0.47‐0.92).37 (0.14‐0.60)
Static hold0.07−0.02 to 0.16.58 (0.19‐0.81).73 (0.32‐0.89).26 (0.06‐0.45)
Baseline0.040.03-0.04.50 (0.09‐0.77).67 (0.16‐0.87).01 (0.00‐0.01)
Start to end temperature−0.77g−1.90 to 0.35.96 (0.91‐0.99).98 (0.95‐0.99).90 (0.83‐0.97)
End to room temperature−0.23−1.74 to 1.28.98 (0.95‐0.99).99 (0.97‐1.0).98 (0.96‐1.0)

aICC: intraclass correlation coefficient.

bCCC: concordance correlation coefficient.

cMVC: maximal voluntary contraction.

dVC50: voluntary contraction at 50% force.

eVC25: voluntary contraction at 25% force.

fMean voltage in millivolts RMS.

gMean temperature in degrees Celsius.

Figure 3. Surface electromyography (SEMG) sensor agreement. mV: millivolts; RMS: root mean square.
View this figure

Excellent agreement was found for MVC (ICC .81, 95% CI 0.57‐0.92) and VC50 (ICC .81, 95% CI 0.57‐0.92). Good agreement was found for VC25 (ICC .66, 95% CI 0.31‐0.85). Fair agreement was found for static hold (ICC .58, 95% CI 0.19‐0.81) and baseline (ICC .50, 95% CI 0.09‐0.77). All participants displayed a decrease in voltage from MVC to VC50, from VC50 to VC25, and from static hold to baseline for both sets of equipment, with the exception of one participant who had a small increase (0.03 mV) in voltage from VC50 to VC25 registered on the stationary equipment (Figure 4).

Figure 3 shows Bland-Altman plot assessing the agreement between stationary and wireless SEMG sensors during voluntary contraction at 25% force. The x-axis represents the average of the two parallel measurements. The y-axis represents the corresponding difference between the 2 measurements. The values are indicated in millivolt RMS.

Figure 4 is a line graph showing the SEMG readings for each participant during MVC, VC50, VC25, static hold, and baseline. The top panel indicates readings with the stationary equipment. The bottom panel indicates readings with the wireless equipment. The values are indicated in millivolt RMS.

Peripheral Skin Temperature Sensor Agreement

Means and standard deviations of the temperature measurements at the 3 selected time points are shown in Table 2. The start temperature between the 2 sets of equipment did not differ significantly (P=.46), but the wireless sensor indicated a higher temperature at the end of relaxation (P<.001) and at room temperature (P<.001; Table 2).

The between-equipment MDs for changes in the temperature are presented in Table 3, along with the LOA and agreement indices. A Bland-Altman plot visually representing the MD and LOA for temperature change during relaxation is depicted in Figure 5. Excellent agreement was found for the change in temperature during relaxation (CCC .90, 95% CI 0.83‐0.97) and from end of relaxation to room temperature (CCC .98, 95% CI 0.96‐1.0). A rise in temperature was detected among 17 participants on the stationary equipment, and among 18 participants on the wireless equipment. Moreover, a rise in temperature of more than 1°C was detected among 15 participants on both equipment sets (Figure 6).

Figure 5 is a Bland-Altman plot showing the agreement between stationary and wireless equipment for the change in temperature from start to end of relaxation. The x-axis represents the average of the 2 parallel measurements. The y-axis represents the corresponding difference in measurements. The values are in degrees Celsius.

Figure 6 is a line graph showing temperature readings for each participant at the start and end of relaxation and at room temperature. The upper panel represents readings with the stationary equipment. The lower panel represents readings with the wireless equipment. The values are in degrees Celsius.

Evaluation Questionnaire

In total, 19 of the 20 participants perceived the use of wireless sensors as practical (n=14) or very practical (n=5). Likewise, the absolute majority of participants reported that the app feedback reflected the use of shoulder musculature to a large (n=9) or a very large (n=9) degree. All participants regarded the use of wireless sensors as safe (n=2) or very safe (n=18). In contrast, 2 of the 20 participants reported undesirable, harmful effects, with both stating that the removal of the electrodes attached to the stationary equipment was unpleasant.

Figure 4. Surface electromyography (SEMG) sensor line graphs. mV: millivolts; MVC: maximal voluntary contraction; RMS: root mean square; VC50: voluntary contraction at 50% force; VC25: voluntary contraction at 25% force.
View this figure
Figure 5. Temperature sensor agreement. mV: millivolts; RMS: root mean square.
View this figure
Figure 6. Temperature sensor line graphs.
View this figure

Principal Findings

This study aimed to provide a proof of concept for using a mobile phone and WHMS for biofeedback purposes, in a fashion similar to phase I-II development of new drug treatments [32]. We chose to investigate temperature and SMEG because they are the most commonly used biofeedback modalities [11] and are shown to be especially effective in adolescents [33]. We identified sensors fulfilling a set of predefined criteria that were considered necessary for the sensors to gain acceptance among patients, and thus these sensors were used [34]. The choice of sensors was arbitrary, as long as the predefined criteria were met. Even though the use of other temperature and SEMG sensors would not yield identical results, we argue that our approach has provided a proof of concept.

We found that the use of a wireless temperature sensor had almost perfect agreement regarding the change in finger temperature during relaxation. Furthermore, the use of a wireless SEMG sensor had a fair to excellent agreement for measuring tension in the trapezius muscle. We noted that the wireless SEMG consistently showed a lower voltage than the stationary equipment. The SEMG sensors showed excellent agreement during MVC and VC50, good agreement during VC25, and fair agreement during static hold and baseline. However, under the assumption that the stationary equipment was the most sensitive, it is not surprising that the calculated agreement decreased slightly at lower activity levels since random and equipment-generated noise constituted a larger part of the signal at low EMG-levels. Nonetheless, the wireless SEMG sensor registered consistent changes in muscle tension. We observed no ECG artifacts in the SEMG recordings. Therefore, it can be assumed that the ECG artifacts do not have a relevant influence on the SEMG recorded from closely placed bipolar electrodes on the right shoulder. Moreover, the safety and usability of the setup were highly satisfactory. In conclusion, the wireless sensors are well suited for biofeedback purposes.

Strengths and Limitations

The proper sample size for the study was assessed before recruiting participants (Multimedia Appendix 1). Traditionally, a sample of 15 to 20 participants is deemed sufficient for reliability studies [35]. However, the use of more precise calculations of sample sizes has been previously suggested [36]. Therefore, we used a CI estimation model suggested by Bonett [20] to determine the minimum sample size required. Due to the interindividual variation in our findings, the analyses would possibly have benefited from having a larger sample size because we did not obtain a predefined CI for all analyses.

There is a large degree of variability in individual human anatomical properties that may influence SEMG readings. This includes the thickness of fatty tissues, resting muscle length, velocity of contraction, muscle cross-sectional area, fiber type, posture change, interelectrode distance, skin impedance, age, and sex [22]. We chose to combine the recordings for the 2 pairs of stationary sensor electrodes to approximate the muscle activity of the wireless sensor placed in between. The relative spread of the electrode pairs may have led to EMG crosstalk, and muscle contraction exercises performed by untrained participants may have additionally resulted in movement artifacts, and suboptimal and varying performances [37]. The abovementioned factors may all have limited the precision of our measurements and contributed to a larger degree of interindividual differences, thus lowering individual ICC and CCC values for SEMG agreement. Likewise, the placement of the 2 temperature sensors beside each other on the finger might have led to differences in measurements. Figure 5 shows 1 outlier that displayed a larger increase in temperature by 1°C with the stationary equipment than with the wireless equipment. This differs from the majority that displayed the largest temperature increase with the wireless equipment. Nevertheless, LOA of ±1.5°C is still acceptable [38].

The SEMG signals usually have a frequency distribution with significant energy up to 400 to 500 Hz, requiring a sampling frequency of at least 1000 Hz (preferably 2000 Hz) to meet the Nyquist rate (2 times higher signal frequency) and avoid the so-called aliasing [39]. However, it is known that oversampling above this critical Nyquist rate does not significantly improve the signal quality [40] but will likely lead to higher cost and size of the sensor. The SEMG signals are usually bandpass filtered at 10 to 500 Hz [41], which we consequently chose to do for both setups. Furthermore, we observed that the notch filter, at 50 Hz, for the stationary equipment seemed to be saturated during recordings. After analog filtering, sine waves of 20 ms duration were still present. This may be explained by power-line noise, despite the use of a notch filter [42]. The wireless sensor also applies a notch filter at 50 Hz, which increases the signal-to-noise ratio. In total, we concluded that the wireless SEMG sensor applies appropriate signal processing settings.

We chose different statistical methods for assessing agreement to evaluate different properties of the wireless sensors. The Wilcoxon signed-ranks tests, together with the Bland-Altman plot and LOA, assess the degree of systematic differences and expected variance between measurements. A two-way, mixed-effect ICC model [43] ignores the element of rater variance (raters fixed as the 2 equipment sets), and the estimate can thus serve as an index of consistency [28,30,44,45]. This is useful to assess agreement when having mean differences between 2 measurement methods. We reported both individual and average ICC values, as the average value becomes useful when a large degree of interindividual variance exists or if individual readings are considered unreliable [30]. On the other hand, we also calculated the CCC to evaluate the degree of absolute agreement, that is, the 2 measurement methods showing identical values.


We have compared the WHMS with a gold standard; however, this does not imply that the gold standard is without measurement error. Thus, some lack of agreement is inevitable [46]. As pointed out by Bland and Altman [47], one should keep in mind that correlation coefficients alone do not assess interchangeability of measurement methods. The acceptable level of agreement in order to claim validity is a clinical decision. Considering the intended use of the chosen sensors, a high degree of absolute agreement is not a necessity, but consistency of agreement is important. We certainly observed that there exists variance in the data, leading to a low degree of absolute agreement. On the other hand, SEMG readings changed similarly and as expected through the experimental procedure for each participant, despite dissimilarities between the 2 equipment sets. This consistency is indeed supported by excellent to fair agreement of ICC values. Furthermore, the wireless SEMG sensor was less reliable at lower voltage, at least in terms of absolute agreement, when compared with our gold standard. A well-designed SEMG setup usually produces a system noise of about 1% of the MVC [48]. Our stationary equipment baseline showed 7% of MVC, which means that there was some inherent noise in the gold standard setup. In contrast, the baseline readings of the wireless sensors amounted to 3% of MVC, which in part may explain the increasing deviation at lower voltages.

Although the SEMG sensor did not demonstrate excellent agreement in all analyses, both SEMG and temperature WHMS appear to be suited for app-based biofeedback. Interestingly, 15 out of 20 participants (75%) managed to raise their temperature by more than 1°C during a single naive session indicating that the setup was simple to master. Moreover, all participants had similar changes in muscle tension through the sets of exercises. However, it is unlikely that the users will be able to decrease their muscle tension throughout the entire duration of a biofeedback session [49]. This means that detecting a change in tension is more important than the absolute values. In line with this, it was recently shown that the feedback itself is more important than lowering muscle tension in the treatment of headache [50]. Taken together, these findings imply that perfect sensor agreement in itself is not a prerequisite for an app-based biofeedback platform. The main focus of app-based biofeedback should be directed at the development of high-quality feedback mechanisms and user interfaces.

Prospects for Future Research

This study confirmed the usability of WHMS in a biofeedback setting and established partial evidence for an upcoming biofeedback app. At any rate, the scientific validation of the sensor is of utmost importance for the value and effectiveness of a future treatment program. The choice to use an MVP app to assess agreement enables iterative and incremental developments. Future research should be carried out to establish further the basis for the use of WHMS for medical purposes in the emerging era of health informatics and mHealth. As an example, similar validation of heart rate variability measurements, which is of interest in biofeedback treatment, has been conducted [51,52,53]. We are currently exploring the user interface and assessing the usability of the app among adolescents with migraine.


This study confirmed the validity of wireless WHMS connected to a mobile phone for monitoring neurophysiological parameters of relevance for biofeedback therapy.


The authors wish to thank all the volunteers for participating in the study. The study was funded by strategic seeding grants from the Faculty of Medicine, NTNU Norwegian University of Science and Technology. We would also like to thank Searis AS for the fruitful collaboration and for their help with programming the app, EXPAIN AS for supplying SEMG sensors for use in the study, and the personnel at the Department of Neurophysiology, St Olavs Hospital, for their support with the experimental procedures.

Conflicts of Interest

Anker Stubberud has participated as a nonpaid member of an Expert Panel advising EXPAIN AS during the final phases of product development. Should the research result in a commercially available product, the university and authors may benefit financially from future intellectual property rights.

Multimedia Appendix 1

Sample size determination.

PDF File (Adobe PDF File), 32KB

  1. Kay M, Santos J, Takane M. 2011. mHealth: new horizons for health through mobile technologies   URL: [accessed 2018-02-13] [WebCite Cache]
  2. Pantelopoulos A, Bourbakis NG. A survey on wearable sensor-based systems for health monitoring and prognosis. IEEE Trans Syst Man Cybern C Appl Rev 2010;40(1):1-12. [CrossRef]
  3. Dobkin BH, Dorsch A. The promise of mHealth: daily activity monitoring and outcome assessments by wearable sensors. Neurorehabil Neural Repair 2011;25(9):788-798 [FREE Full text] [CrossRef] [Medline]
  4. Hanson MA, Powell Jr HC, Barth AT, Ringenberg K, Calhoun BH, Aylor JH, et al. Body area sensor networks: challenges and opportunities. Computer 2009;42(1):58-65.
  5. Schüll ND. Data for life: wearable technology and the design of self-care. BioSocieties 2016;11(3):317-333 [FREE Full text]
  6. Lupton D. Quantifying the body: monitoring and measuring health in the age of mHealth technologies. Crit Public Health 2013;23(4):393-403. [CrossRef]
  7. Hundert AS, Huguet A, McGrath PJ, Stinson JN, Wheaton M. Commercially available mobile phone headache diary apps: a systematic review. JMIR Mhealth Uhealth 2014;2(3):e36 [FREE Full text] [CrossRef] [Medline]
  8. Fiordelli M, Diviani N, Schulz PJ. Mapping mHealth research: a decade of evolution. J Med Internet Res 2013;15(5):e95 [FREE Full text] [CrossRef] [Medline]
  9. Penzien DB, Irby MB, Smitherman TA, Rains JC, Houle TT. Well-established and empirically supported behavioral treatments for migraine. Curr Pain Headache Rep 2015 Jul;19(7):34. [CrossRef] [Medline]
  10. Stubberud A, Varkey E, McCrory DC, Pedersen SA, Linde M. Biofeedback as prophylaxis for pediatric migraine: a meta-analysis. Pediatrics 2016 Aug;138(2) [FREE Full text] [CrossRef] [Medline]
  11. Nestoriuc Y, Martin A. Efficacy of biofeedback for migraine: a meta-analysis. Pain 2007 Mar;128(1-2):111-127. [CrossRef] [Medline]
  12. Andrasik F. Behavioral treatment of headaches: extending the reach. Neurol Sci 2012 May;33 Suppl 1:S127-S130. [CrossRef] [Medline]
  13. Schwartz MS, Andrasik F, editors. Biofeedback: A Practitioner's Guide. New York City: Guilford Press; 2017.
  14. Minen MT, Torous J, Raynowska J, Piazza A, Grudzen C, Powers S, et al. Electronic behavioral interventions for headache: a systematic review. J Headache Pain 2016;17:51 [FREE Full text] [CrossRef] [Medline]
  15. Luxton DD, McCann RA, Mishkind MC, Reger GM, Bush NE. mHealth for mental health: integrating smartphone technology in behavioral healthcare. Prof Psychol Res Pr 2011;42(6):505-512 [FREE Full text] [CrossRef]
  16. Constantinescu G, Loewen I, King B, Hodgetts W, Rieger J, Brodt C. Designing a mobile health app for patients with dysphagia following head and neck cancer: a qualitative study. JMIR Rehabil Assist Technol 2017;4(1):e3. [Medline]
  17. O'Reilly M, Duffin J, Ward T, Caulfield B. Mobile app to streamline the development of wearable sensor-based exercise biofeedback systems: system development and evaluation. JMIR Rehabil Assist Technol 2017;4(2):e9. [CrossRef]
  18. Kottner J, Audige L, Brorson S, Donner A, Gajewski BJ, Hróbjartsson A, et al. Guidelines for reporting reliability and agreement studies (GRRAS) were proposed. Int J Nurs Stud 2011 Jun;48(6):661-671. [CrossRef] [Medline]
  19. Larman C, Basili VR. Iterative and incremental developments: a brief history. Comput 2003;36(6):47-56. [CrossRef]
  20. Bonett DG. Sample size requirements for estimating intraclass correlations with desired precision. Stat Med 2002 May 15;21(9):1331-1335. [CrossRef] [Medline]
  21. Hermens HJ, Freriks B, Disselhorst-Klug C, Rau G. Development of recommendations for SEMG sensors and sensor placement procedures. J Electromyogr Kinesiol 2000 Oct;10(5):361-374. [Medline]
  22. Criswell E. Cram's Introduction to Surface Electromyography. 2nd edition. Sudbury, MA: Jones & Bartlett Publishers; 2010.
  23. Mathiassen SE, Winkel J, Hägg GM. Normalization of surface EMG amplitude from the upper trapezius muscle in ergonomic studies - A review. J Electromyogr Kinesiol 1995 Dec;5(4):197-226. [Medline]
  24. Burden A. How should we normalize electromyograms obtained from healthy participants? What we have learned from over 25 years of research. J Electromyogr Kinesiol 2010 Dec;20(6):1023-1035. [CrossRef] [Medline]
  25. Knutson LM, Soderberg GL, Ballantyne BT, Clarke WR. A study of various normalization procedures for within day electromyographic data. J Electromyogr Kinesiol 1994;4(1):47-59. [CrossRef] [Medline]
  26. Chowdhury RH, Reaz MB, Ali MA, Bakar AA, Chellappan K, Chang TG. Surface electromyography signal processing and classification techniques. Sensors (Basel) 2013 Sep 17;13(9):12431-12466 [FREE Full text] [CrossRef] [Medline]
  27. Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1986 Feb 8;1(8476):307-310. [Medline]
  28. Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater reliability. Psychol Bull 1979 Mar;86(2):420-428. [Medline]
  29. Watson PF, Petrie A. Method agreement analysis: a review of correct methodology. Theriogenology 2010 Jun;73(9):1167-1179 [FREE Full text] [CrossRef] [Medline]
  30. Barnhart HX, Haber MJ, Lin LI. An overview on assessing agreement with continuous measurements. J Biopharm Stat 2007;17(4):529-569. [CrossRef] [Medline]
  31. Cicchetti DV. Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychol Assess 1994;6(4):284-290. [CrossRef]
  32. Schmidt B. Proof of Principle studies. Epilepsy Res 2006 Jan;68(1):48-52. [CrossRef] [Medline]
  33. Sarafino EP, Goehring P. Age comparisons in acquiring biofeedback control and success in reducing headache pain. Ann Behav Med 2000;22(1):10-16. [Medline]
  34. Bergmann JH, McGregor AH. Body-worn sensor design: what do patients and clinicians want? Ann Biomed Eng 2011 Sep;39(9):2299-2312. [CrossRef] [Medline]
  35. Fleiss JL. Design and Analysis of Clinical Experiments. Hoboken, NJ: John Wiley & Sons; 2011.
  36. Walter SD, Eliasziw M, Donner A. Sample size and optimal designs for reliability studies. Stat Med 1998 Jan 15;17(1):101-110. [Medline]
  37. Merletti R, di Torino P. Standards for reporting EMG data. J. Electromyogr Kinesiol 1999;9(1):3-4 [FREE Full text]
  38. Kelechi TJ, Michel Y, Wiseman J. Are infrared and thermistor thermometers interchangeable for measuring localized skin temperature? J Nurs Meas 2006;14(1):19-30. [Medline]
  39. Ives JC, Wigglesworth JK. Sampling rate effects on surface EMG timing and amplitude measures. Clin Biomech (Bristol, Avon) 2003 Jul;18(6):543-552. [Medline]
  40. Durkin JL, Callaghan JP. Effects of minimum sampling rate and signal reconstruction on surface electromyographic signals. J Electromyogr Kinesiol 2005 Oct;15(5):474-481. [CrossRef] [Medline]
  41. van Boxtel A. Optimal signal bandwidth for the recording of surface EMG activity of facial, jaw, oral, and neck muscles. Psychophysiol 2001 Jan;38(1):22-34. [Medline]
  42. De Luca CJ, Gilmore LD, Kuznetsov M, Roy SH. Filtering the surface EMG signal: movement artifact and baseline noise contamination. J Biomech 2010 May 28;43(8):1573-1579. [CrossRef] [Medline]
  43. Müller R, Büttner P. A critical discussion of intraclass correlation coefficients. Stat Med 1994;13(23-24):2465-2476. [Medline]
  44. McGraw KO, Wong SP. Forming inferences about some intraclass correlation coefficients. Psychol Methods 1996;1(1):30-46. [CrossRef]
  45. Bartko JJ. The intraclass correlation coefficient as a measure of reliability. Psychol Rep 1966 Aug;19(1):3-11. [CrossRef] [Medline]
  46. Bland JM, Altman DG. Measuring agreement in method comparison studies. Stat Methods Med Res 1999 Jun;8(2):135-160. [Medline]
  47. Bland JM, Altman DG. A note on the use of the intraclass correlation coefficient in the evaluation of agreement between two methods of measurement. Comput Biol Med 1990;20(5):337-340. [Medline]
  48. Clancy EA, Farry KA. Adaptive whitening of the electromyogram to improve amplitude estimation. IEEE Trans Biomed Eng 2000 Jun;47(6):709-719. [CrossRef] [Medline]
  49. Rausa M, Palomba D, Cevoli S, Lazzerini L, Sancisi E, Cortelli P, et al. Biofeedback in the prophylactic treatment of medication overuse headache: a pilot randomized controlled trial. J Headache Pain 2016 Dec;17(1):87 [FREE Full text] [CrossRef] [Medline]
  50. Rains JC. Change mechanisms in EMG biofeedback training: cognitive changes underlying improvements in tension headache. Headache 2008 May;48(5):735-6; discussion 736. [CrossRef] [Medline]
  51. Munster-Segev M, Fuerst O, Kaplan SA, Chan A. Incorporation of a stress reducing mobile app in the care of patients with type 2 diabetes: a prospective study. JMIR mHealth and uHealth 2017;5(5):e75. [Medline]
  52. Uddin AA, Morita PP, Tallevi K, Armour K, Li J, Nolan RP, et al. Development of a wearable cardiac monitoring system for behavioral neurocardiac training: a usability study. JMIR Mhealth Uhealth 2016;4(2):e45 [FREE Full text] [CrossRef] [Medline]
  53. Dooley EE, Golaszewski NM, Bartholomew JB. Estimating accuracy at exercise intensities: a comparative study of self-monitoring heart rate and physical activity wearable devices. JMIR mHealth and uHealth 2017;5(3):e34. [Medline]

CCC: concordance correlation coefficient
ECG: electrocardiogram
ICC: intraclass correlation coefficient
LOA: limits of agreement
MD: mean difference
mHealth: mobile health
MVC: maximal voluntary contraction
MVP: minimal viable product
RMS: root mean square
SEMG: surface electromyography
VC50: voluntary contraction at 50% force
VC25: voluntary contraction at 25% force
WHMS: wearable health monitoring sensors

Edited by G Eysenbach; submitted 28.09.17; peer-reviewed by M Minen, YCP Arai; comments to author 09.12.17; revised version received 21.01.18; accepted 23.01.18; published 23.02.18


©Anker Stubberud, Petter Moe Omland, Erling Tronvik, Alexander Olsen, Trond Sand, Mattias Linde. Originally published in JMIR Biomedical Engineering (, 23.02.2018.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Biomedical Engineering, is properly cited. The complete bibliographic information, a link to the original publication on, as well as this copyright and license information must be included.