Published on in Vol 5, No 1 (2020): Jan-Dec

Preprints (earlier versions) of this paper are available at, first published .
Video Cloud Services for Hospitals: Designing an End-to-End Cloud Service Platform for Medical Video Storage and Secure Access

Video Cloud Services for Hospitals: Designing an End-to-End Cloud Service Platform for Medical Video Storage and Secure Access

Video Cloud Services for Hospitals: Designing an End-to-End Cloud Service Platform for Medical Video Storage and Secure Access


1Poznan Supercomputing and Networking Center, Poznań, Poland

2AGH University of Science and Technology, Kraków, Poland

3Centre de Recherche en Automatique de Nancy, University of Lorraine, Nancy, France

4DeepRiver, Nancy, France

*all authors contributed equally

Corresponding Author:

Piotr Pawałowski, MA

Poznan Supercomputing and Networking Center

ul Z Noskowskiego 12/14

Poznań, 61-704


Phone: 48 693919937


The amount of medical video data that has to be securely stored has been growing exponentially. This rapid expansion is mainly caused by the introduction of higher video resolution such as 4K and 8K to medical devices and the growing usage of telemedicine services, along with a general trend toward increasing transparency with respect to medical treatment, resulting in more and more medical procedures being recorded. Such video data, as medical data, must be maintained for many years, resulting in datasets at the exabytes scale that each hospital must be able to store in the future. Currently, hospitals do not have the required information and communications technology infrastructure to handle such large amounts of data in the long run. In this paper, we discuss the challenges and possible solutions to this problem. We propose a generic architecture for a holistic, end-to-end recording and storage platform for hospitals, define crucial components, and identify existing and future solutions to address all parts of the system. This paper focuses mostly on the recording part of the system by introducing the major challenges in the area of bioinformatics, with particular focus on three major areas: video encoding, video quality, and video metadata.

JMIR Biomed Eng 2020;5(1):e18139



In recent years, technologies and services in the areas of electronic health (eHealth) and telemedicine have been evolving more and more rapidly. In many European countries, the law is being changed so that remote medical treatments conducted using information and communications technology (ICT) systems are now recognized as valid medical procedures. The physical presence of the doctor and the patient at the same location is often no longer necessary, and private as well as state-owned insurance companies now reimburse doctors for performing remote consultations with the patient. A remote interaction between the doctor and the patient can be conducted via phone or internet chats, but as video plays a significant role in current communication, solutions based on the real-time transmission of audio-video are considered to provide better service with higher trustworthiness. In many such scenarios, there is a need to record remote treatment sessions for further reference, follow-up analysis, or for legal reasons.

Furthermore, in many European countries, there is a growing need for recording medical procedures conducted in hospitals or health care centers. There is a general trend toward increasing transparency when it comes to medical treatment, and thus to improve the quality of services provided to the patient. Medical errors do happen, but sometimes a dissatisfied patient decides to prosecute even when no mistake has been made. Therefore, hospitals need to have a way to settle such disputes based on proof. Some countries are preparing laws to make it obligatory for hospitals to record all treatment procedures. Moreover, incoming law changes will qualify such recordings as medical data, forcing hospitals to store these data for decades. Insurance companies also offer much lower liability insurance fees to hospitals recording their procedures, because in the case of a lawsuit, there is clear evidence to determine whether or not the hospital is at fault.

Taking this context into account, hospitals, health care centers, and doctors in Europe will soon need to implement recording of treatment and storage of these recordings for treatment conducted both locally and using telemedicine systems. However, most hospitals and health care centers are not ready to introduce such services as they do not have the necessary recording devices, are overwhelmed with the infrastructural needs of storing thousands of hours of recordings per year, and cannot guarantee the necessary security of the repositories.

To resolve these issues, an end-to-end solution for the recording, secure storage, and access to records of medical procedures conducted locally in hospitals and health care centers is needed along with using telemedicine systems. This requires new approaches to video coding, image quality, transmission, and security.

In this paper, we propose a generic platform architecture for medical video recording and storage. We discuss current and upcoming challenges, and identify the possible technologies to tackle these challenges. We also outline the existing solution that can be incorporated into a holistic platform. This paper focuses mostly on the recording part of the system, as it introduces the major challenges in the area of bioinformatics from three major areas: video encoding, video quality, and video metadata.

The remainder of this paper is organized as follows. The next section describes the current situation and upcoming challenges. This is followed by a proposal of the generic architecture and possible technologies to be used to resolve these challenges. The following section outlines how existing solutions based on such a platform could be built. Finally, we provide an overview and future prospects.

Telemedicine services, which allow for remote treatment and remote consultations using videoconferencing or other ICT systems, are increasingly being recognized as valid treatment solutions by national health care legislation. For example, as of 2018 in Germany [1], France, and Poland, the physical presence of the doctor and the patient at the same location is no longer required. This allows for new concepts of health care services to be introduced, and solutions providing such services are already available on the market (eg,,, or As in the United States, such services have been in place for some time now, and the European players who want to compete for the market share need to provide solutions tuned to European legislation, taking into consideration the diversity of the health care system landscape in European countries. The telemedicine market worldwide is increasing, which expanded from US $9.8 billion in 2010 to US $11.6 billion in 2011, and continued to grow to US $27.3 billion in 2016, representing an annual growth rate of 18.6% [2].

Telemedicine services can be divided into three basic types: (1) enabling patient-doctor contact, (2) supporting remote collaboration between medical professionals, and (3) enhancing medical education. These three types, although they involve different users and technical solutions, are substantially based on the transmission of video and thus require a secure solution to provide video recording and storage.

University clinics, which often use videoconferencing systems for medical education, also face the problem of how to store and publish recordings of educational sessions. A typical setup with H.323 videoconferencing equipment does not allow for recording sessions without the use of an additional dedicated recording device or cloud service. Other devices introduce extra costs and require management by the hospital. Using an external cloud service may entail not following the legal requirements, as most providers of videoconferencing solutions that are currently used are developed by US-based companies.

A similar problem applies to recording eHealth video connections between the patient and the doctor. Although the amount of data is much lower, web-based or mobile app–based solutions do not fulfil the legal requirements when it comes to the recording and storing of medical videos. Records of treatment procedures conducted remotely contain private patient data and need to be processed and stored accordingly. In many European countries, medical data cannot leave the given country; therefore, the medical videos must be stored securely by a cloud service that is located within the country.

In addition to telemedicine systems, medical videos can be produced by modern operating rooms (ORs), which contain several devices that are also video sources, including microscopes, endoscopes, surgical robots, and macroscopic cameras. Owing to the increasing complexity of the surgical working environment, technical solutions must be found to help relieve the surgeon [3]. Only a limited number of hospitals are equipped with integrated ORs that provide a common video management platform for all of these sources.

In most cases, medical devices record video on local hard drives, DVDs, or USB flash drives, if they provide that possibility at all. This is a highly unmanageable situation, and is a great hassle for the information technology department to manage and store the recordings. Even if there is an integrated video system available in the hospital, the infrastructure needed to save the videos, especially for many years, is not affordable for many hospitals, and storing the videos in hospital is not cost-effective. With the development of video technologies, where full high-definition systems are now standard and 4K or even 8K resolution systems are available or upcoming, the needed storage space for medical video has proliferated. For example, radiological imaging such as computed tomography scans, magnetic resonance images, or positron emission tomography scan images can reach several hundreds of megabytes per exam, while stereoscopic video related to high-definition surgery is based on two streams of 1.5 GB per second each. In the United States, the storage of mammograms required 2.5 PB in 2009, and 30% of all the images stored in the world in 2010 were medical images [4].

Moreover, in many countries, the retention period of medical records is very long. In Poland and France, medical images associated with a patient must be kept for 20 years after their last visit under the current law. As an example, the Department of Radiology and Imaging of the University Hospital in Nancy, France produced 55 TB of image data in 2015. Regarding videos, a full high-definition endoscopic camera (used routinely in endoscopic surgery) can provide 2.6 TB of data during a 2-hour operation. These figures highlight the challenge to be faced in terms of archiving capacity.

Consequently, medical images and video should be encoded (ie, compressed), as long as the encoding does not affect the therapeutic quality of the data for regular use by health professionals. Regardless of whether the data need to be transmitted or stored, it is of high importance to ensure that the compression step introduces no significant loss (degradation, or any other processing such as watermarking) at the encoder stage. In other words, it is crucial to maintain an acceptable visual quality of the video stream for health care professionals.

There are many solutions on the market providing recording functionality in the hospital environment. Companies such as Storz, Olympus, or Stryker have medical recorders in their offer. These systems integrate with the proprietary storage solutions provided by their producers, which have to be deployed within the information technology infrastructure of the hospital. However, this is not a scalable solution when all video recordings have to be kept as medical data over many years. Standard videoconferencing solutions, available from Cisco or Polycom, do not have a built-in recorder but utilize separate server-based recorders to be installed in the information technology infrastructure of the hospital. In addition, this is not a scalable solution and these products are focused on video conferencing in general. Therefore, they do not fulfill the requirements of medical video recording and storage when it comes to security and access rights.

Currently, European hospitals are not yet obliged to record all medical procedures; however, many do so for in-house usage or to increase surgeons’ efficiency [5]. If it becomes obligatory by law to record all surgical procedures, the amount of data produced will be enormous. Although it may constitute a burden, recording of medical procedures also brings new possibilities of utilizing the stored videos.

Our group, composed of research centers and companies from Poland, France, and Germany, has designed a generic architecture (Figure 1) for a holistic recording and storage platform for hospitals, which is outlined below. We have also identified components that can be used as parts of such a platform.

Figure 1. Generic architecture of the holistic recording and storage platform for hospitals. eHealth: electronic health.
View this figure


The high-level scenarios that need medical video recording (marked with green in Figure 1) are: (1) patient eHealth, involving telemedical situations in which the patient is connected with the doctor using the web or mobile solutions providing videoconferencing services; (2) surgery and procedures, which include scenarios in which the hospital or health care center needs to record medical procedures or surgeries; and (3) real-time telemedicine, which includes scenarios focusing on hospitals and health care centers using telemedical solutions to provide remote consultations, surgery supervision, or medical education.

Technical Constituents

The two major technical constituents of the system (marked with blue in Figure 1) are recording and secure storage. Recording includes components of the system responsible for video acquisition, encoding, and uploading to the secure storage. These components should provide novel video encoding mechanisms, maximize video quality of experience for medical applications, be ready for 4K and 8K medical devices, allow to record stereoscopic (three-dimensional) video, and provide an innovative approach to a metadata description of the videos. The second main conceptual component of our system is the safe storage platform providing storage, transcoding, and access to the medical video recordings. The room should be centralized, in terms of the provider, for each country, but it should be possible to be distributed over various data centers. National regulations should be implemented for the deployment in each state, providing the needed security standards. The envisioned solution should provide measures for the secure upload of recorded videos from the devices and recording applications listed above. The platform will also offer various means of access to the videos depending on user privileges and use case scenarios.

Use Case Scenarios

The different use case scenarios of the recordings are marked with red in Figure 1. The access to the records should be granted through a web portal or an application programming interface (API).

Medical Analytics

The platform should provide access to recordings that have been marked as public and anonymized to enable medical analytics and research using the stored video content.

Medical Documentation

Videos are also an essential part of the medical literature. The recordings provided through the platform should be linked to medical documentation files.

Follow-Up Treatment

Recordings stored on the platform should be accessible by doctors to review the previous treatment or procedures to provide better follow-up treatment. The videos should either be available only to the doctor/hospital that created the recording or made available to other health care providers, according to varying laws in European countries.

Legal Cases

The recorded videos could also be used by courts or insurance companies in legal matters. As mentioned in the Introduction, with a general trend to increase transparency when it comes to medical treatment, recoding medical procedures reduces the liability of hospitals. This is especially important as telemedical services, allowing remote treatment using videoconferencing, are being recognized as valid treatment solutions by national health care legislation.

Medical Education

The insufficient number of adequately educated medical personnel is one of the main obstacles to providing universal health care at the highest level. Medical school is especially time- and money-consuming in areas such as surgery, in which students and young doctors have to observe several operations during their education. However, because of spatial limitations in ORs, only a small number of students can be present during a given surgery, especially if they need to see the essential details from up close. This access scenario is an integral part of modern medicine, as audio-video content highly increases the effectiveness of medical education, which in turn benefits the whole society.


Another critical aspect of medical procedures recording that must be considered is that the service delivered to the hospitals has to be an end-to-end service. This means that, from the hospital perspective, the service should not require additional infrastructure except for medical recorders. These recorders have to be able to acquire audio and video from the OR equipment, record and encode it, and then automatically transfer it to the secure cloud infrastructure outside the hospital. It is desired that a standard open API be designed to allow companies producing medical recorders and other video-based equipment to integrate with the secure cloud, avoiding vendor lock-in. Furthermore, the legal framework for each country may differ, so that the process of deployment of the service must be conducted with the participation of legal and health care system experts. There is a long list of norms that has to be taken into account, including ISO 13485 (Medical Devices) as a quality management system [6], ISO 14971:2007 (Part 1: Application of Risk Management to Medical Devices) [7], and IEC 60601-1-11:2015 (Medical Electrical Equipment, Part 1-11: General Requirements for Basic Safety and Essential Performance; Collateral Standard: Requirements for Medical Electrical Equipment and Medical Electrical Systems Used in the Home Health Care Environment) [8].

In the following subsections, we will focus on video encoding, video quality, and video metadata, as these are the areas introducing the greatest challenges with respect to the topic of bioinformatics.

Video Encoding

It is crucial to choose an appropriate set of codes that will minimize the storage space requirements and maintain the required quality for medical purposes. We have created a model to calculate the amount of storage needed for surgery recording while fulfilling the requirements of being treated as medical data. In the model, we assumed that we record one full high-definition (1920×1080) video image from an OR, with the H.264/AVC video codec at 6 Mbps. This codec was chosen for the model as it is currently the most widespread video codec and provides adequate quality for medical purposes. This has been confirmed by carrying out subjective tests, as they are an excellent way to assess video/image quality for a group of users. These tests allowed medical specialists such as surgeons to give their opinions about specific encoded sequences [9]. The example calculations were performed for Poland. In 2017, there were 951 general hospitals in Poland. On average, there are 8 ORs in a hospital, with 3.2 surgeries being performed per day in each of them [10]. Based on this, we can calculate that one hospital recording all surgical procedures produces 189.8 GB of video data per day and 67.7 TB per year. Therefore, for the whole of Poland, this adds up to 62.9 PB per year. If these recordings are treated as medical data, they have to be stored for 20 years since the last visit of the patient. As an example, if surgery of a 20-year-old patient is recorded in 2019 and the patient lives until 80 years old, regularly going back to see the doctors, this recorded surgery will have to be stored until 2099. For calculations in our model, we used the average life expectancy for Poland, which is 75 years [11], and the average age of a patient of 56.87 years [12]. This means that, on average, a recorded surgery has to be stored for 38.13 years, and before any recorded video can be deleted, the storage needs of Polish hospitals will add up to 2325.3 PB. Such volumes will be a severe challenge for the information technology infrastructures of the hospitals. The above calculations assume recording only one full high-definition video. However, there are already medical devices (eg, endoscopes) available on the market that provide 4K or even 8K resolution [13]. The usage of these devices will increase the amount of data produced by a factor of 4 or 16. To compensate for this, new ways of video encoding are needed that will provide higher compression, while maintaining the quality at a medical grade.

Current video compression systems can be roughly divided into two categories: strong and lightweight compression. Strong compression is mainly used for the final distribution of the video content, which provides compression ratios from 10× to more than 100×. High compression is obtained at the expense of losing the information in the video (lossy compression) and usually requires substantial computational effort on the encoder side. However, in the newest video codecs, the compression ratio and resulting visual quality can be controlled by the encoder for different application scenarios. For telemedicine, we can achieve a video quality that is visually not distinguishable from the source for humans. Video coding standards such as H.264 (ie, advanced video coding [AVC]) and H.265 (ie, high-efficiency video coding [HEVC]) are part of the strong compression category.

Lightweight compression is used during the production process to reduce the data size for transmission or storage while maintaining the quality as close as possible to the original (ie, lossless or visually lossless compression). Compression ratios between 2× and 6× are common, and implementations of encoders and decoders require much less computational resources compared to firm compression. Codecs such as JPEG-2000, JPEG-XS, and VC-2 are part of this group.

The H.265/HEVC video coding standard [14] ratified in 2013 represents the state of the art in strong video compression. Compared to H.264/AVC, H.265/HEVC is capable of reducing the bit rate by 50% while maintaining the same quality. Compared to uncompressed video, HEVC can achieve compression factors between 250× and 500×, depending on the target bit rate and video resolution, more specifically 249× for full high-definition (1920×1080), 30 frames/s at 3 Mbits/s; 478× for 4K (3840×2160), 60 frames/s at 25 Mbits/s; and 398× for 8K (7680×4320), 60 frames/s at 120 Mbits/s. Unlike H.264 Main Profile (MP), HEVC MP supports bit depths up to 10 bit, resulting in better color fidelity and a noticeable reduction of banding artefacts in uniform areas with a continuous gradation of the color tone and luminosity.

In 2014, version 2 of the HEVC standard (HEVCv2) was approved, which specifies, among others, the so-called range extensions (RExt) addressed toward video production and contribution applications that require high-quality color formats such as 4:2:2, 4:4:4, and RGB, up to 12 bits per pixel. HEVC RExt is suitable for medical applications as the extended chroma formats and high bit depths allow for better preservation of the original content.

A new generation of video codecs is being developed to cope with the demands of emerging applications such as UHD TV in 4K and 8K resolutions, 360° video, and new quality formats such as high dynamic range (HDR), high frame rate (HFR), and wide color gamut (WCG).

These next-generation video codecs include AV1 and H.266. The AV1 format has recently been ratified by the Alliance of Open Media (AOM) and is designed for web apps such as video on demand (VoD) [15]. The AV1 codec is intended to be open and royalty-free. Studies conducted in 2017 concluded that AV1 produces an average bit rate reduction between 17% and 22% compared to H.265/HEVC [16]. In terms of encoding speed, the AOM’s AV1 reference implementation seems to be too heavy computationally and, as far as we know, there is no efficient AV1 encoder implementation available at present. Therefore, the potential compression gain versus encoding speed of AV1 compared to HEVC does not justify the implementation and optimization of this codec for usage in the context of medical video storage.

The Joint Video Exploration Team on Future Video coding (JVET) was founded in October 2015 by the International Telecommunications Union Telecommunication Standardization Sector (ITU-T) Video Coding Experts Group (VCEG) and International Organization for Standardization (ISO)/International Electrotechnical Commission (IEC) Moving Picture Experts Group (MPEG) to analyze whether there is sufficient need for the development of a new codec with compression capabilities beyond HEVC. Test cases have been defined for various types of video, including HD, UHD, HDR, and 360° video, and investigations on advanced compression tools using a common software platform have been performed [17]. JVET plans to finalize a new codec standard by the end of 2020 (ITU-T will probably name this codec H.266).

Currently, there are optimized HEVC/H.265 software encoders available, such as the open-source x265 encoder [18] (in particular, x265 has inherited its basic algorithms from its predecessor x264 [19], H.264/AVC) and commercial encoders offered by companies. Independent comparisons have shown that x265 has better compression-quality-speed performance than other industrial encoders [20], especially for full high-definition video. Currently, x265 is commonly used as a baseline for benchmarking analyses performed by video codec companies.

Real-time 4K and 8K hardware encoders are also already available on the market. Nvidia released a new generation of GPUs called Pascal that supports 8K hardware-accelerated HEVC encoding using NVENC technology. Advantech is commercializing the VEGA 3300 Series for 4K/8K HEVC encoding, decoding, and transcoding. NEC also launched its 4K video codec for ultra-low delay applications. Intel with Quick Sync and AMD with VCE technology are quite active in the area of hardware-accelerated HEVC processing as well. There are also companies and universities providing field programmable gate array (FPGA) accelerators for HEVC-based solutions. However, hardware encoders are much more expensive to create and are often much less feature-rich than software encoders.

Although firm compression delivers a substantial reduction of bandwidth used, this requires relatively high implementation complexity to achieve. In many usage scenarios, especially those closer to the video source, the quality is more important than the compression rate. In these scenarios, the source videos are ideally stored in an uncompressed raw format. Nevertheless, due to the increase in resolution, frame rate, and pixel format quality, this is becoming more and more challenging from both a technical and economic point of view.

Lightweight compression, visually lossless compression, intermediate codecs, and mezzanine codecs are keywords that refer to codecs for these scenarios. The main characteristics of such codecs are a relatively small compression ratio, often somewhat configurable between 2× and 8× compression compared to the raw format; intra-only codec tools, to allow for uncomplicated integration in editing workflows; focus on low latency and low complexity real-time implementation; and focus on multigeneration performance. Codecs are designed to reduce quality degradation with multiple transcoding steps.

JPEG XS is a recent promising effort to create a standardized intermediate codec. A strong focus is placed on achieving visually lossless quality with low latency and low-complexity implementations. The standardization efforts also target the integration of file and transmission formats [21]. The target use cases are video transport over professional video links (SDI, IP, ethernet), real-time video storage, memory buffers, omnidirectional video capture and rendering, and on-sensor compression.

In our opinion, considering the large amounts of data that will be produced by the recording of medical procedures, only firm compression should be taken into account. This is why it is crucial that during the standardization of new robust compression codecs, the aspect of quality for medical purposes should be one of the main focus areas.

There are several publications presenting research results on storing compressed medical video sequences. However, it should be stressed that research on medical multimedia data compression is usually performed within narrow medical specializations such as oncology [22] (the EUROPATH project), radiology [22,23] (the EURORAD project), ophthalmology [24], pediatrics [25], surgery [26-31], nursing [32], dentistry [27], internal medicine [33-35], and emergency medicine [36-38]. The effects of this research have influenced the creation and development of standards for storing and compressing medical video sequences [22].

The most important among them is the digital imaging and communication in medicine (DICOM) standard [39]. The DICOM standard has become the primary standard used for the storage of compressed surgery multimedia data. DICOM allows for storing both still images and video sequences [39], although it has not been designed to be used for long video sequences. Compression algorithms to be chosen for reliable storage of source data were one of the critical aspects taken into account during the development process of the standard. Currently, most of the codecs specified in the DICOM standard are lossless codecs. Usage of lossless codecs results in low compression ratios that are now achieved when compressing and storing video sequences (eg, in the DICOM format). Therefore, using these codecs entails superfluous requirements on both the archive memory and the throughput of the streaming network.

Moreover, it is essential to note that the DICOM standard is continuously under development, and it is likely to include more lossy compression methods in the future. For example, with respect to the storage of long bronchoscopic video recordings, the DICOM standard is usually not used. However, the newest version of the standard introduces the first extensions toward storage of long medical records. For diagnostic purposes, the most popular format is AVI. The most commonly used video codec is the (almost lossless) MJPEG, which results in broadband streams that are not suitable for streaming.

By contrast, lossy codecs generally produce desirable narrow-band video streams. Unfortunately, as mentioned above, lossy codecs are hardly ever used for compression of surgery multimedia data if the decompressed images are meant to support a diagnostic process. These codecs are usually designed with the assumption of introducing a significant loss if used for effective compression (the decompressed image may differ significantly from the original). In the case of using visibly degraded images for diagnostic purposes, there is a severe danger of an unacceptable influence on the diagnosis.

Usually, if a presentation of medical multimedia data does not have to support a diagnostic process, but is used for other purposes (eg, educational purposes), popular consumer (usually lossy) codecs are used. In the case of still images, the JPEG compression standard [40] and the JFIF format are used [41]. If video sequences are being compressed, popular codecs such as MPEG-1, MPEG-2, MPEG-4 [42], H.264, H.265, and others are typically used [43].

Video Quality

In many visual applications, the quality of the moving image is not as important as the ability of the optical system to perform specific tasks for which it is created based on these video sequences. Such sequences are called Target Recognition Video (TRV). Regardless of the different ways in which the concept of TRV quality is understood, its verification is necessary to perform dedicated quality testing. The basic premise of these tests is to find TRV quality limits for which the task can be achieved with the desired probability or accuracy. Such tests are usually subjective psychophysical experiments with a group of subjects. Unfortunately, due to the issue complexity and relatively poor understanding of human cognitive mechanisms, satisfactory results of TRV quality computer modeling have not yet been achieved beyond minimal application areas.

Given the use of TRV, qualitative tests do not focus on the subjects’ satisfaction with the video sequence quality but instead measure how the subject uses TRV to accomplish specific tasks. The purposes of this may include: video surveillance such as recognition of vehicle license plate numbers, telemedicine/remote diagnostics for a correct diagnosis, fire safety and fire detection, rear backup cameras such as for parking the car, and games such as spotting and correctly reacting to a virtual enemy.

Since the human factor has a significant influence on the subjective assessment of TRV quality, it is necessary to ask questions on the procedures that need to be complied with to perform it. In particular, problems arise on: the method of selecting the TRV source on which the test TRV (with degraded quality) is based; subjective testing methods and the general manner of conducting the psychophysical experiment; the process of selecting a subjects group in the psychophysical experiment, especially the identification of any prior task knowledge; training subjects before the start of the operation; conditions in which the test will be carried out; and methods of statistical analysis and presentation of results [44].

Metrics for general quality of experience [45], both full-reference methods (eg, peak signal-to-noise ratio or structural SIMilarity [46]) and no-reference methods (eg, video quality indicators, developed by the AGH University research team [47]), conventionally used in video processing systems for video quality evaluation are not appropriate for recognition tasks in video analytics (ie, TRV). Therefore, the correct assessment of video processing pipeline performance is still a significant research challenge.

Video Metadata

Besides the medical data itself, some information (ie, related to the video, patient, pathology, etc) should be recorded in parallel. These metadata are essential to identify the hospital, surgeon, patient, and all information relevant to the surgery. These additional data increase the size of the total data to be recorded or transmitted. Moreover, these data are subject to specific constraints, as they must be processed using standard algorithms. These are also often sensitive data and must be compliant with regulations such as General Data Protection Regulation. Finally, the data are coming from different sources and suffer from a lack of integration, making it difficult for the expert to access the data in their daily practice in some cases.

Despite all of these constraints, exploitation of these metadata offers several advantages and should be of interest for medicine at present and in the future. In particular, this would enable improvement of the everyday practice of experts during the identification of the disease, diagnosis, and follow up.

One way to circumvent the constraints mentioned above, especially the increase in data size and the lack of integration, could be to use data-hiding techniques to hide metadata in the medical video without increasing the total amount of data. In this way, the metadata would never become separated from the recording.

Data hiding can be used to hide different types of information such as text, signature, code, image, video, or audio. It is crucial to include the patient metadata relevant to the recorded procedure in the video, including preexamination results, disease history, and linking to the patient data in a national or hospital database. Such an approach would enable having the data that are most important for follow-up treatment always available with the video. Of course, as these are sensitive medical data, a second version of the recording with anonymized patient data must be prepared for research and educational purposes.

The data-hiding technique was initially conceived in the 1990s to fight against digital piracy and protect the intellectual property of identical copies [48] (especially after the explosion in the use of digital media) during storage and transmission via very open information networks susceptible to attacks. Since its creation, data hiding has been developed in several areas such as telecommunications, video coding, VoD services, medical imaging [49], and computer security.

Figure 2 presents an example of data hiding used to embed information in an ear nose and throat medical video during its transmission. There are two main parts in a data-hiding procedure: data insertion and data detection (or extraction). The mark is inserted within the medical image before transmission. At the reception, the hidden target is detected owing to a secret key and then extracted.

For each application, there is always a relationship between the method of data hiding and the following three main constraints: the capacity of insertion (in bits), imperceptibility/invisibility, and robustness. The size is the amount of information you can insert or hide in the media. Protecting information can distort video quality. It is therefore essential that the mark remains invisible to stay as faithful as possible to the original condition. Data-hiding techniques are sensitive to various distortions that occur during transmission or coding. Robustness to some of these attacks is strongly dependent on the application constraints, while the capacity and invisibility are more related to the structure of the host video itself. Regardless, using a data-hiding technique inevitably leads to a compromise among size, invisibility, and robustness, taking into account the constraints of the application. As an example, in the medical domain, we must place the main effort on invisibility, especially for diagnostic applications.

Figure 2. Schematic depiction of a data-hiding procedure.
View this figure

Our group is already providing services that can be treated as components of the envisioned platform or come close to it.

The medVC platform [50] is a remote collaboration tool for medical professionals that allows for real-time audio-video communication and the usage of specialized medical services. This platform is designed to be installed in ORs, conference rooms, and doctors’ offices. It makes it possible to send multiple high-definition video streams coming from cameras, microscopes, endoscopes, and other medical equipment. The platform has a built-in recording system providing high-quality videos encoded with the H.264 or HEVC codec. The system can also automatically upload the recorded videos to cloud platforms such as Medtube [51] or the Interactive Medical University (IMU) [52]. These platforms are dedicated medical video portals, the content of which can be used for medical education. Medtube is a platform allowing any doctor to publish recordings of their surgeries, uploaded using a web browser. IMU is an approach to an end-to-end solution, where videos recorded by dedicated medical recorders are automatically uploaded to a doctor’s account, which can then be edited and published.

The deep RIVER [53] company is a spin-off of the University of Lorraine, providing libraries that enable metadata hiding in video content. This novel approach is currently being integrated into the medVC platform, which will allow embedding critical medical data into the video synchronously.

Spin Digital Video Technologies GmbH [54], a small and medium-sized enterprise based in Germany, is the developer of an H.265/HEVC codec implementation, achieving extremely high video quality with reasonable bit rates. Spin Digital software solutions enable media applications that require the latest image and video processing enhancements, including very high resolution (4K, 8K, and 16K), HDR, HFR, WCG, 360° video, and virtual reality. Based on this core technology, Spin Digital has developed a complete solution that includes applications (media player and transcoder) as well as a software development kit that is ready to be integrated into custom applications.

Researchers of the AGH University of Science and Technology [55] working in our group are experts on video quality assessment methods and members of the Video Quality Experts Group [56]. Thanks to their work, recommendations for video coding parameters for medical use have been created, and in the future, emerging codecs will be assessed in this context.

Poznań Supercomputing and Networking Center (PSNC) [57] is one of the leading research and computing centers in Poland. PSNC focuses on using, innovating, and providing high-performance computing, grid, cloud, portal, and network technologies to enable advances in science and engineering. PSNC is also a cloud infrastructure provider, and its network interconnects all clinical hospitals in Poland. Based on this experience and activity, PSNC can become the first operator to introduce secure cloud storage services on the Polish market.

Recording of medical procedures and the requirement to treat these recordings as medical data will be a significant challenge for health care systems around the world. Exabytes of data will be created and will have to be stored for decades. Most hospitals will not be able to handle this from an infrastructural point of view. Therefore, there is a need to provide solutions on national or even international levels. Standardization efforts are needed to define procedures of medical video storage in certified data centers supporting national health care systems. This must be done together with the specification of an API to be implemented by medical recording devices that will automatically upload the recordings to the repositories. In parallel, the standardization of high compression codecs for medical purposes must progress, and the quality parameters of the codecs, required to ensure medical video quality, must be defined by law. Data hiding can provide an answer to the problems of trustworthiness and integrity of medical data. Thanks to this technique, it is possible to send and store more digital metadata by hiding it in the media. In addition, data hiding offers more advantages for medicine applications by enriching the medical data with useful metadata that can help practitioners improve their diagnosis during the identification of the disease or enhance the level of training for future doctors.

To some extent, there are technologies available or emerging that can be used to build the desired medical video recording and storage platform. With respect to video encoding, there is HEVC [14]; the Joint Collaborative Team on Video Coding is maintaining that the ISO/IEC and ITU-T are conducting ongoing work on new HEVC extensions that can be applied for high-quality medical content. Moreover, the JVET [17] codec of the ITU-T VCEG and ISO/IEC MPEG, which is expected to be completed at the end of 2020, should provide extensions for medical usage. Automatic video quality assessment is more problematic, as conventional video quality assessment metrics are not appropriate for recognition tasks in video analytics; thus, there is still a significant research challenge in this area. New metrics and methods have to be defined, providing an adequate approach to assessing video quality for telemedicine. For video watermarking and trustworthiness of video metadata, novel approaches to data hiding within the video media itself are already available [49]. There are also commercial providers of solutions offering end-to-end services for hospitals, such as medVC [50] and IMU [52], enabling automatic recording, uploading, and storage of medical procedure recordings. However, as there is no legal definition of the quality parameters for highly compressed medical video, these solutions are used mainly for medical education and knowledge sharing at present. Having these standards defined would allow for a broader spectrum of usage scenarios such as getting a second opinion on the treatment, support for follow-up treatment, or analysis in legal cases.

Conflicts of Interest

None declared.

  1. Telemedizin in Deutschland - Fernbehandlung läuft schleppend. Krankenkassen-Zentrale.   URL: https:/​/www.​​magazin/​telemedizin-in-deutschland-fernbehandlung-laeuft-schleppend-97525 [accessed 2019-08-28]
  2. EHealth Action Plan 2012-2020 - Innovative healthcare for the 21st century. European Commission.   URL: [accessed 2019-08-28]
  3. Rockstroh M, Franke S, Neumuth T. Requirements for the structured recording of surgical device data in the digital operating room. Int J Comput Assist Radiol Surg 2014 Jan 21;9(1):49-57. [CrossRef] [Medline]
  4. Hey T, Trefethen A. The Data Deluge: An e-Science Perspective. In: Berman F, Fox G, Hey T, editors. Grid Computing: Making the Global Infrastructure a Reality. Hoboken, NJ: John Wiley & Sons; May 30, 2003:809-824.
  5. Bergström H, Larsson L, Stenberg E. Audio-video recording during laparoscopic surgery reduces irrelevant conversation between surgeons: a cohort study. BMC Surg 2018 Nov 06;18(1):92 [FREE Full text] [CrossRef] [Medline]
  6. ISO 13485: 2016 Medical Devices - Quality Management Systems - Requirements for Regulatory Purposes. International Organization for Standardization.   URL: [accessed 2019-08-28]
  7. ISO 14971: 2007 Medical Devices - Application of Risk Management to Medical Devices. International Organization for Standardization.   URL: [accessed 2019-08-28]
  8. EC 60601-1-11:2015 Medical electrical equipment - part 1-11: general requirements for basic safety and essential performance - collateral standard: requirements for medical electrical equipment and medical electrical systems used in the home healthcare. International Organization for Standardization.   URL: [accessed 2019-08-28]
  9. Chaabouni A, Gaudeau Y, Lambert J, Moureaux J, Gallet P. H.264 medical video compression for telemedicine: A performance analysis. IRBM 2016 Feb;37(1):40-48. [CrossRef]
  10. In-patient health care. SWAID.   URL: http:/​/swaid.​​en/​ZdrowieOchronaZdrowia_dashboards/​Raporty_predefiniowane/​RAP_DBD_ZDR_3.​aspx [accessed 2019-08-28]
  11. Life expectancy. SWAID.   URL: [accessed 2019-08-28]
  12. Health and health care in 2016. Glowny Urzad Statystyczny (Central Statistical Office).   URL: https:/​/stat.​​files/​gfx/​portalinformacyjny/​pl/​defaultaktualnosci/​5513/​1/​7/​1/​zdrowie_i_ochrona_zdrowia_w_2016.​pdf [accessed 2019-08-28]
  13. Minai H. 8K endoscopes making surgery safer and less painful. Nikkei Asian Review. 2017.   URL: [accessed 2019-08-28]
  14. Sullivan GJ, Ohm JR, Han WJ, Wiegand T. Overview of the High Efficiency Video Coding (HEVC) Standard. IEEE Trans Circuits Syst Video Technol 2012 Dec 01;22(12):1649-1668. [CrossRef]
  15. The Alliance for Open Media Kickstarts Video Innovation Era with AV1. Alliance for Open Media. 2018.   URL: https:/​/aomedia.​org/​press%20releases/​the-alliance-for-open-media-kickstarts-video-innovation-era-with-av1-release/​ [accessed 2019-08-28]
  16. MSU Codec Comparison 2017 Part V: High Quality Encoders. CS MSU Graphics & Media Lab, Video Group. 2018 Jan 17.   URL: [accessed 2019-08-28]
  17. Segall C, Baroncini V, Boyce J, Chen J, Suzuki T. JVET-H1002: Joint Call for Proposals on Video Compression with Capability beyond HEVC. 2018 Jul 20 Presented at: Joint Video Exploration Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11 8th Meeting; October 18-24, 2017; Macao, CN.
  18. x265 HEVC/H.265 Encoder. x265.   URL: [accessed 2019-08-28]
  19. x264 H.264/AVC Encoder. VideoLAN.   URL: [accessed 2019-08-28]
  20. Zvezdakov S, Antsiferova A, Kondranin D, Kulikov D, Vatolin D, Erofeev M. HEVC Video Codecs Comparison 2018 (Part IV: FullHD Content, High-Quality Use Case). MSU Codec Comparisons. 2018 Apr 04.   URL: [accessed 2019-08-28]
  21. Overview of JPEG-XS. Joint Photographic Experts Group (JPEG).   URL: [accessed 2019-08-28]
  22. Rubis Project. 4th Research and Development Framework Programme 1994-1998. In: Healthcare Telematics Projects Final Report. Brussels: European Commission, DG XIII, Directorate B; 2001:17-19.
  23. Przelaskowski A. Wavelet-based Image Data Compression (Falkowe metody kompresji danych obrazowych) DSc dissertation. Warsaw: Publishing House of the Warsaw University of Technology (Oficyna Wydawnicza PW); Sep 01, 2002:149-190.
  24. Miron H, Blumenthal EZ. Bridging analog and digital video in the surgical setting. J Cataract Refract Surg 2003 Oct;29(10):1874-1877. [CrossRef] [Medline]
  25. Hamilton NM, Frade I, Duguid KP, Furnace J, Kindley AD. Digital video for networked CAL delivery. J Audiov Media Med 1995 Jun;18(2):59-63. [CrossRef] [Medline]
  26. Kumar A, Pal H. Digital video recording of cardiac surgical procedures. Ann Thorac Surg 2004 Mar;77(3):1063-1065; discussion 1065. [CrossRef] [Medline]
  27. Reynolds PA, Mason R. On-line video media for continuing professional development in dentistry. Comput Educ 2002 Aug 01;39(1):65-98. [CrossRef]
  28. Greene PS. Streaming Video for The Annals Internet Readers. Ann Thorac Surg 1998 Apr;65(4):1188-1189. [CrossRef]
  29. Gandsas A, McIntire K, Palli G, Park A. Live streaming video for medical education: a laboratory model. J Laparoendosc Adv Surg Tech A 2002 Oct;12(5):377-382. [CrossRef] [Medline]
  30. Malassagne B, Mutter D, Leroy J, Smith M, Soler L, Marescaux J. Teleeducation in surgery: European Institute for Telesurgery experience. World J Surg 2001 Nov;25(11):1490-1494. [CrossRef] [Medline]
  31. Rosser J, Herman B, Ehrenwerth C. An overview of videostreaming on the Internet and its application to surgical education. Surg Endosc 2001 Jun 19;15(6):624-629. [CrossRef] [Medline]
  32. Green SM, Voegeli D, Harrison M, Phillips J, Knowles J, Weaver M, et al. Evaluating the use of streaming video to support student learning in a first-year life sciences course for student nurses. Nurse Educ Today 2003 May;23(4):255-261. [CrossRef] [Medline]
  33. Strom J. Streaming Video: Overcoming Barriers for Teaching and Learning. 2002 Jan 01 Presented at: International Symposium Educational Conferencing; 2002; Banff, AB.
  34. Wiecha JM, Gramling R, Joachim P, Vanderschmidt H. Collaborative e-learning using streaming video and asynchronous discussion boards to teach the cognitive foundation of medical interviewing: a case study. J Med Internet Res 2003 Jun 27;5(2):e13 [FREE Full text] [CrossRef] [Medline]
  35. Zollo SA, Kienzle MG, Henshaw Z, Crist LG, Wakefield DS. Tele-education in a telemedicine environment: implications for rural health care and academic medical centers. J Med Syst 1999 Apr;23(2):107-122. [CrossRef] [Medline]
  36. Levitan RM, Goldman TS, Bryan DA, Shofer F, Herlich A. Training with video imaging improves the initial intubation success rates of paramedic trainees in an operating room setting. Ann Emerg Med 2001 Jan;37(1):46-50. [CrossRef] [Medline]
  37. Leung J. Apply Streaming Audio and Video Technology to Enhance Emergency Physician Education. Acad Emerg Med 2002 Oct 01;9:1059. [CrossRef]
  38. Gisondi MA. Emergency Department Orientation Utilizing Web-based Streaming Video. Acad Emerg Med 2003 Aug 01;10(8):920. [CrossRef]
  39. Digital Imaging and Communications in Medicine (DICOM). National Electrical Manufacturers Association.   URL: [accessed 2019-08-28]
  40. ISO/IEC 2022:1994: Information technology - Character code structure and extension techniques. International Organization for Standardization. 1994.   URL: [accessed 2019-08-29]
  41. Lowe HJ. he New Telemedicine Paradigm: Using Internet-Based Multimedia Electronic Medical Record Systems To Support Wide-Area Clinical Care Delivery. 2001 Jul 06 Presented at: Telemedicine and Telecommunications: Options for the New Century; March 13, 2001; Bethesda.
  42. Cuggia M, Mougin F, Le Beux P. Indexing method of digital audiovisual medical resources with semantic Web integration. Int J Med Inform 2005 Mar;74(2-4):169-177. [CrossRef] [Medline]
  43. Duplaga M, Leszczuk M, Papir Z, Przelaskowski A. Compression evaluation of surgery video recordings retaining diagnostic credibility (compression evaluation of surgery video). Opto Electron Rev 2008 Dec 01;16(4):428-438. [CrossRef]
  44. Leszczuk M. Revising and Improving the ITU-T Recommendation. J Telecom Inf Technol 2015 Jan 01:912.
  45. Le Callet P, Möller S, Perkis A. Qualinet White Paper on Definitions of Quality of Experience. 2013 Mar 01 Presented at: Fifth Qualinet Meeting European Network on Quality of Experience in Multimedia Systems and Services (COST Action IC 1003); March 12, 2013; Novi Sad.
  46. Wang Z, Bovik A, Sheikh H, Simoncelli E. Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 2004 Apr;13(4):600-612. [CrossRef] [Medline]
  47. Leszczuk M, Hanusiak M, Farias MCQ, Wyckens E, Heston G. Recent developments in visual quality monitoring by key performance indicators. Multimed Tools Appl 2014 Sep 6;75(17):10745-10767. [CrossRef]
  48. Petitcolas F, Anderson R, Kuhn M. Information hiding-a survey. Proc IEEE 1999 Aug 06;87(7):1062-1078. [CrossRef]
  49. Puech W, Rodrigues JM. A new crypto-watermarking method for medical images safe transfer. 2004 Sep 06 Presented at: 12th European Signal Processing Conference; September 6-10, 2004; Vienna p. 1481-1484.
  50. medVC Remote Collaboration Tool for Medical Professionals.   URL: [accessed 2019-08-28]
  51. MedTube.   URL: [accessed 2019-08-28]
  52. Interactive Medical University - A platform of medical education based on video materials. medVC.   URL: [accessed 2019-08-28]
  53. deep RIVER.   URL: [accessed 2019-08-28]
  54. Spin Digital Video Technologies GmbH.   URL: [accessed 2019-08-28]
  55. AGH University of Science and Technology.   URL: [accessed 2019-08-28]
  56. Video Quality Experts Group (VQEG).   URL: [accessed 2019-08-28]
  57. Poznan Supercomputing and Networking Center (PSNC).   URL: [accessed 2019-08-28]

AOM: Alliance of Open Media
API: application programming interface
AVC: advanced video coding
DICOM: digital imaging and communication in medicine
eHealth: electronic health
FPGA: field programmable gate array
HDR: high dynamic range
HEVC: high-efficiency video coding
HFR: high frame rate
ICT: information and communications technology
IEC: International Electrotechnical Commission
IMU: Interactive Medical University
ISO: International Organization for Standardization
ITU-T: International Telecommunications Union Telecommunication Standardization Sector
JVET: Joint Video Exploration Team on Future Video coding
MP: Main Profile
MPEG: Moving Picture Experts Group
OR: operating room
PSNC: Poznań Supercomputing and Networking Center
RExt: range extensions
TRV: target recognition video
VCEG: Video Coding Experts Group
VoD: video on demand
WCG: wide color gamut

Edited by G Eysenbach; submitted 05.02.20; peer-reviewed by M Johanson, E Moro-Rodríguez; comments to author 12.04.20; revised version received 31.05.20; accepted 13.06.20; published 29.07.20


©Piotr Pawałowski, Cezary Mazurek, Mikołaj Leszczuk, Jean-Marie Moureaux, Amine Chaabouni. Originally published in JMIR Biomedical Engineering (, 29.07.2020.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Research Protocols, is properly cited. The complete bibliographic information, a link to the original publication on, as well as this copyright and license information must be included.