This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Research Protocols, is properly cited. The complete bibliographic information, a link to the original publication on http://biomedeng.jmir.org/, as well as this copyright and license information must be included.
The amount of medical video data that has to be securely stored has been growing exponentially. This rapid expansion is mainly caused by the introduction of higher video resolution such as 4K and 8K to medical devices and the growing usage of telemedicine services, along with a general trend toward increasing transparency with respect to medical treatment, resulting in more and more medical procedures being recorded. Such video data, as medical data, must be maintained for many years, resulting in datasets at the exabytes scale that each hospital must be able to store in the future. Currently, hospitals do not have the required information and communications technology infrastructure to handle such large amounts of data in the long run. In this paper, we discuss the challenges and possible solutions to this problem. We propose a generic architecture for a holistic, end-to-end recording and storage platform for hospitals, define crucial components, and identify existing and future solutions to address all parts of the system. This paper focuses mostly on the recording part of the system by introducing the major challenges in the area of bioinformatics, with particular focus on three major areas: video encoding, video quality, and video metadata.
In recent years, technologies and services in the areas of electronic health (eHealth) and telemedicine have been evolving more and more rapidly. In many European countries, the law is being changed so that remote medical treatments conducted using information and communications technology (ICT) systems are now recognized as valid medical procedures. The physical presence of the doctor and the patient at the same location is often no longer necessary, and private as well as state-owned insurance companies now reimburse doctors for performing remote consultations with the patient. A remote interaction between the doctor and the patient can be conducted via phone or internet chats, but as video plays a significant role in current communication, solutions based on the real-time transmission of audio-video are considered to provide better service with higher trustworthiness. In many such scenarios, there is a need to record remote treatment sessions for further reference, follow-up analysis, or for legal reasons.
Furthermore, in many European countries, there is a growing need for recording medical procedures conducted in hospitals or health care centers. There is a general trend toward increasing transparency when it comes to medical treatment, and thus to improve the quality of services provided to the patient. Medical errors do happen, but sometimes a dissatisfied patient decides to prosecute even when no mistake has been made. Therefore, hospitals need to have a way to settle such disputes based on proof. Some countries are preparing laws to make it obligatory for hospitals to record all treatment procedures. Moreover, incoming law changes will qualify such recordings as medical data, forcing hospitals to store these data for decades. Insurance companies also offer much lower liability insurance fees to hospitals recording their procedures, because in the case of a lawsuit, there is clear evidence to determine whether or not the hospital is at fault.
Taking this context into account, hospitals, health care centers, and doctors in Europe will soon need to implement recording of treatment and storage of these recordings for treatment conducted both locally and using telemedicine systems. However, most hospitals and health care centers are not ready to introduce such services as they do not have the necessary recording devices, are overwhelmed with the infrastructural needs of storing thousands of hours of recordings per year, and cannot guarantee the necessary security of the repositories.
To resolve these issues, an end-to-end solution for the recording, secure storage, and access to records of medical procedures conducted locally in hospitals and health care centers is needed along with using telemedicine systems. This requires new approaches to video coding, image quality, transmission, and security.
In this paper, we propose a generic platform architecture for medical video recording and storage. We discuss current and upcoming challenges, and identify the possible technologies to tackle these challenges. We also outline the existing solution that can be incorporated into a holistic platform. This paper focuses mostly on the recording part of the system, as it introduces the major challenges in the area of bioinformatics from three major areas: video encoding, video quality, and video metadata.
The remainder of this paper is organized as follows. The next section describes the current situation and upcoming challenges. This is followed by a proposal of the generic architecture and possible technologies to be used to resolve these challenges. The following section outlines how existing solutions based on such a platform could be built. Finally, we provide an overview and future prospects.
Telemedicine services, which allow for remote treatment and remote consultations using videoconferencing or other ICT systems, are increasingly being recognized as valid treatment solutions by national health care legislation. For example, as of 2018 in Germany [
Telemedicine services can be divided into three basic types: (1) enabling patient-doctor contact, (2) supporting remote collaboration between medical professionals, and (3) enhancing medical education. These three types, although they involve different users and technical solutions, are substantially based on the transmission of video and thus require a secure solution to provide video recording and storage.
University clinics, which often use videoconferencing systems for medical education, also face the problem of how to store and publish recordings of educational sessions. A typical setup with H.323 videoconferencing equipment does not allow for recording sessions without the use of an additional dedicated recording device or cloud service. Other devices introduce extra costs and require management by the hospital. Using an external cloud service may entail not following the legal requirements, as most providers of videoconferencing solutions that are currently used are developed by US-based companies.
A similar problem applies to recording eHealth video connections between the patient and the doctor. Although the amount of data is much lower, web-based or mobile app–based solutions do not fulfil the legal requirements when it comes to the recording and storing of medical videos. Records of treatment procedures conducted remotely contain private patient data and need to be processed and stored accordingly. In many European countries, medical data cannot leave the given country; therefore, the medical videos must be stored securely by a cloud service that is located within the country.
In addition to telemedicine systems, medical videos can be produced by modern operating rooms (ORs), which contain several devices that are also video sources, including microscopes, endoscopes, surgical robots, and macroscopic cameras. Owing to the increasing complexity of the surgical working environment, technical solutions must be found to help relieve the surgeon [
In most cases, medical devices record video on local hard drives, DVDs, or USB flash drives, if they provide that possibility at all. This is a highly unmanageable situation, and is a great hassle for the information technology department to manage and store the recordings. Even if there is an integrated video system available in the hospital, the infrastructure needed to save the videos, especially for many years, is not affordable for many hospitals, and storing the videos in hospital is not cost-effective. With the development of video technologies, where full high-definition systems are now standard and 4K or even 8K resolution systems are available or upcoming, the needed storage space for medical video has proliferated. For example, radiological imaging such as computed tomography scans, magnetic resonance images, or positron emission tomography scan images can reach several hundreds of megabytes per exam, while stereoscopic video related to high-definition surgery is based on two streams of 1.5 GB per second each. In the United States, the storage of mammograms required 2.5 PB in 2009, and 30% of all the images stored in the world in 2010 were medical images [
Moreover, in many countries, the retention period of medical records is very long. In Poland and France, medical images associated with a patient must be kept for 20 years after their last visit under the current law. As an example, the Department of Radiology and Imaging of the University Hospital in Nancy, France produced 55 TB of image data in 2015. Regarding videos, a full high-definition endoscopic camera (used routinely in endoscopic surgery) can provide 2.6 TB of data during a 2-hour operation. These figures highlight the challenge to be faced in terms of archiving capacity.
Consequently, medical images and video should be encoded (ie, compressed), as long as the encoding does not affect the therapeutic quality of the data for regular use by health professionals. Regardless of whether the data need to be transmitted or stored, it is of high importance to ensure that the compression step introduces no significant loss (degradation, or any other processing such as watermarking) at the encoder stage. In other words, it is crucial to maintain an acceptable visual quality of the video stream for health care professionals.
There are many solutions on the market providing recording functionality in the hospital environment. Companies such as Storz, Olympus, or Stryker have medical recorders in their offer. These systems integrate with the proprietary storage solutions provided by their producers, which have to be deployed within the information technology infrastructure of the hospital. However, this is not a scalable solution when all video recordings have to be kept as medical data over many years. Standard videoconferencing solutions, available from Cisco or Polycom, do not have a built-in recorder but utilize separate server-based recorders to be installed in the information technology infrastructure of the hospital. In addition, this is not a scalable solution and these products are focused on video conferencing in general. Therefore, they do not fulfill the requirements of medical video recording and storage when it comes to security and access rights.
Currently, European hospitals are not yet obliged to record all medical procedures; however, many do so for in-house usage or to increase surgeons’ efficiency [
Our group, composed of research centers and companies from Poland, France, and Germany, has designed a generic architecture (
Generic architecture of the holistic recording and storage platform for hospitals. eHealth: electronic health.
The high-level scenarios that need medical video recording (marked with green in
The two major technical constituents of the system (marked with blue in
The different use case scenarios of the recordings are marked with red in
The platform should provide access to recordings that have been marked as public and anonymized to enable medical analytics and research using the stored video content.
Videos are also an essential part of the medical literature. The recordings provided through the platform should be linked to medical documentation files.
Recordings stored on the platform should be accessible by doctors to review the previous treatment or procedures to provide better follow-up treatment. The videos should either be available only to the doctor/hospital that created the recording or made available to other health care providers, according to varying laws in European countries.
The recorded videos could also be used by courts or insurance companies in legal matters. As mentioned in the Introduction, with a general trend to increase transparency when it comes to medical treatment, recoding medical procedures reduces the liability of hospitals. This is especially important as telemedical services, allowing remote treatment using videoconferencing, are being recognized as valid treatment solutions by national health care legislation.
The insufficient number of adequately educated medical personnel is one of the main obstacles to providing universal health care at the highest level. Medical school is especially time- and money-consuming in areas such as surgery, in which students and young doctors have to observe several operations during their education. However, because of spatial limitations in ORs, only a small number of students can be present during a given surgery, especially if they need to see the essential details from up close. This access scenario is an integral part of modern medicine, as audio-video content highly increases the effectiveness of medical education, which in turn benefits the whole society.
Another critical aspect of medical procedures recording that must be considered is that the service delivered to the hospitals has to be an end-to-end service. This means that, from the hospital perspective, the service should not require additional infrastructure except for medical recorders. These recorders have to be able to acquire audio and video from the OR equipment, record and encode it, and then automatically transfer it to the secure cloud infrastructure outside the hospital. It is desired that a standard open API be designed to allow companies producing medical recorders and other video-based equipment to integrate with the secure cloud, avoiding vendor lock-in. Furthermore, the legal framework for each country may differ, so that the process of deployment of the service must be conducted with the participation of legal and health care system experts. There is a long list of norms that has to be taken into account, including ISO 13485 (Medical Devices) as a quality management system [
In the following subsections, we will focus on video encoding, video quality, and video metadata, as these are the areas introducing the greatest challenges with respect to the topic of bioinformatics.
It is crucial to choose an appropriate set of codes that will minimize the storage space requirements and maintain the required quality for medical purposes. We have created a model to calculate the amount of storage needed for surgery recording while fulfilling the requirements of being treated as medical data. In the model, we assumed that we record one full high-definition (1920×1080) video image from an OR, with the H.264/AVC video codec at 6 Mbps. This codec was chosen for the model as it is currently the most widespread video codec and provides adequate quality for medical purposes. This has been confirmed by carrying out subjective tests, as they are an excellent way to assess video/image quality for a group of users. These tests allowed medical specialists such as surgeons to give their opinions about specific encoded sequences [
Current video compression systems can be roughly divided into two categories: strong and lightweight compression. Strong compression is mainly used for the final distribution of the video content, which provides compression ratios from 10× to more than 100×. High compression is obtained at the expense of losing the information in the video (lossy compression) and usually requires substantial computational effort on the encoder side. However, in the newest video codecs, the compression ratio and resulting visual quality can be controlled by the encoder for different application scenarios. For telemedicine, we can achieve a video quality that is visually not distinguishable from the source for humans. Video coding standards such as H.264 (ie, advanced video coding [AVC]) and H.265 (ie, high-efficiency video coding [HEVC]) are part of the strong compression category.
Lightweight compression is used during the production process to reduce the data size for transmission or storage while maintaining the quality as close as possible to the original (ie, lossless or visually lossless compression). Compression ratios between 2× and 6× are common, and implementations of encoders and decoders require much less computational resources compared to firm compression. Codecs such as JPEG-2000, JPEG-XS, and VC-2 are part of this group.
The H.265/HEVC video coding standard [
In 2014, version 2 of the HEVC standard (HEVCv2) was approved, which specifies, among others, the so-called range extensions (RExt) addressed toward video production and contribution applications that require high-quality color formats such as 4:2:2, 4:4:4, and RGB, up to 12 bits per pixel. HEVC RExt is suitable for medical applications as the extended chroma formats and high bit depths allow for better preservation of the original content.
A new generation of video codecs is being developed to cope with the demands of emerging applications such as UHD TV in 4K and 8K resolutions, 360° video, and new quality formats such as high dynamic range (HDR), high frame rate (HFR), and wide color gamut (WCG).
These next-generation video codecs include AV1 and H.266. The AV1 format has recently been ratified by the Alliance of Open Media (AOM) and is designed for web apps such as video on demand (VoD) [
The Joint Video Exploration Team on Future Video coding (JVET) was founded in October 2015 by the International Telecommunications Union Telecommunication Standardization Sector (ITU-T) Video Coding Experts Group (VCEG) and International Organization for Standardization (ISO)/International Electrotechnical Commission (IEC) Moving Picture Experts Group (MPEG) to analyze whether there is sufficient need for the development of a new codec with compression capabilities beyond HEVC. Test cases have been defined for various types of video, including HD, UHD, HDR, and 360° video, and investigations on advanced compression tools using a common software platform have been performed [
Currently, there are optimized HEVC/H.265 software encoders available, such as the open-source x265 encoder [
Real-time 4K and 8K hardware encoders are also already available on the market. Nvidia released a new generation of GPUs called Pascal that supports 8K hardware-accelerated HEVC encoding using NVENC technology. Advantech is commercializing the VEGA 3300 Series for 4K/8K HEVC encoding, decoding, and transcoding. NEC also launched its 4K video codec for ultra-low delay applications. Intel with Quick Sync and AMD with VCE technology are quite active in the area of hardware-accelerated HEVC processing as well. There are also companies and universities providing field programmable gate array (FPGA) accelerators for HEVC-based solutions. However, hardware encoders are much more expensive to create and are often much less feature-rich than software encoders.
Although firm compression delivers a substantial reduction of bandwidth used, this requires relatively high implementation complexity to achieve. In many usage scenarios, especially those closer to the video source, the quality is more important than the compression rate. In these scenarios, the source videos are ideally stored in an uncompressed raw format. Nevertheless, due to the increase in resolution, frame rate, and pixel format quality, this is becoming more and more challenging from both a technical and economic point of view.
Lightweight compression, visually lossless compression, intermediate codecs, and mezzanine codecs are keywords that refer to codecs for these scenarios. The main characteristics of such codecs are a relatively small compression ratio, often somewhat configurable between 2× and 8× compression compared to the raw format; intra-only codec tools, to allow for uncomplicated integration in editing workflows; focus on low latency and low complexity real-time implementation; and focus on multigeneration performance. Codecs are designed to reduce quality degradation with multiple transcoding steps.
JPEG XS is a recent promising effort to create a standardized intermediate codec. A strong focus is placed on achieving visually lossless quality with low latency and low-complexity implementations. The standardization efforts also target the integration of file and transmission formats [
In our opinion, considering the large amounts of data that will be produced by the recording of medical procedures, only firm compression should be taken into account. This is why it is crucial that during the standardization of new robust compression codecs, the aspect of quality for medical purposes should be one of the main focus areas.
There are several publications presenting research results on storing compressed medical video sequences. However, it should be stressed that research on medical multimedia data compression is usually performed within narrow medical specializations such as oncology [
The most important among them is the digital imaging and communication in medicine (DICOM) standard [
Moreover, it is essential to note that the DICOM standard is continuously under development, and it is likely to include more lossy compression methods in the future. For example, with respect to the storage of long bronchoscopic video recordings, the DICOM standard is usually not used. However, the newest version of the standard introduces the first extensions toward storage of long medical records. For diagnostic purposes, the most popular format is AVI. The most commonly used video codec is the (almost lossless) MJPEG, which results in broadband streams that are not suitable for streaming.
By contrast, lossy codecs generally produce desirable narrow-band video streams. Unfortunately, as mentioned above, lossy codecs are hardly ever used for compression of surgery multimedia data if the decompressed images are meant to support a diagnostic process. These codecs are usually designed with the assumption of introducing a significant loss if used for effective compression (the decompressed image may differ significantly from the original). In the case of using visibly degraded images for diagnostic purposes, there is a severe danger of an unacceptable influence on the diagnosis.
Usually, if a presentation of medical multimedia data does not have to support a diagnostic process, but is used for other purposes (eg, educational purposes), popular consumer (usually lossy) codecs are used. In the case of still images, the JPEG compression standard [
In many visual applications, the quality of the moving image is not as important as the ability of the optical system to perform specific tasks for which it is created based on these video sequences. Such sequences are called Target Recognition Video (TRV). Regardless of the different ways in which the concept of TRV quality is understood, its verification is necessary to perform dedicated quality testing. The basic premise of these tests is to find TRV quality limits for which the task can be achieved with the desired probability or accuracy. Such tests are usually subjective psychophysical experiments with a group of subjects. Unfortunately, due to the issue complexity and relatively poor understanding of human cognitive mechanisms, satisfactory results of TRV quality computer modeling have not yet been achieved beyond minimal application areas.
Given the use of TRV, qualitative tests do not focus on the subjects’ satisfaction with the video sequence quality but instead measure how the subject uses TRV to accomplish specific tasks. The purposes of this may include: video surveillance such as recognition of vehicle license plate numbers, telemedicine/remote diagnostics for a correct diagnosis, fire safety and fire detection, rear backup cameras such as for parking the car, and games such as spotting and correctly reacting to a virtual enemy.
Since the human factor has a significant influence on the subjective assessment of TRV quality, it is necessary to ask questions on the procedures that need to be complied with to perform it. In particular, problems arise on: the method of selecting the TRV source on which the test TRV (with degraded quality) is based; subjective testing methods and the general manner of conducting the psychophysical experiment; the process of selecting a subjects group in the psychophysical experiment, especially the identification of any prior task knowledge; training subjects before the start of the operation; conditions in which the test will be carried out; and methods of statistical analysis and presentation of results [
Metrics for general quality of experience [
Besides the medical data itself, some information (ie, related to the video, patient, pathology, etc) should be recorded in parallel. These metadata are essential to identify the hospital, surgeon, patient, and all information relevant to the surgery. These additional data increase the size of the total data to be recorded or transmitted. Moreover, these data are subject to specific constraints, as they must be processed using standard algorithms. These are also often sensitive data and must be compliant with regulations such as General Data Protection Regulation. Finally, the data are coming from different sources and suffer from a lack of integration, making it difficult for the expert to access the data in their daily practice in some cases.
Despite all of these constraints, exploitation of these metadata offers several advantages and should be of interest for medicine at present and in the future. In particular, this would enable improvement of the everyday practice of experts during the identification of the disease, diagnosis, and follow up.
One way to circumvent the constraints mentioned above, especially the increase in data size and the lack of integration, could be to use data-hiding techniques to hide metadata in the medical video without increasing the total amount of data. In this way, the metadata would never become separated from the recording.
Data hiding can be used to hide different types of information such as text, signature, code, image, video, or audio. It is crucial to include the patient metadata relevant to the recorded procedure in the video, including preexamination results, disease history, and linking to the patient data in a national or hospital database. Such an approach would enable having the data that are most important for follow-up treatment always available with the video. Of course, as these are sensitive medical data, a second version of the recording with anonymized patient data must be prepared for research and educational purposes.
The data-hiding technique was initially conceived in the 1990s to fight against digital piracy and protect the intellectual property of identical copies [
For each application, there is always a relationship between the method of data hiding and the following three main constraints: the capacity of insertion (in bits), imperceptibility/invisibility, and robustness. The size is the amount of information you can insert or hide in the media. Protecting information can distort video quality. It is therefore essential that the mark remains invisible to stay as faithful as possible to the original condition. Data-hiding techniques are sensitive to various distortions that occur during transmission or coding. Robustness to some of these attacks is strongly dependent on the application constraints, while the capacity and invisibility are more related to the structure of the host video itself. Regardless, using a data-hiding technique inevitably leads to a compromise among size, invisibility, and robustness, taking into account the constraints of the application. As an example, in the medical domain, we must place the main effort on invisibility, especially for diagnostic applications.
Schematic depiction of a data-hiding procedure.
Our group is already providing services that can be treated as components of the envisioned platform or come close to it.
The medVC platform [
The deep RIVER [
Spin Digital Video Technologies GmbH [
Researchers of the AGH University of Science and Technology [
Poznań Supercomputing and Networking Center (PSNC) [
Recording of medical procedures and the requirement to treat these recordings as medical data will be a significant challenge for health care systems around the world. Exabytes of data will be created and will have to be stored for decades. Most hospitals will not be able to handle this from an infrastructural point of view. Therefore, there is a need to provide solutions on national or even international levels. Standardization efforts are needed to define procedures of medical video storage in certified data centers supporting national health care systems. This must be done together with the specification of an API to be implemented by medical recording devices that will automatically upload the recordings to the repositories. In parallel, the standardization of high compression codecs for medical purposes must progress, and the quality parameters of the codecs, required to ensure medical video quality, must be defined by law. Data hiding can provide an answer to the problems of trustworthiness and integrity of medical data. Thanks to this technique, it is possible to send and store more digital metadata by hiding it in the media. In addition, data hiding offers more advantages for medicine applications by enriching the medical data with useful metadata that can help practitioners improve their diagnosis during the identification of the disease or enhance the level of training for future doctors.
To some extent, there are technologies available or emerging that can be used to build the desired medical video recording and storage platform. With respect to video encoding, there is HEVC [
Alliance of Open Media
application programming interface
advanced video coding
digital imaging and communication in medicine
electronic health
field programmable gate array
high dynamic range
high-efficiency video coding
high frame rate
information and communications technology
International Electrotechnical Commission
Interactive Medical University
International Organization for Standardization
International Telecommunications Union Telecommunication Standardization Sector
Joint Video Exploration Team on Future Video coding
Main Profile
Moving Picture Experts Group
operating room
Poznań Supercomputing and Networking Center
range extensions
target recognition video
Video Coding Experts Group
video on demand
wide color gamut
None declared.