Driving the Dramatic Advancements of Digital Video
Digital video technology continues to achieve dramatic advancements. Most digital videos with high image quality that we view on a daily basis, such as high-definition television broadcasts, Blu-ray, and videos watched on PCs and smartphones (via 1SEG and YouTube), are created using an international standard for video technology called H.264/AVC.
Akira Nakagawa (associate fellow at Fujitsu Laboratories Ltd.) is a leading figure in the technological development of H.264/AVC, and he has contributed to the global spread of high-definition video. In 2016, in recognition of his contributions, he was awarded Japan's Medal of Honor with Purple Ribbon. We talked to Nakagawa and asked him about how to spark innovation that meets the world's expectations.
Which Video Encoding Technology Made High-quality Video a Standard in Our Lives?
"Video data is generally very large, and if left that way, computers simply cannot handle it. For example, every second of high-definition video captures two million pixels 30 times. When this data is recorded on a DVD without compression, you can only record 30 seconds of video. For this reason, you must compress the data to filter out excess information in a way that cannot be detected by the human eye. This is what we call video encoding technology.
"Practical use of video encoding technology began in the 1980s with business-use video conferences. In the years that followed, advancements in video encoding algorithm, semiconductors, and recording and transmission technologies have helped increase image resolution, expanding the applications in broadcasting and video recording."
After H.264/AVC's International Standardization, Companies Worldwide Began Raising the Stakes
"In 2001, the ITU-T and the ISO/IEC joint technical committee has started their great efforts to standardize H.264/AVC, the international video compression standard for high-definition video such as HDTV.
"In international standardization efforts, engineers from around the world gather to document specifications on the optimal procedure of a particular coding process. During this particular meeting, over 100 participants gathered, and each party made proposals based on their latest technological research results. Then, they held discussions and decided upon which technologies to adopt. As this standard was composed of quite a few efficient but complicated technologies, the document ended up exceeding 300 pages.
"Actually, I participated in international standardization for the first time during the previous H.263 standard efforts, and proposed a technology to improve the subjective quality during video conference. However, I had to overcome great difficulties to let members understand the usefulness of my proposal for adoption. Learning from this experience, during the standardization of H.264/AVC, I prepared several approaches that focused on persuasive proposals and methods with clear results. This turned out to be well worth the effort because it helped my technological proposals get accepted, enabling me to contribute to the functional improvement of the standardization process as well as acquire necessary patents."
Is Practical Application Too Difficult against Market's Expectations? Called as a "Mammoth"
"In Japan, terrestrial digital broadcasting started just after this standard's formulation. This led to heightened expectations in the market and society for high-definition technology. H.264/AVC technology, which can compress to the same image quality as the previous MPEG-2 standard at half the data volume, drew very high expectations for practical use. However, to achieve this, we needed to satisfy all the requirements in the standardization document, which exceeded 300 pages. As a result of implementing a vast number of functionalities to improve performance, the computational complexity of video encoding was 10 times that of MPEG-2. Due to this extreme complexity and the sheer volume of data processing, we were told that it would be too difficult to implement in any practical way. An article published in Nikkei Electronics at the time dubbed H.264/AVC as the 'mammoth codec.'"
A Method for Converting Beautiful Video to Contain Low Information Volume
"To make H.264/AVC suitable for practical use, we had to get control over the high computational complexity. It became crucial to develop a low computational encoding method that drew out the performance of H.264/AVC with the smallest data volume possible.
"In addition, the standard only specifies the decoding method (how to turn compressed data back into video data). As for the method of encoding, which is how the video data is converted into compressed data in the first place, each company was left to develop their own methods. Even as companies used their encoding technologies to compress the same videos to the same data volumes, there were significant differences in image quality. Thus, a competition among companies emerged to see who could produce the best image quality with the smallest data volume."
Subjective Image Quality: High-quality Encoding of the Parts Where People Focus
"To overcome this challenge, we made use of special traits of human perception. If your eyes focus on the worst part of a particular scene, you will perceive the entire image to be of poor quality. We utilized this characteristic of human subjectivity to apply high encoding to the parts that people focus on the most. The other parts are encoded with some information omitted in a way that is not so noticeable.
"Video basically works by sending incremental differences from the previous image. Because of this, elements that change in a time-directional manner, such as water, fire, movements in a forest, and falling flower petals, are particularly difficult to compress while retaining high image quality. We thought of how to handle difficult videos like this, looking for ways to allow for quality degradation in the way that is most acceptable to the human eye so that viewers can still watch the video with an overall high level of quality.
"Even after poring over books during our research, we could not find a way to use equipment to measure perceptive subjectivity, so the only solution was to evaluate by ourselves with human eyes. We thought about what kinds of visual distortion would bother us the most, and we came up with ways to control quality through strategic allotment of data volume. A process of accumulating unexpected elements caught our attention during our efforts to judge everything with our own eyes."
"Nobody had heard of cloud technology at that time, but by using Fujitsu's grid computing technology, which linked multiple computers to perform parallel computing, we could efficiently perform simulations that stretched tens of thousands of hours to search for video evaluation and improvement ideas. As a result, in just one year, we could develop a proprietary algorithm that achieved high image quality while maintaining low computational complexity, and we started the process of developing an H.264/AVC product. However, the most difficult part for us still lay ahead."
After celebrating their success in developing an encoding algorithm, the work of product development came next. In Part 2 of this interview, Nakagawa talks about the journey of developing an H.264/AVC product as well as his vision for the next generation.
- Akira Nakagawa
Fujitsu Laboratories Ltd.
- Akira Nakagawa graduated from the Department of Electrical and Electronic Engineering of the University of Tokyo in 1989, and he completed his master's course at the same university in 1991.
He joined Fujitsu Laboratories that same year. Since then, he has been involved in basic research related to video encoding technology, international standardization, and LSI/equipment development.
A Doctor of Engineering, he was awarded Japan's Medal of Honor with Purple Ribbon in spring 2016.