Massive Data Processing Overloading Deep Learning
The spread of the IoT, which connects various objects to the Internet, is causing a rapid increase in the number of networked devices including home appliances, houses and cars. The Ministry of Internal Affairs and Communications predicts that the number of IoT devices will double from 15.4 billion to 30.4 billion between 2015 and 2020. (*)
Massive amounts of data produced by these devices, or “big data,” contain various forms of data such as numbers, text, images and sounds, from which value can be generated if the underlying patterns, rules and insights can be discovered. That is why Deep Learning is attracting attention. Deep Learning is a type of machine learning method and is said to help create new businesses and services.
The training process of Deep Learning requires massive processing based on the input training data. The more IoT devices are deployed, the more data is collected, resulting in more training data. For this reason, higher-performance training servers are required, and massive amounts of parameters are required for near-human level recognition. (Figure 1)
Available power capacity limits the maximum performance of any hardware, including training servers, which makes it difficult to enhance performance by scaling out—a technology for higher energy efficiency is needed. Common approaches include reducing data bit width to 16-bit or 8-bit and using integer arithmetic instead of floating-point arithmetic. However, these approaches have some problems such as failing to train a neural network due to insufficient operational precision and degrading the recognition capability of the neural network.
Smaller Bit Width Reduces Power Consumption during neural network training
Fujitsu Laboratories Ltd. tried solving these problems and developed a circuit technology that reduces data-bit width to improve energy efficiency while maintaining sufficient operational precision in the process of Deep Learning.
A Deep Learning processor core that uses this new circuit technology performs real-time data analysis in the middle of training and stores the results as statistical information in a database. This statistical information is used for creating an optimized configuration for neural network training, which enables operations with less operational precision decrease, solving the problem caused by bit width reduction.
This circuit technology improves energy efficiency from these two aspects: a reduction in power consumption by replacing floating-point arithmetic with integer arithmetic and a reduction in power consumption by reducing bit width from 32-bit to 8-bit.
Maintaining recognition rate, 75% Less Power Consumption, Enabling a Wider Scope of Application in Deep Learning
Fujitsu Laboratories Ltd. conducted an assessment experiment on the new circuit technology with a combination of LeNet, a network commonly used for handwriting recognition, and MNIST, a training data set for handwriting recognition, which resulted in similar recognition rates regardless of bit width: 98.90% for 32-bit, 98.89% for 16-bit, and 98.31% for 8-bit.
Improved energy efficiency can enhance the throughput of training servers at the same power consumption, or reduce power consumption at the same throughput. In addition, this allows neural network training to be performed not only on cloud servers but also on edge servers in private networks, and can reduce power consumption by 75% when neural network training is performed by edge servers installed close to where data is generated, such as factories. This contributes to expanding the application scope of AI technologies.
Fujitsu Laboratories Ltd. is currently evaluating 16-bit and 8-bit data bit width and will proceed to smaller bit width for even higher energy efficiency. We are aiming to make this technology commercially available in fiscal 2018 as part of Human Centric AI Zinrai, our systematized AI technologies, and will continue working on more advanced AI usage.