About this project
In this project, we are studying processors dealing with images deep learning.
Deep learning (deep learning) has greatly developed due to the drastic improvement of computer performance, the acquisition of large scale data by the spread of Web and Internet devices, and the results of various researches, and now attracts attention in various fields. Particularly in the image recognition field, it boasts overwhelming recognition performance, competition for recognition accuracy of general object images ILSVRC continues to improve greatly since 2012 deep learning, and finally it also exceeded human recognition performance in 2015. It is expected that this deep learning will be applied to various fields in the future and will show results.
Although deep learning can achieve overwhelming accuracy compared to other calculation methods, in order to acquire the accuracy, there is a problem that requires enormous learning time even with the current high-speed computer. For example, in a convolution neural network called FaceNet that recognizes the face image of a human reported by Google Inc., it took 1000 to 2000 hours of learning time to realize 99.63% super high accuracy. In the future, deepening learning is expected to be used for various problems using a network that has been further enlarged. At this time, it is not realistic to take more learning time for each problem. Therefore, this project aims to drastically reduce learning time by accelerating learning.
In this project, we consider an algorithm to speed up deep learning with the new architecture of data flow using distributed memory as the keyword of “parallelization”. In addition, by realizing high-speed learning with a dedicated processor, we aim to acquire high throughput and high power efficiency by reducing the time granularity of control.