"Graph Structure" Used Frequently in Big Data Analysis
Recent advances in big data technology are remarkable, and various fields continue to accumulate big data. Do you know how such vast amounts of data are analyzed?
Big data contains a large amount of graph-structured data that expresses relationships between people and things. Graph-structured data is not uniform and varies in its size and cardinality. For example, the US currently has a huge graph that contains about 1.1 billion nodes * connected via the Internet, which represent mobile phones, tablets, laptops, gaming consoles, and TVs.
More than before, Fujitsu Laboratories has developed a wide variety of technologies for information retrieval and data analysis technologies that handle graph-structured data, including LOD platform technology **
* Noes or vertices in graph theory. They represent individual entities, each of which relates the other in a structured embodiment.
** Linked Open Data: Overall technologies that enable publishing collections of data in computer-friendly ways so as to make it easy to associate them with each other, while allowing secondary use to anybody.
Automatically Extracting Features from Graph Data with Unique Technology Surpassing the Limits of Conventional Deep Learning
Fujitsu Laboratories has developed a brand-new technology that enables highly accurate learning of graph-structured data. It is Fujitsu's proprietary deep learning technology, which can be applied to graph-structured data and surpasses the scope of existing deep learning technology that has reached a high level of image and voice recognition.
This technology analyzes graph structures using cutting-edge mathematics and transforms them into a uniform expression, thus resolving various daily problems. A tensor is a mathematical expression of data that represents multidimensional arrays, a generalization of the concepts of vectors and matrices (Fig. 1). Graph-structured data is not uniform and varies in its size and cardinality, though, by transforming it to a uniform mathematical expression called tensor, highly accurate machine learning can be achieved through the technology based on deep learning that handles graph-structured data.
In a trial in which this technology was applied to a virtual screening, which explores candidate chemical compounds for drugs by computer, the technology learned the relationship between molecular constructions and chemical activity from hundreds of thousands of chemical compounds, about 100 times larger than those that previous technology could handle. Thus, we achieved 80% accuracy in predicting active compounds (a 10% improvement compared to existing technology). In addition, Fujitsu Laboratories conducted a trial to detect malicious access or attacks from graph-structured data that represents the communication relationships between hosts, and successfully reduced false alarms by more than 20% compared to existing methods.
Application to Other Fields Including Pharmaceuticals, Network Monitoring, and Finance
We expect that our deep learning technology for graph data has positive effects, such as predicting drug efficacies or side effects while reducing its development time and cost. It will also reduce the amount of labor to monitor networks dramatically. This technology is also expected to positively impact various fields by enabling improvements such as highly accurate detection of improper financial manipulation and facilitating sophisticated judgements of suitability for lending.
Whether use of AI technology will further expand in the future depends on how graph-structured data is handled. Fujitsu Laboratories will continue to further improve the accuracy of its categorization technology for graph-structured data, aiming to bring it into practical implementation as a core technology of Human Centric AI Zinrai within fiscal 2017. In addition, Fujitsu Laboratories will continue to expand the fields of applicability for deep learning technology to conduct advanced data analysis in a variety of fields.