Fujitsu Labs With Technology Automatically Offering Analysis Scenarios for Big Data
By reusing scenarios developed by data analysts
This is a Press Release edited by StorageNewsletter.com on September 4, 2012 at 2:48 pmFujitsu Laboratories Limited announced the development of a technology that, through the use of analysis scenario templates developed by data analysts, can make recommendations about and the applicability of templates and possible usage combinations of additional data based on the content and attributes of data to be analyzed.
In recent years, anticipation has been mounting for applications in which big data is analyzed using machine learning, mining and other technologies, and then these analysis results are leveraged as part of the business and management decision-making processes of companies. In line with this trend, there has been rapid progress in the development of platforms, technologies and tools for collecting, storing and analyzing data. At the same time, driving the utilization of big data requires personnel that are knowledgeable about statistical- and mining-related analysis, while also possessing business knowledge about specific vertical industries and work processes. Training and securing this kind of personnel has become a challenge industry-wide.
With Fujitsu Laboratories’ newly developed analysis template recommendation feature, users can save and reuse analysis templates containing analysis scenarios developed by data analysts. These analysis scenarios provide guidelines about what kinds of data can be used in combination and the best ways to interpret and apply analysis results. Because the technology can make recommendations, based on the content and attributes of the data, about applicable analysis templates and additional data that might be helpful, users can leverage saved analysis templates to enable them to perform analysis and forecasting, even without advanced knowledge or know-how.
This technology is scheduled to be rolled out as part of Fujitsu Limited’s Interstage Business Analytics Modeling Server, a middleware package for building analytics solutions.
Background
In recent years, there has been increasing anticipation for big data applications that apply machine learning, mining and other analytics technologies to large volumes of data and then incorporate these results into the business and management decision-making processes of companies. Rapid progress has been made in the development of platforms, technologies and tools for the high-frequency and real-time collection, storage, and analysis of a variety of large-volume, unstructured data, such as social media and sensor data, in addition to already existing business data. Going forward, determining how to leverage big data in business will be a key to growing one’s business and gaining competitive advantage.
Technological Issues
Analysis scenarios-in addition to platforms, technologies and tools for data analysis are crucial for gaining valuable business insight from big data. These scenarios provide guidelines about what kinds of data can be used in combination, how data for analysis should be pre-processed, what kinds of technologies and tools should be applied, and the best ways to interpret and take advantage of analysis results. To construct these kinds of analysis scenarios, it is essential to have a team of experts who possess business knowledge about specific vertical industries and work process, as well as knowledge about statistical – and mining-related analysis. Training and securing personnel to drive the use of big data has become a challenge industry-wide.
Newly Developed Technology
Fujitsu Laboratories has developed an analysis template recommendation feature that makes it possible to reuse the advanced knowledge and know-how of data analysts by designing and saving analysis processes in the form of templates. Based on the content and attributes of the data the user wants to analyze, the technology can automatically make recommendations about applicable analysis templates and additional data that might be helpful. This, in turn, makes it possible for users to leverage saved analysis templates to easily perform analyses, even without having advanced knowledge or know-how.
Reusing analysis scenarios
via analysis template recommendations
Features of the newly developed technology
1. Data model-based analysis template management
With the technology, analysis processes (analysis processing procedures) are designed in the form of analysis templates that combine different analysis components containing pre-processing and analysis processes. These analysis templates represent combinations of analysis components and analysis component parameters that are based on specialized analyst know-how, such as what kinds of data can be used in combination, how data should be pre-processed, what kinds of technologies and tools should be applied, and the best ways to interpret and take advantage of analysis results. When creating a template with the new technology, vertical industry/work process categories, analysis objectives, and other metadata, as well as the content and attributes of the data to be analyzed, are coded to match standard data models, thereby enabling the recommendation of analysis templates.
Data model-based analysis template management
2. Data profiling-based analysis template recommendations
When starting a new analysis, it is possible to recommend analysis templates using metadata such as vertical industry/work process categories and analysis objectives. This approach, however, is not applicable for analysis problems involving multiple business areas or complex processes, and it cannot support cases in which the analysis objectives are not fully defined at the beginning stages of the analysis. On the other hand, to handle these kinds of situations, which analysts have been successful at handling to date, Fujitsu Laboratories’ technology involves a new approach for recommending analysis templates based on the content and attributes of the data being analyzed.
By specifying the data to be analyzed and using data profiling technology, Fujitsu Laboratories’ new technology can extract and assess the data’s content (the meaning of data items) and attributes (properties such as the volume of the data and its distribution). Through the process of matching these profiled contents and attributes with data models that are associated with analysis data templates, it is possible to make recommendations about which analysis templates are applicable to the data being analyzed.
Data profiling-based analysis template recommendations
3. Suggestions about possible usage combinations
for additional data
When utilizing big data, it is a common practice to perform multifaceted analyses on combinations of diverse, heterogeneous data in order to discover and analyze hidden causal relationships (influences) that are undetectable from a single data set. Until now, discovering these kinds of data combinations has required specialized analyst know-how along with trial-and-error.
With the new technology, during the recommendation process for analysis templates, the system will not only search for analysis templates for possible uses, but it will also search for analysis templates that would be applicable given additional data beyond the specified data set. By making suggestions about this kind of additional data, it is possible to discover a diverse range of data combination patterns.
Suggestions about additional data
Results
The new technology enables users to reuse and revitalize the specialized know-how of data analysts. As long as data is available for analysis, data profiling-based recommendations make it possible to implement a cyclical analysis flow in which analysis templates are discovered from and applied to data sets, and then possible additional data combinations is added and the analysis templates are reapplied. This, in turn, will enable the creation of new analysis applications and uses for such applications, while also expanding the scope of analysis objectives and usable data.
Data model-based analysis template management
Future Developments
By employing an analysis system that implements the new technology, Fujitsu Laboratories is currently developing analysis components and analysis templates that are applicable to a range of work processes, including customer management in the logistics and manufacturing sectors, marketing, product recommendations, quality assurance, and risk management. This has enabled the company to execute analysis scenarios based on a host of proprietary technologies that Fujitsu Laboratories has developed to date, including technology for detecting signs of defects in markets and risk scenario analysis technology. Going forward, the company will move forward on applying the technology to real-world problems while expanding the range of vertical industries and work tasks that it supports.
The technology is scheduled to be rolled out as part of Fujitsu Limited’s Interstage Business Analytics Modeling Server, a middleware package for building analytics solutions.