Minesoft has always prided itself on facilitating and accelerating the rates of innovation achieved by our clients, but we take special pleasure in being able to assist more directly with research projects. Here Alessandro Comai of the University of Japan has produced a research paper on patent and scientific data analytics, making use of a case example to demonstrate how vast quantities of information can be mined from scientific literature, news articles and (here’s where PatBase comes in!) patent data.
The article covers the sources of the data sets, the queries and methods used to identify the body of text to be analysed, and the normalisation methods used to ensure data from the range of sources could be assessed as one set as well as separately.
Comai explains the issues surrounding the manipulation of large data volumes to extract insights, and throughout reinforces the message that before valuable information can be gained, user needs and questions should be clearly defined. This is because when working with data models of this size meaningful interactions can be lost in the noise unless appropriate filters are put in place to narrow the display to only items of interest to a viewer’s current needs or role.
Helpfully, a table is included as part of the paper, explaining the types of filters used in the case example, and how pairs of filters can be combined together to draw different information from the data. For example, showing only the largest nodes combined with the earlier time periods within the set will show the important stakeholders and topics at the early stages of the development of a technology area. On the other hand, combining the smaller node sizes with the more recent time entries will show emerging players within the space, or emergent evolutions of the technology which may pose a threat or an opportunity to established stakeholders.
The challenges of working with and normalising large datasets to produce useful visualisations is a daily struggle for many information professionals, with difficulties compounding as the size of the dataset grows. To overcome these challenges, Minesoft has developed the PatBase Analytics V2 module. The primary goal when developing the module was to cut away as much of the time-consuming data cleaning as possible and give access to easy-to-use visualisations and charts drawn from highly complex datasets. PatBase Analytics V2 is fully integrated in PatBase so searching, analysing and visualisation can all take place in a single location.
The PatBase Analytics V2 module contains similar citation network maps which follow the same methods to build the charts as those outlined in the paper, with the same easy-to-use filters to restrict by node or edge size as described within this paper. The controls are embedded as part of the interface for ease-of-use via slider bars and value entry boxes and can be used to track the citation networks within a particular technology area or centred on a self or competitor portfolio.
For more information about PatBase Analytics V2 and to request a free trial. Click here.
The original article can be found here, all credit to Alessandro Comai and the university of Japan.