Google Scholar Citation Visualisation

Discussion


Once the final design has been implemented, we evaluate the strengths and weaknesses of our visualisation based on whether we were:

  1. Able to accurately visualise the given data
  2. Able to effectively convey the dimensions of the given data that were chosen (cited by, cited, earlier/recent, category)
  3. Interactivity and usability of the Interface

These categories are related to the interactivity offered by our visualisation and are strengths as they form a part of Ben Shneiderman’s Visual-Information Seeking Mantra which explains principles related to interface design which aid in addressing visual queries.

Other strengths are related to design decisions implemented based on visual thinking principles described by Edward Tufte.


Strengths


The strengths of the visualisation fall into 4 main categories; Overview, Zoom, Filter and Details on Demand. These categories are related to the interactivity offered by the visualisation.


Overview:

Relating to the first principle of Ben Scheneiderman’s Visual-Information Seeking Mantra, the global map overview allows simple visual queries to be answered such as “Which region cited the paper the most/least?” In the particular example we implemented in our visualisation, it is clearly visible that most of the citations come from the same subcategory and that the article has less references than citations.


Zoom:

By allowing the user to zoom into a specific area of the map, users can see exactly where citations have come from (which institution). Additionally, this helps solve the problem of occlusion that may occur when looking at the overview of a region as points disperse to their exact location allowing individual markers to be identified.


Filter:

Using the Google Maps to implement our visualisation allows filtering to be done on earlier/recent citations/references. This allows the user to easily find the information they are looking for (eg. see only recent citations) by removing points on the map and aids in answering the visual queries a user may have.


Details on Demand:

Detailed information about authors are shown when clicking a marker on the map. This allows users to only see information related to authors when needed allowing the map to remain clutter-free and more general visual queries (eg. comparison of cited papers vs cited by papers) to be made.



Design Decisions


In order to ensure fast visual queries, map markers differ on two channels (shape and colour). This makes it very easy to filter out unnecessary information when performing a visual query.

The final design used different shades of colour instead of transparency to show older citations. This was an intentional design decision because it was discovered that transparent symbols tended to fade and blend into the background. Thus, two different shades of colours were used to distinguish earlier/recent references and citations from each other. The chosen shades are different enough to be easily distinguishable from each other while still allowing the eye to group them together as they are from the same colour range.


Weaknesses


One of the major weaknesses is occlusion, both in busy areas (east coast of America for example) and at universities. We don't allow a user to quickly see how many times the paper was cited at Harvard for example. In order to do this they would need to click on the relevant (glowing) pin to see the data.

Another weakness has to do with how the data was collected. Due to the fact that Google Scholar doesn't have an API and automatically scraping results is against Google's Terms of Service, the data collection process was far more manual than the average computer scientist would deem acceptable. One of the issues with the manual collection of data in this way is that it is sometimes hard to pinpoint the location of a paper, especially for foreign language papers and unknown authors. As a result it is probable that some of the location data points to the current institution of an academic, instead of where the academic was at the time of writing.


Conclusions


The visualisation benefited a lot from the initial presentation, as many of the suggestions provided were incorporated into our final design. One of the suggestions which we struggled to incorporate was to show the numbers of citations at each university. The decision was made not to include this as it would simply add noise to the busy areas.

The final design was able to show the previously mentioned visual queries, it displays the necessary information in a clear and concise ways and allows users to interact with it in useful ways.


Future work


The visualisation is still a prototype and functionality hasn’t fully been implemented yet. The implementation and coding of more advanced features (eg. dynamic sidebar with author abstract information, linking author’s to their Google scholar profile, showing multiple papers in a single institution as nodes clustered around a point) and pulling the data in automatically instead of having to scrape it manually would be a possible line of future work.

Automatic integration would rely on a change of Google's terms of service or the creation of an API. Some time could also be spent exploring what summary data would be useful, possibly by continent, country or state/province.

This application would be interesting for both students and academics. Students could use it to get more in depth and contextualised information about their research and academics could see where and who is citing their work. The generalisation of this visualisation could have broad applicability, a fairly uncreative example of this could be showing participation at international conferences.


Ideas from Second Presentation

Future work could include being able to compare two papers at the same time by having side-by-side visualisations. This would be useful for the user to see how closely one paper relates to another or to see how much one paper may have been influenced by the paper it is citing.

The pictures of the authors help the user to put a face to the name. The value of this could be expanded - this helps in easier recognition of recurring authors.