In week 2, I focused on refining the visualizations I did in week 1 to better visualize and understand one dataset among the three large datasets (so far) we have in the project. Thanks to the visualizations, I have some sense of information seeking behaviors of users who use institutional repositories (IR) to search and download information including devices used, device differences due to geolocation, time of search, factors affecting their clicks and clickthroughs etc.
To improve the aesthetics of the visualization, I paid attention to color contrast, graphic resolution, color ramp, transparency of colors, shapes, and scales of x and y axis. To enhance the readability of the visualization, I tried not to present too much information in one visual using Miller’s law of “The Magical Number Seven, Plus or Minus Two” to make sure that people will not feel overwhelmed when looking at the visual and processing information.
Besides working with visualizing information which struck me as interesting in the first dataset, I also tried to wrangle the other datasets. Nikolaus managed to harvest metadata relevant to each URL. This means we can look into metadata content related to each search. However, it also creates a challenge for me regarding how to make unstructured string data into structured data. This is not what I often do but I am excited to brush up my skills in working with text data in the coming weeks.