LEADS Blog

Nikolaus Parulian, Week 1: Exploratory Data Analysis – What we can do to understand the data?

LEADS site: Repository Analytics & Metrics Portal

 
 
After getting some ideas about data science, data analytics, and data visualization in the boot camp (Sonia already posted an excellent review on what we learn on the boot camp), I started working on the Repository Analytics and Metrics Portal (RAMP) dataset provided by my mentors. 
RAMP is a The Repository Analytics & Metrics Portal (RAMP) is a web service that improves the accuracy of institutional repository (IR) analytics. 
RAMP provides a persistent and accurate count of file downloads from IR and so much potential for IR metrics aggregation and comparison across the organization that join this project.
 
The first thing I did on the dataset is understanding the data by doing an exploratory data analysis. The RAMP dataset I am working on is derived from the Google Analytics Console  which contains page_clicks, URL, average_positions, and impressions merged with additional data that RAMP provided. I visualized and aggregated most of the categorical columns on the dataset and found the correlation between each numerical column. Besides that, I also count the statistics to see if there are outliers in the dataset.
 
In the end, I found some interesting result through the visualization and correlation analysis, and we will discuss the findings in the meeting on the second week.
 
Overall, this RAMP project is pretty exciting and have so many potentials. I am excited to continue working on this project further.
 
 
Nikolaus Parulian

1 thought on “Nikolaus Parulian, Week 1: Exploratory Data Analysis – What we can do to understand the data?”

  1. Wow, it looks like you really hit the ground running on this project! I’m really excited to see where this research goes over the summer. Looking forward to seeing your visualizations!

Leave a Reply

Your email address will not be published. Required fields are marked *