To recap what I was trying to do: I wanted to find a species in Taiwan that also happened to be mentioned in the Proceedings of Academy of Natural Sciences.
We chose Taiwan as our geographic point of interest because it has been historically complex in terms of sovereignty and will probably be interesting as an example to see shifting geopolitical realities.
This whole week I have been brushing up the use cases on the example we gathered from the Biodiversity Heritage Library — a land snail species “Pupinella swinhoei sec. H. Adams 1866″.
The idea is to bring the Natural History Museum literature (NHM) closer to real life biodiversity occurrence dataset. I then gathered the dataset from GBIF with search term on scientific name “Pupinella swinhoei”. The aggregated GBIF dataset contains 50 occurrence records across 18 institutions (18 datasets), ranging from year 1700 to now.
(different colors indicate they are from different data source)
Though the ‘countryCode’ field mostly indicated that the records are from TW (Taiwan), it may not be the sovereignty at that time period. To merge these datasets with the sovereignty at the time, I examined two of the 18 data sources first: MCZ dataset versus NSSM dataset.
The 1700 Taiwan is a county within the Qing Dynasty China.
And the 1930s Taiwan is a colonized region of Japan.
I had some preliminary results to merge these two dataset’s sovereignty field by using the logic-based taxonomy alignment approach. However, since I am preparing a submission for a conference based on this use case- I don’t want to jinx anything! (Fingers crossed).
If I am allowed to share more about the paper, I promise to discuss more in the next blog post!