With the first GIS map of Pompeii now available online, we are turning more of our attention to the problem of connecting our spatial data to our bibliographic data. While there is still some important spatial work to be done with the current map, the planning and documentation for the bibliographic integration serves as a worthwhile distraction. To that end and following a discussion last week with Alexander Stepanov, the PBMP’s GIS architect, I’ve decided to write up some very quick documentation for our data and their connections as a blog post. I’ve also decided to try something else new. Below is a Google Slide with the designs and discussions we drew on a whiteboard as the background. Over this are shapes representing files we need to link together with their names hyperlinked to their locations on the web (as hosted sites or Dropbox objects). In this way, the blog post operates in three different dimensions:
- As a public discussion
- As a living, internal document
- As an interface to the repository of files we’re using.
The files listed are as follows:
A single file of spatial data to start, the Propeties by Eschebach (Prop_ESCH), representing all the building and occupied spaces in the city. Later this will expand to include other, more generalized features of the landscape, such as the City Blocks, Gates, and Fortification Walls.
Three files from the Nova Bibliotheca Pompeiana are given here:
- The first 10,000 citations (GYG Citations_BIBLIOGRAPHY) completed from the NBP as there were prepared for uploading to Zotero (and then to Omeka). This shows how the data were divided and might be recombined.
- A list of property addresses from the Spatial Index from the NBP (GYG Citations_INDEX). This gives as a one-to-many relationship the address of a property and the one or more citations that relate to it.
- A list of addresses per citation as extracted from the full-text of the first two volumes of the NBP (GYG Citations_TEXT). This gives as a one-to-many relationship the bibliographic citation as given by Garcia y Garcia and the one or more addresses that relate to it.
Naturally, there will be a significant overlap between #2 and #3, which will reduce the total number of connections, but also offer a chance to preform quality control test on the data as extracted from the NBP.
If thinking of this a merely a spatial data problem, the work to be done is non-trivial, but also not conceptually difficult. That is, if all we wanted to do was to connect the bibliographic data to the map so that users could click on it and access that information, the process would be straight-forward: combine and proof tables #2 and #3, then join them to the spatial data of Properties by Eschebach. Indeed, that *is* our primary goal, but we also want those bibliographic citations to be linked to their full references on our other platforms (i.e., Zotero and Omeka). Moreover, we want users to be able to use search functions in the map – beyond navigating and clicking – to both find and leverage bibliographic information. For example, we want people to be able to search for an author in the map and have the sites and buildings associated with that author appear highlighted. The user should also then be able to create a new search off of this subset of data, using either additional bibliographic criteria or spatial definitions. To make these functions possible, however, the data stored in the map cannot only be reference numbers linked out to other resources. Finally, we would like to eventually have searches in our bibliography be (passed to and) responsive in the map, so that the results of regular bibliographic searches might be visualized in the map as well as in the listing of citations.
As you can see from teh image, we’ve got an outline of how we’ll do this. Nonetheless, if you are a GIS architect, a digital collections librarian, data designer, or all around smart person and have an opinion on how this might be done, in all or in parts, please do email me: Pompeiana[AT]gmail.com
– EP