The PBMP’s NEH White Paper

Delinquent, unfortunately, but now finalized is the PBMP’s last requirement from the National Endowment for the Humanities Digital Humanities Start-Up Grant, a white paper that will be published on the NEH website. As it was meant to be, writing this report was both difficult and rewarding. It was difficult because so much happened during the grant term it was hard to remember it all and even more difficult to put what happened in the right context, or even in the right order. It was also difficult because it is hard to shine an honest light on both one’s successes and failures. For these very same reasons, of course, the process was deeply rewarding. At the same time that I had to admit where I failed, it was also required to point out and to try to quantify our successes. The latter greatly outweighed the former. Having the support of both the American Council of Learned Society’s Digital Innovation Fellowship and the NEH DH Start Up grant, the PBMP genuinely was able to exceed the grant’s stated goal ” to support the planning stages of digital projects that promise to benefit the humanities.” More than planning, we have built viable digital products being used by thousands of people in more 120 countries around the world.  Thanks to the NEH, the ACLS, and especially the dedicated team at the University of Massachusetts Amherst we have significantly moved the study of Pompeii forward into the 21st century, and set the stage for still greater benefits to be realized.

Below is the text of our white paper, which I hope will document, inform, and inspire. But that’s asking a lot of a white paper.

NEHWhitePaper_PBMP

– EP

Recently I wrote a rather long guest blog post for OpenPompei, an Italian group interested in Pompeii, open data, and community engagement, about the PBMP. The post turned into a manifesto about what I’d like to do with the PBMP and for digital archaeology at Pompeii more generally. Naturally, it belongs here as well (reblogged):

Wishing for Data, Working for Data: a manifesto for an Open Pompeii

OpenEntrance

Post by Eric Poehler

An Open Pompeii, for me, has been a dream since 1998, since the first time I ever thought about the archaeology of the ancient city. That dream is recalled almost every day: each time I need to find an obscure book, make a map, access an archive, or visit a building on the site. The dream became a calling in 2007 when I discovered that a colleague and I had each, unaware of the other, been spending hundreds of hours digitizing the landscape of Pompeii to run our analyses. How much better might those hundreds of hours been spent in the context of our research? How much more might we understand? That experience was the origin of Pompeii Bibliography and Mapping Project and the beginning of a recognition for me that we are losing information and losing opportunity by failing to cooperate and failing to organize. I was therefore elated to learn of OpenPompei and the SCRIPTORIVM (and its impressive video announcement), not only because it represented a step towards wider collaboration, but also because it came from within the Italian community. I’d long wished for a like-minded Italian community, and now it seems that wish is coming true. Thus, it seems appropriate to share a few other wishes for Pompeii. What follows is a ‘wish list’ of projects that I’ve been interested to see started, pushed forward, and in some cases to be finished.

“Spatializing the city”: Attaching data to places and architectures.

The important explosions of research on Pompeii over the last 250 years on Pompeii have, in one way, been exactly that: intense fragmentations of a unified urban environment into categories of study (archaeology, art history, classics, epigraphy, history, etc….), the instruments of scholarly communication (articles, lithographs, manuscripts, and now 3D models), and in many cases literal separation from the city (into museums, private collections, and the pockets of visitors). It is time to bring the data back home. There are innumerable opportunities to put the representations of frescoes, mosaics, inscriptions, and objects back into their natural spatial environments and we are fortunate to have at our disposal remarkable works of aggregation to accomplish this: Pompei. Pitture e Mosaici, Corpus Inscriptionum Latinarum, Nova Bibliotheca Pompeiana, Studio sulle provenienze degli oggetti rinvenuti negli scavi borbonici del Regno di Napoli, Pompeii in Pictures, Fortuna Visiva, among others. Yet we seem to find ourselves in the paradox to stand at once on the shoulders of giants and in their shadows. We have thus far been unable or unwilling to take on both the genius and the failings of these scholars and projects in order to do something more. Happily, this is beginning to change.

The Ancient Graffiti Project, lead by Rebecca Benefiel and Sara Sprenkle, has accepted the challenge to put the epigraphic record back in order, allowing users to search for graffiti by content, class, and location. Imagine (as at least Paavo Castren and Henrik Mouritsen must have done) what questions we might ask about the epigraphic landscape when the physical landscape is a companion rather than an obstacle. I wish this project all the greatest success.

We need to do this same work for the frescoes and mosaics. Recently Domenico Esposito has shown the great value of not only studying a specific wall painting style, but also considering its spread across the city. Surely geographically locating all the information about wall and floor decorations contained in Pompei: Pompei: Pitture e Mosaici (PPM) – styles, subject matter, materials, etc. – will create a whole new universe of questions for art historians and others. For those who are interested in this topic, you’ll be happy to know that through the exceptional efforts of some great students (esp. Tess Brickley, Sarah Chen, and Ethan Liu) we have scanned the entire PPM and created a beta CAD file of almost every room in every building in Pompeii, each individually named with the room label given in the PPM. The CAD file is available here, but for copyright reasons, sections of the PPM scans are available by request.

Similarly, recent books and articles by Miko Flohr, Steven Ellis, Pia Kastenmeier, and Nicholas Monteix (among others) have all brought the economic questions of production, consumption, and retail activities down to the level to the individual room. These questions too are ready to be explored further and expanded upon by incorporating their data into a spatial frame. Much of this work has focused on chronological change as well as identification and description, which brings us to an even greater challenge to the spatial representation of Pompeian data: objects and stratigraphy. Like the difficulties the Ancient Graffiti Project faces in precisely relocating individual inscriptions, the definition of find spots from the early excavation reports can be a significant challenge. It is one, however, we should no longer avoid for its difficulty. With example of Pim Allison’s work on the Casa del Menandro and the publication of the Bourbon era finds by Pagano and Prisciandaro we are in a better place to consider the value of the finds record both the level of the individual room or building and at the scale of the entire city. Once again with the help of fantastic students (Pompeii 492a, Spring 2015) I have been able to digitize and transform Pagano and Prisciandaro’s table of finds of the early excavations and they are now ready for the careful, hard work of attaching them to the urban landscape. The finds records from modern excavations – and more importantly, their stratigraphic descriptions – have even richer and denser evidence to offer. Many research projects have GIS and digital recording procedures incorporated into their fieldwork practices and I urge them to share those data as soon as they are able, perhaps even as part of their initial reports, but hopefully not long after final publication.

The Pompeii Quadriporticus Project will soon be doing just that: sharing in multiple formats the outputs of our research for scholars and the lay public alike. We’ve already begun sharing our images. For scholars, we anticipate soon sharing tabular and descriptive data accompanied by drawings and matrices of our interpretations, further supported by 3D renderings and GPR results. For the public, and especially for visitors to Pompeii, we are planning to create online a series of nested histories of the Quadriporticus – growing in detail and complexity as the user desires – that are geolocated so that, like targeted advertisements, people can read about the past on their own device without large placards detracting from the experience of the past. There are countless areas of the city that can benefit from the same kind of geolocated information.

Equally, there are countless documents that are waiting in archives and libraries that need to be online and available for use. Archives like those of Halstead van der Poel in the Getty (which draws on Warscher documents), or those collections of papers of former superintendentsrecently discussed in the Rivista di Studi Pompeiani hold a unique set of information and unique perspective on the great excavations and research agendas of the 20th century. To that end, perhaps it is time to consider an oral history project of the many important Pompeian scholars – previous superintendents and directors of the site, heads of foreign research projects, independent researchers, etc. – whose work asked the questions that we in the 21st century are now trying to answer. For the 18th and 19th centuries, collections of artworks and maps can offer a great deal of evidence about the early excavations. Projects such as the excellent Fortuna Visiva are already in place, but the need for systematic and comprehensive collection and open sharing of these illustrative documents remains a desideratum. In exploring the question of early maps, once again with my students, we created a Zotero site of the mapsknown in the Corpus Topographicum Pompeianum and attempted to locate and some cases, digitize these maps. Combining a complete list of maps of the excavations with the CAD file of room-level spaces, it will be possible to create a maps that shows the 250 year process, year-by-year and in some cases day-by-day, of Pompeii’s disinterment. The pace of excavation, the recovery of objects, the locations of work, all can be normalized and made comparable by the giving them spatial properties in a GIS.

For the PBMP, there’s a short wish list of features and capabilities that I’m hopeful to realize in the future. The first is the creation of a means to discover, receive, and ingest new citations into the bibliography. This is a problem with many obvious open access and community-based solutions in addition to need for the technical expertise to implement it. A second desired feature is a natural language processing procedure to parse the many full-text objects attached (or in the process of being attached) to our bibliography in order to find all – and all the meaningful – locations in Pompeii mentioned in the text. What is meaningfully discussed in a text and what is merely mentioned, and which does the user need, are issues to resolve. Next on the list is a flexible and intuitive design for a versioning archive of the spatial data with the purpose to serve not only the different versions of the PBMP base data, but also to make available the many different interpretations of the site that are represented by different shapes of space. That shops are attached to the shape of a house, or not, has important ramifications for how we interpret the entire ancient city from a number of perspectives: when attached one assumes economic dependence if not direct ownership of the shop by the house; when separate, one presents a landscape of far greater independence in property ownership and the assumption of an economic class to own those shops. The shape of space matters and it is crucial not to hard code historically meaningful assumptions into digital representations without making those assumptions explicit AND without offering at least the opportunity to choose a different set data with a different set of assumptions. Finally, I’m wanting and working toward a connection from the bibliography to the GIS such that when bibliographic search results are returned, the meaningful locations contained within those results are displayed on a map to accompany the list of citations. I’ve called this an “instant gazetteer” that will change and narrow with every new search.

These are some of the more GIS based items on my “wish list” for an Open Pompeii. Nearly all of these projects can start with the initiative of a single person who is fascinated by Pompeii and who believes that fascination will only grow through engagement and contribution. Anyone can find and add content to online platforms: citations, maps, artworks, etc. Secondary teachers and university professors can engage their students in projects that not only educate, but also don’t waste the effort expended in the process of learning. Build lessons that build things. Let me unpack that a bit further by analogy. Lifting weights builds muscles. What if the movement of those weights could be used to power the lights at the gym? Learning builds intelligence. What if the act of learning increased the total ‘weight’ of content for the next act of learning? Events like SCRIPTORIVM and community-building groups like OpenPompei are especially important right now as we regularly lose as much information about the ancient world into silos of data as we do from the continuous, if irregular collapses of walls and crumbling of plasters. Even some conservation efforts conceal evidence in the name of preserving it: one needs only to look at the long tradition of covering walls with a mortar to prevent its erosion, but at the cost of hiding all the details of its history of construction. While the community of Pompeianisti cannot save the site directly, we can do incredible work to support the work of conservation by providing a broad, dynamic, and most of all open sets of data that can be used in planning and research. Surely the worst thing we might do is to squander the resources we have by duplicating efforts the way my colleague and I did back in 2007.

As a researcher, I’ve realized it is not within my expertise to claim (though I’m not without opinions) what are the best ways to save and/or to use Pompeii. As a foreigner, moreover, it’s not my place to make demands. But, as one of many committed specialists and community members, I can make a difference. In my case, I can put aside the natural inclinations to retain control over the data and research products I’ve produced (and, importantly, paid for) and “give it away” to others in imperfect, incomplete, or “in progress” forms. My aims are not wholly altruistic, however. I do want and do claim authorship and credit for the digital products I’ve made and I do hope that in making them widely available those digital objects will build a legacy for the efforts that created them. At the same time, I also wish to be part of an on-going philosophical shift in the way that we create and share data, especially academic data. It is a shift that we desperately need. In my opinion, building a data set or even describing that data in a narrative is no longer enough, one needs to take their ultimate validation in the number of people those data reach and how many choose to use it. For academics, such a shift is not merely structural; its not about how we peer-review databases or how we evaluate digital work. It’s also a cultural shift. We need to learn to be comfortable with the reality that data are messy and to share them anyway. We need to be willing to move the entire discipline ahead, along with our own specific publications, by sharing imperfect, incomplete, and in-progress works. And yes, we need to do this within the structures of academic labor and power (i.e., hiring, promotion, and tenure).

There’s so very much more to say on this topic and at another time I probably will. For this discussion I’ll leave it here by thanking OpenPompei and SCRIPTORIVM for letting me contribute to their noble cause and by taking my own advice, letting the most recent GIS data of the PBMP– imperfect, incomplete, and in-progress as they are– go off for the betterment of our Pompeian community.

A map for the Grande Progetto di Pompei and the Portale della Trasparenza

The underlying principle of the PBMP is that space is perhaps the single most powerful metaphor for structuring (at least) archaeological information. In our project, it is the shape of the ancient city and its parceling into modern, Roman, and pre-Roman geographies (or more abstractly, geometries) that serves as scaffold for a valuable set of research information: the bibliography. Much, much more can be done with this structure and one day we hope to do it: attach images of frescos to walls, mosaics to floors, artifacts to rooms and buildings, and even stratigraphy to trenches. Someday.

What we can do today is a little less grand, but no less important. In 2012 the Grande Progetto di Pompei was initiated with the hope that an infusion of interest, support, and above all funding would forestall both the normal deterioration of the site and the catastrophic collapses that made headlines in Novemeber, 2010 (see the discussion of the collapse of the House of the Gladiators on Blogging Pompeii).   Since then, the Grande Progetto has been the subject of intense interest and scrutiny.

In response to that interest and scrutiny, the Soprintendenza archeologico di Pompei (SAP) has created a web interface to the information about the work and progress of the Grande Progetto. The portale della trasparenza offers the public a glimpse into not only the variety of work that is being done to preserve Pompeii, but also how much is being spent and by and to whom. It is a spectacular step forward, though some have reasonably wanted still more (the OpenPompei group, for example, has asked for improved data structures).

The SAP had already done the hard work of getting these data together, but we realized that the PBMP could help more easily represent where the money was going, in both the figurative and literal senses. The map below shows the 13 major divisions of the Grande Projetto and clicking on each will return the information served by the SAP, translated into English. That is not to say that it is translated into common English, as the bureaucratic conventions of any culture will confound the average reader. Still, for the purpose of completeness, all data are preserved.

We at the PBMP make no claims to the value or the virtue of decision making reflected in these data. They are as they are, and we thought we might help interested people know more. That is all. If we have goal, it is to foster discussion. Therefore, comments, corrections, additions, suggestions and poignant civil discussion are all welcome in the comment section below.


View larger map

 

Documentation on Linking Data

With the first GIS map of Pompeii now available online, we are turning more of our attention to the problem of connecting our spatial data to our bibliographic data. While there is still some important spatial work to be done with the current map, the planning and documentation for the bibliographic integration serves as a worthwhile distraction. To that end and following a discussion last week with Alexander Stepanov, the PBMP’s GIS architect, I’ve decided to write up some very quick documentation for our data and their connections as a blog post. I’ve also decided to try something else new. Below is a Google Slide with the designs and discussions we drew on a whiteboard as the background. Over this are shapes representing files we need to link together with their names hyperlinked to their locations on the web (as hosted sites or Dropbox objects). In this way, the blog post operates in three different dimensions:

  1. As a public discussion
  2. As a living, internal document
  3. As an interface to the repository of files we’re using.

The files listed are as follows:

A single file of spatial data to start, the Propeties by Eschebach (Prop_ESCH), representing all the building and occupied spaces in the city. Later this will expand to include other, more generalized features of the landscape, such as the City Blocks, Gates, and Fortification Walls.

Three files from the Nova Bibliotheca Pompeiana are given here:

  1. The first 10,000 citations (GYG Citations_BIBLIOGRAPHY) completed from the NBP as there were prepared for uploading to Zotero (and then to Omeka). This shows how the data were divided and might be recombined.
  2. A list of property addresses from the Spatial Index from the NBP (GYG Citations_INDEX). This gives as a one-to-many relationship the address of a property and the one or more citations that relate to it.
  3. A list of addresses per citation as extracted from the full-text of the first two volumes of the NBP (GYG Citations_TEXT). This gives as a one-to-many relationship the bibliographic citation as given by Garcia y Garcia and the one or more addresses that relate to it.

Naturally, there will be a significant overlap between #2 and #3, which will reduce the total number of connections, but also offer a chance to preform quality control test on the data as extracted from the NBP.

If thinking of this a merely a spatial data problem, the work to be done is non-trivial, but also not conceptually difficult. That is, if all we wanted to do was to connect the bibliographic data to the map so that users could click on it and access that information, the process would be straight-forward: combine and proof tables #2 and #3, then join them to the spatial data of Properties by Eschebach. Indeed, that *is* our primary goal, but we also want those bibliographic citations to be linked to their full references on our other platforms (i.e., Zotero and Omeka). Moreover, we want users to be able to use search functions in the map – beyond navigating and clicking – to both find and leverage bibliographic information. For example, we want people to be able to search for an author in the map and have the sites and buildings associated with that author appear highlighted. The user should also then be able to create a new search off of this subset of data, using either additional bibliographic criteria or spatial definitions. To make these functions possible, however, the data stored in the map cannot only be reference numbers linked out to other resources. Finally, we would like to eventually have searches in our bibliography be (passed to and) responsive in the map, so that the results of regular bibliographic searches might be visualized in the map as well as in the listing of citations.

As you can see from teh image, we’ve got an outline of how we’ll do this. Nonetheless, if you are a GIS architect, a digital collections librarian, data designer, or all around smart person and have an opinion on how this might be done, in all or in parts, please do email me: Pompeiana[AT]gmail.com

– EP

Zotero: the first 10,000 (almost) citations

The first 10,000 citations about Pompeii have now been prepared and 9,956 have been uploaded to our Zotero library. Users can search the library, reorder the display, export records, produce formatted citations, and add references to their own collections. These citations still have issues in need of correction due to both human error and text character translation. We hope to improve the citations and eventually add more using the PBMP Zotero Group. Please sign up and get in touch (PompeianaATgmail.com) if you are interested.

Most of the content in the Zotero library is self-explanatory, as the redundancy of the table below demonstrates. There are, however, two fields that need some clarification to be properly used or ignored. These are:

  1. Loc. In Archive: This is PBMP ID, a unique, sequential number assigned by the project.
  2. Call Number: This is the NBP ID, a (mostly) unique, alphanumeric designation assigned by L. Garcia y Garica in his landmark three volume work, Nova Bibliotheca Pompeiana. Use this number to discover further information about the author, editions of the publications, reprints, and reviews. Volumes I and II are still in print and volume III is newly available. We encourage you to encourage your library to purchase the remaining copies of these works.

 

Item header: Title of the work

Item Type Publication format, such as book, journal article, artwork, etc.
Title Title of the publication.
Author Author(s) of the publication.
Series Editor Name(s) of editor(s), or various authors (AA.VV) if there is no specific editor.
Series Name of publication series, if any.
Place Place of publication.
Date Year of publication.
# Of Pages Extent of pages in the publication.
Language Language of publication.
URL  Link to Full-Text of the publication.
Loc. In Archive PBMP ID
Call Number NBP ID

Pompeii: The First Navigation Map

The PBMP’s first full map for navigation is now online. You can start to explore Pompeii in the map embedded below, or go to the full site for more space and options. If you want to customize the map or make a presentation from it, sign in to / sign up for your ArcGIS Online account and save a copy to your own webspace. The link is at the upper right of the embedded map page. Below the map is additional information about the files, the information they contain, and their display.

The “Pompeii: Navigation Map” is essentially a set of nested tiles that change the display of the city as one zooms in and out to change the scale of the map. Overlying these are a series of vector-based files, which are used almost exclusively as invisible, data rich layers. That is, the transparency for many of the files set to 100% so that the information about Pompeii those files hold can be accessed (via a pop-up window), but their rendering does not slow the loading of the map.

Users may find the information in the following files to be of interest:

Data-Rich Layers

Elevations Points: This layer is turned off by default and set to not appear until the view scale reached 1:2500. Above sea level elevation data at 5cm or 10cm resolution from multiple sources: Corpus Topographicum Pompeianaum (1984); De Caro, S. (1979); Eschebach and Müller-Trollius (1993); Etani et al, 2003; Pompeii Archaeological Research Project: Porta Stabia.

Eschebach ALL: (West & East). Due to the number of features in the original file (Properties by Eschebach), the file was split in two along the via Stabiana. The user should notice little difference. There are, however, some significant issues to be aware of in the spatial consistency of properties for those interested in the area of individual features. Because the properties were drawn to express the functional categories assigned by Eschebach and not the contiguous physical boundaries of the building, there are overlaps, gaps, and duplications in the data. We are working to improve these data. For the moment, caveat emptor. These files do, however, contain information of importance to researchers, including:

  1. Address of the Primary Door according to Eschebach (1970; 1993).
  2. Functional Category according to Eschebach (1970; 1993).
  3. A link to image of the property at Pompeii in Pictures.
  4. The Date(s) of excavation according to the Corpus Topographicum Pompeianaum (1984).
  5. Area of the property in square meters.

PBMP CTP (Features): The 628 properties in this file represent the properties described in the “Structures” section of the Corpus Topographicum Pompeianaum (1983), pars. II. This layer contains information of importance to researchers, including:

  1. Address of the Primary Door.
  2. Page number of the information in the Corpus Topographicum Pompeianaum (1983), pars. II.
  3. All known names of the property: Name (1) – Name (15).
  4. Bibliographic reference for each known name of the property: Ref. Name (1) – Ref. Name (15).
  5. A link to image of the property at Pompeii in Pictures
  6. Area of the property in square meters.

Display Layers

Fortification Walls: Sixteen sections, between the defensive towers and city gates, of Pompeii’s extant fortifications are shown and named.

Defensive Towers: Eleven of the twelve known (by inscription) defensive towers surrounding Pompeii are shown and named.

Gates: The seven known gates to Pompeii are shown and named.

Unexcavated Areas: Three primary areas still not yet excavated (in Regions I, III, IV, V, and IX) are shown and named, as well as isolated areas along the interior of the fortification walls.

City Blocks: The excavated extent of the city blocks (insulae) are shown and labeled.

Streets: There are 97 streets and passage areas represented in this file with the extend of the street and its name given according to their modern conventional nomenclature (in Italian).

Alleys: Six passages within city blocks and disconnected from the street network shown and named.

Sidewalks: The excavated extent of the pedestrian sidewalks are shown.

Stepping-Stones: The 316 known stepping-stones within the street network are shown and named.

Forum: The forum, though also given a designation as a city block (VII 8), is shown here as its own feature.

Water Towers: The twelve water towers are located and labeled according to the nomenclature established in Larsen, 1982.

Fountains: Thirty-four public fountains, including both the complete footprint of the fountain and its interior basin, shown, symbolized to show the basin with water, and named.

Projected City Blocks (insulae): This is layer is turned off by default and expresses 46 extrapolated city blocks that remain partially or completely unexcavated. Some areas are almost certainly accurate (Regions I, III, and IX), while other are somewhat more speculative (Regions IV and V).

PBMP CTP (Tiles): This layer is turned off by default and represents the location of the 628 properties described in the “Structures” section of the Corpus Topographicum Pompeianaum (1983), pars. II. This layer visualizes the locations in the PBMP CTP (Features) layer, but does not contain that layer’s attribute data.

The PBMP Welcomes Daniel Armenti

As the PBMP closes in on its goals in 2014/2015, we welcome a new team to see them to completion. Chief among them is Daniel Armenti, the new graduate research assistant for the project who captains the bibliography component and supervises an excellent team of undergraduate assistants. Let’s meet Daniel:

 

I often resort to my knowledge of various languages as a positive accomplishment when I feel intimidated by an academic project. As a student of Comparative Literature, we’re encouraged to learn several to do our research, and to not rely on other peoples’ translations. I’ve found my language knowledge to be comforting though because it gives me what I feel to be a concrete metric by which to measure my academic progress. Things that intimidate me academically: anything computer related beyond simple word processing (and, let’s be honest, that can get a bit dicey as well). So when I was encouraged by one of my professors to apply for the graduate assistant position with the Pompeii Bibliography and Mapping Project, I spent a long time wondering how my name had crossed his mind for a digital humanities project.

This is rather silly, because in reality my research and work experiences have dealt extensively with the digital humanities, even if I seem to have trouble reconciling them with my perception of computers as essentially wonder–boxes–where–the–magic–occurs: for the previous two years I have been working as an editorial assistant for the journal Digital Philology (Johns Hopkins UP), a medieval studies journal with a focus on the digital humanities. Most of my actual work was concerned with proofing, formatting, and bibliographic work, but it has put me in a position to encounter many of the new methods of approaching medieval text, from the creation of interactive databases of texts, to the use of software for authorial attribution. Before working on Digital Philology, I had worked off and on for several years on the Raymond J. Lord Collection (associated with the University of Massachusetts Renaissance Center), a digital library of medieval and renaissance combat treatises and fencing manuals—this work was primarily image editing and transcription work.

I don’t know why I have to work so hard to associate these projects with experience in the digital humanities—I mean, the word “digital” is in the title of the journal I edit—but for some reason, like many who do work in the humanities, I’ve pigeonholed myself as “computer illiterate,” which is demonstrably untrue. This is not to say that my languages haven’t been helpful on the Project—all of the biographical notes that we deal with are in Italian, and I’m thankful that I can read and understand the vast majority of titles, with the exception of those in Slavic or Asian languages. But the more I work on the Project (and the more I watch the undergraduates we hired do their work considerably faster than I do despite their lack of Italian, and other relevant languages), makes me realize that it’s my familiarity with other digital humanities projects, and my own thoughts on the problems and difficulty of doing research, that will really help me as our work on the Project moves forward.

Currently the digital humanities are a boon to those of us working on pre-modern subjects (I’m sure it is for those working on modern subjects as well, but I’m going to stick with what I’m familiar with): high resolution scans and digital libraries save us the trouble of buying plane tickets, travelling overseas (did I mention that I have an inconvenient terror of flying), to try and spend as much time with the objects of our research as the librarians will grant us; combined with these scans, OCR software allows us to interact with the information of our texts in almost every way imaginable, from simple word searches, to attribution studies, stylistics, etc.; bibliographical databases provide not only a foundation for new research, but also help avoid redundant research—if they’re set up well—on subjects that sometimes have over two thousand years of academic history, and tens of thousands of sources.

One of the things that Eric Poehler (the director of the PBMP, whose name you’re probably familiar with if you’re reading this blog) impressed on me was how well set up the bibliographical database will be once it’s established—we’ve got around thirty points of information that can be applied to any citation, including title, author/editor, publisher, and date, but also medium, language, authorial biographical information, series information, and we’re discussing the implementation of keywords. The inclusion of this information allows for not only simple searches of the database, but the use of the information of the database itself as a source for new historical studies. Furthermore, our goal is to link as many of these sources to digital copies of their texts as possible, as many of them now exist in the public domain. By linking the entries to the texts themselves the bibliography will act to a certain degree as an online library, providing the researcher with access to the research itself. Bibliographical searches will be refined even more when we introduce the mapping portion of the Project, which will link physical location and subject as intuitive search parameters.

It’s impressive, or rather it will be impressive when you see it. I’ve seen it everyday for the past month, so I’m already impressed with the ambition, potential, and awesomeness of this project. When I give this last quality, I want to it to imply not only how excellent a resource the PBMP will be, but also the enormity of the work that has already been accomplished, and that still lies ahead of us. We’re currently working with some twenty thousand sources, splitting them into points of data, editing citation entries, and preparing to shift all of this information online. It requires a lot of time, and a lot of hands, to get this much work (a considerable amount of it data entry) finished, and for me, a surprising amount of supervising as I work with our undergraduate assistants. And it is getting done. It is getting done far more rapidly than I would have expected, after reviewing the mass of work that lay ahead of us a month ago.

It’s exciting. And, for the most part, the work is interesting (I’m speaking for myself here, not necessarily for my undergraduate assistants, who we’ve foisted most of the data entry onto). It’s interesting in part because I am able to use my languages (at last!) to do the work of editing and proofing our citation lists, but also because I’m learning an enormous amount about what goes into a project like this, and how to implement it in a way that will be useful to scholars who study Pompeii, certainly, but on a broader scale as a model to scholars who would like to create similar projects in their own fields.

– Daniel Armenti

The PBMP seeks a graduate research assistant

Graduate Research Assistant Position

Academic Year, 2014-2015

Pompeii Bibliography and Mapping Project

 

The Pompeii Bibliography and Mapping Project (PBMP) seeks a graduate research assistant for the 2014-2015 academic year. The position has funding for 38 weeks at 15 hours per week and includes benefits and curriculum fee. The PBMP is a digital humanities project, funded through the ACLS and the NEH, working to construct a robust and interwoven spatial and bibliographic resource for all levels of research on the ancient city of Pompeii. More information can be found on our website and blog: http://digitalhumanities.umass.edu/pbmp/.

The PBMP is accepting applications immediately and hopes to fill the position with the very best candidate as soon as possible. Knowledge of classical archaeology or of Pompeii specifically are not expected, but an interest in these subjects is preferred. Similarly, no previous programming or web-authoring experience is required, though familiarity with multiple software packages, basic digital tools, and/or elemental scripting environments is desired. None of these preferred experiences should deter competent, capable, and energetic applicants from applying.

 

Duties and Responsibilities

The expected duties of the graduate research assistant will be to assist in the work to develop both the exhaustive bibliography of references to scholarly work on Pompeii and a complete online Geographical Information Systems map of the ancient city. Generally this work will include efforts to:

  • Maintain and expand bibliographic catalogs in Zotero.
  • Construct and elaborate bibliographic collections and exhibits in Omeka.
  • Capture and generate full-text bibliographic content.
  • Modify and create spatial data for ArcGIS / ArcGIS Online.
  • Assist in building the database and web-interface connection between the bibliographic and mapping components.
  • Supervise and direct the activities of undergraduate assistants and volunteers.
  • Liaise with university administration, faculty, librarians, and other employees on behalf of the project.

 

Requirements and Qualifications

Appropriate candidates will have some or all of the following characteristics:

  • Strong academic research capabilities with preference for familiarity with humanities and/or social science fields.
  • Knowledge of library catalogs and bibliographic platforms (e.g., Worldcat, Zotero, Omeka, Refworks).
  • Spatial acumen, in data and programs (e.g., CAD, GIS, 3D modeling, etc).
  • Experience with web-authoring platforms and programs (e.g., HTML, Dreamweaver, WordPress, blogs).
  • Exposure to scripting programs (e.g., Python).
  • Excellent written and verbal communication skills.
  • Ability to take initiative.

 

Applicants should send a CV and letter of interest by email to Prof. Eric Poehler (epoehler@classics.umass.edu) detailing their qualifications and interest in the position. Applications will be accepted on a rolling basis and interviews will be similarly scheduled. We hope to fill the position and for the graduate research assistant to begin as early as September 1st.

PBMP Bibliography: Excel → RIS → Zotero → Omeka.

PBMP Bibliography: Excel → RIS → Zotero → Omeka.

The existence of an online, searchable, 13,000+ reference bibliography on Pompeii is tantalizingly close. With the expertise of two great UMass Librarians, Aaron Rubinstien (University and Digital Archivist) and Ron Peterson (Discovery and Integrated Systems Coordinator), the PBMP has moved our massive spreadsheet of citations into bibliographic formats readable by the content platforms we intend to use. Out first attempt to publish the bibliography is now available on our Zotero and Omeka sites. The process of migrating those citations to the web, although it appeared to be a simple one, has not been easy.

In no small part, this difficulty is the legacy of the ‘boot strap’ beginnings of the PBMP. In 2009, before this project was funded by the NEH, ACLS, UMass DHI or CHFA (again I thank them all!), and before Garcia y Garcia partnered with Arbor Sapiente to update his work and publish online as pdfs, I began scanning the Nova Bibliotheca Pompeiana and correcting the terrible OCR transcripts in Microsoft Word. With the generous funding from UMass, it became possible to parse those word docs into tabular form and hire students to continue to correct the data. Originally, I had intended to use Microsoft Access to produce easy to use forms for students to continue the process of correcting the raw citation text and splitting it into appropriate fields. Ironically, “Access” was not easily accessible for students (not included in Microsoft Office for Students). For this reason, we shifted to Excel.

Doubtless because I am not a librarian and am not educated in their best practices, I was surprised to learn that neither Zotero nor Omeka would import from Excel, .csv, .tsv, or .txt. Surely this is to protect the specifically structured contents from being regularly fed into the wrong fields. Our task was therefore to convert our spreadsheet formatted data into one of the formats that our platforms would accept. Zotero will import from Zotero RDF, MODS, RIS, BibTeX, Refer/BiblX, and unqualified Dublin Core RDF, while Omeka, importantly, can import from Zotero. It therefore seemed appropriate to create a chain of transformations: Excel → RIS → Zotero → Omeka. Aaron, Ron, and I mapped the fields to be transferred from Excel to RIS and then Aaron wrote the scripts that processed that translation. He then imported them to Zotero with its native import tool, getting 12,804 records online. It was obvious at this point, however, that the encoding of special characters in Excel and their re-expression in Zotero was going to be problematic. Universal character and symbol recognition and translation is an endemic issue. For example, the title of this post was first translated into the body of this post by Worpress as “Excel à RIS à Zotero à Omeka”. Continuing our transformation chain, Aaron then applied the “Zotero Import” Plugin to import the Zotero records into Omeka. 10,479 records we imported before some error was introduced that halted the import.

Zotero_EXFor a first attempt, our process of translation and upload was remarkably successful, but these results are obviously not good enough. Beyond the problems already mentioned – special character issues and missing records in Omeka import – there are other issues to overcome. For example, we discovered that some elements of the field mapping were faulty. Sometimes this was a problem with the translation script, but more often it was a problem with the original data being inconsistent. In complex bibliographic citations, (e.g., items with multiple authors in an edited volume that is part of a series books) students were often excusably confused while working on the data, and some citations they parsed incorrectly. There are also the differences in Italian publishing standards and Garcia y Garcia’s own (understandable on such a large project) personal idiosyncracies that meant information did not always go in the right places.  One strange issue, however, is that the RIS field for “Place”, that is, the location where an item was published, just won’t read into Zotero’s related field. BibTeX seems to have a greater range of fields so we will try that format on our second attempt. Another item to overcome is the absence of an unique handle for each citation that our GIS system can use. That’s just a global application of a serial identifier, in this case, (e.g.) “PBMP_BIB_000001”.

To help overcome these issues, we are enlisting the help of one of my senior undergraduate students, Juliana van Roggen whose Guardstones blog you should also check out for some rugged data analysis and visualization of street stones in Pompeii, a topic dear to my heart. Dedicated to fixing the bibliography, Juliana is working to resolve many of the inconsistencies in the data as well as preparing those data for remapping, multiple imports, and for life online. Her current tasks include:

  1. Using conditional formatting to assign the language of the work and to define its object type (i.e., book, journal, diss, etc.).
  2. Sorting out the journal number issues and preparing to map journal abbreviations to their full names.
  3. Joining the struggle to figure out how to keep the character encoding as citations move from Excel to online.
  4. Connecting to full-text objects online, including those 2953 itmes the PBMP has recently received from Hathi Trust and others previously received from Internet Archive.

Once these corrections are made we will be in good stead to run a second import into Zotero and Omeka. It is my hope that at this point the first part of this process – moving from Excel to Zotero – of this process will be finished and not repeated. We should then be able to make changes online directly into Zotero as needed. This means that a second import into Zotero will not likely also be a final import into Omeka. It should be noted that Zotero is not merely a stepping stone in our process, but rather is envisioned as an integral tool in our larger bibliographic resource.  Although we run the risk of redundancy and asynchronous parallel systems, the different functionalities of Zotero and Omeka make keeping them both a preferred option. For Omeka, this means a much more customizable experience of the data. Individual items can be more fully manipulated and groups can be cultivated not only as collections, but also curated as exhibits, turning the bibliography from mere catalog to platform to illustrate and even to make arguments from its contents. On the other hand, with the robustness and rigidity of Zotero’s design comes a greater ability to create and share individual citations and collections. Most importantly, however, it is a more collaborative space where the PBMP can find, collect, and incorporate new or previously unknown references to Pompeii.

Making the connection: a first functioning prototype of the PBMP

 Making the connection: a first functioning prototype of the PBMP

In a previous post I described the frustration of simultaneously succeeding in building multiple components of the PBMP and being unable to see how they fit together as “trying to grow a hand from the fingers in”.  Today, in a brief, but very effective meeting, all those working on the different parts of the project discussed how the GIS and bibliography, in their current states, will be joined together. What follows is a “meeting minutes” style documentation of our conversation, set out here for two purposes:

  1. To remind us of our thoughts and plans at this stage of the project
  2. To easily share those ideas with a wider community for comment and criticism.

The meeting began with a quick recitation by me (Poehler) on the status of each part of the project and what I thought needed to be accomplished by the end of the week. The deadline is not arbitrary: I will be presenting the PBMP as one part of a workshop I’ll give at the University of Texas at Austin’s Graduate Student Conference entitled “Digital Archaeology at an urban scale.” The poster and schedule can be found at the bottom of this post.

Bibliography Team: Ron Peterson and Aaron Rubinstein have been working to transform our massive spreadsheet of citation information (currently 13,040 completed references) into a format acceptable for importation by Zotero. Thus far the idea has been to condense the spreadsheet – originally designed for a MODS implementation – into Dublin Core and covert that to RIS format. We’ve wanted to have mirrored Zotero and Omeka citation databases, and since Omeka can import via Zotero, the challenge therefore is to get things into Zotero. Zotero can import from a number of formats, but RIS seems the best thus far. Aaron is working on that now.

Mapping Team: Alexander Stepanov and I have been working to publish online a solid, clean, and complete first map of the Pompeii. This will be a basic map for navigation, with some important attribute information included. The navigation and attribute functions relate to the Phase One and Phase Two maps described earlier on this blog. In many ways, this first published map is nearly done, but a few pieces remain to clean up. The first of these is to design and configure the what information will appear and what it will look like when a user clicks on a place in the map. Pop-up windows in ArcGIS and ArcGIS Online are dynamic and scriptable, but with only a subset of all the data available, we’ll need to strike a balance between what we can show now and what is possible eventually. A second pressing problem is the need to be able to zoom as far into the map as possible. Because the features of Pompeii are naturally at the human scale rather than geographical scale, users need to be able to zoom in to 1:50 scale or even lower to examine those features. Currently, ArcGIS online only scales to about 1:1000. Finally, I am working feverishly to complete the integration of  information from the Corpus Topographicum Pompeianum II (“Toponmy” section) with a new spatial dataset. What is driving this is the symmetry of presenting that data at the University of Texas at Austin, which published the CTP volumes thirty years ago.

After writing the above, I’m compelled expand into a little editorializing: Though it is only the first of what will be many versions of our map and mapping data, the importance of publishing this map should not be underestimated. Because we will allow users to download the entire map package – map and data together – this will be the first time a standard, fully digital map of Pompeii has been available. The CAD plan that underlies our GIS and the effort that scholars and the superintendency put into it should not be dismissed, but  it is not available publicly. Remarkably, this will be the first major cartographic advance on the topography of Pompeii since the publication of the RICA maps of the Corpus Topographicum Pompeianum thirty years ago in 1984.

Natural Language Processing Team: As we await a new batch of full-text objects from HathiTrust, Tiger Wu and David Smith have parsed the three volumes of the Nova Bibliotheca Pompeiana graciously published online by Arbor Sapientiae, extracting from each citation its number and all the addresses listed by L. Garcia y Garcia as being relevant to that citation. You can see that rough, but valuable tabulation here as tab separated values. We plan to use this as a first iteration of a joining table that will link bibliographic references to places in Pompeii with its digital spatial representations. Additionally, because we do have the basic information for works held by the Internet Archive – especially their permalinks – our plan is to integrate that information with our Zotero and Omeka catalogs so that, whenever possible, a researcher can go from finding a place to be interesting in the map to reading about that place in only a matter seconds.

When this prototype is working later this week, I will post some links to it. Expect also to read what I learn from demoing the PBMP with the folks at UT.

– EP

Rebuilding the City Poster