Archivists face a bewildering array of technologies designed to help administer and provide access to archival collections. From free, open-source software such as ArchivesSpace to proprietary software such as Eloquent, archivists may choose from a wide variety of tools. CollectiveAccess, a web-based, free, and open-source system, offers many features to archivists who need a low-cost way to manage and offer online access to their collections. CollectiveAccess was developed and is maintained by the company Whirl-i-Gig and is comprised of two main software components: “Providence” and “Pawtucket.” Providence is the core of CollectiveAccess and provides secure user interfaces for data entry and editing, filtered and faceted searching, file management and upload, and general system administration. Pawtucket is an optional CollectiveAccess component that generates a public-facing website to provide access to files and metadata saved in the Providence database. This paper describes work completed at the American Alpine Club Library (AACL) in Golden, Colorado, to implement a CollectiveAccess instance, which was given the name “Explore.”
The AACL publicly launched Explore in February 2013, using both Providence and Pawtucket. Explore was created with many purposes in mind: to provide access to digital collections, to create and present online exhibits, to manage digital assets, to administer archival and museum collections, and to serve as a value-added benefit of membership in the Club. The flexibility of CollectiveAccess made it possible for the AACL to structure Explore for all of these purposes, however, its success in each area was inconsistent and the system design was too ambitious to be ultimately sustainable. This article details the AACL’s experience launching Explore and provides a helpful case study of how the software was used and customized in a small library and archives with limited staff and resources.
The features of CollectiveAccess have been well described in the literature, and the software has frequently been compared with similar collections management programs. Two of the most comprehensive reviews of the program are Lisa Spiro’s report on collections management software and Ruben Martinez’s article on CollectiveAccess. Spiro’s report provides a detailed evaluation and comparison of collections management software available in 2009 including CollectiveAccess, Archivists’ Toolkit, Eloquent, Cuadra Star, and Archon. The single other similarly detailed (and more recent) report on CollectiveAccess by Ruben Martinez is written in Catalan and only available in English translation via online services such as Google Translate. However, his article, based on a very recent release of the program, provides an excellent summary of the software and its features.
Several authors have considered CollectiveAccess as a digital asset management (DAM) tool for specific kinds of collections and have made recommendations about how best to select the technology that is optimally suited to a particular usage. In “Getting Oral History Online: Collections Management Applications,” Dean Rehberger first presents a series of questions and framework on which to base decision-making for establishing an oral history DAM system. Rehberger considers CollectiveAccess to be best-suited for large collections, and he remarks on CollectiveAccess’s features, including the option for users to upload and share resources, geotagging capabilities, and the software’s automation of making optimized access copies for web delivery of media files. Similarly, in the abstract and slides for the presentation “Enhancing Educational Access to Art,” staff from the Blanton Museum of Art describe their pilot project to provide remote access to and tagging of digitized artwork via CollectiveAccess for faculty and students at the University of Texas Austin. Based on the success of the pilot, the Museum planned to pursue the incorporation of additional features and development. Isabel Pedersen and Jeremiah Baarbe offer one final example of an article about CollectiveAccess as a tool for a specific kind of collection—in this case, a digital humanities open repository at the University of Ontario. Although their implementation was still under development at the time of the article’s publication, Pederson and Baarbe’s project exemplifies how CollectiveAccess may be extensively customized to meet local needs.
Three authors present cases in which CollectiveAccess was not considered the best choice for the project at hand. Rhonda Clark discusses the Titusville Historical Society’s project to provide online access to some of its historic images and indicates that they selected Omeka over CollectiveAccess because Omeka was the most cost-effective option with the lowest technology barriers for their organization. However, Clark notes CollectiveAccess’s metadata flexibility, integration with the Library of Congress Authorities, how its architecture connects with other digital repository technology, its mapping and geotagging features, and the fact that it’s free. In contrast, Alexander Watkins considers potential tools for managing personal collections of photos and includes CollectiveAccess but finds it technically too difficult for personal use, suggesting that the software is aimed at larger institutions.
The third author, Juliet Hardesty, describes a project at Indiana University (IU) to provide online access to a variety of digital objects through one platform. For archivists and others considering CollectiveAccess, her analysis provides helpful information about how it compares to other software because she ends her article with a substantial evaluation of software with digital exhibit features that were designed with “GLAMs” in mind (i.e., galleries, libraries, archives, and museums) and considered at IU. In this evaluation she presents a chart comparing features of around twenty different platforms and then details the programs “that provide the closest equivalent to omeka.org, the software package that can be downloaded and installed,” including CollectiveAccess.
Explore Background and Institutional Context
The American Alpine Club Library (AACL), established in 1916, is a special library open to the public for research and to American Alpine Club members, who receive borrowing privileges for the circulating collection of mountaineering literature as a membership benefit. In addition to its circulating collections, the AACL holds extensive non-circulating special collections and archives, with books dating to the 1500s, a large collection of climbing gear and memorabilia, and significant archival and art collections. When I started full-time as the AACL’s Digitization Archivist in January 2011, I joined a library staff including the full-time Director, three hourly assistants, and a host of volunteers. The library was a division of the larger American Alpine Club (AAC), a non-profit membership organization with around 14,500 members and a full-time staff of 17 and 3 part-time employees, not including library personnel, when I started. The AACL had phased out its use of PastPerfect Museum Software for managing its archival and realia collections, and the library needed a replacement tool to help manage and provide access to these collections. My duties included implementing a DAM system for the Club, which we named Explore, and overseeing a large project to digitize and create a database of 23,800 searchable articles from the combined 31,240 pages of all issues of The American Alpine Journal, Accidents in North American Mountaineering, and Alpina Americana, the Club’s flagship publications, started in 1907.
Prior to my start date, an AAC committee had completed most of the planning for selecting the most suitable DAM program to meet the Club’s needs. Using a matrix of decision points with features ranked as “necessary” through “nice to have,” the committee had narrowed the choices down to two programs: ResourceSpace and CollectiveAccess. I reviewed both software options in the context of the matrix, with a focus on the anticipated uses of the system. I also considered other potentially important factors such as data portability and extensibility and the need for archival management software at the AACL.
Ultimately, I thought CollectiveAccess would be better because it was designed with collections like those at the AACL in mind, whereas ResourceSpace was designed more for organizations to manage digital assets for in-house use. At the same time, CollectiveAccess offered the functionality of a DAM with better metadata options, which would be more customizable with better reporting and searchability than ResourceSpace. I liked that CollectiveAccess came with a built-in, out-of-the-box, public-facing website program, Pawtucket, and a back-end data entry and management interface, Providence, both of which are free, open-source programs built on a foundation of Linux, Apache, MySQL, and PHP. I believed that the AACL needed software that would enable easy export of standards-based metadata, as CollectiveAccess provided. I saw this as an opportunity not only to help the AAC’s marketing team manage digital photos of Club programs, but to expand the project to include born-digital and digitized materials and metadata for non-digital materials in the AACL’s collections, including creating online finding aids for its archival holdings. I also liked that CollectiveAccess could handle providing access to files in a variety of formats whether image, document or audiovisual, had geotagging capability through Google Maps, could support user-contributed resources and metadata, that there were plugins for a variety of additional functions, and that there was an active community of users. Based on all of the above factors and more, I made the successful recommendation to adopt CollectiveAccess for the Club’s DAM, and I set about to customize an installation profile for the AACL.
Installing CollectiveAccess and getting a system into operation require some technical facility or working with someone who has the necessary knowledge. In my experience, installation and making decisions about customization were the most complicated aspects of using CollectiveAccess. Individuals and/or institutions considering CollectiveAccess should keep this complexity in mind and weigh it against their technology capacity when considering their own installation. However, customizing CollectiveAccess is not necessary to take full advantage of the system’s capabilities. Minimizing the complexity of installation is possible if users are comfortable with the available pre-loaded options. We greatly customized Explore, which required a lot more troubleshooting and code modification than if we had only used the software out of the box. Either way, setting up a system requires web server technology, and to customize an installation fully, it helps to have at least some familiarity with XML and XML metadata schemas, PHP, HTML, MySQL, working from a command line, and database architecture.
Using CollectiveAccess requires a dedicated server, either a web server or a local server with a web environment, to host the software and database. Using a web server is the only way to take advantage of Pawtucket to create a front-end website, but hosting on a local server would be a viable solution for an organization that only wanted to implement a collections management platform based on Providence. At the AACL, we compiled a web server with the help of a contract technical consultant, and I was responsible for customizing the AACL’s XML installation profile. Having direct access to the server, or being able to work in tandem with the person who has this access, greatly assists in solving problems responsively and creating a system that best meets local needs. The support of a software developer or systems administrator is also useful, and sometimes essential, when troubleshooting and making modifications. The AACL eventually hired the developers of CollectiveAccess, Whirl-i-Gig, to assist with the infrequent problems we could not solve in-house and to customize some plugins.
CollectiveAccess’s metadata can be customized in several ways. For one, modifying an installation profile is the easiest way to create system-wide customizations, but the software also provides the ability to customize metadata fields, with only slightly less flexibility, once the program is installed. I opted to create a custom installation profile for the AACL because I had four specific uses in mind: to upload and create metadata for digital objects (i.e., a DAM system), to record administrative collections metadata for all special and archival collections, to create finding aids for archival collections, and to record metadata about the AACL’s historic artifacts collections. Individuals and institutions interested in customizing a CollectiveAccess instance for very specific uses should plan on evaluating the installation profiles that come with the software to determine whether one of these would meet local needs or if more modification would be better.
CollectiveAccess comes with a variety of XML installation profiles created by different institutions that have developed their own customized instance of the software and contributed them to Whirl-i-Gig for inclusion in the software package. Many of these are based on professional standards, and I started by modifying one that had been created for EAD (Figure 1). Modifying an installation profile also makes it possible to create controlled vocabularies, which was helpful at the AACL because we wanted to focus on information that was important to climbers such as type and grade of climbing (Figure 2) as well as information that would improve the system’s usability as a DAM, such as a field to select photo orientation. Prepopulating as much metadata as possible with controlled vocabularies and taxonomies in the installation profile also helped increase consistency in metadata creation.
CollectiveAccess offers a unique database model with fourteen core fields in the database, each with some unique attributes that impact the software’s full functionality. Learning how all of the pieces fit together was another challenge, but Whirl-i-Gig offers significant, free, and regularly updated documentation on a Wiki and in a support forum on the CollectiveAccess software site, which was critical for troubleshooting, improving the system, taking advantage of plugins, and managing software upgrades. Plus, even before we contracted for support service, Whirl-i-Gig was generous in offering occasional free support to the AACL. Anyone who plans to implement a CollectiveAccess installation should read as much of the available documentation carefully to get the most out of the software and correctly set up the installation.
To create a model for the AACL’s CollectiveAccess instance, I started by completing some research. First, I interviewed all AAC staff to learn about their needs for a DAM platform and what information would be most helpful to them, what kinds of digital assets they currently created and how they managed them, and what would get them to use the new system. Next I reviewed all of the existing .csv files with metadata from the AACL’s former installation of PastPerfect Museum Software and reviewed other finding aids and metadata created about the AACL’s archival and special collections to make sure that the new system could ingest pre-existing information. I also developed climbing vocabularies and a climbing location taxonomy that was aimed at AAC members who might want to search for information about climbing topics. And before completing the installation profile, I considered all of these various kinds of metadata in comparison with both the structure of CollectiveAccess and common metadata standards to make sure that future crosswalks could be successful, including to EAD. All of the planning and research, from learning the software and compiling the server, to interviews with staff, to testing the installation profile, took roughly three months of full-time work.
Several steps were necessary to prepare for the initial system activation. After creating the installation profile using oXygen XML editor, I loaded it on the server. In addition to the installation profile, several configuration files need to be modified with information about the installation. At the AACL we did not have the luxury of a test and development server alongside a production server. When we modified our installation, we were changing the live version, which made data security and backup even more essential in case something did not go as planned during an upgrade. We needed to be prepared to restore a previous installation and protect our data if the modification caused major problems.
Digital Asset Management (DAM)
The first purpose for the AACL’s use of CollectiveAccess was as a DAM system. As a result, I focused on this functionality the most when we first launched the software. I knew that the key to its success as a DAM rested on whether my coworkers would use the system, so I wanted it to be easy to use and intuitive. To this end, I required only five metadata fields for digital objects: a unique identifier (the filename of the uploaded digital asset), a name or title for the object, a date, rights information, and whether the resource should be publicly available. I also created a larger but unrequired set of metadata fields that could be completed on additional data entry screens. (CollectiveAccess also makes it easy to modify the order of data entry, add or remove required fields, and separate fields into different tabs.) After completing the required fields and uploading the associated file, users were not required to create additional metadata or do anything further.
CollectiveAccess automatically processed a wide variety of file types and created access copies on-the-fly that were available through the program’s image and document viewer (Figure 3). This was both a strength and weakness of the program; instantaneous access copies were great to have, but they required a lot of digital storage. Plus if the files were large, the load times were long and the system could sometimes appear to hang, both because of the limits of the bandwidth of the AACL’s internet connection as well as slow server performance.
Despite multiple trainings for all staff on the DAM and why it was important, use by staff was extremely low. In fact, outside of the library, only three staff members or interns used Explore with any frequency. There was no mandate that they adopt the system from AAC leadership, and between the lack of mandate and the time it could take to add resources, using the system did not offer AAC staff enough direct benefit to outweigh the time it took to do so, unfortunately. At this point, the AACL decided to shift the focus of Explore to online exhibits using materials in the library that would support other AAC events and campaigns.
Online Exhibits and Metadata
Given my responsibility for adding content to Explore, I ended up uploading the majority of the files, primarily digitized photographs from the AACL’s archives, with significant volunteer support to create and refine metadata. When Whirl-i-Gig developed easy-to-use batch editing tools, released in a later version than the one we first installed, these tools helped us populate the system with material very quickly. By the end of the first year after Explore’s launch, the AACL had metadata for approximately 5,000 objects in its system, most with an accompanying image and only a few records that were strictly metadata, although the level and quality of description varied widely. Our original workflow was more focused on upload of individual items one at a time with metadata for each, but we soon started batch uploading digitized collections with the advent of the much-needed capability to batch edit metadata. To improve the quality and granularity of the metadata, we relied on volunteers to describe individual images and complete the records for as many digital resources as possible.
In fact, volunteers were an essential component of Explore’s creation in many ways. Volunteers not only helped create metadata in Explore, they also created digital exhibits, training manuals and user guides, the geographic taxonomy based on climbing destinations, finding aids, and object records. One volunteer had extensive computer networking expertise, and he helped us troubleshoot on more than one occasion when we encountered errors. Volunteers also frequently digitized materials for the AACL, which we began to add to Explore. Many of our volunteers had a library background, which certainly helped them quickly grasp and use the system effectively, but the user interface in CollectiveAccess is fairly easy to navigate, even for non-experts, as many of our volunteers did not have a library background.
The debut exhibit we created was about the first American ascent of Mount Everest in 1963, as the AAC celebrated the 50th anniversary of this climb as the theme of the Club’s annual dinner in February of 2013 (Figure 4). We soon found that the exhibit plugin did not meet the specifications of the marketing team, which wanted the plugin to facilitate the inclusion of long, multi-paragraph captions. Ultimately we were able to accomplish a workaround by modifying the code, creating a set of objects that comprised the exhibit, and using the object detail page for exhibit text entry in a special field meant for this purpose. After the Everest exhibit, the AACL curated two more exhibits using the same method: one on the centennial of the first ascent of Denali and one on the climbing history of Yosemite National Park.
Development of the Everest exhibit occurred while we prepared for the public launch of Explore’s Pawtucket-based front-end website (Figure 5). The announcement of the exhibit was the marketing “hook” for the launch, but we had also uploaded and created metadata for both a considerable amount of unrelated content as well as extensive supplemental digitized photographs to accompany the main Everest exhibit. Explore went live on February 6, 2013, and within five months, we recorded around 4,700 unique visitors using Google Analytics, which we had pasted into the Pawtucket code to measure the site’s usage. Pawtucket users should consider a similar modification if they wish to implement analytics functionality, since the software does not include this feature.
After the Everest exhibit and launch of Explore, we started describing archival collections and creating finding aids in Providence, which was relatively straightforward based on the AACL’s customized installation profile. Having an intern who was able to develop excellent documentation and an instruction manual for describing archival collections in CollectiveAccess was a tremendous advantage, and both he and two other interns entered finding aids into the system. CollectiveAccess comes with a finding aid plugin for Pawtucket, but unfortunately, the AACL never took advantage of this plugin in Explore because AAC leadership prioritized the implementation of other features.
However, the back-end interface to Providence was a good alternative to provide access to archival collections via our installation, as it enabled us to enter all of the information required by Describing Archives: A Content Standard, as well as to record storage location information (Figure 6). When researchers needed access to a finding aid, we could provide them with a user account for logging in to Providence with read-only privileges to navigate the finding aid’s hierarchy and find what they were looking for. Users considering CollectiveAccess and Pawtucket should keep in mind that its finding aid plugin provides the only means through which to provide public online access to hierarchically-rendered finding aids without additional code development and the creation of new user accounts.
The AACL implemented CollectiveAccess to accomplish several primary goals: the creation of digital exhibits, administration of archival and special collections, provision of access to digital collections, and management of digital assets. Although the use case at the AACL for CollectiveAccess shifted during the course of its implementation, the software was agile enough to allow for these changes. Its batch processing features, flexibility and customization options, accommodation of metadata standards, image processing, and low cost were of the most benefit to the AACL, while the learning curve for implementing a CollectiveAccess instance and not being able to take advantage of the finding aid plugin were the two biggest obstacles. Based on my experience at the AACL, I encourage archivists considering CollectiveAccess for generating online finding aids to evaluate carefully whether Providence and Pawtucket will provide archival researchers with the most user-friendly experience or if another platform would be better. Ultimately, archivists will find that the software is used most easily as a collections management system, as the front-end site generated through Pawtucket does not come preconfigured to display finding aids without turning on a plugin. Even so, CollectiveAccess is a good alternative to similar programs, especially for users who would like a highly customizable system.
I would like to recognize the following individuals for the essential role they played in the creation of Explore: Allison Bailey, Julia Blase, Luke Bauer, Dan Cohen, Alex Depta, Chris Feldbush, Donna Hamilton, Craig Hoffman, Chris Jaquet, Seth Kaufman, Matt Klick, Erik Lambert, Adam McFarren, Max Miller, Erich Purpur, Katie Sauter, Wendy Thomas, and especially Beth Heller for the tremendous leadership and vision she brought to the American Alpine Club Library as Library Director.
doi: 10.1109/ISMAR-AMH.2013.6671260, accessed November 1, 2015.