On July 26, 2015, the United States celebrated the 25th anniversary of the Americans with Disabilities Act of 1990 (ADA), one of the most significant civil rights laws of the 20th century and the result of decades of work on behalf of disability rights: “a clear and comprehensive national mandate for the elimination of discrimination against individuals with disabilities.” In recognition, the Dole Archives at the University of Kansas (KU) created an original exhibit on disability rights in the U.S. from the perspective of Senator Bob Dole’s experience. We supplemented the curated exhibit with over 12,000 thematically related pages of archival documents, implementing a hybrid approach of item-level and folder-level scans, and providing access via a SIMILE Exhibit interface embedded in a responsive web site.
This case study focuses on the digitization efforts of the project, the creation of the web exhibit, and relevant lessons learned through the process. It illustrates an example of the practical aspects of a smaller institution’s efforts toward (a) an MPLP-inspired approach to bulk digitization, including folder-level scanning and minimalist metadata creation, and (b) the use of open-source technology (specifically, Bootstrap for responsive web design and SIMILE Exhibit for an interactive digital collection) to facilitate discovery and access to a large amount of content in way that is usable, accessible, and flexible.
Digitization: MPLP vs. Bulk Scanning
When Mark Greene and Dennis Meissner’s article, “More Product, Less Process: Revamping Traditional Archival Processing,” came out in 2005, the ideas and practices of MPLP were initially applied largely to improving efficiency in analog processing. In 2010, Greene pointed out that, rather than “a one-size-fits-all approach to processing,” the authors’ original intentions for MPLP were to identify these concepts to help archivists define a minimum level of archival functioning (including digitization and metadata creation), with the understanding that the level of detail would increase when warranted by specific materials. This is succinctly summarized in Greene and Meissner’s 2010 follow up article in these three “bare essentials”:
- Make user access paramount: Get the most material available in a usable form in the briefest time possible.
- Establish an acceptable minimum level of work, and make it the processing benchmark.
- Embrace flexibility: Don’t assume all collections, or all collection components, will be processed to the same level.
Digital presentation of carefully selected and extensively described items as found in “boutique” collections provides an institution with the chance to highlight particular items from its collection. However, this does require a significant investment of resources to create the digital assets and metadata, with the result being a fraction of the total resources available in the analog collection. In 2007, OCLC released a report calling for “scaling up digitization of special collections,” valuing access over preservation and quantity over quality, “with the recognition that large quantities of digitized special collections materials will better serve our users.” Sutton (2012) notes the “ongoing shift away from resource-intensive digitization processes toward large-scale production models is being driven by both MPLP principles and the increasing need to maximize online access to collections.” With the advent of mass digitization projects, such as Google Books or the Internet Archives digitization initiative, users have come to expect large amounts of our materials to be available online.
In a study of academic historians’ use of digitized archival collections, Chassanoff (2013) suggests that bulk online collections may be both appreciated and distrusted by researchers. While the benefits of mass digitized archival materials are many (e.g., increased access, lower cost of production), it can be difficult for users to “discern both context and relevance,” as well as the “coverage and extensiveness.” Scholars also expressed dissatisfaction with incomplete online resources and “want assurance that the entirety of the archival collection is made available to them.” On the other hand, some testing has indicated that users may prefer more descriptive metadata over minimally described bulk digital collections.
Each of the extremes of the digitization spectrum carries its own benefits and challenges. Fortunately, options are available in the middle path. One such model is folder-level presentation of materials. In contrast with an item-level approach, folder-level scanning creates a greater volume of material at a lower per-item resource cost, while introducing the added benefit of retaining the document’s context and original order. This provides the researcher with opportunity to mimic the experience of looking through the physical folder, increasing the potential of discovery through related documents. In the same vein, the application of minimal metadata standards to digitized items further helps to increase the amount of material available for researchers, while still providing a means for discoverability. Sutton describes the goal of an MPLP approach to metadata creation thus: “to provide enough description to get users ‘in the ballpark…’ then let them take over responsibility for finding the specific items that meet their needs.”
This case study will serve as an example of how a combination of these approaches can be used in the same digital collection, presenting an approach in the same vein as Sutton’s description of the digitization of the John Muir Papers: “not an either/or comparison of boutique vs. large-scale approaches, but rather an integration of them that embraces the MPLP tenant of adopting rapid, minimalist processes when possible and intensive, detailed processes when merited.”
Responsive Web Design
In 2000, long before adaptable web design was a reality, John Allsopp wrote, “It is the nature of the web to be flexible, and it should be our role as designers and developers to embrace this flexibility, and produce pages which, by being flexible, are accessible to all.” This mindset is firmly in line with archivists’ goal of providing “information in ways that meet their users’ needs, using systems and tools that users understand.” Although it is not a new concept, not much exists in the literature about responsive web design for digital collections in a cultural heritage setting, although the library field has produced some pertinent articles, and many content management platforms, such as WordPress, Drupal, or Omeka, have options for responsive themes.
Web developer Ethan Marcotte popularized the term “responsive web design” (RWD) in a 2010 article identifying an alternative to device-specific web design. Marcotte explains that, beyond the technical attributes of coding, RWD “requires a different way of thinking. Rather than quarantining our content into disparate, device-specific experiences, we can use media queries to progressively enhance our work within different viewing contexts.” RWD relies on a fluid grid framework, in combination with media queries and flexible images, to restyle the same web page to best fit the user’s screen. Properly executed RWD will display as if it had been designed for the screen that is being used, regardless of the dimensions. Users are increasingly accessing the Internet on smaller screens and mobile devices. In order to facilitate and encourage access to our collections, it is important to create interfaces that will work well with all devices, and RWD is a key component of this process.
There are a number of options available for cultural heritage institutions to create engaging online collections, such as Omeka or CollectiveAccess, and many of these have been explored and reported on. SIMILE Exhibit is one such open source framework that can be used by archives to create robust digital collections. As an option, however, it is largely missing from the archival literature.
Growing out of the Massachusetts Institute of Technology’s (MIT) SIMILE project, SIMILE Widgets is a collection of “open-source web widgets, mostly for data visualizations.” Exhibit (one of these widgets) allows users to create web pages that simulate database-driven web sites, with powerful sorting, filtering, search, and other customizations, by using only HTML and CSS. In 2011, David Karger, computer science professor at MIT, described the usefulness of the software like this:
Impressive data-interactive sites abound on the web, but right now you need a team of developers to create them. Exhibit demonstrated that authoring data-interactive sites can be as easy as authoring a static web page. With Exhibit 3.0 we can move from a prototype to a robust platform that anyone can use to author (not program) rich interactive information visualizations that effectively communicate with their users.
Although SIMILE applications are not widely used by cultural heritage institutions for presentation of archival documents, given their flexibility and lower technological barriers, they are well worth considering, especially for smaller institutions.
About the Dole Archives
The Dole Archives is a Congressional archives at the Robert J. Dole Institute of Politics at the University of Kansas. The collections consist primarily of the Dole Congressional papers, created during his 36-year career, as well as some smaller related collections. The majority of the Dole collections are processed, and folder-level finding aids are available via our Archon interface. At this time, we do not have an institutional repository that provides good support for digital objects, although we do have some assets hosted in Archon. Our digital materials are generally available through standalone collections, related either through provenance or through thematic relationships, such as the one described here. However, we are in the process of implementing a digital asset management system that will supplement our online presence by consolidating our resources for improved access and discoverability.
About the Exhibit
The exhibit, Celebrating Opportunity for People with Disabilities: 70 years of Dole Leadership, was created in conjunction with commemorateADA, a series of public programming events at the Dole Institute recognizing the 25th anniversary of the signing of the ADA and Senator Dole’s role in its passage. The physical exhibit is relatively large, with ten distinct sections, some with additional sub-sections, mounted on 2D graphic panels and consisting of archival documents and photographs and original interpretive text. The web site was created to reproduce this content, expanding it with additional archival materials and resources. It has been well received and was the recipient of a 2015 Kansas Museums Association Technology Award.
The web exhibit is composed of three primary parts. The first is an online representation of the physical exhibit: a semi-chronological exhibition on Bob Dole’s life-long involvement with disability rights. This “boutique-style” portion includes over 75 archival documents and photographs, tracing Dole’s advocacy efforts on behalf of people with disabilities – from his own struggles following his injuries in World War II to his current and continuing efforts with the Convention on the Rights of Persons with Disabilities (CRPD), wounded veterans, and other disability-related causes.
The second key part of the website is what we have called “ADA in the Dole Archives.” Drawing on a number of our archival collections, the Dole Archives digitized 12,392 pages of primary source documents related to the ADA. These documents were all digitized in-house and made available for download as PDF files via an interactive SIMILE Exhibit interface.
The third part of the website, “In the Classroom,” provides an educational resource for middle and high school teachers. Drawing on primary source materials, members of the KU Council for the Social Studies (KUCSS) developed a freely available lesson plan for government and history teachers, discussing bipartisanship using the passage and application of the ADA as a real world example. As a service for visually impaired individuals, the Dole Archives partnered with the Kansas Audio-Reader Network to create an audio narration and description of the physical exhibit, accessible in-house via individual QR codes for each section. Although the layout and some of the physical characteristics differed from the web exhibit, this narration was also made available as streaming audio files on the website, providing another level of accessibility for users. The website in its entirety can be viewed here: http://dolearchivecollections.ku.edu/collections/ada/.
Selection and Description
The Dole collections include a wide variety of material related to disability issues, including constituent mail, legislation, speeches and press releases, in-office memos, notes and reports, and many other types of documents. From the earliest stages of the project, we knew that we wanted to include a large selection of this material to enrich the curated exhibit with a robust set of archival documents for researchers, as a way of “facilitating rather than controlling access.” As a department, we have discussed undertaking an MPLP-inspired approach to digitization, specifically through folder-level scans and minimal metadata descriptions around a theme or subject. With a good deal of overlapping thematic content between collections, the ADA materials lend themselves well to this type of approach, and this project provided a good opportunity to test our workflow.
To create an initial list of material for digitization, we searched the folder titles of all of our collections for keywords relating to disabilities and disability legislation. This first fairly exhaustive search yielded over 1,100 folders of material, plus several hundred individual items – an estimated volume of well over 100,000 pages. As this amount of material was infeasible for us to digitize in the allotted time frame (about 9 months), we narrowed the scope to include only material that dealt with ADA specifically (93 folders). To this, we added a selection of speeches that Dole gave between 1964-1996 dealing with disabilities (29 speeches), as well as the near-entirety of a smaller collection of Alec Vachon, Dole’s Legislative Assistant on disability issues from 1993-1995 (18 folders). The addition of the Vachon Collection served two purposes: first, it provided a glimpse at the state of disability issues just a few years following the passage of the ADA, and second, it allowed us to test a workflow that had been previously created for the digitization of smaller collections in their entirety.
Ultimately, we decided to undertake a hybrid approach to digitization and presentation, following the MPLP tenant of embracing a flexible approach to how materials are handled. The Vachon Collection and Dole Speeches (1,829 pages) were digitized and described at the document level, whereas the remainder of the material (10,563 pages) was selected, digitized, and described at the folder level. Integrating these two approaches maximizes the amount of digital content that we are able to offer, while allowing access to our collections from a variety of research perspectives.
We digitized all materials for this exhibit in accordance with our normal scanning workflow. The bulk of our in-house scanning is done by student workers on an Epson Expression 11000XL flatbed scanner. All documents are scanned with Silverfast as 300 dpi full color jpeg images at 100% quality and bundled into PDF files with Photoshop. In order to ensure consistent handling of the PDF files, an action is set up in Adobe Acrobat and is applied to all scanned documents, allowing the same processing steps to be quickly and uniformly applied to all digitized items. Among other steps, this action saves a copy of the full size PDF as a PDF/A preservation copy, applies OCR to the file, and generates a smaller access copy (also PDF/A) for web presentation.
Metadata is captured at the time of scanning in a local Access database, which is mapped to Dublin Core elements and can be exported for a variety of uses. For folder-level scans, we transcribe the folder title as the item title, with the phrase “(Entire Contents)” appended to indicate that it is the complete folder. The date range is estimated based on the existing folder title and the student’s observations. The description field provides a broad overview to the contents of the folder (e.g., “Multiple document types related to the ADA, disabled Americans, advertisements for disability/accessibility technology, etc.”). One to three subject terms are attached to the record, and technical metadata is recorded. With the help of some automation, the metadata process generally takes less than two minutes per document/folder.
Creating the Web Exhibit
When creating the web exhibit, we identified one seemingly simple outcome that we wanted to accomplish: using freely available technology, provide access to a large amount of content in way that is usable, accessible, and flexible. With this one overarching aim in mind, there were several smaller goals that we wanted to achieve.
Goal #1: To interpret a relatively large physical exhibit and present it in a web-friendly format that would be attractive, easy to navigate, and blend well with the related content.
The layout of the content was the first challenge in presenting the curated exhibit. With ten distinct sections and six additional sub-sections, the design could easily become unwieldy, confusing, and difficult to navigate.
Following current trends in web design, we decided to present all of this content on a single page, using headings to create visual breaks between the sections. A fixed sidebar navigation menu lists all of the sections and indicates where the user is in the exhibit, serving to orient the user and to provide an easy way to move between sections. Similarly, a navigation menu fixed to the top of the page allows quick access to the other primary parts of the website (i.e., “ADA in the Dole Archives,” “In the Classroom,” and “About”).
Additional related primary source materials are included at the bottom of each section in a styled div called “From the Archives,” making their relative location and visual appearance consistent throughout the exhibit. Thumbnails of images and documents are linked to open in a popup lightbox, allowing quick navigation to all scans from the same section, and are accompanied by a direct link to download the jpeg or pdf file. This combination of features serves to make the content accessible and easy to navigate for all users.
Goal #2: To make use of free open source technology and resources to create the web exhibit and present the archival documents.
The second primary purpose of this web site is to disseminate a large volume of digitized archival material related to the ADA (i.e., “ADA in the Dole Archives”). The goal was to provide access to disparate materials from multiple collections in a way that is flexible and user-driven, without losing the context or allowing the presentation of the content to be overwhelming by its volume. Further, it was important for us to use open source technology as a means to provide access to the files themselves. We have used SIMILE Widgets in other document presentation projects and decided to make use of their Exhibit software.
Creating a digital collection with Exhibit begins with a simple HTML page containing links to MIT’s application programming interface (api) and the data to include, then customizing the display to fit the needs of the data. The three main aspects of the Exhibit design are views, lenses, and facets:
- A view is the overall display of the whole collection, which could take the shape of a table, a timeline, a map, a bar chart, or a variety of other layouts. It is also easy to display the same content through different views on the same page, or in separate tabs.
- A lens allows customization of the way an individual item is displayed.
- Facets are different tools that allow for filtering, browsing, sorting, and searching on specific fields (or properties). A facet can be a multi-select list, a text box, a word cloud, or one of a number of other common options.
All of these aspects can be arranged in HTML and styled with CSS. The specifics of the web coding are beyond the scope of this article, but there are good resources and tutorials readily available for interested parties. The basic HTML code is fewer than 40 lines long, and, with a bit of experimenting, can be very easy to work with.
The data that populates the exhibit is pulled from a JSON file, where each entry is composed of a set of properties. The properties are completely flexible and can be anything the user wants to include, such as dates, names, geolocation coordinates, tags, collection titles, etc. While generating a JSON file by hand can be tedious, SIMILE offers a service called Babel that can convert data between formats. For the purposes of Exhibit, Babel can create a JSON file from an Excel or Google spreadsheet, a tab-delimited file, or an RDF/XML file, a simple and effective process.
The Exhibit framework provides the ability to include a customizable and interactive interface to all of these files as part of the digital collection. In combination with the integrated filter, sort, and search capabilities, the Dole Archives is able to provide access to these files in a way that is simultaneously curated and user-driven.
Goal #3: To utilize responsive web design, so that the web site would scale to meet the user’s technology, from mobile to desktop.
Creating the responsive layout is the work of a few basic steps. The first involves visualizing the content in its various layouts and applying Bootstrap’s CSS classes system to achieve this end. Bootstrap works in a 12 column grid system, with four screen sizes identified in media queries (large, medium, small, and extra-small). While the details of the language are outside of the scope of this writing, the following basic example may help to visualize how this works. An element coded as
<div class="col-xs-12 col-md-6 col-lg-4"></div>
will display in 12 columns (or 100% width) on small and extra-small screens (“col-xs-12”), six columns (50% width) on medium screens (“col-md-6”), and four columns (33% width) on large screens (“col-lg-4”). These simple classes, properly applied, will allow the content to rearrange itself depending on the user’s screen with no additional work required on the part of the coder.
In addition to the flexible grid system, another key component of responsive web design is formatting the content for the user. This may include things such as having different image sizes and/or ratios for different screens, collapsed navigation menus on small screens, and other techniques for creating the best experience possible for your user’s device.
In addition to reformatting the layout of the content, we were also able to reduce the size of the web exhibit for smaller screens. For example, the “From the Archives” sections that contain additional primary source materials are in collapsible elements that by default are open for larger screens and collapsed for smaller ones, reducing the size of the page for users on smaller screens, while allowing them to view the content on demand.
Since Bootstrap and Exhibit are both HTML frameworks, it is actually very easy to make the two of them work together. By assigning Bootstrap classes to the Exhibit HTML, the Exhibit layout becomes responsive, adjusting to the user’s screen.
Goal #4: To comply with web accessibility recommendations.
“Web accessibility refers to the inclusive practice of removing barriers that prevent interaction with, or access to websites, by people with disabilities.” In addition to simply being good practice for web design, creating a web site that meets accessibility recommendations is essential for an exhibit related specifically to disabilities. The Web Content Accessibility Guidelines 2.0 (WCAG) were released in 2008 and are currently the most widely accepted standards for creating accessible websites. To meet this goal, we explored a variety of ways to verify compliance and to identify accessibility problems on a page. A number of online services exist for this purpose, and, after testing a number of options, we found the well-established Web Accessibility Evaluation Tool (WAVE) to be easy to use and understand and to be the best for our needs. This tool examines a given URL and displays the web page with accessibility concerns flagged with color coded icons and brief descriptions.
It should be noted that a large portion of web content on our exhibit is does not meet accessibility guidelines by its very nature (i.e., the fact that they are scanned documents in PDF format). PDF files can be problematic at their best, requiring a number of steps and guidelines to be followed to allow accessibility by all users. While OCR can provide readable text, the types of descriptive information, tagging, and other criteria required means that creating fully accessible versions of scanned materials is extremely resource-intensive and time-consuming, something that is really not feasible at this time for most bulk digitization projects.
Lessons Learned and Future Steps
The combination of Bootstrap and SIMILE Exhibit as a solution for presenting this digital collection proved beneficial. The flexibility and amount of customization possible allowed us to tailor the site to our needs and the specific character of this multi-part exhibit without having to use funds beyond our normal operating budget. Content included within the Exhibit layout is easy to update, allowing for potential future expansion of documents without the need for additional coding or design work. In addition, both of these frameworks have good community support for problem solving or other informational needs.
One notable issue that we had with Exhibit for this project was the inability to search within the full text of PDF files, a feature that is available in Omeka and other software. In previous digital collections, we developed workarounds for this, such as pasting the full text of a letter into a JSON property and making it searchable but not viewable, or embedding a Google custom search engine in the site. Neither of these options was feasible for this ADA exhibit, so for the time being, although we can search titles and descriptions, there is no full-text search of PDF files. However, we are actively exploring options and hope to add this feature in the future.
While the majority of the Exhibit layout is customizable and flexible, some sections are hardcoded and cannot be changed without editing the API itself. For example, Exhibit offers the option to include a sorting feature for the collection (e.g., the user can choose to sort by date or name, or to have the items grouped by location, etc.). However, the heading area which allows the user to select the sorting property is not a particularly intuitive design and is not editable. This limited flexibility is something that will be encountered in any software that is not custom-built for the specific purpose, and, for us, was something that we could live with.
There are a few things that we would do differently (and may indeed do for this exhibit in the future). One is to perform user testing with a physical person (or persons) with disabilities, part of any good web design strategy. It is one thing to have software evaluate a web site for compliance, but it is no replacement for actual usability testing. Prior to releasing the website, we did put out a call to the KU community (via KU’s accessibility office) for volunteer testers, but we did not have any respondents, possibly due to the timing of the request coming at the beginning of summer.
Also, this project did serve to reemphasize for us the relative inefficiency of bulk scanning on a flatbed scanner. While this method does provide excellent digital surrogates, the low rate of production is resource-intensive. Shifting away from flatbed scanning for documents (especially those that are created for access rather than preservation) to an overhead scanning method will greatly increase the rate of capture, making the process of bulk digitization more sustainable and efficient for future projects.
As the archival community moves forward with ever-increasing options for creating, presenting and accessing digital content online, finding the right approach for an institution takes time, and it will certainly continue to shift as the web and user expectations change. Through the creation of this exhibit, we have attempted to address several topics involved with creating and providing access to a relatively large-scale digital collection that works for our institution. It is our hope that providing a blend of item-level and folder-level access will allow researchers to access these materials in the way that best meets their needs, from students looking for individual documents to begin their research, to academic scholars wanting to see as much pertinent material as possible. Working with contemporary web design standards and open source technology to present this material provided a good opportunity to combine a curated exhibit with additional related materials in a way that will be useful for many years to come.
Robert J. Dole Senate Papers-Legislative Relations, 1969-1996, Box 753, Folder 10, Robert J. Dole Archives and Special Collections, University of Kansas, accessed November 3, 2015, http://dolearchivecollections.ku.edu/collections/ada/files/s-leg_753_010_002.pdf.
This work is licensed under a Creative Commons Attribution 4.0 International License.