Introduction and Narrative
The purpose of this section is to explain the associated costs and funding sources for this project. A discussion of scholarship related to funding digital archives and financial considerations for archivists follows. Some of the key considerations discussed are the overarching constraints of funding on each level of the archival process and the subsequent negotiations between the idealized archival design versus the possible design within such constraints. Issues related to the costs of Optical Character Recognition (OCR) software for making scanned images keyword searchable and the labor of securing copyright permissions for digitizing previously published materials. The ongoing challenge of maintaining digital data through long-term funding is also explored.
More practical than theoretical, the issue of funding is one that any digital project must confront in the planning stages. There is general consensus in scholarship that “funding is extremely important for digital projects…[and] it is important for project managers to think carefully about funding models and partnerships at the outset of any digital historiographic project, especially community projects” (Smith 126). This is because there are three distinct phases of funding needs to consider: costs associated with data collection, the archive’s initial design and construction, and finally its ongoing maintenance and preservation in the digital space. To obtain artifacts for the archive, there may be a need to travel to collection locations and to procure the appropriate technology to digitize and store collected artifacts. In terms of creating the archive, there is a large investment needed in human labor: securing copyright for possibly protected artifacts, correcting and editing text associated with scanned documents, and obviously organizing, describing, and arranging the artifacts while designing and construct the digital archive that houses them. There are also possible costs associated with the archive’s technological needs with software to manage records and any fees associated with registering a domain name, purchasing server space, and paying for web hosting services. These costs enable an archive to be brought into existence and to be made publicly available, but they are rather finite in scope. The costs of ongoing maintenance and preservation will need to be considered for the duration of the archive, as there are often annual fees associated with server space and web hosting along with the need to proactively counter problems of digital degradation and obsolescence.
While every archival project differs in scope and design, and will necessarily have differing funding needs, ensuring the project can be brought to fruition and sustained requires careful financial planning and fundraising strategies. Rhetoricians engaged in historiographic archival work will need to understand the material constraints of developing digital materials and make arrangements for the sustainability of any project prior to investing resources, including both time and money. Despite the urgency of recovering fragile and time-sensitive evidence of marginalized cultures, the development of archives requires careful consideration for the immediate and long-term resources needed to initiate and maintain any recovered materials. Undertaking a project without planning for the associated costs may result in the inability to ethically collect and manage cultural artifacts, delivering on any promises or intentions for their care presented to participants. To invite and encourage participation without having the proper conditions to receive can constitute a breach of trust, endangering future archival projects success as distrust may be engendered. However, understanding the constraints of funding, and working realistically within them and presenting them honestly to participants, can lead to successfully realized historiographic archives.
I knew going into this project that funding would need to be addressed. In one of the doctoral courses I took in New Media studies, I had developed a website featuring digitized copies of the underground magazine Inquisition while documenting the process in a series of blog posts. Making the decision to convert the free WordPress site to a traditional website with greater functionality was something I discussed, along with the associated costs, and it resulted in a fairly dynamic site with the ability for users to engage with the content and with other users through an added group organization function. However, after several years of paying out of pocket for hosting, I was no longer able to afford the upgraded site and had to revert to the original free site that is still available, losing several aspects of the site that represented hours of development work (like the group function and an annotated bibliography with embedded content). This experience led me to add a funding category to an independent study I subsequently did in preparation for the dissertation, which allowed me to explore research related to the issues of funding digital projects and the methods for addressing the challenge. The experience and research together helped to prepare me for planning this project’s funding, or it at least made me aware that addressing funding would be a key aspect of the planning.
At the outset, I had a sense of what would need to be funded and an idea for the things I would like to have funded. I knew I would need to pay for travelling to Rhode Island to collect the interviews and artifacts, and I knew I would need to pay for a domain, storage, and web hosting for a 5-year commitment. Since I felt strongly, and to a large degree still do feel, that I wanted to have an archive that appeared as professional as possible, I wanted to ensure that the artifacts were represented by images of the highest quality. I knew this meant that the initial data collection would then require the use of a quality camera and scanner, which would also likely require that I gain some training in photography and videography to capture the best images. Once collected, I also wanted to be able to organize and manage the digital records with a professional system, namely the museum software, PastPerfect. I felt this would not only give me some important practical experience using an industry standard in records management, but that it would potentially better enable the archive to be transferred to an appropriate institution if personal hosting became no longer viable. Although there are certainly ways to capture images and put them online for free, it was important to me that the archive project an authoritative and trustworthy, along with doing justice and honor to the contributors, which I felt necessitated a financial investment in these various technologies.
At the time I was writing the prospectus, I knew about a grant opportunity through the Luso-American Education Foundation (LAEF) that if awarded, would help support the project, specifically with travel and technological needs. However, I was writing the prospectus in the summer of 2016 and the grants would not be determined until February 2017. I submitted my application materials, but I would have to proceed with planning the project not knowing whether I would have these funds. I submitted my IRB application in January of 2017, which I needed to begin collecting artifacts, and I was planning to travel in March of 2017, coinciding with the spring semester break at the university where I work. I felt constrained by these dates because taking time off during the semester for an instructor is problematic, and I really wanted to have the data collection completed before the summer when I would have more time to then work with the artifacts and build the archive. Once I received IRB approval, I was in a position where I needed to book the flight and reserve lodging and transportation before hearing back from the LAEF, so paid for them out of personal funds. Shortly after booking, I was fortunate enough to hear that I had been awarded a $2,500 grant, which I could use in part to reimburse the cost of travel, which with the added cost of food and gas, came to almost half of the grant. I planned to use the remainder of the funds to purchase the desired technology.
In the planning stages, I determined that I wanted to purchase a Canon VIXIA HF R700 Flash Memory camcorder and tripod to both capture still images and video record the interviews, which would cost about $325. I also wanted to use a portable cube studio, a reflective fabric tent with directed LED lighting and colored background, to photograph the artifacts in the best light and clarity, which would be an additional $135. Also, I wanted to purchase an Epson Perfection V39 flatbed color image scanner to capture any photographs or documents related to the family that contributors wished to share, which would cost $100. I would also need a removable SD memory card for the camera and a 2TB external hard drive on which to store the digital artifacts securely at a cost of $85. The total cost of $645 was reasonable given the amount remaining from the grant; however, I would not receive the funds until late June 2017, which was well after the planned trip. Unfortunately, I had already paid out of pocket for the travel costs and was not able to make these purchases as well in advance of the grant funds. I also began to have logistical concerns about getting the larger items, like the scanner and tripod, to Rhode Island from Florida. I considered waiting until I got there and trying to purchase them there, although I had concerns about not being able to test and practice with them before going. I considered purchasing them in Florida and arranging for them to be shipped, but this was a considerable added cost with its own concerns about damage and timely arrival.
These questions persisted as I approached the travel dates, and another issue had begun to surface. I was having difficulty arranging for enough participants to meet during the week in March that I had scheduled. It was becoming clear as the weeks went by that it was likely going to require a second trip to Rhode Island during the summer to capture enough participants. I began weighing the desire for technological tools against the emerging need to save the remaining award to fund additional travel. Was it more important to have high quality images or to include more participants? I began to think about even further costs that were to come, including web hosting and possibly purchasing PastPerfect software. For a WordPress site to have the dynamic functionality of a traditional website and to offer enough digital storage space to support the media-heavy archive, there is a required upgrade that will run just under $300 annually, which is a $1500 commitment to maintain the archive for the original five year plan. The cost of the PastPerfect software itself is $870 but would require the purchase of an add-on to manage digital artifacts, which runs an additional $385. It also likely requires the purchase of the software upgrade to access the full features, as well as training and support services, putting the total just over $2,000. With only so much money available, I realized that I would need to adjust the plans.
The easiest decision about what to change had to do with the PastPerfect software. Due to the high cost, along with a better understanding of just how robust the system is—far more suited to the complexities and size of an institutional collection than an archive with fewer than 200 artifacts, I knew this purchase would be beyond the scope of the project, despite the potential advantages to my own experience and issues of future compatibility. In examining the software features further, I learned that although it automates the labelling of artifacts, it follows the organizational system of Nomenclature 3.0, the field standard for naming artifacts based on Robert G. Chenhall’s 1978 System for Classifying Man-Made Objects. This work can be completed using the Nomenclature 3.0 system learned from the text and an Excel or Google Spreadsheet, and with the comparatively low number of holdings in this initial phase of the archive, is manageable without the software. PastPerfect can also automatically generate a finding aid, but I feel there would be significant benefits to learning how to write accession numbers and organize the archive manually.
From there, I recognized that whether I traveled a second time or not, I would need to have the funds available for web hosting through the committed years unless I wanted to create a far less robust and dynamic archive using WordPress free tools, which I knew would not give me the kind of look and function that I envisioned. If I held back the remaining award for web services, then I would have to pay out of pocket for a second research trip and the technology tools, so I would need to make a decision about how much of my personal resources I was willing to commit to the project. It is difficult with two small children and all the attending financial responsibilities of a home, child care, car payments, and tuition to also add supporting an archival project to the budget, especially since I had not yet received the grant and had just paid in advance for the flight, hotel, and car for the March trip. I began to think about ways to work around the technology costs, which also had those logistical issues to solve. In looking into how others have conducted field research, I began to realize that there were many free iPhone apps that could be used to produce digital records, including audio recordings and document scans, along with the built-in camera. Using the phone would not cost anything additional and would not be difficult to travel with, but I was really torn by the idea. On one hand, it was in my budget, and there was a feeling of almost doing something like a guerilla archive, work that was radically accessible to any would-be archivist. It removed part of the financial barrier that can make archival projects so cost-prohibitive and takes away a kind of professional mandate that makes it seem that archival work is only for those with specialized training and expertise. I began thinking, with some intrigue, about how I might be able to work in a way that subverts the “archival work requires tremendous funding” narrative that I had read and experienced.
However, on the other hand, professionalism is also often equated with authority and trustworthiness. While I am an amateur archivist in many ways, I do not want the archive to look amateurish. Without quality technology and the training to produce quality images, there was a real fear that the result would be poor, and I would embarrass myself, or worse, do a disservice to the participants. In the end, with the March trip rapidly approaching and a summer trip already being tentatively planned, I decided to forgo the camera, scanner, and studio and save my personal money for additional travel.
As I discuss further in the Data Collection and Management chapter the quality of the images collected remains an area of the archive about which I am most self-conscious and regretful. It would be wonderful to have the artifacts professionally rendered as digital images and to offer archive users the best experience, capturing the details so effectively as to bring users the closest experience to actually handling the analog versions. However, with limited funding, decisions will need to be made and the consequences of those decisions will need to be addressed. I feel that for an initial phase, choosing to spend resources on travel to include more participants was a better decision than allocating the money for the technology. I also feel that the archive has the potential to grow into a larger community project in the future, and it would be possible that in these expansions and subsequent collection activities, I could improve the technology with either new grant funds or in partnership with other organizations. I hope to get to a place where I can look at the quality of archival images not as a failure of the project, but rather as important learning experiences that will ultimately improve my ability to generate artifacts in the future.
The constraint of funding has been one of the more difficult limitations for me to accept because it is where the gap between what I want for the archive and what is possible is widest and most immutable. Despite any intellectual or physical efforts I might make, the available funds are finite at this time, and I have to make the best decisions I can, including compromises and concessions. Although the travel and data collection portions of the archive are completed for now, there are subsequent funding needs for hosting, which I will discuss in the Digital Exhibit and Archival Interface chapter as that is where I am housing discussion regarding the work of making the archive digitally accessible. What I have learned at this point though is that decisions surrounding how to allocate available funding are necessary, ongoing, and difficult to make due to how they can shape the final product in unintended ways. I also find that any efforts an archivist can do to secure funding must be balanced against the constraints of time in which the activities need to take place. Waiting until there are enough funds to support the exact archive I had planned might have meant postponing real collection opportunities that had presented themselves or possibly failing to ever get started. I felt that getting the information out for people to experience was worth the risks of not presenting an idealized archive; it was better to have an imperfect archive than no archive at all. In terms of funding, as with all aspects of the archive, there is a necessity to balance the ideal with the real and negotiate these tensions as thoughtfully and honestly as possible.
The Overarching Constraint of Funding, or, The Perfect and the Possible
The issue of funding a digital archive is well-trod ground. All scholars working with projects like this acknowledge the challenge on securing the initial funds to begin such a project, to develop the archival materials, and to secure them for long-term sustainability. Alexis Ramsey observes that although all archives are constructed, requiring intentional structuring and development of the framing apparatus, the digital archive takes this created quality to a new level given its additional required layers of funding and labor. She notes, “The digital archive is just as, if not more, created then traditional archives because digitizing is expensive and time-consuming work” (88). The argument here means that although some physical archives come into being more passively, as documents and materials accumulate in institutional holdings over time, the digitization of any records requires a far more purposeful set of actions to bring it into being and is associated with deep costs in terms of both human-labor costs and technological ones. In addition to the costs of associated with digitization, there are also costs simply associated with conducting research and developing participant relationships to support archival processes. Lynée Gaillet also asserts that all “historical research is expensive and difficult—especially for beginning scholars or those working at institutions with limited research budgets” (38). As a beginning scholar myself, this resonates as especially true. Were I to have unlimited funding, there are a great many things I would probably have done differently. I would have spent more time engaging with the community and Azorean organizations to develop greater trust and increase participation through additional travel and enhanced my recruitment efforts with targeted marketing. I would certainly have invested in more advanced technology for data collection, possibly even hiring professional photographers or videographers to assist with collecting artifacts. I would be able to afford any records management software available to professionalize the archive further as well as construct the archive with the most dynamic and robust content management system. However, in each case, I needed to weigh these ideal methods with what was possible given the nature of my available funding. The experience exemplifies Mike Kastellec’s depiction of funding as a dominant constraint in any digital preservation project, represented by the diagram below.
Although Kastellec warns of his nested diagram being interpreted too rigidly as a hierarchy, there is some benefit in considering what it represents. In this diagram, costs delineate the scope of the project in its entirety and is represented by the outermost ring. Technology, access, law, and selection are all factors that will further limit the possible data until the final “limited subset” makes its way into the archive. Costs will limit the technology that the archivist can obtain for the project, just as I experienced, which can limit what kinds of artifacts can be added. For example, since I was not able to afford a video camera, I ultimately chose to use still images only along with audio-only recordings of interviews. In one case, I wanted to make a video of my Aunt Elsie making her version of Portuguese soup, along with possibly others also preparing family dishes. In the end, I was limited to transcribing the steps in my field journal while taking pictures when possible. Access to the materials from which an archive can be created is also limited by costs, which was true for my archive in terms of travel opportunities to collect artifacts. Legal costs often pertain to labor involved in clearing copyright: tracking down authors, drafting legal documents, and working with owners to obtain rights, and the budget will constrain how much can be put into such efforts. Selection, or what the archivist chooses to include, will further limit the items in the sense that larger collections require larger data storage and processing, both of which can have associated costs with technology and labor. Approaching archive building with a sense of this nested funding and subsequent limitations, archivists at the beginning of a project would have stronger sense of allocating the funds depending upon their goals. A small collection may not require a large budget for selection but may need to dedicate funds to technology that supports data mining across the artifacts, like Optical Character Recognition (OCR) for scanned documents to allow for keyword searching within the text. A collection that aims to be inclusive and comprehensive rather than searchable can afford to spend less on costs for technology with greater investment in access and law to increase the overall number of holdings. Having a clear sense of what the archive should be able to do is as crucial as knowing what artifacts should be in the archive, so funding decisions can be appropriately considered with respect to the purpose.
My experience in having to compromise my ideal vision due to the constraint of available funds is certainly not unique; in fact, it would probably be difficult to find any digital archival project that was funded to such an extent that such decisions were not inevitable. Despite our current societal appetite for widespread digital availability of information, there is generally a lack of understanding about the significant costs associated with the work, likely a result of the relative ease with which we can make digital images or scans and then post them online. This false sense of the work’s scope may be why “there has been relatively little discussion [in scholarship] of how we can ensure that digital preservation activities survive…or even how they might be supplied with sufficient resources to get underway at all” (Lavoie qtd. in Kastellec 68). It may also be why the availability of funding for digitization projects is “many orders of magnitude short of what is required to archive the amount of information desired by society over the long term” (Kastellec 65). Therefore, in addition to carefully planning for how a project will be funded and devoting significant effort to pursuing resources, archivists should also work to educate the public on the associated costs and to raise awareness about the importance of archival work to encourage greater public and institutional financial support.
Financial and Functional Costs of OCR and Copyright Permissions
There are many aspects of the archival process that are affected by funding, from participation to technology, but there are two specific areas related to handling artifacts that have less to do with actual dollar costs and more to do with the cost of time and human labor: managing OCR data and securing copyright permissions. While many may assume that it is the “digital” in a digital preservation project that may drive costs the most, Kastellec reminds us that “it is useful to put the advances in digital preservation technology in perspective and to recognize that non-technical factors also play a large role in determining how much of our cultural heritage may be preserved for the benefit of future generations” (70). It is necessary for archivists to account for these other kinds of costs when planning a digital project to recognize that they may also present certain limitations to the final archive just as those associated with technical needs.
Obviously, the need to manage OCR data will vary from project to project depending on the type and number of scanned artifacts; however, the challenge of accuracy with OCR technology will remain a constant. The process itself refers to the automated, electronic conversion of text from an image, into machine-encoded text, which enables users to search within the image for specific words or phrases by creating a layer of code with the text behind the image. Diana Kichuk explains, “Scanned print pages in raw bitmap, PDF, TIFF, JPEG, or other image files produced by mass digitization projects are not searchable. OCR software is used to convert image files into searchable text files” (74). For example, consider for these artifacts from the Sousa and Alves collections respectively:
These images were created using a camera and are saved as JPG files, but each importantly features text. There is handwritten recipe for apple pie from 1956 with a handwritten note for how the recipe was shared, and in the bottom image there is a poem written in both Portuguese and English by an Alves family relative for a wedding. Without applying OCR technology to these images, someone searching the archive for recipes using “cinnamon” or for examples of how Azoreans use the word “saudade” would yield a set of results that did not include these images despite them being relevant to the users’ search. For my archive, this functionality is important to me as I want to support users’ ability to search across artifacts for specific keywords of their own choosing and not only on descriptions or tags that I felt were significant enough to be attached to the image. This is the function that takes the database from being a collection of copies to a tool for gathering new knowledge.
It may seem that the discussion of OCR and archival artifacts is a function of data collection methods or curation activities related to the construction of the archival interface, and it certainly is related to these areas, but it is a large part an issue of funding as well, not only due to the cost of specialized software (such as Adobe Acrobat Pro for a minimum of $200) but also the high-levels of human interventions necessary for accuracy in the encoded text; it is the cost of time and labor. The reason that OCR work is associated with a significant labor cost is that OCR technology can never be truly perfected and typically the initial conversion is laden with errors, especially when the original documents are using non-standard fonts and formats or have poor quality, which is characteristic of many of the artifacts in cultural preservation efforts. This necessitates human corrections. Kichuk notes, “OCR text quality is dependent on a number of variables, including image spatial resolution, text-to-background contrast, and font quality and variability,” and there is added difficulty when trying to “parse complex or uncommon languages, old or worn fonts, and complex formulas” or to recognize “structural features such as text orientation, headings, images, tables, captions, and paragraphs” (72). In the case of the examples from my archive, which are representative of many of the images with text, there would be significant difficulty in capturing the various orientations of cursive handwriting in the recipe or the italicized text of the poem, arranged in columns and with two languages. The resulting layer of text, if it could be produced at all in the case of the handwriting, would need labor-intensive correction efforts or the need to forgo the OCR application altogether and create a handcrafted text file by manually transcribing the document, which is where the issue of OCR becomes one of funding.
In large-scale digitization projects, manually correcting OCR files “using paid labor and traditional proofreading and copyediting practices would be prohibitive” (Kichuk 60). In my archive, since I am working alone, the cost is not necessarily a question of paid labor, but rather one of time and balancing the accuracy of the OCR-generated text files with the other tasks required to deliver the archive and dissertation in a timely fashion, so there is some question of whether applying OCR to the artifacts is a viable option in this initial phase. This strikes me as another example of balancing the product being released for view and releasing a perfected project. I could release the archive without the searchable images, so the archive can begin to interact with users sooner then add the text files later. Unfortunately, it may not be a step that can be skipped without risking credibility in addition to limiting the desired functionality. Kichuk finds that “while a project may plan a future correction stage, chances are good the project will never implement it, due to funding or administration changes or inertia, and the errors will endure, perhaps forever” (74). Furthermore, these errors and omissions will ultimately negate the products’ usefulness as users will not be able to get the information they need and will lose trust that the resource is reliable. To forestall irrelevance due to inaccuracies, “it is vitally important for preservation [of our]…cultural heritage to ensure the quality of the digital objects hosted in digital repositories” (Kichuk 61). It becomes incumbent upon the archivists then to find solutions to the labor of OCR correction to secure archival authority and durability, and funding plans must account for the costs associated with accurate OCR.
One potential solution is using volunteers in an open, crowdsourced proofing process. Kichuk notes that “a combination of OCR-ed text and volunteer or professional proofreading can achieve [high] accuracy rates” and that “crowdsourced proofreading…provides a viable, cost-effective remedy for OCR and automatic correction weaknesses” (71). When I start working more intensively with the artifacts to curate the exhibit online, I will make the determination which images will be included, and thus which files will need OCR applications run and subsequently corrected. It may not be possible to complete OCR work on the artifacts in the first iteration of the site, but it is a goal I hope to work toward and something that may benefit from harnessing the power of crowdsourcing. I think that the individual contributors might have some interest in transcribing their own contributions, even furthering the participatory nature of the archive, or reaching across my network to others interested in digital humanities work that may find the experience beneficial to their own learning. In conjunction with some volunteer efforts, I hope to be able to support the work of OCR correction without committing additional funds.
A similar issue to the time and labor cost of OCR work is that of securing copyright permissions for the digital reproduction of analog artifacts. While not all artifacts are subject to copyright law, the process of digitization and dissemination of entire documents opens the archivist to questions of copyright infringement. In this archive, I currently have two artifacts that were given to me in analog form for inclusion in the archive, pictured below.
The top two images are from a small cookbook that came from the Sousa family and the bottom two images are from a book detailing the history and aftermath of the 1957 volcanic eruptions in the Praia do Norte parish (municipality) on the island of Faial, which was donated by the Castros, relatives of the author. Having ownership of these complete texts forces me to confront the question of whether to include them in the open-access digital exhibit, and if so, how much of the original documents do I want to reproduce and make available? I feel it would be a detriment to not include them at all, or simply make mention of them in the finding aid as part of the archive but not included in the exhibit, but because these are published works, reproduction would require careful consideration of copyright law, especially if I wanted to reproduce the texts in their entirety.
Reproduction of a complete text can present some technological hurdles in scanning and digitally hosting such a large file, but more than that, the archivist must face the fact that “still problematic is the juridical side of this issue. To give public access to archival files evokes problems with publication rights, copyright and the individual protection of intellectual work” (Aumann et al.112). Facing this hurdle is complicated by the well-documented difficulties in interpreting and navigating copyright law. Lloyd Weinreb notes, “The truth of the matter is that the [copyright] legislation was impenetrably opaque; and although there was a great deal of prior law that had to be faced, it pointed with equal force in opposite directions” (1152). Copyright law has a long, storied history and has been shaped by numerous legal cases in which various interested parties, often at odds with one another, seek protection. At times, valid arguments for copyright protection have been made based on the rights of authors, publishers, and even the general public through fair use. In other words, there is legal precedent to support an argument for archives to reproduce entire texts based on the constitutional mandate in Article I, Section 8, Clause 8 indicating that copyright restrictions should be for a limited time only, so the texts can be used freely “to promote the progress of science and useful arts.” As a result, the law seemingly provides allowances for potentially useful texts to be used without penalty of copyright infringement under the designation of fair use. However, there are also recent precedents and rulings, equally strong in their arguments, that have scholars noting how the current trend in issues of copyright dispute “has changed in even more radically pro-ownership ways” (Tushnet 543). This means that in a case between an archive arguing that inclusion of a text is socially beneficial and allowable under the doctrine of fair use, an author’s argument that the work is his or her intellectual property and thus protected against reproduction could, in the current legal climate, carry more weight.
This has led archivists to exercise an abundance of caution, rather than open themselves to potential infringement issues, and spend resources working to secure individual permissions (from authors or other owners like organizations or publishers) to digitally reproduce texts and make them accessible online. In one case study, the cost to obtain copyright permissions for a series of letters in the Thomas E. Watson archive project were both excessive and ultimately only resulted in copyright permissions verified for only a few artifacts. The study reports that “the project manager spent more than 450 hours over the course of 9 months to conduct this copyright investigation,” and “the total cost of the research was approximately $8,000,” resulting in “explicit permission to display online only 4 letters” (Dickson 631). With funding usually limited in many institutions, this case study clearly shows that pursuing copyright due diligence is burdensome for both the financial and labor requirements involved. I learned of another process to obtain copyright from Ken Wachsberger, who is affiliated with a large digitization project by Reveal Digital to create an open-access archive, Independent Voices, featuring the underground press papers of the Civil Rights Era. We first began discussing copyright through Facebook Messenger as he was interested in getting in touch with writers and editors from the underground press publication Inquisition from Charlotte that I had previously published about and digitized in an effort to obtain their permission for inclusion in the project. He also shared with me elements from the copyright process that he was charged with managing. Essentially, explicit written copyright permissions would be obtained by as many of the individuals associated with the publications as Wachsberger could locate. This would be a labor-intensive project, taking many hours of research and communicating with individuals, coordinating the completion of consent forms, and in some cases required many miles of travel. Again, since I am working on my own, there would not be an actual cost in dollars to spend the time needed to secure copyright permissions to reproduce the two texts in the archive, but there would be an investment of time. Since there are only two, this would likely keep the task manageable, but there would be challenges in terms of determining who own the copyright in each case. The cookbook appears to have been published by an organization, the Portuguese-American Federation, which appears to still exist in some form today. However, the cookbook contains recipes submitted by individuals, and the law is not clear whether this would make the text a “collective work” in which individuals retain the copyright for their own submissions or a “work for hire” in which individual copyright is relinquished to the publishing group. If it is the former, then I would theoretically need to secure permission from each recipe author, assuming I could find them, but if it is the latter, then I would only need permission from the Portuguese-American Federation, which is logistically more feasible. For the history booklet, it is not clear whether I would need author or publisher permission, which could be complicated given that the original publication took place outside the United States. It is also appears that the booklet is part of a larger series, the Erupções da Memória Colectiva (Eruptions of Collective Memory) and Coleccão Piroclástica (Pyroclastic Collection), that may have been published by FaiAlentejo, a cultural organization headquartered in Faial. This raises the collective work versus work for hire question again and would thus determine from whom permission could be granted. Even though these are only two texts, the work in determining copyright and then make the communicative efforts to obtain explicit permission to reproduce the texts would be significant and could ultimately result in delayed inclusion in the archive or even denied permission to reproduce. I wanted to include an introduction to the ideas here since in many archives the issue of working to secure copyright is fraught with labor costs and potential need to purchase copyright permissions from their owners. Archivists will need to weigh these costs against the potential benefits of including digitized texts in their entirety and will need to be prepared to support their decisions surrounding copyright with legally-sound arguments.
The Challenge and Necessity of Ongoing Funding Needs
Several years ago, as part of a New Media course, I was assigned the task of taking stock of my digital footprint, specifically what I was storing in digital form and in what technology I was using. The result was overwhelming. I have media stored on hard drives, both internal and external, at work and at home, on computer hard drives and in my phone, in clouds, on flash drives, photos on SD cards and DVD, and even old coursework from my undergraduate program on floppy disks. What I realized is that although we tend to think that our digital storage is stable, it is actually far more prone to loss and degradation than analog objects. It is so easy to accidentally delete a file folder, misplace some small storage device like an SD card, leave some media behind on an old work computer or replaces laptop, find yourself with inaccessible media to technological obsolescence (like those floppy disks that I no longer have any way to access), or go to plug in a flash drive only to discover the computer can no longer connect to the device. While my books and printed photos are neatly stored on a shelf or in albums, subject only to the slow degradation of time and climate or catastrophic loss in fire or flood, I have experienced my digital files being corrupted by age, obsolescence, loss, and theft. In many ways, the reality is that digital data is more vulnerable and fragile than their tangible counterparts, and it pushes me as a digital archivist to think of preserving digital data as an ongoing task; there will always be a need to oversee the data. Unlike the hard copies, where “long-term preservation was a simple matter of putting them on a shelf or in a climate-controlled room and recording them in the catalog,” digital data will “require more hands-on curation” (Perrin et al. 98). The archivist cannot simply place the electronic data “on a shelf” and expect the materials to remain protected. Technology preservation is a still changing field, moving faster than institutions tend to be able to adapt, and despite the excitement of creating terabytes of data being produced from digitization efforts, the archivist must address issues of ongoing maintenance if he or she hopes to see the terabytes remain useful. However, these efforts will require funding strategies that go beyond the initial costs of collection, digitization, and posting to include preservation.
This long-term preservation work is especially vital to born-digital archives, in instances with “records that originate on computers, and there may (or may not) be an analog equivalent, such as a printout” (Cocciolo 239). These kinds of archival holdings have a significant difference “from collections created through digitization, which creates surrogates or access copies of materials that originate on paper, film or some other analog medium” (Cocciolo 239). This is because “there is no analog equivalent, [so] digital archiving also necessitates digital preservation” (Cocciolo 239). If there is an analog object from which the digital data is generated, there is some additional stability in the sense that a new digital file can be created if another digital copy is lost, corrupted, or becomes obsolete in its format. With a born-digital file, such as the images of artifacts I have taken while traveling that I do not have access to in the archive holdings or the audio files with the interviews, it would be incredibly difficult to reproduce the data if it became inaccessible.
There is general agreement, and a growing sense of urgency, in the scholarship that anyone working with digital data will need to focus on “securing the long-term persistence of information in digital form” due to the inherent fragility of digital data (Lavoie qtd. in Cocciolo 239). Greenwood notes that a failure to address “technical concerns about storage and maintenance…may render digital files useless as historical documents within a few years” as a result of “file corruption, media failure and technological obsolescence” (83). Other scholars argue that “the preservation of digital information is widely considered to require more constant, ongoing attention than the preservation of any other media. This constant demand of effort, time, and money to handle the rapid technological and organizational advances is considered a major stumbling block to the preservation of digital information” (Wang et al. 97). Research also suggests that even though “long-term preservation has also become one of the most important issues” related to digital information storage, there is an alarming problem in that “most institutions might have done little to prepare for the long-term preservation of these digital assets” (Perrin et al. 99, 98). It is clear that the work of archive creation does not end with the launch of a new digital collection, but rather is a continuous process of maintenance that requires continual funding.
One of challenges in securing funding for ongoing preservation is a general lack of understanding about the need, much like the assumptions many of us make about the stability of our own digital files. Cocciolo explains that “although individuals are somewhat aware of challenges stemming from technological obsolescence, there is a lack of action to respond to this because the methods available for addressing this issue are not known nor is it viewed as a high priority by most staff” (Cocciolo 246-7). He continues to argue that many people assume that “the [digital] asset will always be available much in the same way that individuals assume that uploading a photo to Facebook means it ‘will always be there’…this view does not take into account the organizational and technological resources needed to ensure this continuation, as well as factors from the external environment that could impede such continuation” (247). There is a lack of understanding about the true nature of digitization that it is a way to extend the life of an artifact, like a digitized photograph creates a more secure version of the content than the fading ink on disintegrating paper. It is easy to understand why the clean copy of the object on digital form gives a false sense of having further ensured the image’s safety, and in some real ways it has added a level of security—perhaps by reducing the amount the artifact needs to be handled. However, in creating a sense of safety can undermine actual preservation given the instability of digital data over time, the ever-encroaching data rot. Researchers needs to keep in mind that “‘digitization is not the same as preservation,’ nor does digital equal forever” (Ramsey 87). The problem is compounded by the false assumption that the systems that we use to make digital data available for public use are also employing strong preservation strategies. Perrin et al. explain that although these systems, like my WordPress site, may “have small features that help with some preservation, they are not designed to preserve the files for the long term (103). They continue to argue that “systems primarily designed to publish electronic files are not preservation solutions” (103). Such an assumption about the efficacy of publishing tools to do the work of preservation leads to a failure to plan and head-off potential risks to obsolescence that must be embedded in a project from the time of inception. In other words, the biggest danger to digital archives’ long-term preservation is not solely an issue of technology, but rather it is an issue of human perception, and these humans are the same ones who often control how funding is allocating. People need to perceive the need to secure data and then act on that sense of urgency through financial support for ongoing maintenance.
For the would-be archivist, it is often easier to collect some funding to initiate a project, like I was able to receive a small grant to launch the archive, than it is to secure continuous funding to support long-term preservation. An institution may be willing to allocate a sum to the library or department interested in developing an archive, but to make funding for that project a permanent feature of the budget seems, to anyone working in higher education, unfeasible. It reminds me of my own experience with an institutional budget through my role as a writing center administrator. When I first began, several of our consultants were funded through a 5-year grant for academic support services. After the grant was slated to run out of funds, the positions were no longer able to be supported since the university was unable to make them a permanent part of the regular budget. The same seems true for archives; a grant or temporary surplus may be dedicated to digitizing a collection, but how can we sustain an archive after its conception? Lynn Bloom suggests that one answer may be a dedicated advocate. She writes, “Every archive needs an advocate who can ensure uninterrupted institutional commitment of personnel, space, and funds to maintain any archive, of whatever size. And every archive needs an archival advocate, to scold, nag, lobby, and otherwise keep the collection visible and alive” (Bloom 287). Porter et al. also suggest the problem in academic contexts of maintaining ongoing financial support. Although not discussing archives, they describe how the limited space on college campuses often means that initiatives, such as a writing lab, will need continual advocacy to sustain itself over time. They write, “As each new monitor of departmental space questions the lab as wasted space, its use must be rejustified. This continuous rejustification process reminds us that our ‘rights’ to space are not given or unassailable” (630). In the same way, the “rights” to digital space on servers and to funding resources for archival projects is not guaranteed. Each time an administrative change takes place, someone will need to advocate for the archival space to be maintained. While the efficacy of the advocate is variable, and certainly constrained by the material realities of those in a position to support the project, it is crucial that an archival champion exists to keep the needs of maintenance in mind and considered in the institution’s financial planning. This question of permanence is the central question raised in almost every article related to preservation and maintenance. Planners must allow for this permanent funding in any project, and innovative strategies will need to be employed to generate self-sustaining revenue.
The work of preservation itself mainly addresses two areas: data loss and technological obsolescence. In terms of data loss, there is a “constant risk is physical deterioration” due to “the inherent properties of the medium and environment in which it is stored,” so that “when the physical medium of a digital file decays to the point where one or more bits lose their definition, the file becomes partially or wholly unreadable” (Kastellec 64). Digital data is also vulnerable to loss through “software bugs, human action (e.g., accidental deletion or purposeful alteration), and environmental dangers (e.g., fire, flood, war)” (Kastellec 64). The second main danger is technical obsolescence, which “occurs when either the hardware or software needed to render a bitstream usable is no longer available, and “given the rapid pace of change in computer hardware and software, technological obsolescence is a constant concern” (Kastellec 64). Like my no-longer-accessible floppy disks, it is not impossible to imagine a future in which the JPG or MP4 files that hold the media files for the archive, or the external hard drive that stores them would become obsolete or lost as newer formats and technologies are released, requiring my archival collections to continually switch to new formats in order to stay accessible and preserved.
There are four main approaches to ongoing maintenance to keep data preserved: migration, data redundancy, normalization, and emulation. Migration is the process of copying data from older file formats and storage media to current technologies, and normalization takes non-standard file formats and converts them into standardized formats, both of which help address issues of obsolescence. Emulation can also guard against obsolescence by “avoiding conversion and instead using virtualized hardware and software to simulate the original digital environment needed to access obsolete formats” (Kastellec 64). By emulating older or outdated digital environments, the data can remain accessible but in its original form. Data redundancy is a practice that can address data loss by maintaining several copies of data, often is separate locations, that is automatically and systematically compared against each other to find and correct any errors that might occur in digital degradation. However, regardless of the strategies chosen to address preservation, they are “intensive processes, in the sense that they normally require some level of human interaction,” which is costly as “trained staffs are a limited and expensive resource” (Kastellec 64). Even the strategy of redundancy, which is more automated, will have costs associated with the technology needed to maintain multiple copies, including “bandwidth, disk access, and processing speeds needed to perform parity checks (tests of each bit’s validity) of large datasets to guard against data loss” (Kastellec 64). For example, one way to achieve redundancy is having multiple separate servers to store data “in geographically distant locations,” so that in the event of data loss on one server through human error or environmental causes, there “would still be a copy of the files from which to recover” (Perrin et al. 101). However, even though redundancy is considered “the first step in any digital preservation process,” it is an issue of funding as it will require the cost of more than one server to be built into the financial plan (Perrin et al. 103).
For the size of this archive, I can more easily maintain redundant copies of the data to ward off loss or degradation because the amount of data that would need to be stored on multiple servers is relatively small, although I will need to ensure that I do not operate in violation of the IRB procedures that have been approved for securely and properly maintaining media files relate that relate to individual research subjects. The true test will be how I steward the archive over time, and how I manage (or fail to manage) the media files as the current technologies that support their accessibility and avoid obsolescence. At this point, my plan is to possibly donate the archive to an institution, such as the Ferreira-Mendes Archive at UMass Dartmouth, making the long-term preservation less of a pressing issue for my own management plans; however, I would need to proactively work against the risk of archival dissolution should such a transfer not take place.
It is important to note that there is another digital area that is vulnerable: the archival metadata. Metadata, or information about the primary digital files, metadata is often the information added to the archive to describe the artifact files and facilitate searches or storage organization. This data can include the artifact descriptions, the OCR text files behind images, any alt-text embedded in the image codes, tags, categories, and navigation functions for the site. As in the discussions above, it is often true that the focus on preservation in digital archives is focused on the artifact files themselves, the actual PDF or JPG files that contain the artifact images, and there is a failure to secure metadata. This tendency to concentrate on the actual files, while obviously important, does not take into account that it can be more time-consuming to recover or rebuild metadata surrounding a file than the content file itself. Perrin et al. conclude that “institutions can expect much more work to recover lost metadata than to recover a lost digital item, but in most content management systems the metadata is not as protected as the item” (103). For me, this is especially problematic because this archive, and specifically the online exhibit, is not being built to create only a gallery of images. The descriptions and arrangements, the way that I am working to actively frame the artifacts and guide users toward specific conclusions, is exclusively held in the metadata, which exists solely in the WordPress site constructed around the archival artifacts. Determining how to store the metadata requires deliberate strategies and actions so the archival exhibit as a whole entity can be recovered in the event the WordPress site is no longer accessible. In general, it is a conversation though that must also be connected to those around funding since these processes generally require an investment in both labor and technologies.
A Focus on Funding Solutions
So far, this section has discussed the need for ongoing preservation and provided an overview of what that work entails with their associated costs, but there is a large body of research that suggests solutions to the issue of ongoing funding. Wang et al. argue that one solution may be to approach the work from a marketing perspective and consider how the artifacts themselves can be used to generate revenue. They argue that digital artifacts should be viewed as a kind of “durable good” that “will deteriorate and depreciate in value in the absence of these [preservation] services” (97). Without preservation, the value of the archive depreciates, so care must be taken to ensure its maintenance, and the funding needed to support this can come from monetizing the collection. They offer examples such as developing an educational DVD that could be purchased by other institutions, providing access to the physical archive materials for a fee, or offering the rights to reproduce elements from the archive for a licensing fee (102-3). They also identify several business models for generating revenue from archives, including image licensing, which allows users an option to purchase a license for the images they wish to reproduce and use for their own purposes (102). This suggestion though may be problematic for maintaining legal copyrights for the artifacts if the archive claims permission through fair use exemptions granted to non-profit endeavors with educational purposes. If an institution generates revenue from an archive, it could detract from this claim. Furthermore, any barriers to artifact access established by paywalls or per-image fees, while they “may support preservation,” it is likely to “discourage access or provision of content, respectively” (Kastellec 68). The fees may help cover preservation costs, but if they deter people from using the archive, the project runs the risk of rendering itself irrelevant.
Like the suggestion to tap into crowdsourcing to assist with OCR correction activities, there are also numerous scholars who argue for the benefits of collaboration to help defray the costs of preservation work, and archive in general. In one very basic way, collaboration can reduce costs through shared knowledge. Antonella Fresa argues that “preserving digital cultural heritage content is common to all the cultural institutions, but still it is addressed by individual projects, without really any shared approaches” (110). This means that in terms of preservation work “the same problems are studied repeatedly and successful solutions are unknown by the others working on the same issues” (Fresa 110). The result is that archivists may to work and produce content in isolation, left to continually reinvent the wheel, covering ground others have covered, and fail to take advantage of tested and effective preservation practices, which can lead to unnecessary mistakes that can be costly to correct later. Knowledge sharing is just one kind of collaborative effort, but partnerships are also important ways to develop sustainability. Kevin Smith notes that through a partnership with an official civic institution, the Boston City Archives, the community participatory archive project gathering narratives surrounding the Boston Marathon bombing, Our Marathon, was able to establish more support, both in participation and funding. He concludes that “partnerships with outside organizations enable the archive to grow…and implicitly privilege the participatory public memory creation by endowing them with institutional and professional associations” (126). Fresa also advocates for such partnerships, arguing that “the industrial sector, creative industry and publishers, should participate in managing and sustaining the infrastructure because they recognise value in the potential for the profitable exchanges of data that are offered” (109). Punzalan and Caswell also argue more indirectly for collaborations with commercially-interested sources, finding that “community archives that resist corporate funding models are left struggling to raise money from individual donors” (34). While formal funding partnerships are likely to involve the acceptance of input from those stakeholders, it is likely to limit the archivist’s and community’s agency in terms of design, marketing, and the direction of growth. However, accepting the challenges of balancing the limits on agency with secure and ample funding may be the difference between having an archive and not having one. It should not be entered into without consideration for whether those constraints would require any unethical treatment of the community and their own say in their representation, but it may provide an opportunity not otherwise available. Perrin et al. make a similar argument about collaboration, but rather than joining with organizations who might have a stake in the specific content of the archive, they advocate for participation in the Digital Preservation Network (DPN). They explain, “DPN’s stated goal is to gather academic institutions together to provide a location to preserve academic digital content for long-term preservation” (101). The DPN works by pooling resources to create a trusted and safe repository for long-term storage that becomes more affordable, especially for smaller institutions, by not having to independently purchase or maintain entire data storage systems.
Utilizing shared storage is an especially promising strategy to reduce the cost of preservation. Digital librarian Gretchen Gueguen stresses that an essential aspect of maintaining data is making sure that “the individual components and data behind digital projects” are stored “in a shared centralized repository” (qtd. in Clement et al. 120-1). Aside from the DPN, many archival projects are taking advantage of DSpace (http://www.dspace.org/), which functions like a library in how it provides access to materials but also as “as a preservation archive, keeping this material accessible, and often immediately usable, far into the future” (Cocciolo 239). DSpace is generally regarded as a “trustworthy” place “for researchers to deposit copies of their work for long-term preservation and access” (Cocciolo 239). Although, with annual subscription costs ranging from $4000-$10,000 depending on size, this is still not a viable option for small projects like this archive. However, for a smaller institution, this option would still be more cost effective and reliable than creating a system entirely independently. Cocciolo acknowledges that although using shared repositories “may be worthy long-term goals for manuscript and special collection repositories, the requirements may be too great” for smaller projects and suggests instead “starting small with something as simple as network file storage,” which can often be obtained from companies like Carbonite for as little as $72 a year (240). While not as robust or secure as the more expensive repositories, it is an effective solution for preservation through redundancy.
One of the more abstract solutions to funding has to do with ensuring that the archival project will have an engaged audience. If the archive has strong usership, it is more likely to find revenue streams from which it can operate preservation activities. An ongoing demand for the archive would aid arguments in favor of ongoing financial support. One way to determine the audience for an archive is to “conduct a market survey” prior to “introducing its products or services related to digital archives into the market” (Wang et al. 98). The goal here seems to be to determine how much demand there would be for the product to avoid bringing an archive to the field if there are few scholars interested, which may result in a lack of interest needed to drive long-term viability. If an archive does not see traffic from researchers or other users, then its long-term survival may be threatened by future claims of irrelevance. Why should an institution maintain a digital archive, investing in its funding and updating, if it does not serve the needs of the community? Cocciolo argues that some projects are ultimately unsuccessful because they are created to serve “library goals (such as collecting and preserving scholarly work)” and not in response to the needs for those in the community, which spends valuable resources on digitization of records without a widespread interest or demand to drive usage (240). Greenwood also notes the importance of creating an archive to meet a need rather than just to take advantage of technological advancements: “Organizations may seek innovations to address an identified need, or organizations may become aware of an innovation and seek a perceived need it might address, possibly without evaluating how well the innovation fulfills the need” (83). Ultimately, without an audience, long-term funding and maintenance attached to perceived usefulness is threatened. Perhaps once this archive is released and begins to engage with the community, it will develop the usership, interest, and potential scholarly purposes that can open new arguments for funding based on utility and demand.
It can also reduce preservation costs if an archive utilizes standard practices, which will ensure continuity in the event that archive management should change hands. These changes in organizational administration are common, especially in conversations about preservation which may extends many years beyond the initiation of the archive. If each individual data manager has his or her own methods for working, and does not take careful measures to document procedures, then the archival data is more likely to be misinterpreted, overlooked, or even deleted by another manager who comes into control. Perrin et al. argue that “maintaining written procedures and documentation for all aspects of digital collections is vital,” meaning that “preserving institutional memory and the reasons why decisions are made are as important as preserving digital assets” (103). Fresa makes a similar claim, explaining that preservation is “an ongoing action, to be periodically revised, in order to update data sets and metadata formats. These are time consuming activities, especially if carried out independently by each cultural institution. Common procedures and workflows, shared internationally, would reduce the cost both in terms of time and money to be allocated to this task and would contribute to the general interoperability and openness of data” (108). Common procedures would allow for open access to data and a decrease in the overall costs associated with maintenance. Specialized knowledge would not be required if a common practice is employed in the archive’s operations. If a manager left a position with one institution, he or she would not take away the ability to operate and understand the supporting technology.
One such standardized practice is the Open Archival Information System (OAIS), which does not necessarily dictate a set of technologies to use, but rather standardizes the activities associated with preservation; it is the “what” of preservation work in archives and not the “how.” In the model, there are six services that an archive must offer, including accepting, storing, managing and preserving data along with making it accessible and providing administrative support. Most relevant here is the fundamental idea that archives following the model will address preservation needs embedded in the very architecture of the archive, with the model mandating that a preservation planning service “monitors the external environment for changes and risks that could impact the OAIS’s ability to preserve and maintain access to the information in its custody, such as innovations in storage and access technologies…then develops recommendations for updating the OAIS’s policies and procedures to accommodate these changes” (Lavoie 13). The standardized inclusion of preservation practices in this model constitutes a “safeguard against a constantly evolving user and technology environment” (Lavoie 13). The model “has been adopted as a best practice in preserving digital information from a wide variety of institutions, including libraries and archives within colleges, universities and governments” (Cocciolo 239). Archives drawing on this model will not only ensure that preservation is fully incorporated into the archival design reducing the costly need to recover data after a loss event, but it will align with other archival projects and thus increase the ability for human actors to navigate inherited systems with greater efficacy, which can also help an archive avoid time-intensive and expensive reorganizational efforts in the future.
Adequately funding an archive requires an understanding of how costs are embedded in nearly every aspect of the project. Generating participation, collecting artifacts, and bringing the collections online are only the initial investment; there will be considerable costs to ensure the long-term survivability of digital data and maintaining accessibility. Failure to plan for all costs risks the archive becoming irrelevant as the project may reach an audience of engaged users or obsolete as technology outpaces the archive’s ability to reestablish itself in newer generations of media formats. Archivists should look carefully at models of standardization, shared storage solutions, and potential partnerships or collaborations as ways to meet the difficult challenge of securing continual funding for digitized archives.
Aumann, Stefan, et al. “From Digital Archive to Digital Edition.” Historical Social Research/Historische Sozialforschung, vol. 24, no. 1, 1999, pp. 101-44.
Bloom, Lynn Z. “Deep Sea Diving: Building an Archive as the Basis for Composition Studies Research.” Working in the Archives: Practical Research Methods for Rhetoric and Composition, edited by Alexis E. Ramsey et al., Southern Illinois UP, 2010, pp. 278-89.
Clement, Tanya, Wendy Hagenmaier, and Jennie Levine Knies. “Toward a Notion of the Archive of the Future: Impressions of Practice by Librarians, Archivists, and Digital Humanities Scholars.” The Library Quarterly: Information, Community, Policy, vol. 83, no. 2, April 2013, pp. 112-130.
Cocciolo, Anthony. “Challenges to Born-Digital Institutional Archiving: The Case of a New York Art Museum.” Records Management Journal, vol. 24, no. 3, 2014, pp. 238-250.
Dickson, Maggie. “Due Diligence, Futile Effort: Copyright and the Digitization of the of the Thomas E. Watson Papers.” The American Archivist, vol. 73, no. 2, Fall/Winter 2010, pp. 626-636.
Fresa, Antonella. “Digital Cultural Heritage Roadmap for Preservation.” Journal of Humanities & Arts Computing: A Journal of Digital Humanities, vol. 8, Mar. 2014, pp. 107-123.
Gaillet, Lynée Lewis. “Archival Survival: Navigating Historical Research.” Working in the Archives: Practical Research Methods for Rhetoric and Composition, edited by Alexis E. Ramsey et al., Southern Illinois UP, 2010, pp. 28-39.
Greenwood, Keith. “Digital Photo Archives Lose Value as Record of Community History.” Newspaper Research Journal, vol. 32, no. 3, Summer 2011, pp. 82-96.
Kastellec, Mike. “Practical Limits to the Scope of Digital Preservation.” Information Technology & Libraries, vol. 31, no. 2, June 2012, pp. 63-71.
Kichuk, Diana. “Loose, Falling Characters and Sentences: The Persistence of the OCR Problem in Digital Repository E-Book.” Libraries & the Academy, vol. 15, no. 1, Jan. 2015, pp. 59-91.
Lavoie, Brian. The Open Archival Information System (OAIS) Reference Model: Introductory Guide, 2nd ed., Digital Preservation Coalition, 2014, http://www.dpconline.org/docs/technology-watch-reports/1359-dpctw14-02/file. Accessed 19 Jun. 2018.
Perrin, Joy M., Heidi M. Winkler, and Le Yang. “Digital Preservation Challenges with an ETD Collection—A Case Study at Texas Tech University.” The Journal of Academic Librarianship, vol. 41, no. 1, Jan. 2015, pp. 98-104.
Porter, James E., Patricia Sullivan, Stuart Blythe, Jeffrey T. Grabill, and Libby Miles. “Institutional Critique: A Rhetorical Methodology.” College Composition and Communication, vol. 51, no. 4, June 2000, pp. 610-42.
Punzalan, Ricardo L. and Michelle Caswell. “Critical Directions for Archival Approaches to Social Justice.” Library Quarterly: Information, Community, Policy, vol. 86, no. 1, 2016, pp. 25–42.
Ramsey, Alexis E. “Viewing the Archives: The Hidden and the Digital.” Working in the Archives: Practical Research Methods for Rhetoric and Composition, edited by Alexis E. Ramsey et al., Southern Illinois UP, 2010, 79-90.
Smith, Kevin G. “Negotiating Community Literacy Practice: Public Memory Work and the Boston Marathon Bombing Digital Archive.” Computers and Composition, vol. 40, 2016, pp. 115-130.
Tushnet, Rebecca. “Copy This Essay: How Fair Use Doctrine Harms Free Speech and How Copying Serves It.” The Yale Law Journal, vol. 114, no. 3, 2004, pp. 535-590.
US Constitution. Art. I, Sec. 8, Cl. 8.
Wang, Ping, In-Lin Hu, and Chen-Chi Chang. “Exploring the Value and Innovative Pricing Strategy of Digital Archives.” Electronic Library, vol. 32, no. 1, 2014, pp. 96-105.
Weinreb, Lloyd L. “Copyright for Functional Expression.” Harvard Law Review, vol. 111, no. 5, 1998, pp. 1149-254.