Crowdsourcing Digital Archives… A Long Road to a New World for History

In this digital age, crowdsourcing has become popular for many fields. From creating films to border control, crowdsourcing provides organizations with an easy way to enlist the help of interested parties without the expense of hiring them (http://en.wikipedia.org/wiki/List_of_crowdsourcing_projects). History is one area of study in which crowdsourcing has the potential to become extremely valuable. Several crowdsourcing projects already exist for history, often funded by government grants and sponsored by museums, and provide great opportunities for history. Firstly, crowdsourcing allows for more efficient availability of primary sources. All it takes for these to become available to the public is for the source itself to become digitized. Once scanned into the database, it is available. Secondly, these crowdsourcing projects provide documents to the public for free allowing virtually anyone to access these valuable pieces. This benefits students and historians alike who need access to these sources, but might not be able to travel to see the actual source. Also, a lot of colleges and universities possess useful sources, that once put into a digital archive will make research much easier. Finally, crowdsourcing allows the general public to take a greater interest in history. One of the best ways to generate interest in any subject is to allow people to take an active role in the project. Therefore, opening these archives up and asking for help will generate an interest in the field, because people feel more connected to the subject.
Despite these positive elements of crowdsourcing digital historical archives, there are many kinks that need to be worked out. Though it is wonderful that these documents are available to the public and anybody can contribute to transcribing them, that quality in and of itself poses the biggest issues. Because anybody can edit these sources, often the quality of the edits is questionable. In my personal experience, some of the transcriptions had so many grammatical errors, it would have been better if the document had been left alone. Also, the general public may not know the importance of tags and of accurate tags. Therefore, some sources, though now transcribed are as good as lost because they are not tagged and that means a researcher or historian must wade through all the articles to find what they are looking for.
I looked at four digital archive sites that allow crowdsourcing. The first one I looked at was California Digital Newspaper Collection. This site provides Californian newspapers from 1846- present. Upon reaching the site and finding an article, the researcher finds the newspaper clip to the right of the page and the transcription of the article to the right. This serves to provide an easy to read version of what is in the article. This site is sponsored by the University of California, Riverside.
http://cdnc.ucr.edu/cgi-bin/cdnc?a=p&p=home&e=——-en–20–1–txt-txIN——
The second site I reviewed was the Historic Cambridge Newspaper Collection, sponsored by the Cambridge Public Library. Similar to the California collection, the site provides newpaper articles based out of Cambridge, Massachusetts from 1846-1923. This site primarily draws its articles from the Cambridge Chronicle (1846-1923), the Cambridge Press (1887-1889), the Cambridge Sentinel (1903-1912), and the Cambridge Tribune (1887-1923). This site is structured like the Californian newspaper digital archive, with actual articles on the right and a clear text of the article on the left.
http://cambridge.dlconsulting.com/cgi-bin/cambridge?a=p&p=home&e=——-en-20–1–txt-txIN——
In my opinion, the DIY History project sponsored by the University of Iowa was the most beneficial and well done American archive I found. This site, though focusing on Iowa natives and citizens, provides researchers and historians with letters and diary entries. Also, this site allows readers to look at cookbooks from the 19th century. This site is also structured with the text followed by a transcription.
http://diyhistory.lib.uiowa.edu/about.php
The final site I reviewed was not an American history site, but interesting nonetheless. The archive was Europeana 1914-1918. This site was created so that World War I survivors and families could tell their story through pictures, official documents, films, diaries, and letters of correspondence. This site was extremely fascinating. The site is structured like the Martha Berry Digital Archive site, with a document or picture and then a general description next to it as opposed to an exact transcription.
http://www.europeana1914-1918.eu/en
I contributed to the Cambridge Library project and the Martha Berry Digital Archives. Contributing to the Cambridge project was an exercise in frustration because it consisted of editing what others had already transcribed, making the process tedious.
All in all, crowdsourcing digital archives have great potential for the field of history. It allows for historians and researcher to have ready access to more primary sources. As with all new systems, some problems need to be worked out before it can become beneficial.

On Crowdsourcing: If Only Ease of Access Were Not Directly Proportional to Level of Ineptitude

Crowdsourcing is officially a “thing.” It exists regardless of personal opinion or scholarly objection. The consequences of crowdsourcing history threaten traditional historiography, which may or may not be a good thing. History has long been told through the highlight reel of time. Historians have typically studied the larger things of life, the high points, the low troughs, and tended to not mention most of the details. This approach certainly has its limitations. Many times the average life is overlooked or the exceptions are ignored, but overall the general flavor of things is communicated effectively. We have come upon an age though, where the smallest thing can be recorded digitally and then easily found by anyone with a connection to the internet. Crowdsourcing digital archives allows for rapid cataloging and dissemination of bits and pieces of the recorded world. The ease both of recording and cataloging such minutiae makes small-scale local history far more accessible and widespread. The pervasive quality of small-scale history seems like it would be more accurate for any specific, given area, but less applicable or important for the world at large. In a sense, the globalization of archival knowledge through crowdsourcing might cause the interesting effect of circumscribing the purview of any given historical work because of the sheer information on each small segment of time and place.

As with all things, crowdsourcing exhibits some positive and negative characteristics. On the negative side, those who contribute to history and cataloging it are often less qualified in crowdsourcing systems. In many cases contributors do not even need to register for an account, so there can be no accountability for whatever they may choose to do on the archive. Secondly, crowdsourcing allows a wide range of minds to work on a single project without any communication between members. Such a system does not bode well for a unified or systematically organized body of knowledge. Thirdly, crowdsourcing is unreliable at best in terms of interest and thus volunteer labor. It is never guaranteed that there will be progress or work each day, as crowdsourcing necessitates public interest and involvement. Positively though, crowdsourcing is a cheap way of garnering a theoretically limitless labor force. It also engages the public in historical work and gets people involved with important stories and events. Thirdly, crowdsourcing allows researchers to search through vast reaches of data that might never have been looked at otherwise.

The sites that I looked at were:

1. DIY History. This site allows users to transcribe handwritten letters from a variety of library holdings. The project began with only papers from the Civil War but has been expanded to include other collections as well. http://diyhistory.lib.uiowa.edu/index.php

2. Citizen Archivist. This site allows users to transcribe, tag, and contribute documents for the National Archives. Users can search for a topic they are interested in and deal with documents relating to that area. This site contains a huge array of sources dealing with anything related to U.S. national archival interests. http://www.archives.gov/citizen-archivist/

3. Brooklyn Museum. This site aims to make the Brooklyn museum more searchable online. To that end it allows users to tag or challenge tags on photographs of artwork. Users can earn points to watch reward videos and are ranked according to the number of tags they have contributed or censored. http://www.brooklynmuseum.org/opencollection/tag_game/start.php

4. MBDA. This site seeks to digitize and catalog the correspondence of Martha Berry, founder of the Berry Schools. Users can edit/catalog the documents through entering information about the document, tagging, and summarizing the contents of the document. Users can also earn ranks and badges the more that they edit. https://mbda.berry.edu/

The two websites I looked at most extensively were MBDA and the Brooklyn Museum. The MBDA website was fairly user-friendly and users had a options of how much responsibility they took on depending on how much work they wanted to do with each letter. The Brooklyn Museum site only afforded users two options: tagging or challenging tags. MBDA’s reward system seemed more complex but less immediately gratifying than the Brooklyn Museum’s more up front encouragement of beating other people’s rankings. At the Brooklyn Museum project it seemed like the user accomplished relatively little as most of the tags were so general as to render them relatively useless for researching purposes, while MBDA cataloging might allow more detailed searches. MBDA was ultimately less finished than the Brooklyn Museum project as it has fewer people involved and requires more commitment from its volunteers.

iHistory

Digital history and crowdsourcing appears to be the way of the future when it comes to the transcription of historical documents. Due to the development of this phenomenon, the public has been able to take a more active role in uncovering their personal histories or histories of places and times they have interest in. What was before a topic that one could only experience through TV screens or pages of a book has become something anyone can include themselves in. One can imagine that there are a multitude of benefits to this type of archival process: more documents can be transcribed and organized more quickly, more resources become available to scholars, and much more. However, downsides exist to this practice as well; just as a person has the power to create, they also have the power to destroy. Because it would be nearly impossible to an expert to review all crowdsourced documents, there is always a possibility that a person could have transcribed it incorrectly. These incorrect transcriptions have their own negative results, as a historian could then cite the incorrect transcription and ruin whatever work they were doing.

For the purposes of this project and to become more familiar with the topic and process of crowdsourcing, I have researched a few crowdsourcing websites and done by part to contribute to a couple of them. The first of these was the Martha Berry Digital Archives, which is the crowdsourcing website run by Berry College. This site deals solely with documents associated with Martha Berry herself and other documents associated with Berry College.

The next site was a crowdsourcing site run by the University of Iowa called DIYHistory. This site has a far broader range of documents, from Iowa specific documents, pioneer diaries, and also cookbook entries. This site gives the public to completely transcribe the works in their entirety, including whole books.

The NYPL also has their own crowdsourcing archive, which differs slightly from the previous to archives I become involved with. Theirs was What’s On the Menu? and dealt exclusively with transcribing historical restaurant menus. It also gives more detailed information on the various dishes that are listed on these menus, showing when they were the most common on a timeline.

The National Archives also have a crowdsourcing project where the public can help aid in transcribing the content of documents, much like Iowa’s DIYHistory project. It is quite interesting due to the varieties of documents that it makes available, which range from typical historical documents to acts of theatrical workings.

The two projects that I contributed to were the MBDA as well as DIYHistory. Both were very easy to become involved in, as the account creation process was simple and straightforward. DIYHistory didn’t actually require an account for certain elements of the site, but it was encouraged in order to become fully involved in the transcription process. Both are quite valuable in terms of the documents that they have made available, but DIYHistory could be a bit more useful because it has documents from more varied topics and document types. DIYHistory also gave participants more choice in the documents they transcribed, and they make it easy to see what documents need more attention and how much work has been put into them.

Open Call: The Shortcomings of Crowdsourcing Digital History

When the once-popular “reality” tv show American Idol first aired, it seemed the personification of the American Dream for the 21st century. From the vantage point of the television set, one could watch singers belt their way to the top through hard work, good looks, and bubbly personalities. However, American Idol was perhaps most famous for its first weeks of “open calls,” the portion of the show’s season where any American could audition for the competition. These weeks were the most memorable because most of the people attending the open calls could not sing worth a nickel, sending the populous of this great nation into peals of condescending laughter.

Crowdsourcing digital history is a lot like those first few episodes in American Idol’s seasons: People who have a genuine interest in history are overjoyed that they have the opportunity for an “open call” where they can show their amateur chops at decoding and a historical database. However, their attempts at this are often chock-full of errors because they have not been trained to interpret the subtleties of old documents. These well-meaning contributors inadvertently allow their biases and sloppiness as unpaid volunteers to taint these history projects. Nevertheless, just like open calls gave American Idol a lot of toll-free popular appeal, crowdsourced history projects harness the power of enthusiastic volunteers with very little financial cost involved.

One crowdsourcing project is Religion in American History, a site manned by Loyola University students. Their dilemma, similar to many in the crowdsourcing field, was that they had too many random documents in the school basement and not enough funds. Using their Flickr account, the students and their professors set out to pass some of their digitized library to helpers on the internet. Upon examination, the problem with Loyola University’s crowdsourcing project seems to be that not enough people are interested in helping out, and when they are, they write only the most cursory things. One comment for a picture that contained a full page of text merely named the heading of the text and nothing else.

Another site is a Flickr page curated by the University of Pennsylvania. Their work deals mostly with documents that are not American in origin but somehow made their way to the U.S. during the 17th-19th centuries (There’s an entire gallery devoted as an abecedarium). Called the Penn Provenance Project, there are tons of comments made by informed researchers. Everything is very well-organized (unlike Loyola’s Flickr) and well-photographed. This site seems to have flourished on crowdsourcing because it is obviously popular with scholars who have time on their hands. Contributing to these Flickr compilations is easy–all you need is a Yahoo account and you are set to comment to your heart’s content.

One non-success story of the digital crowdsourcing movement is that of heritagecrowd.org. Once a flourishing site, it was created “to encourage the crowdsourcing of local cultural heritage knowledge for a community that does not have particularly good internet access or penetration.” It was even set up so that people could contribute by text message and voicemail. However, in 2012, the site was hacked maliciously and destroyed. In a blog post about Heritage Crowd’s downfall, its founder Shawn Graham detailed why the site was made susceptible, citing poor record keeping, too much free reign given to computer systems, and security loopholes.

Lastly, the Martha Berry Digital Archives was probably the most professionally-operated crowdsourcing site I came upon. Boasting a great and organized layout for contribution, The MBDA site made it easy for contributors to tag, describe, and categorize material. In contrast to the above Flickr sites that I tried to contribute to, MBDA gave me instructions and a framework in which to operate. I would say that it was a superior site to most digitized history initiatives out there. However, I would still say that MBDA is just as susceptible to bad categorization as the Flickr sites are.

Crowdsourced digital history is appealing to the masses because of its inclusive nature and accessibility, but its chances at changing, adding to, and defining the scholarly historical field are slim. This is because the people contributing to crowdsourced history projects aren’t professional historians.  Although there are of course some exceptions to this, the unrestricted, no-filter nature of digital crowdsourcing allows all types of faceless internet-surfers to place their stamp on the project at hand. Even though crowdsourced digital history might seem like an easy and cheap way to synthesize information, this “open call”  makes historical documents vulnerable for butchering.

Crowd Sourcing and its Historical Applications

Crowd sourcing and its effectiveness is a highly debated topic in the realm of history. The idea of crowd sourcing is that you get people from where ever people are interested to look at documents and transcribe, or decipher, what they mean. There are some sites that allow you to contribute information regarding a subject while others are merely translations. Crowd sourcing will only become more widespread as the access to the internet grows, and the potential to outsource large amounts of work becomes easier. This concept has many benefits to add to the field of history, as well as a few drawbacks, but the key is to learn from the mistakes that get made, and to try to minimize the amount of problems that these type of sites have, because the process is a much needed one.

Crowd sourcing lets large numbers of people  look at documents, and decipher what they say or  review already transcribed works, so the information is easily accessible by the public for research needs, or if they just want to find specific information on something. It allows the work that would take the staff of a library, or an archive to sort through massive amounts of information fairly quick with the help of people who are interested in browsing items that would be previously unattainable. It allows people to use the internet to breach the gaps of distance and see sources they otherwise would never see. Crowd sourcing allows for a more comprehensive and specific informational search which allows for more thorough investigations of topics.

Not everything about these sites is good. In just looking through the four I browsed, and where I contributed, I found many errors that can really hinder the usefulness of the sites for historical purposes. You have the issue of people, who really have no historical background possibly translating the documents, and depending on how you interpret what is important in the document, you may not tag it in the right spot, or you may find different parts important, and tag those. This leaves the document which may be really important to a certain topic left out of the search. Many of the site let anyone on to help with the effort that it be as open as possible, and therefore letting the opportunity for mistakes to be made increase. People of any education level and of any background can translate. You also have no way to know, unless someone reviews it, if the translation is actually correct, or if there are not tons of spelling errors. In certain cases you may not even know where the information added came from or how true the information is. These are just a few of the sources and how they compare to the Martha Berry Digital Archives.

The one I chose to contribute to was the DIY History website. This site is run by the University of Iowa, and it looks at civil war diaries, transcontinental railroad documents, cookbooks, accounts of pioneer lives in America, all sorts of primary documents that you would never see unless they were transcribed on the internet. A lot of the sources, while available to those who know about them, would go untouched do to the general lack of knowledge of their existence. I was concerned however with the ease at which I was able to just jump right in without any sort of background information given. There was no request to see why I was there or if I knew anything about the history of any of the categories that were there. The language of the letters was really hard to read in places and I am sure that on a few translations I made some mistakes, where I feel a trained eye for certain handwriting styles would not, but those are the trade-offs we make. Overall the site was really easy to navigate and the documents that I could read got translated fairly easily and now those people who want to know about specific accounts of the people in Iowa during the civil war will have access to do just that. When compared to the Martha Berry site I feel like it was organized better into subgroups but the ability to tag, as in the Berry site, is a crucial part of the effectiveness of the site and DIY just didn’t have that.

Another site that I looked at was the New York Public Library’s what’s on the Menu? site. I actually thought that this site was really cool. The idea of crowd sourcing things like recipes is great. This alleviates the issue of interpretation since the recipes are going to be the same across all backgrounds. The website its self was a little bit of a letdown since you couldn’t review anything, when I tried to anyways, but the idea behind this as a way to share recipes throughout history, or to look up dishes you want to try but don’t have room for all the storage of the old fashioned cookbooks. These type resources are what you would want.

Another site that many forget uses crowd sourcing, and that is Wikipedia. Wikipedia shows, in my opinion some of the best and the worst elements of the crowd sourcing argument. It allows anyone to contribute whatever they want to the site. Usually the information has a corresponding endnote if the user isn’t lazy, but you never know if you can trust what is put on a site like that. It is great for using the footnotes and going to find the original source if that is your goal. It is great for general sourcing but you can trust where the information came from just by looking at the article. It does however, in allowing for so many people to post, have the greatest selection of topics, and articles in the world. You can start a lot of projects just by going to the site and looking in the footnotes for possible sources to use.

Now we come to the Martha Berry Digital Archives. After looking at the other site and seeing what they have to offer, I prefer the layout of the DIY History website over this. The breakdown into subgroups makes it easier to find things. The tagging that this site has is something that I think all of the other sites should have but don’t. Archival sites though are different than other crowd sourcing sites in that they are merely transcribing and people can’t start their own topics and add false information. There are bound to be errors though with anything that requires interpretation. It is important to have a system of rechecks to the system to eliminate the possibility of misreading or interpretation and I think the archives do a good job of that. Hopefully in the future we will see more concern on the background of the individual, to make sure that they have some experience in the field of history, but as it stands now the benefits of having a way to get large amounts of information, from areas that would otherwise take years to sort through, is a much better plan then just allowing information to sit gathering dust, where it could provide insight into a much debated topic, much like this one.

Tagged , ,

Crowdsourcing Blog Project

Since about 2005 the term “crowd-sourcing” has been used to describe a means of researching information in our society. The exact definition of the term is written as, “the practice of obtaining needed services, ideas, or content by soliciting contributions from a large group of people, and especially from an online community, rather than from traditional employees or suppliers”. In laymen terms crowd sourcing is a type of volunteer labor in obtaining information on specific subjects. Websites differ in how they allow this process to happen. Some websites will only allow certain aspects of documents to be transcribed, while other sites give editors a lot more freedom in their edits. Basically, these sites guide average people through the steps needed to contribute to cataloguing and improving documents. Now how could this term benefit academics, specifically in the department of historical studies?

When someone thinks of how to gather the facts we study in history they probably think of experienced historians. These historians will devote their entire lives to one subject in order to learn more about the subject, and spread its information. Crowdsourcing changes this process of obtaining historical information in many very positive and significant ways. The first positive is the fact that crowdsourcing allows for community involvement involving historical matters. Another benefit would be the number of eyes going onto a document being edited. The more people that analyze a document increase the chance that errors will be spotted and corrected. Yes, there are trained professionals that analyze these documents for a living, but even professionals can make mistakes that need corrections. Crowdsourcing allows for documents to be reviewed numerous amounts of times for a relatively low cost. Crowdsourcing allows for the research and cataloguing of information processes to be sped up tremendously.

Whenever a product has a positive impact chances are negatives are also nearby. This fact is no different when discussing crowdsourcing. The negatives are probably very obvious when looking at a vast amount of people contributing information. Not everyone who uses these websites is always the most reliable sources of information. This ignorance can lead to documents being edited with incorrect information, and in turn becoming very difficult to search for. Overall it seems the majority of the contributors do have good intentions, and allows for crowdsourcing to be a valuable tool in studies.

 

Here are four sites that use crowdsourcing as a means of obtaining information that I visited:

 

  1. Martha Berry Digital Archives: This website uses crowdsourcing to catalog personal writing of Martha Berry. The crowdsourcing site includes a collection of personal letters, photographs, receipts, and other documents involving Martha Berry and Berry College. The site also includes other important documents that contribute to the history of Berry College. Contributing on this website was fairly easy, and only requires registering to begin transcribing. The ability to cycle through documents to edit is also a very helpful and user-friendly tool of this website.
  2. Collection of Civil War Dairies: This website used crowdsourcing to catalog and edit first hand accounts of war from soldiers to their families during the Civil War. Editors can contribute here through editing transcriptions, correcting other editors, and by transcribing handwritten pages. Overall this is a very well put together crowdsourcing website. Personally, I feel that Martha Berry Digital Archives are a much simpler and user-friendly website. Switching from document to document is more exciting with its random selections. This website gives you more specific documents to analyze the entirety of if you wished to do so.
  3. National Archives Transcription Pilot Project: This website is actually the website I chose to contribute to. I loved how easy the site was to use by not requiting registration at all. This allowed for quick and easy access. When using the site you are allowed to choose the difficulty level of the document you edit. Unfortunately, all the easier documents were already edited. Also the intermediate levels were almost all done as well. This required me to look at a few of the intermediate documents, but a majority of harder ones. I dealt with documents from letters to my personal favorite a leaded Plantation report. I helped to transcribe some of the content on a few of its pages. Another good thing about this site continued with its organization of documents into dates, and their completion percentage. This allowed helping my find documents.
  4. What’s on the Menu?: (Done by New York Public Library) this website was a very interesting crowdsourcing site. The site takes no registration to even participate in. I messed around with the site some, but could not find any substantial amounts of contribution to make besides labeling a price or two. I liked the site, but would like to see more things to edit. The problem is probably the fact that there is no registration required, and in turn gets a lot of contributors to catalogue and transcribe the numerous historical restaurant menus.

 

After mention my number three-crowdsourcing site I do have to say Martha Berry Digital Archives was a much simpler choice for me. Although it did require registration I believed it to be more helpful if seeking to contribute. For one National Archives Transcription Pilot Project was very difficult in finding articles that I could actually easily contribute to. I had to dig for a few minutes to actually find one that needed contributions. MBDA gives you this by simply guiding you to the need material. Overall I would choose Martha Berry’s crowdsourcing in both research and contributions.

Tagged ,

Digital History and the Effects of Crowdsourcing-Marissa Fulton

Technology has become a vital part of today’s society. Therefore, it only makes sense that technology would ultimately invade the realm of history. Digital archives are making documents more accessible than ever before. People who were previously limited by geography are now able to view documents they would not have had access to. Many digital archives utilize crowdsourcing to transcribe documents and make them available to the public. This gives people the opportunity to become involved in their history. This also means that various archives will not have to use funds to pay people to transcribe the documents they strive to make available. However, there are drawbacks to crowdsourcing. When the untrained public is in charge of editing documents things are likely to go wrong. Those editing might tag a document incorrectly, making it difficult to find. They may also not see the importance of certain documents and overlook something of great value. While these are serious obstacles, I believe the pros of crowdsourcing outweigh the potentially negative consequences.

In furthering my knowledge about crowdsourcing I perused four digital archives, two of which I contributed to. In the next few paragraphs I will review my experiences with each site.

  1. The Citizens’ Archivist Dashboard is a branch of the National Archives. One does not have to register to contribute to this website. However, it can be difficult to contribute to. Because of the popularity of the National Archives, most all of the available documents have been transcribed.
  2. The University of Iowa’s DIY History has documents on the lives of pioneers, women in Iowa, the Transcontinental railroad, and Civil War diaries. Anyone can begin editing by choosing a document, or they can register to better track their edits.On this site as opposed to tagging and summarizing, you transcribe the entire document. This site tells which documents have not been started, which ones need review, and the percentage completed for each document that has been edited.
  3. One of the sites I contributed to was The Cambridge Public Library. On this site one edits the transcriptions of the Historic Cambridge Newspaper Collection. These newspapers have been uploaded onto the website. They have all been transcribed, but those who register on the site can edit the transcriptions comparing their accuracy to the uploaded versions of the articles. The existing transcriptions possess many typos, so there is much to do for those registered.
  4. Finally, I contributed to the Martha Berry Digital Archives. After registering I was able to browse documents. After choosing a document, the editor can write a short description and tag the document based on the subject matter and location.

While both The Cambridge Public Library and the Martha Berry Digital Archives were user friendly, I found the Cambridge Public Library a far easier experience. However, the Martha Berry Digital Archives does offer more interesting and varying possibilities. With the Cambridge Public Library you choose a newspaper, month, date, and year. You then select an article and edit the transcription provided for the article. This was a fairly brainless experience. Therefore, while it was easier, it was not as interesting as editing for the Martha Berry Digital Archives. There, one can select a document from a diverse collection, summarize, tag, and provide a location for the document. Thus, while The Cambridge Public Library was an easier experience, it was not as enjoyable as getting to know the varied documents that the Martha Berry Digital Archives possesses.

History: No Longer a Thing of the Past

Now that we’ve moved into a digital age and almost everything uses technology, it is no surprise that archives and anything historical have started being digitized. Digitization modernizes the research of history while still incorporating the feel of digging through documents in an archive. Using crowdsourcing for digital history projects brings anyone into the mix; historians and non-historians can involve themselves in these projects unlike ever before. It’s the combination of information and help from any person with internet access that makes crowdsourcing so important to modern historical presentation.

Crowdsourcing brings many benefits to the availability of history while also causing problems within digital history projects and any digital archive. Some benefits of crowdsourcing come from the interaction with the community. Almost all of the contributors are purely volunteers taking time out of their own lives to enjoy delving into historical information. This is an easy way to gain the help needed without the monetary costs. Crowdsourcing also brings dozens of viewpoints to broaden the opinions and abilities of its editors and contributors. Literally anyone can contribute to an online archive without having to leave their homes or without having a degree in history, but that does not always bring the benefits these archives desire. While transcribing or editing a document in a digital archive, the mistakes can add up. The lack of an official set of rules regarding editing and tagging could cause important continuity issues. Giving editors the freedom to provide their own interpretations of the information, albeit important and exciting, could bring some problems regarding the historical accuracy.

To research crowdsourcing on digital history projects, I found three websites that use the public to gather and edit historical information:

  1. The DIY History transcription project from the University of Iowa contains 19th century documents and diary entries discussing daily life of the pioneers and life during the Civil War, among others. This site exemplifies transcription through crowdsourcing. I found the site’s ease of access wonderful for any user interested in transcription, but the handwriting and the quality of the documents would be tough for a novice editor to navigate.
  2. The National Archives provides many crowdsourcing options through their Citizen Archivist program, like Transcription and Tagging, that allow anyone to involve themselves in history. The transcription option doesn’t need a login and is incredibly similar to the DIY History website, but it contains a much larger variety of documents that would attract a larger audience. The tagging option, on the other hand, requires a login and uses a review process before the National Archives accepts your tags. Both programs are put together well and are full of rich historical documents.
  3. The last website I discovered was the Cambridge Public Library’s Historic Cambridge Newspaper Collection. This site allows for correcting transcriptions on Cambridge newspapers from the 19th to early 20th centuries. You can choose specific articles within the entire newspaper and correct the transcriptions already in place. The task is a bit tedious, but it still brings an interesting aspect to transcription.

Alongside these other websites, I contribute copious amounts of time to the Martha Berry Digital Archive. This website is devoted to the letters and images within the Martha Berry correspondences, and the site includes thousands of these documents for editing and tagging purposes. Not only does it allow for the editor to contribute descriptions, dates, and titles to the documents, it also provides options for tagging and for mapping the  location of the letter. I also created an account on the Tagging site from the National Archives and the Cambridge Newspaper Collection, but I spent the most time with the National Archives. Both the tagging system with MBDA and the National Archives receive reviews before they are locked officially, in the case of MBDA, or posted by the National Archives staff. The searching ability on the National Archives website is much more organized than that of MBDA, but one must take into consideration the amount of help from volunteers the National Archives receives compared to the small scale of MBDA. For such a new endeavor and one from such a small school, the Martha Berry Digital Archive is a step above the rest. The amount of specific options within editing a document and the ability to choose a random document creates an ease of access that pushes past the cumbersome tagging system and the variations in editing techniques and styles.

Caveat Emptor: The Rise of the Bored Cheetos-Fingered WWII/Civil War/Vietnam Expert Armchair History Buffs with a Computer

Crowdsourcing digital history has been a rapidly growing aspect of the field of historical preservation. Utilizing the resources of the public, including personal effects of old letters, photographs, and other documents and objects, the amount of material, especially primary documents, has become readily available to a wide audience. Furthermore, many institutions have opened up crowdsourcing initiatives to the general public, where the ordinary person can take part in identifying, transcribing, and tagging scanned documents, photographs, paintings, maps, et al. This has allowed many archival collections to efficiently expedite the process of labeling the countless (and expanding) number of documents that are being scanned beyond the capacity of the archives. This allows collections to be completed in a very short time and at zero cost to the archives, the libraries, or organizations, often already short of funds due to budgetary cuts. While this new initiative a part of the general “digitalization of the humanities” could serve as a great tool for the field, there are a great number of inherent and long term risks that could be devastating to the historians’ and archivists’ craft.

There are benefits to this increasing movement. Digitization of many primary sources is a means for preservation, especially for paper products that were originally made using an acid treatment method. While there is a risk of being able to lose electronic data permanently, it is easy to create back-up copies and essentially the information will be forever preserved. The almost universal availability to the Internet also allows a wider access to this information for everybody, including professional historians, undergraduate or graduate students, and the WWII/Civil War/Vietnam armchair expert history buffs we all know and love…Labor-wise, having a ready pool of volunteers, students, and assistantships allows this kind of work to be done at a lower cost than hiring a professional archivist and at a much higher volume of productivity. Whereas as some projects might take years to accomplish, the mobilization of several thousands of students and WWII/Civil War/Vietnam armchair expert history buffs with nothing else to do but sit in front of a computer screen and munching on Funyons between episodes of Band of Brothers, could accomplish in a matter of months with the utmost enthusiasm.

With these benefits, I find the negative aspects of these movement to be outright disturbing. Perhaps I am a history major luddite at heart, however I see this as a cheapening and damaging to a field that already seems to be an anathema of vocationally-minded STEM wackjobs who have a propensity to dictate the terms of value to particular areas of study. If it wasn’t for the presence and advocacy of the great tycoons of Wall Street, history would certainly have been relegated further into the “worthless” degrees of philosophy and anthropology. Bah…

 

If I put it into these terms, would you want the world champion of Call of Duty or Halo to be the next head of the Army, or Dick Vitale to be the next coach of Purdue University’s basketball team? Absolutely not! It is disturbing that someone who has zero training in the field of history to hold the responsibility of tagging potentially valuable documents to the Cheetos-fingered WWII/Civil War/Vietnam armchair expert history buff. The ability to properly interpret primary sources is an advanced skill that is not even expected of undergraduate sources, in short it can be dangerous to wrongly interpret documents or improperly weight their value. Even as a senior history major, I do not feel comfortable in analyzing some of these documents; and my poor tagging skills have already testified to that fact. Employing these volunteers as a workforce can also undermine the value of historians and archivists. If there is a vast pool of labor willing to work for free, it incentivizes institutions to cut back on funding such positions. We have seen this in other occupations, librarians for example.

 

Beyond that, I would close with a caveat emptor. Progress is not always something that can be stopped and the perception of its unanimous benefits are subjective. As a (hopefully) future historian, I would say this. Do not let the WWII/Civil War/Vietnam armchair expert history buff define this field of study. Otherwise, we may be on the inevitable path of becoming like the History Channel; Nothing, but dribble that caters to the great-unlearned masses in a Kardashian-bedazzeled, pimped-out Escalade to hell.

 

As for my contributions, to digital archives I focused on the Martha Berry Archives, The New York Public Library Map Warper Project, the Smithsonian Digital Volunteers, and the US National Archives. Overall, the Martha Berry Archives and the Smithsonian Digital Volunteers are the most intuitive and user-friendly sites for newcomers (although I would argue Martha Berry Archives is the best). Their layouts are reasonably similar and do not require much time to learn. The New York Public Library Map Warper Project, is very interesting in that it seeks to use old maps dating from the American Revolution through the contemporary era and the volunteer “warps” the old maps using a polygon interface system to superimpose the old map onto a modern satellite image of the same area. This allows Urban Historians to be able to track changes in city planning, street development, and lots; especially in the years after 1870 as cities began to expand. While I think this project is incredibly interesting, there are many steps to the process that makes it complicated and one who is unfamiliar with maps, will have an exceptionally difficult time with it. Many of my attempts came out skewed-enough when superimposed that I simply scrapped my efforts to one who has much greater practice with it. I did find that the New York Public Library was the easiest to sign up for and there are numerous tutorial videos, as opposed to just a written list of directions. If you like maps and are eager for challenge, I would urge you to use this means as a way to get involved in helping urban history. There is not much to misinterpret, only being unable to properly locate streets or use the polygon sketching software.

The Basics of Crowdsourcing

Many hands make light work. This old saying sums up the theory and benefit behind crowdsourcing projects. It holds true whether applied to transcribing projects or those driven by crowd submission. The beauty of crowdsourcing is that organizations can put their materials online or post a request for submissions and someone halfway across the country could help fulfill the goal of the project despite the distance.

For example, the creators of  the Martha Berry Digital Archive uploaded scanned images of the documents in the Archives at Berry College and asked the public help transcribe and tag the images to make them searchable and available to a widespread audience. Similarly, the New York Public Library has a project titled “What’s on the Menu?” which asks for the public’s help transcribing one of the largest restaurant menu collections in the world, with some documents dating as far back as the 1840s. Another way projects can utilize the contributions of the public is by letting users add their own historical documents for others to view. The National Museum of American History has a project called the Agricultural Innovation and Heritage Archive which does just that by letting its users upload stories about experiences they, their family, or their friends have had with Agriculture and its many challenges and rewards. Some projects are a combination of both types of contribution opportunities, like the Edgerton Digital Collections Project. This online project was created to honor Harold “Doc” Edgerton, an innovative MIT professor of engineering who passed away in 1990, by allowing friends and former students to post stories and memories they have of Doc as well as helping to transcribe and tag digital copies of his research notebooks.

I contributed to both the MBDA and the Doc Edgerton projects. While both were reasonably easy to navigate and use, they were each different with their own pros and cons. On the Martha Berry site it is easy to filter and find documents that have not yet been worked on. This is not the case for Doc’s notebooks which have to be clicked through until the contributor can find one that has not been completed yet. However once you find a document, the written text is automatically blown up to make it easy to read alongside the fields that need to be filled out. The MBDA site, on the other hand, allows you to zoom in but sometimes you have to leave the bigger picture in order to type the text. Overall, these complaints are small and the two sights are not difficult to use and make crowd sourcing quick and easy.

While letting so many people contribute to the efforts of these projects does allow the work of digitizing or compiling much quicker, it also takes away a constancy that would come from one or a small number of people working on the tasks in an uniform manner. This is especially true when it comes to tagging a piece, what one user thinks is unimportant in a document might be the thing that someone else is searching for, which could leave research incomplete. However projects like the Agricultural Innovation and Heritage Archive would not be as complete if it were not as easy and convenient for contributors to simply upload their stories in order to preserve them. The main benefit of digital databases created as a result of these crowdsourcing sites is that more people can access archives that they traditionally would have needed to travel to see. Someone can access files housed in Washington DC from London which can help research grow in ways that would previously cost vast amounts of time and money to access. One negative aspect on the other side of this coin however is that traditional research allows the evidence to speak for itself. When searchable text is the way someone is looking through documents, they do not see every pice from a folder like they would if they were shifting the the documents in person. This could lead to a biased interpretation of the sources and false conclusions because of an incomplete picture. Overall, every method of research will have positive and negative aspects. Crowdsourcing is a convenient and fast way to create databases that can be accessed from a distance by many people.