Monthly Archives: March 2014

Is Crowdsourcing Digital History Useful

The big question is what can crowd-sourcing accomplish that many professional historians who have often devoted their entire lively hoods to such a subject can’t? Crowd-sourcing projects are apparently looked upon with disdain by many academic historians. But I believe there is some value to the use of crowd-sourcing digital history projects. There are only so many of these professional historians within the world, and even if they are skilled at their jobs, they are things they miss or make errors of their own. Admittedly not everyone who participates from the public is going to be correct as well, that may be inevitable. But for history as a subject to thrive, it must get fresh blood every now and then. I believe Crowd-sourcing to be a good way to get such people involved. It may even convert some people to an interest in history.

There are pros and cons to everything, and crowd-sourcing digital history is no different. The pros involve allowing new people and fresh ideas into a historical project. Maybe they will catch something the original creators of the archive did not, maybe they will be able to improve upon it. Making historical sources more accessible to the public could go a long way to opening the eyes of the people to the importance of history. The cons of it are just that: opening it to the public will bring in all kinds of people. Inevitably there will be some who get something flat out wrong or are not even remotely qualified to help with such a project. But I figure this would be a small issue, considering who would even get involved with such a project if they didn’t at least have a passing interest in the subject.

I’ve found quite a four crowd-sourcing projects on recent US History:

  1. Cambridge Public Library Historic Newspaper Collection In this project, public volunteers are allowed to correct the text of OCR scans of Historic Cambridge Newspapers and magazines, including those dating from the the first quarter of the 20th century. Requires public members to register on the site before being able to contribute.
  2. Brooklyn Museum Collections In this project, the game is to vote on the relevancy of identification tags on the museum’s  collection, including several pieces made in the 20th century. You earn points for each round of voting. It requires registering with the site before you are allowed to play.
  3. Civilian Archivist, National Archives It is essentially exactly as the title says. Allows public access to tag images and records, transcribe historical documents, help index the 1940 census, contribute to articles and share photographs. Problem is that you are required to register on the site before you an edit and that you must wait for the site administrator to approve your registration
  4. Korean War Mysteries Crowdsourcing of Korean War documents on missing POWs from that war. It’s less editing and more sharing information to help close these mysteries. It is doubtful that you will be considered for it unless you demonstrate proficiency in the subject.

I contributed to the Brooklyn Museum Collections and comparing it to the Martha Berry Digital Archive, it’s relatively simplistic. All you have to do is register and then start playing the game “Tag you’re it!”, and vote on the relevancy of the tags already assigned to a collection piece. It’s purely art, no documents compared to the MBDA. Its laughably easy, the only reason I imagine they make you register with the site is to prevent people from incorrectly messing with the tags. Anyone not really interested or there just to cause trouble is likely not going to bother with registering.

Crowd-sourcing has a few cons, but overall I see the potential contribution to the digital history field as very good.

-Dexter T. Thomas

 

 

 

Digital Archives and Crowdsourcing

In the age of technology, most people are used to having information at their fingertips. More and more digital archives are emerging on the internet.  They are using the public through crowdsourcing in order to mass transcribe and edit documents, records, and pictures. These digital archives sites are beneficial to the public that are unable to travel to the physical location of these documents.  Crowdsourcing is useful because it gets the public involved and the institutions do not have to pay people to transcribe or edit the sources. If done correctly and administrated properly, digital archives and crowdsourcing can be tremendously useful for the public. Unfortunately, this is not always the case.

Crowdsourcing can be seen as a tool that allows a mass number of the public to edit and transcribe sources that otherwise would not be able to be done. The problem occurs when the public does not correctly edit or over edits a source. Information can be lost with editing or can be overlooked with not enough editing. By tagging a document, it makes it easier for the public to find. If a member of the public does not correctly tag an item or uses different wording to tag it, then it can make it even more difficult for the public to find. Also, sometimes the public does not understand the importance of some of the sources. There could be something of extreme significance overlooked. The purpose of digital archives is that they allow the public to see sources that otherwise they would possibly not see. If the public cannot find the sources that they need than the purpose is unmet.

The Martha Berry Digital Archives are a collection of articles, letters, and pictures that pertain to Martha Berry and the Berry Schools.  The site is user friendly and gives the user options of searching and editing the sources. When editing the scanned documents, the user can tag key words to the source and locate where the source originated from. The archival site allows a summary of the document, but does not allow the user to fully transcribe the document.  Documents can be edited multiple times by other users until the digital archives administrator closes the document from edits.

Another site is the National Archives’ Citizens Archivist Dashboard. This site allows the public to contribute to some of the digitized National Archives’ records, documents, and pictures. This site can be limiting to the user. It looks user friendly at first, but when the public wants to begin editing or tagging the sources, the process is difficult. The directions seem simple and yet hard when the user tries to actually follow them. This is a good site to use if the public was searching for documents for pictures of certain time periods or events. Since it is the National Archives, many people have used this site and have provided great editing for the sources. Its popularity also means that the user is more likely editing what someone else has already edited.

The New York Public Library’s What’s on the Menu? Site has historical restaurant menus. The public helps by transcribing all of the dishes from the menus. Over a million dishes have been transcribed from over 17,000 menus. There are descriptions of the dishes and even recipes for the public.  The site is user and public friendly but the administration of the site is lacking. Either the project is done or the administrators cannot keep up with the amount of contributors because there are no more menus to transcribe at this time. Overall, the site is great for those who are searching for a certain dish or menu from a restaurant.

The University of Iowa Libraries’ DIY History digital archives site is an assortment of diaries from the civil war, correspondence about the transcontinental railroad, diaries about women and the prairie life.  This is one of the most user friendly sites that I visited. There are still plenty of items needed to be edited and the directions are clear and understandable.  The administration is actively monitoring the site because many of the documents are considered “complete”. In this site, unlike Martha Berry Digital Archives, the editor can fully transcribe the text. The public can still review others’ edits, but there more “complete” documents than “needs review” documents. This site was just as easy to navigate as the Martha Berry Digital Archives site, but it is harder to search for certain documents because there are no tags on the DIY History site.

Digital History–Ree Palmer

Considering the scholarly disdain of crowdsourcing websites such as Wikipedia, one would assume the same negative response to the crowdsourcing of historical archives online. However, the combination of decreased funding for historical preservation and the push for digitization of documents allows for a unique solution: crowdsourcing archives. Projects that allow crowdsourcing facilitate the use of volunteer labor from the general public in order to label, catalogue, and sort through the scanned files of physical archives. Some websites go as far to allow participants to completely transcribe documents. Crowdsourcing websites that guide online volunteers through the process of correctly cataloguing documents can be a creative solution to limited funding and pressures for online access to archives.

With any solution, however, there are positives and negatives. On the positive side, crowdsourcing allows for online archives to quickly sort through thousands of documents and label them for ease of search for research. Crowdsourcing also allows for the general public to become involved in the historical process and feel a sense of connection to the documents. With the exception of the expense of creating the website software, crowdsourcing is low cost for the archives or sponsor since the contributors work as volunteers. On the negative side, crowdsourcing grants access to unreviewed documents to the untrained public—which allows for error in transcription and tagging. Documents with incorrect tags do not allow for searches to locate them—and the documents may lose their usefulness if they cannot be accessed.

In researching crowdsourcing history, I have looked over five digital archives that allow crowdsourcing—four of which are reviewed below.

The University of Iowa’s DIY History Project allows participate to transcribe documents from the Iowa Digital Library and features collections of 19th century documents. DIY History is rich in its collection of personal diaries and letters including selections from pioneers, the Transcontinental Railroad, and the Civil War. Although the project should be noted for its ease of access—no registration is required, ease of transcription is another matter. All documents are handwritten and require complete transcription for text search capabilities—making the project difficult for those untrained 19th century script, spelling, and shorthand.

The Citizens Archivist Project, a division of the National Archives, provides three crowdsourcing opportunities: tagging, transcription, and contribution. For beginners, the project allows tagging of photographs and artwork from the past two-hundred years of American history. Participants, after a quick registration, can tag labels to files for future search records. Participants can also transcribe a collection of American history documents, mainly public records, from the 18th through late 20th century. Transcribers can select easy, medium, or advanced documents—though the easy and medium sections are complete. In addition, participants can also contribute their own historical documents, or those of family members, to the site.

The War Department Papers allow transcribers to contribute to over 45,000 papers from the United States War Department from 1784 to 1800. The site caters more to academic contributors as it matches participates to their research interests to select documents. Documents on this site include public records, speeches, and letters from key figures in the Early Republic and the archive itself is a goldmine for research. That being said, this is not a crowdsourcing site created for the casual editor looking for a quick project—documents require full transcription of 18th century script and can be tedious.

The Smithsonian’s Digital Volunteers Program allows participants to review and label a plethora of materials from different collections from the Smithsonian Museums. This review, however, will focus on the collections under the National Museum of American History—which include certified currency proofs from Alabama, Alaska, Arizona, Arkansas, and California. The Smithsonian’s interface guides participants through a series of questions about each proof and provides ease of editing and a catch for incorrect questions. Although the “A” states are nearly complete, California’s collection allows for both labeling and reviewing descriptions created by others.

In addition to reviewing the four sites above, I worked as a contributor on both the Smithsonian’s project and the Martha Berry Digital Archive. The set-up of contributing is similar in that they both require a simple registration and a set series of questions about each document including author, date, and tags. The Smithsonian’s search capabilities for editors are a strong point; contributors can search and select the exact document they wish to edit while MBDA’s site only lets editors “view a random unedited document” and hit next until they find one they want to edit. MBDA does allow for a community of editors and a system of badges for editing achievements.

 

 

Digital History for use by the public

The use of digital history has become a growing field of interest in recent years, as well as the use of Crowd sourcing. Digital archives provide a great benefit to the public in that it allows for a greatere access for historical documents to be found and read by the public, no longer are people restrained to the hours that the archive may be open during the week days and perhaps not at all on the weekends. Along with this with the use of tags people can find documents that are related to that subject rather than having to dig through individual documents by hand, however this brings up one of the issues of digital history that documents need to actually be tagged correctly and properly in order to actually be found by the public.

While crowd sourcing is not a new concept it definitely provides a helpful push for digital archives in the fact that these archives have a greater source of editing to cover a large number of documents to get them into a digital archive. Along with this it provides the general public a sense of involvement when it comes to history, particularly if it history that is of local importance like with the Martha Berry project here at Berry. However, while it makes use of a willing public and gives a sense of importance to the community that they are involved with their own history it provides problems with trying to maintain conformity in ways of editing and documentation, though this can be limited somewhat depending on the forms that the project requires to be filled in online. Another negative is that even though you are opening up archives to the public there are some documents that may not be seen still due to an agency not wishing to give up an image that a particular person or institution may want to portray of itself through these documents.

 

The crowd sourcing sites that I found included the Martha Berry Digital Archive, the New York Public Library’s (NYPL) What’s on the Menu Archive, and then two projects from the University of Iowa one being the Collection of Civil War Dairies between soldiers and their family members and the other project from Iowa University was the correspondence of railroad baron Thomas Duran during the construction of the transcontinental line.  Each project has its own positives and negatives with each source. Both the projects from Iowa University provide a bar with the percentage of the document that has been looked at and marks documents that need to be reviewed so people know which documents to target first or need the most help to be looked at. However, beyond that there is little indication about the progress of each document. Though it is very easy to make changes to someone’s edits, particularly if they have left a section in brackets to indicate they only think that is what the word might be, so it allows for users to come along and confirm what people may believe what the word or even letter is.

The “What’s On The Menu?”  archive from  NYPL is an attempt to archive menus from the New York area dating all the back to the 1850s. While it provides the chance for users to transcribe the entirety of menus as well as dishes and prices, it does have an problem in the fact that there are very few new documents that have been uploaded by the Library so often times those that wish to contribute are stuck simply looking over other’s work.

Between the Martha Berry Archive and the Civil War Diaries, which are the two that I spent the most time contributing to they both provide a sense of ease between each other. However, Martha Berry requires that one be logged into the site in order to make changes to documents while the Civil War diaries were completely open sourced. Now this provides a bit of a buffer in preventing malicious changes to transcriptions of documents. In terms of capabilities Martha Berry provided a much greater use of tags for future searches into documents as well as providing authors and date ranges. It could also be considered much easier as it did not ask users to transcribe entire documents, it only asked for a brief summary of the document’s contents, location and title and when a document is typewritten it is much easier to make these edits for the document than a simply hand written dairy.

Paul Shamblin

With the rise of the Era of Internet, the way we study and research history has changed. One example of this change is the rising interest in digital archives that rely on crowdsourcing. There are many benefits to using crowdsourcing—the potential for transcription and donations increases the more people that sign up to give their time to the archival project. Having lots of people to help out can easily let a museum digitally archive their collections in a fraction of the time that it would take a dedicated team, and crowdsourcing means it’s done for free, so the funds can be allocated elsewhere. Crowdsourcing also opens up a whole new way for historians to get access to materials; by allowing contributors to post their own content instead of just edit what has already been posted, historians can gain access to a vast world of historical material that otherwise would have never seen the light of day. However, crowdsourcing is not all positive. They give the public a chance to sabotage efforts to reliably digitize sources, which can lead to inaccurate work which is harmful to the progress of the project. Crowdsourcing projects can combat this by requiring users to register before they contribute, but adding that extra hurdle can often prevent people who would have otherwise contributed from doing so.

For this project, I visited four different sites that rely on crowdsourcing and explored what they have to offer.

  1. The Martha Berry Digital Archives offers a wide variety of documents from the personal documents of Martha Berry; it includes her personal letters, photographs, and other such documents. It also has a variety off thematic collections that document a specific time period in Berry History, like WW1 or the family history of Martha Berry. Contributing only requires registering as an editor and then transcribing documents as they are assigned, a fairly painless process.
  2.   The First World War Poetry Digital Archive features a collection of war time poetry by many poets, some of whom would later become famous. The site also features over 6,000 items submitted by the public since 2008; taking a brief journey through everything that’s been submitted will turn up anything from a Valentine’s Card to an autograph book from a Red Cross nurse. Contributing to this site is more challenging, because it can only be done at special times when they ask for assistance.
  3. The Texas Tech Vietnam Center and Archive includes a wide variety of documents relating to the Vietnam War; they have oral histories, photographs, home movies, letters, and more. They accept any kind of contribution as long as it meets their criteria for inclusion, and after it is processed they will include it in their digital archives.
  4. The National Archives Transcription Project includes a wide variety of documents from American history that are found in the National Archives, and relies on visitors submissions to transcribe the documents for research.

The two that I contributed to were the Martha Berry Digital Archives and the National Archives Transcription project. Submitting to these two was fairly easy; the only difference was that the Martha Berry archives required registration while the other didn’t. The National Archives had their documents organized in order of how hard they were to transcribe, which made the process more user friendly; however, the Martha Berry Archives offered the transcriber many more options, like the ability to tag the document or give it a position on a Google Maps document. In general, these two sites were easy to get involved with and contribute to, and offer a wide variety of materials to choose from.

 

The Beast Inside the Machine: the Downfalls of Crowdsourcing

The creation and maintenance of digital archives is very valuable to those wishing to do historical research. Being able to simply go online and have thousands of primary sources at your fingertips can greatly expedite the research process. One would have the opportunity to browse and cite sources that are located hundreds of miles away without having to worry about traveling and traveling expenses. With this said, there are still a lot of downfalls in allowing the public to edit online resources as well as having resources based online.
For instance, one major problem I came across when searching for various crowdsourcing websites was the Heritage Crowd Project (http://heritagecrowd.org/ ). If you go to the link, you can see that the website was hacked. This could potentially lead to several problems. If you are using resources on a site and the site is hacked, it can obviously greatly affect the continuation of your research.
Another problem with crowdsourcing is that anyone can contribute. While you do not have to have a BA in history in order to figure out how to transcribe a document, it does help to have some prior historical knowledge pertaining to the period as well as some basic editing skills in order to accurately transcribe a document. It does help that the document can be edited and fixed by other contributors; however, there will still generally be a high frequency of errors. These errors can occasionally lead to the misrepresentation of a document and thus interfere with the legitimacy of research.
The first site that I looked at, though not really recent US history, was on the University of Iowa’s website (http://diyhistory.lib.uiowa.edu/transcribe/scripto/transcribe/3110/70425 ). This website had a large collection of letters written between 1850 and 1900. Many of the letters were written by soldiers that fought in the Civil War. You can go onto the website and search for letters and immediately begin editing straight on the page. While it is very easy, there are not any tags provided. Tags make it much easier to find letters that actually pertain to your research and would greatly improve navigation on this site.
The other two sites that I looked at worked with the digitization of newspapers. The University of California created the California Digital Newspaper Collection (http://cdnc.ucr.edu/cgi-bin/cdnc). The documents on this website have been scanned and transcribed onto the site. You can easily go online and begin editing. You pick a paper, highlight a section, and begin correcting the text. The Cambridge Public Library has a similar editing process on their website (http://cambridge.dlconsulting.com/cgi-bin/cambridge). This process is actually one of the better ones I have seen, however, the Martha Berry Digital Archives is the best of the four that I looked at.
The Martha Berry Digital Archives is better than the other sites that I looked at for several reasons. MBDA is easier to navigate and has a more user friendly editing process. There is also an easy way to keep track of the documents you have edited (https://mbda.berry.edu/items/browse). In addition to this, there are tags for all of the documents, which can really help you during the research process. The map that allows for the documentation of location is also very helpful and sets the MBDA apart from the other sites I looked at.
Overall, crowdsourcing is very helpful in historical research. Despite the fact that it’s helpful, there are still a lot of negative aspects that must be addressed for it to truly be a reliable database.

Tagged