Category Archives: E. Rayburn

The Basics of Crowdsourcing

Many hands make light work. This old saying sums up the theory and benefit behind crowdsourcing projects. It holds true whether applied to transcribing projects or those driven by crowd submission. The beauty of crowdsourcing is that organizations can put their materials online or post a request for submissions and someone halfway across the country could help fulfill the goal of the project despite the distance.

For example, the creators of  the Martha Berry Digital Archive uploaded scanned images of the documents in the Archives at Berry College and asked the public help transcribe and tag the images to make them searchable and available to a widespread audience. Similarly, the New York Public Library has a project titled “What’s on the Menu?” which asks for the public’s help transcribing one of the largest restaurant menu collections in the world, with some documents dating as far back as the 1840s. Another way projects can utilize the contributions of the public is by letting users add their own historical documents for others to view. The National Museum of American History has a project called the Agricultural Innovation and Heritage Archive which does just that by letting its users upload stories about experiences they, their family, or their friends have had with Agriculture and its many challenges and rewards. Some projects are a combination of both types of contribution opportunities, like the Edgerton Digital Collections Project. This online project was created to honor Harold “Doc” Edgerton, an innovative MIT professor of engineering who passed away in 1990, by allowing friends and former students to post stories and memories they have of Doc as well as helping to transcribe and tag digital copies of his research notebooks.

I contributed to both the MBDA and the Doc Edgerton projects. While both were reasonably easy to navigate and use, they were each different with their own pros and cons. On the Martha Berry site it is easy to filter and find documents that have not yet been worked on. This is not the case for Doc’s notebooks which have to be clicked through until the contributor can find one that has not been completed yet. However once you find a document, the written text is automatically blown up to make it easy to read alongside the fields that need to be filled out. The MBDA site, on the other hand, allows you to zoom in but sometimes you have to leave the bigger picture in order to type the text. Overall, these complaints are small and the two sights are not difficult to use and make crowd sourcing quick and easy.

While letting so many people contribute to the efforts of these projects does allow the work of digitizing or compiling much quicker, it also takes away a constancy that would come from one or a small number of people working on the tasks in an uniform manner. This is especially true when it comes to tagging a piece, what one user thinks is unimportant in a document might be the thing that someone else is searching for, which could leave research incomplete. However projects like the Agricultural Innovation and Heritage Archive would not be as complete if it were not as easy and convenient for contributors to simply upload their stories in order to preserve them. The main benefit of digital databases created as a result of these crowdsourcing sites is that more people can access archives that they traditionally would have needed to travel to see. Someone can access files housed in Washington DC from London which can help research grow in ways that would previously cost vast amounts of time and money to access. One negative aspect on the other side of this coin however is that traditional research allows the evidence to speak for itself. When searchable text is the way someone is looking through documents, they do not see every pice from a folder like they would if they were shifting the the documents in person. This could lead to a biased interpretation of the sources and false conclusions because of an incomplete picture. Overall, every method of research will have positive and negative aspects. Crowdsourcing is a convenient and fast way to create databases that can be accessed from a distance by many people.