All content in digital formats can be characterized as structured or unstructured data. In actuality, all data is structured—even typing on a keyboard “structures” a text as an alphabetic file and links it to an ASCII keyboard and strokes. The distinction of one letter from another or from a number structures the data at the primary level. But the concept of “structured data” is used to refer to another, second, level of organization that allows data to be managed or manipulated through that extra structure. Common ways to structure data are to introduce markup using tags, to use comma separated values, or other data structures. The distinction between structured/unstructured data has ramifications for the ways information can be used, analyzed, and displayed.
Structured data is given explicit formal properties by means of the secondary levels of organization, or encoding, referred to above. These use extra elements (ex: tags), data structures (tables, spreadsheets, database collections), or other means to add an extra level of interpretation or value to the data. The term unstructured data is generally used to refer to texts, images, sound files, or other digitally encoded information that has not had a secondary structure imposed upon it.
How are online documents encoded in order to be machine-readable? In the humanities, why is this done? Regardless of whether most people notice them, what are some everyday examples of markup? What encoding guidelines or standards might be relevant to projects in this class? Furthermore, what are the possible ethical implications of structured data forms?
- Big Data + Old History https://youtu.be/tp4y-_VoXdA
- Humanities + Digital Tools: Text Technologies https://www.youtube.com/watch?v=wQKmktxYcsc
- The Truth about Algorithms https://www.youtube.com/watch?v=heQzqX35c9A
- Age of the Algorithm https://player.fm/series/ninety-nine-percent-invisible/274-the-age-of-the-algorithm
- Annelise Dowd – Video 3: Metadata, Markup, and Discoverability https://youtu.be/zyFvMslPRW0
- Example of a Project Proposal from ENGL201 Spring 2018
- Student Explanation of Digital Humanities Project https://youtu.be/nt3F55zHJHs
- DIRT (Digital Research Tools) http://dirtdirectory.org/
Metadata Games leverages the power of play and crowdsourcing to help manage digital collections, Tiltfactor Director Mary Flanagan explains.“Metadata Games is a collection of games that features archival materials in unique ways and that engages the public to play with the materials in order to learn new things about them,” Flanagan notes. “As people choose or type or do other kinds of interactions, it becomes part of the history of that image or text file so that we know more about our archives.”When archives contain materials from minority and non-dominant communities, the risk of loss or misidentification of digital objects is especially high.“There are massive amounts of things being digitized by libraries, archives, and cultural institutions, for which we just don’t have information or we have culturally skewed information,” Flanagan says.
Take some time to play some of the Metadata Games. Profile 2 of the games in a post at your website. Please note – use a post, not a page. In your post reflect on the types of metadata were you challenged to work with in the games. What did you enjoy about the games? Did you encounter difficulties? Tag your post: week3