All content in digital formats can be characterized as structured or unstructured data. In actuality, all data is structured—even typing on a keyboard “structures” a text as an alphabetic file and links it to an ASCII keyboard and strokes. The distinction of one letter from another or from a number structures the data at the primary level. But the concept of “structured data” is used to refer to another, second, level of organization that allows data to be managed or manipulated through that extra structure. Common ways to structure data are to introduce markup using tags, to use comma separated values, or other data structures. The distinction between structured/unstructured data has ramifications for the ways information can be used, analyzed, and displayed.
Structured data is given explicit formal properties by means of the secondary levels of organization, or encoding, referred to above. These use extra elements (ex: tags), data structures (tables, spreadsheets, database collections), or other means to add an extra level of interpretation or value to the data. The term unstructured data is generally used to refer to texts, images, sound files, or other digitally encoded information that has not had a secondary structure imposed upon it.
How are online documents encoded in order to be machine-readable? In the humanities, why is this done? Regardless of whether most people notice them, what are some everyday examples of markup? What encoding guidelines or standards might be relevant to projects in this class? Furthermore, what are the possible ethical implications of structured data forms?
Reading
Media
- Annelise Dowd – Metadata, Markup, and Discoverability https://youtu.be/zyFvMslPRW0
- Big Data + Old History https://youtu.be/tp4y-_VoXdA
- The Truth about Algorithms https://www.youtube.com/watch?v=heQzqX35c9A
- Age of the Algorithm https://player.fm/series/ninety-nine-percent-invisible/274-the-age-of-the-algorithm
Exercises
- Use Hypothes.is to annotate the two required readings listed above. Add metadata to your annotations by tagging them: engl201week3
- Read the page of the Omeka manual outlining Dublin Core Metadata. Add five files to your Omeka installation you setup in Week 2 and update as many metadata fields as are relevant to the media items. In Week 2 you added some photos to your Omeka installation, this week let’s add:
- Two small sound files (MP3)
- 2 video files (MP4)
- 1 image file (JPG or PNG)
- The metadata you add to your MP3 item and MP4 item in Omeka will provide us all with more context and insight into your items. You can record your own sound and video for this exercise with your mobile phone. There are many online applications and free mobile apps you can use to convert mobile phone media to MP3 and MP4. If you are unable to find one that works for you, please ask in https://chat.opended.ca. If you find one that works really well for you, please consider sharing it in https://chat.opended.ca so others may benefit from your discovery.
- When that is done, create a post at your website with a link to those items in your Omeka installation. Briefly outline why you chose those items and how you see the metadata for those items helps visitors to your site better understand your chosen media items. Did you encounter difficulties? If so, how did you overcome them?
- Add metadata to this week’s post at your site buy tagging your post: week3