Tagged week 1

Making HTML5 More Approachable: Writing for the Web, Accessibility, and Cyberinfrastructure

The white paper below is my final assignment for Week 1, Digitization Fundamentals, as part of the Graduate Certificate in Digital Humanities at DHSI 2015.


Learning to write for the web with HTML5 can be a daunting task. The general procedure is for students to view the raw data of an HTML document, learn about the meanings of tags and the process of encoding, and then hopefully begin writing HTML. This approach, however, relies on other embedded factors and assumptions, such as high computer literacy and facility with web environments. Especially in the Humanities where these skills are not necessarily required, newcomers to HTML are often anxious and afraid to share their projects with others. Discussing the basics of cyberinfrastructure[1] can help students visualize how documents circulate on the Internet, and using visual materials instead of raw code can help lower the barriers to writing for the web.

Defining HTML5

HTML5 consists of three parts, HTML (Hypertext Markup Language), CSS (Cascading Style Sheets, often referred to as CSS3, the third iteration), and Javascript (a programming language).[2] HTML is the markup language that adds semantic meaning to content. These HTML tags, such as <p>paragraph</p>, wrap-around and describe content with opening and closing angle brackets. The fifth version of HTML, or HTML5, contains new tags and markup for modern web development such as <video> and <audio>. CSS contains the aesthetic markers for altering the look and feel of webpages. CSS can be encoded within the HTML page, or as it most often is, referenced as a separate .css file or “style sheet.” The cascading nature of CSS refers to the ability of these style sheets to be hierarchical, where newer style sheets can overwrite previous ones without needing to remove or delete the latter. HTML and CSS are what the majority of people will work with when learning HTML5. Both HTML and CSS are methods of encoding content, not of programming or writing code. Javascript is the third portion of the HTML5 trio, and while essential for the web it can be perplexing for beginners. Javascript is the code-writing or programming portion of HTML5, and it can be tackled later on when a good foundation of HTML and CSS have been achieved.

Accessibility in Web environments

The word “accessible” can have multiple meanings when working with web environments. For many, the term accessibility relates to the openness of data on the web, making information clear and jargon-free, and also ensuring that links and important information are easy to find. However, accessibility also has a deeper meaning. Accessibility refers to the need to create digital environments that are inclusive, responsive, and universal in their design.[3] Web sites and digital projects must be created with all users in mind so that people with disabilities, whether physical, cognitive, or temporary, are not excluded from interaction.[4] Unfortunately, the process of learning HTML often skips past both usages of the word accessible, resulting in web environments that are complicated, and frustrating or impossible to use. Accessibility is often thought of after a project is completed, many times as a result of legal action.[5] Websites or other digital projects that have features of accessibility added to them after the fact can suffer from poor implementation and design, while also incurring additional costs for development. All digital projects, from the smallest personal webpage to the largest corporate website, should be designed from the very beginning with accessibility as a primary feature. The best place to start this process is with those just beginning to write for the web, whether in HTML5 or through other forms of web creation.

Text editors

A major stumbling block in the process of learning HTML5 is the text editor. Many new students learning HTML are unfamiliar with text editor programs, even if there’s already one installed on their computer. The text editor is generally less complicated to use than Word or other word-processing programs, but the unfamiliarity of the program can make students uneasy. An explanation of how word-processing programs incorporate code and other markup behind the visible surface of the document can help students see the reason for text editors. Spending time on the installation and usage of the text editor, along with ensuring that .html and other file extensions open by default, can help reduce student frustration. Instructors should also use the same programs that students are using, especially when projecting their computer screen to the class. It can be disconcerting to see a highly modified text editor running on the projector, and using the command line should be avoided for the sake of clarity as well. Students approaching HTML5 or programming for the first time can feel overwhelmed when watching an expert deftly maneuver through the command line interface with custom keystroke settings and other time-saving shortcuts.

Working from the outside-in

HTML is a method of encoding text with semantic meaning for the web. It’s common for instructors to begin with “lorem ipsum” text, and there are some entertaining lorem ipsum generators on the web, such as https://baconipsum.com and http://www.cupcakeipsum.com These lorem ipsum generators can add a splash of fun, but they are also another layer of abstraction for new students to navigate through. Instead of working with placeholder text, it would be best to work with materials that are familiar to the student, perhaps their own writing or text from a favorite book, poem, or website. The idea here is to help the student maintain a sense of control over the process and reduce anxiety. By using content that’s familiar, students will have a mental image for how the information should be structured and presented. Using as many familiar points of contact and reference as possible can help students overcome some of the initial fear, and work toward questions of design and presentation. A discussion using visual materials and avoiding raw code can help students see the logic behind HTML. A printed page from a simple website could be used for students to write on, where they might draw arrows pointing to different sections of the page and their corresponding HTML tags and elements.

Hierarchies and structure

Along with using familiar content for teaching HTML, a good overview of the hierarchies and structures of a webpage can be helpful.[6] Building accessibility into a webpage begins with the general layout and order of content. One of the most important aspects is to review how headings operate in a hierarchical structure, but paragraphs and other tags do not. The header tags of <h1>, <h2>,<h3>, and so on, can be quite confusing to new students, in part because of the changing font size associated with the tag. Students often choose a header tag because it changes the visual design, not because it presents information in a logical order. Screen readers and other assistive devices rely on header tags to move through web documents, and this hierarchical order is essential for navigation and meaning. Beginning with a discussion of header and paragraph tags, and how they’re used for semantic rather than aesthetic purposes, is a great place to discuss accessibility and good web design.

Writing for humans, and machines too

Once students have some content on their webpage, they inevitably look toward styling, which then leads to CSS or cascading style sheets.[7] Inline styling, or adding CSS markup within the HTML document, is often the first step. However, CSS markup is usually referenced in another file, referred to by a link in the head of the .html document. Inline styling is an older and more simple styling practice, and if separate style sheets were never used students would still be able to construct a basic webpage with some design features. Yet, there’s an opportunity with CSS files to show students how documents on the web can hyperlink to one another behind the scenes. Even though the links or referenced files in the head portion of the .html file are not visible to the user of the webpage, this information is essential for interoperability on the web. The head portion of the HTML document is important because it’s meant for computers to read, not people. Constructing documents in HTML5 is a process of writing for multiple audiences, including machines, and discussion about this can help students visualize how information moves through the Internet.

The basic workflow

Once the discussion of writing for machines is complete, moving toward a workflow that includes a referenced CSS style sheet can help put theory into practice. Generally, students will work locally on their own computers, and this can result in a lack of understanding about how documents are made visible on the web. There’s nothing inherently wrong with working locally, but without discussing the storage and retrieval of documents over the Internet students will be unable to put their new skills into practice. Online server space is needed for this portion of the class, and availability will depend on institutional resources. Even if there are no resources available, a discussion of the process is essential for students, as this will help them more fully understand what “working locally” means. This process of learning HTML and CSS, locally or otherwise, will require two separate programs to be open (the text editor and a browser), and three separate windows (.html file, .css file, and the browser window). If students can see how these three components relate to one another, they are more likely to understand how documents are stored, retrieved, and revised on the web. A more advanced discussion about FTP (File Transfer Protocol) programs, and discussing the need for SFTP (SSH or Secure Shell FTP), document permission settings, and online security would be wonderful if time allows.


Writing for the web in HTML5 can be complicated to learn, but focusing on visual methods and having directed class discussions can increase student understanding. With HTML and CSS, students can construct static webpages and share information more easily across software platforms and hardware devices. By avoiding proprietary or closed formats and programs, such as Word and .pdf files, students will be engaging in the 5 Star Open Data plan.[8] Most important for 5 Star Open Data is getting the materials online, and a basic understanding of HTML and CSS empowers students to contribute to the Internet in meaningful ways. With their new HTML5 skills, students will be able to interact more fully with programs that they may already be using, such as an LMS (Learning Management System, like Blackboard or Canvas) or a CMS (Content Management System, such as WordPress or Drupal). With accessibility as a primary feature from the very beginning, students will be creating documents that are easier for people to interact with, and by using the logics inherent in well-structured data they will be making the materials easier for machines to read as well. Learning the basics of HTML5 is an opportunity to publish materials online, as well as an occasion to learn about the fundamentals of cyberinfrastructure and digital literacy.


[1] Gardner Campbell, “A Personal Cyberinfrastructure,” EDUCAUSE Review, September 2009 http://www.educause.edu/ero/article/personal-cyberinfrastructure

[2] A good overview of HTML5 and how its development has progressed is available in HTML5: The Missing Manual. Matthew MacDonald, HTML5: The Missing Manual, Nan Barber ed. (Sebastopol, CA: O’Reilly Media, 2013).

[3] George H. Williams, “Disability, Universal Design, and the Digital Humanities,” Debates in Digital Humanities, online open access edition (Minneapolis, MN: University of Minnesota Press, 2013) http://dhdebates.gc.cuny.edu/debates/text/44

[4] The Accessible Future project has good list of readings about accessibility in web environments: http://www.accessiblefuture.org/readings/

[5] Although accessibility in web environments is a moral, as well as a technical question, legal action is often the course that must be taken for change to occur. A list of recent lawsuits can be found at The University of Minnesota, Duluth, “Higher Ed Accessibility Lawsuits,” http://blog.lib.umn.edu/itsshelp/news/2013/10/higher-ed-accessibility-lawsuits.html

[6] Dive into HTML5 is a free and online resource, and best of all the website itself is built entirely in HTML5. Mark Pilgrim, Dive into HTML5, http://diveintohtml5.info

[7] As students begin to learn how CSS shapes the design of the webpage, a peek at CSS Zen Garden can show them how powerful CSS can be: http://csszengarden.com

[8] 5 Star Open Data, http://5stardata.info – a clear and easy to follow website discussing Timothy Berners-Lee’s ideas about linked data, http://www.w3.org/DesignIssues/LinkedData.html

Journal 5 – Digitization Fundamentals

American Research Libraries (ARL) Code of Best Practices for Fair Use infographic
American Research Libraries (ARL) Code of Best Practices for Fair Use infographic

Our fifth and final day of class was split into two parts: a class session in the morning, and “show and tell” in the afternoon. There was also a lecture on the ethics of digital humanities research in the afternoon. In the morning we discussed the problem of copyright in digital humanities projects, and also the question of code literacy.

Copyright is certainly a tricky thing to figure out, and sadly there are no solid answers except for court judgements. There are, however, some guidelines that can be followed for fair use, or fair dealing practices. The notion that no one can really tell you what is, or what is not, a copyright violation can have a chilling effect on academic scholarship. Important for our discussion was that academics can rely on fair use legally. Following a set of best practices can help ensure that works under copyright remain protected, while also allowing for new and innovative scholarship.

The second portion of our discussion on code literacy was even more contentious than questions of fair use. The night before we watched, or rather listed to, a roundtable discussion posted to Rhizome’s Vimeo page. Although the audio was terrible, the discourse was quite interesting. At the heart of the matter was defining “code literacy” — is it a scientific or technical goal, focusing on engineering and programming aspects — or is the objective humanistic, centering on the idea that code is everywhere in our modern lives and that we have the power to direct or our own futures, digital or otherwise?

There was no clear answer of course, it was more so a point of reflection on digital technology and DH overall. As we worked through the week in Digitization Fundamentals, we learned technical skills and we also learned how to use those skills for creative and meaningful production. Balancing these two facets of code literacy, the scientific and the humanistic, will remain a central feature of our digital projects to come.

Journal 4 – Digitization Fundamentals

A study in video compression: one video of red moving balls shown in three different video formats. In this photo, the red ball in the lower left shows fewer artifacts than the other two.
A study in video compression: one video of a red moving ball shown in three different video formats. In this photo, the red ball in the lower left shows fewer artifacts than the other two. However, its higher quality and file size make it a more difficult video to share online.

On the fourth day of class we moved into video editing. This was an interesting class because video editing seems more approachable than working with audio, but its processing and distribution is also more complex. Frame rates, variable screen sizes, color reproduction, and compression of data are just some of the variables that effect video production. Video is also a very powerful medium, in that it captures the mind in a way that other media might not.

Video production is a time-consuming process with many stages of development. The steps of pre-production, production or shooting, and post-production each have their own components and processes to consider. Equipment from cameras and tripods, to computers and video monitors are necessary to ensure a quality film — not to mention actors, scripts, storyboards, as well as financing and distribution. With all of these things in mind, it’s easy to see how producing a short film would mirror many of the project management considerations within DH.

At its most basic level, video is a form of storytelling. Most videos or films are linear, with the author or director of the film taking control of the form and movement of the narrative. There are some new tools for non-linear storytelling with video, such as Korsakov, but unfortunately Korsakov relies on Flash which is rapidly being replaced by HTML5 video on the web. Youtube makes the distribution of video seem easy, but the process is actually quite complicated. Films must be compressed with a codec to bind clips and audio together, and they must also be a small enough file size to be streamed and shared.

Our final class project for the show and tell was a video produced by the entire class. I contributed my audio file, a song titled “Ice Cream Dubstep,” which I made in class. Other students worked on filming the class itself, some filmed students for individual interviews, other students also made audio clips, and two students, Heather and Rachel, worked together to edit all of these disparate artifacts in one final video.

Digital Imaginary – Tweets from Session 1 Colloquium, Week 1

Tweets regarding my DHSI Colloquium talk on my dissertation research (The Digital Imaginary), “Sharing the Digital Imaginary: Dissertation Blogging and the Companion Website” (colloquium schedule PDF archive):

Digital Zombies – Tweets from Session 1 Colloquium, Week 1

Tweets regarding Professor Juliette Levy’s DHSI Colloquium talk with me on Digital Zombies (http://zombies.digital), “Bringing DH into the library: pedagogy, games and online ed” (colloquium schedule PDF archive):

Ice Cream Dubstep

[soundcloud url=”https://api.soundcloud.com/tracks/208789296″ params=”color=ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false” width=”100%” height=”166″ iframe=”true” /]

After the class created Ice Cream Original, I downloaded the audio files that we were working with in Audacity.

From here I took the tracks in a new direction, creating “Ice Cream Dubstep.”

The vocals are the same as the original, just modified with “stretch and pitch” using Adobe Audition. The gong sound and the ice cream truck melody are also altered with Audition. I removed the applause, and I added in two dubstep tracks from Freesound, “dubstep_drumloop_crunch” and “dubstep growl.”

Adding the tracks to Audition was easy, but it was a little tricky to make them melodic. Within Audition I zoomed-in and examined the waveforms in order to line-up the tracks so they hit on the proper beats.

I also uploaded Ice Cream Dubstep to Soundcloud. Using Adobe Illustrator I made a quick graphic for Ice Cream Dubstep using an ice cream poster pack purchased from Creative Market.


Ice Cream Original

[soundcloud url=”https://api.soundcloud.com/tracks/208789291″ params=”color=ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false” width=”100%” height=”166″ iframe=”true” /]

On the third day of class in Digitization Fundamentals we worked with audio files, and it was pretty amazing.

The instructor Robin Davies worked through the differences between audio frequency and sampling rate, and we got our feet wet with Audacity. One of the students recorded a short line of poetry, and Robin tweaked the settings and microphone placement. This showed us how important it is to get a good capture of the original audio, many times the mic is too far away and there’s too much noise in the background.

The line of poetry recorded was from Wallace Stevens’, “The Emperor of Ice Cream”:

Let be be finale of seem.
The only emperor is the emperor of ice-cream.

With this track running in Audacity, Robin added some sound clips from Freesound, and we extracted some audio from a Youtube clip. We ended up with a strange melange of sounds, the poetry line, a gong, an ice cream truck, and applause. I’ve called this track “Ice Cream Original,” and I posted it to Soundcloud.

Journal 3 – Digitization Fundamentals

Working with audio files in Audacity: These files were mixed in class to produce the original clip “Ice Cream,” which included poetry read aloud by a student in the class.

Day three was all about audio, and I’ll have to admit, this was my favorite subject of the week. I had worked with audio in the past, but I never really understood the processes that were taking place. The numbers and waveforms in programs like Audacity and Adobe Audition can be a bit daunting to understand. Thankfully, Robin presented the material in a clear and compelling way, taking us step-by-step through the process. This class tied-in with the first day, as we worked more directly with the numbers and mathematical concepts of bits and bytes. Even though waveforms are visual representations of sound, it was easier to see how the waveforms related directly to mathematical data points.

To help us work through the process of editing audio, Robin played a single track by Radiohead in a variety of ways. Reducing the sampling rate had a direct impact on the playability of the track, moving it from clean and crisp to dull and distant in tone. Since they’re both measured in Hertz or Hz, it was difficult at first to see how audio frequency and sampling rate differed from one another.

Audio frequency is the vibration, the wave itself that is the actual sound. Sampling rate is the number of times per second the wave is measured. Humans can hear a range of sound, or audio frequency between 20 and 20,000 Hz. In order to turn these waves into digital data that is representative of the sound, the waves must be measured at twice the Hz, around 40,000 Hz. This high sampling rate is necessary to capture the waves at crests and at troughs, giving a full picture of the sound. Capturing the wave at the same rate as its audio frequency would only produce data points for crests, or only troughs, and the sound would not be dynamic or representative of the original.

Journal 2 – Digitization Fundamentals

Our second day of class in Digitization Fundamentals covered images, data file formats, HTML and CSS. One of the instructors, Robin Davies, discussed how libraries were increasingly advertising for positions that included DH skills, such as Digital Preservation Librarian. This job description called for the management of “emerging digital preservation practices,” and also for expertise with “all phases of the life cycle of digital content with the aim of long-term retention and access.” Important here is not only the requirement for preexisting skills with digital artifacts, but the idea that digital practices change often, and digital humanists must stay ahead of the curve to ensure that DH projects stand the test of time.

Digitizing materials must occur in such a way that the digital capture is as true to the original as possible. This means, of course, that no photo filters should be used, or other aesthetic tools so common in social media. However, there’s also the possibility that some tweaking of the device’s default settings might be necessary to create a digital file that’s true to the original. For example, if a page of text is black ink on white paper, scanning this with black and white settings might seem appropriate. If the text has faded a bit though, and the paper is no longer bright white but yellowish or greenish, a color scan or photograph would more accurately convey the age and texture of the document.

We had an interesting conversation about the difference between a scan and a photograph. The class really had a tough time drawing a solid line between the two, as there was much overlap between the processes. Book scanning can be especially difficult. The spine of the book should remain unharmed and unbroken, making flat-bed scanners problematic. Using an apparatus with two cameras pointing at each page from an angle can help retain the shape of the book, while also enabling digitization. But is this a scan, or a photograph? Does the file format, whether pdf or jpg, shape the definition? And are higher resolutions always necessary, and which process might produce the most faithful digital artifact? These questions are difficult to answer, and perhaps the best practice is to use the methods most fit according to time, materials, and resources.

Non-destructive apparatus at DIY Book Scanner. A scanner like this was described in the book, Mr. Penumbra’s 24 Hour Bookstore, which we discussed in class.