Self Study (part 1): Introducing XML and Markup

I’m occasionally asked what people should read and do if they want to teach themselves TEI P5 XML. Where should they start? This depends, obviously, on what time they have and what resources. I tend to recommend directed intensive training such as the Digital.Humanties@Oxford Summer School as good ways to get an introduction to such topics.

However, some people are unable to participate in such training and prefer self-directed learning. What should they do? There are lots of resources online such as TEI By Example and the TEI Guidelines. Where to start?

When people are taking an Introduction to TEI workshop I usually introduce markup but move onto TEI and XML very quickly because in such intensive workshops time is limited. Instead, when people are undertaking self-directed learning I think they should use the time they have to learn more about HTML and then XML before starting to learn about the TEI vocabulary of XML itself.

There is so much reading that is possible to suggest for an initial exploration of XML and Markup.  I would suggest at least looking at:

as a good start.

If I were to suggest a series of assignments someone might undertake based on this reading it would be to do the following, writing up answers to the questions.

  1. Read the W3Schools HTML basic section and XHTML section, do the HTML and XHTML quizzes
  2. Read the W3Schools XML basic section and XML Namespaces page, do the XML quiz
  3. Read the TEI Guidelines Gentle Introduction to XML; and the wikipedia article on XML.
  4. How does XML differ from HTML? Why might it be more powerful to describe what some piece of data is, rather than say how it should be presented?
  5. Download and install the oXygen XML editor (you can get a 1 month free trial license, otherwise costs $64 USD)
  6. Choose a very short (1 page) sample of a document you are interested in.
  7. Create a list of the overall structural aspects you feel define this sort of document. Create a list of any of data-like entries (like names or dates) in the document. Create a list of presentational aspects of the document that you think important to record.
  8. Funding challenge part 1: Hypothetically, imagine you had funding to mark up several thousand pages of this material. Look at the list of aspects you would like to record. Why is each one important? What benefit does recording each of these things give those wanting to use or understand the text (or culture from which it originates)? Which would you choose to markup? How consistently can you mark up this feature? Such document analysis should be done long before any project starts (or asks for funding).
  9. Funding challenge part 2: An uncaring government has slashed its funding for higher education research projects and has reduced your project’s funding by 50%! What would you do? Will you mark up only 50% of the material? If so, how do you decide which parts? Will you only mark up certain aspects? If so, which ones and why?
  10. Using the ‘Text’ (code view) mode of the  oXygen XML editor create a well-formed XML file of your sample document with elements and attributes that you have invented yourself. What difficulties do you encounter doing this?
  11. Why might it be better for communities of users to agree on elements, what they mean, and how they should be used?
  12. What are the central ideas of Michael Wesch’s youtube video? How do they relate to the nature of XML and how it is used?
  13. Read the wikipedia article on RSS, and find an RSS feed to subscribe to in google reader to see its application.
  14. Does order really matter in an XML document?  What is the difference between:

    <list><item n=”1″>item 1</item><item n=”2″>item number 2</item></list>  and
    <list><item n=”2″>item number 2</item><item n=”1″>item 1</item></list>

    And how much difference does this make when viewing XML as a data storage format rather than a presentational one?

  15. Join the TEI-L mailing list and start lurking.

This certainly isn’t exhaustive, but with a bit of support, I suggest someone undertaking this would be much better placed to start learning about TEI P5 XML from the online sources available.

The next post in this series is an Introduction to the Text Encoding Initiative Guidelines.

Posted in SelfStudy, TEI, XML | 2 Comments

2 Responses to “Self Study (part 1): Introducing XML and Markup”

  1. Kevin says:

    Having grumbled for years that there was no good place for someone to start, I wrote my own introduction ( http://www.ultraslavonic.info/intro-to-xml/ ) and then learned that David Birnbuam did the same thing ( http://clover.slavic.pitt.edu/humcomp/what-is-xml.php ) , though his is more in-depth. At the end of mine, there are links to both David’s and to TEI by Example.

  2. James Cummings says:

    Yes, I’ve written basically those introductions a few times myself. I’m planning to do a series of entries with the next one introducing basic TEI and moving on from there. *fingers crossed*