Skip directly to content

RDFa/Schema.org: Adding a machine readable page about myself

on Friday, 9th December 2011 - 20:39

Excercise for the evening: Make a page for myself that provides semantic information to pass to the machines.  *Insert creepy music here*

The why:  Steph (a.k.a. Scor) has been working on several projects over the years that have resulted in the addition of RDFa to Drupal 7, a module (or two) that use this RDFa to provide information that the machines can read.  Since I usually need to explain to people (and christmas IS around the corner) what it is my husband does... trying to follow his instructions on his recent screencast should suffice for this year's training.

First five minutes... Fail! Fail! Fail!  I got distracted getting a glass of water and with a trip to the washroom and missed the first half of the screen cast.  Oh wait, I've seen this before!  It seems easy enough.

1.  Downloaded and installed (using Drush and 'modules') the "Schemaorg" module (or Schema.org module if your username is scor).  

2.  Since my site is mostly composed of blogposts, I went into content type 'blog' and added the schema.org settings 'Blogpost'.  This of course is for future reference when and if the search engines start recognizing rich snippets.  I also edited my fields so that the 'image' was tagged as a schema.org 'image'.  Body has automatically been labelled 'articlebody'.  Tags need to be labelled 'keywords'.

Note: I don't know the difference between 'blog' or 'blogpost'.. when you start typing 'bl..' the schema.org settings will automatically try to pattern match the word to what is in its library.

3.  Back to the task - "Create new content type".  In his example he uses "Person".  Not to be confused, and since there's only one user on my site - I'll call my content type "Aboutme".  I scroll down to the bottom of the page and add the Schema.org settings - "Person".  

Note: Since the date of his screencast, scor has changed it so you only have to enter one field in the settings.  The second field was essentially redundant.

4.  Now the important bit... On the drupal.org post on Structured Data and Data Interoperability there are links to certain 'recipes' giving schema.org settings for various content types.  I will copy and paste them here:

FieldSchema.org mapping
typePerson
Imageimage
Affiliationaffiliation
Job titlejobTitle

5.  BUT - if I go to http://schema.org/Person  I find there are far more attributes that I could potentially add to my 'Aboutme' content type.  This list is still a work in progress and so I'm sure if you're following along that it may have changed six months from now.  I believe that once these start being commonplace they will no longer change (or at least, nothing should disappear from the list).  I could add the following:

  • description (*body field)
  • image 
  • additionalname (Found out that the module needs an update! Not in module library - BUT, it doesn't stop me from entering in the text manually.)
  • affiliation 
  • alumniOf 
  • children (name of)
  • familyName (lastname) 
  • givenName (firstname) 
  • jobTitle
  • memberOf 
  • nationality 
  • worksFor 

6.  Here goes nothing.. create the additional fields for my content type.  Save. 

7.  Go to "Manage Display" of content type and I hide everything.  The reason for this is I don't want a page turning up on my site with random information posting - I'd like a little ordered sense.  I'm actually thinking up of an 'ad lib' style view.  But I digress.  As a default I do have the body showing. 

8.  Time to see if it works!  I create a page 'dcor'.  (#7. strike through - the fields must all be visible and NOT hidden if I want this particular page to view with Rich Snippets.  Alternatively, you can also embed the RDFa information in the css of your theme layer (if it's a personal site) but this is not what I want to do.  
Use case: I want to set up about 20 profiles of various teachers on my retreat centre site.  How would I display them? As a page in views!  NO!  I would have to set up individual pages for each teacher.  So the data would have to be pushed out the way that the node would display.  

So you can't use views and you're stuck with ugly field data on your otherwise pristine website.
The final page: ""   When tested here... comes up as...

Tee hee he.. I broke it! ;)  Well, it started out working fine but then I backtracked and deleted all the fancy schema fields I had created.  K.I.S.S. All that Google Webmaster Tools will show you are your jobTitle, affiliation, person, and image.  So may as well stick to the scope of the tutorial.
SO... after this whole exercise.. I suggest to my dear beloved scor... 

Simplify the experience.  In the long run (when and IF machines can read all of the schema.org mappings) I don't need to be creating all of these magical fields if I can simply select the text in the body of my node and add an 'annotation' that will configure the schema.org setting.  This way, I can select my name and annotate as 'person'.  I can select some text and annotate as 'description'... and so on and so forth. 

Other notes:

Affiliations shows up alongside the job title.  So it may be a good idea to use that instead of 'Worksfor'.  It will also only picks up the first entry (i.e. putting in two affiliations only popped out the one.)  I'm removing the excessive data that doesn't show up on the example... Oops and there I went and broke it.  Apologies for no screen shot, but it was working at one point as per the example on the tutorial!