This one is for the CMS designers out there. If you’re in the business of building platforms that people create content in, you undoubtedly have run into the problem of storing metadata. It may seem easy at first, “just put it in a database!”, but then you start running into predictable problems: the context information is hard to store, keeping references valid/up to date, and what happens when you export?
Databases + Metadata = Unsolved Problem
Metadata in databases loses its context quickly. Let’s say Jen uploads an image, titles it “My pet puppy.” Jen’s friend Steve wants to use the image, selects it in the image library and wants to change the title to “Jen’s pet puppy.” Where do you store that title now? What happens when Jen renames her image? What if you use a couple copies of the same image, with a different title? It’s a bit of a mess, but usually the solution is store the metadata in context: keep the metadata with each use of the image, in that HTML page. Problem is, images don’t have a title attribute.
The other issue is maintaining those goddamn references between the database, the HTML file, and the image file. Odds are you’ll be using some database file system of some kind so now you have to manage deletions, renames, and metadata edits in three different linked places. Those links are fragile, so things fall out of sync. Especially if users have access to editing their HTML source code, offline editing, import/export, anything like that. So make sure to keep one authoritative copy of that data.
Lastly is the issue of exporting/sharing this content. The platform that I work on has a strict requirement for being exportable without ruining everything, in order to keep a very important ($$) industry certification. So when we export that web page, we don’t want to lose all of that image metadata. We will if it’s in the database, unless you do a lot of non-standard hackery. And we’d want to avoid non-HTML shit just to pass data around (a standard solution).
Microformats: Metadata, Inline, Bam
Really, a good way to do this is just store metadata inline, in the HTML content. The best solution we found for this setup is using microformats. The ideas that you wrap your object (an image, an object, a text block, whatever) with span tags that represent each one of the pieces of metadata. There’s a much more verbose explanation on the microformats.org site. The hCalendar format is good place to look for examples of this concept embraced.
So for our image example above, the HTML would look something like this.
<img src="puppies.png" width="500" height="169" />
<span class="imagetitle">The puppies are st00pid fly.</span>
<span class="author hidden">Jen</span>
It’s pretty ingenious. You have the semantic relationship between the image and the imagetitle, and you can easily extend to add other information like the author, and keep that specific item hidden, or whatever.