To be perfectly honest, I was admittied into my Masters program with zero funding. In retrospect, starting a two year+ project with no guaranteed income wasn’t the greatest idea, for a variety of reasons.
First, every semester I hope/pray to get a Graduate Teaching Assistant job, which luckily gets easier and easier as I accumulate “seniority points.”
Second, no funding means no specified project, which means freedom to choose any research topic I please, as long as my (very lenient/forgiving) advisor is OK with it. Well, it’s been about eight months since I’ve come back from India all ready to start researching, and only two days ago did I actually settle on a topic.
Eight months is a long time to pay tuition, and follow dead ends with literature reviews. Also, those months are expensive if you waste your time on partying, girls, video games, Union involvement, student government, keggers, new housemates, motorcycles, trips to Mississippi, Vancouver, Ottawa, and so on. Well .. maybe it wasn’t a complete waste, per se
Finally, I’ve settled on a topic that I’m truly interested in.
Usability and Image Search
The topic: usability of image retrieval interfaces for systems based on image content, rather than user-provided textual meta-data.
What does this mean? Image searching (like Google Image, or Flickr search) is actually really, really complicated stuff. The systems on the back-end of the process use a variety of properties of an image to catalogue it in their vast library: the file name, it’s dimensions, data size, human contributed tags and categories, and a whole slew of other things. That specific information is called the metadata. Most of major (Yahoo, Google, MSN, etc.) image search engines also scour the web-page content that surrounds an image to get more metadata that can be searched.
Human contributed metadata is particularly meaningful: tags, categories, and file names point to the content of the image, not just its properties. The problem with human contributed metadata is that it’s:
- very incomplete
- very subjective, thus possibly inaccurate
- very time consuming if you do want to make it accurate or complete
- there are no standard descriptive elements in markup languages like XHTML or HTML for people to tag their images on the web
If you have a photo of a man riding a bicycle in Greece, unless you’ve tagged that picture with “man”, “bicycle”, “greece” (in four or five of the most common languages in the world), the likelihood of that image being catalogued so it can be found by an interested party is quite low.
Researchers since the mid-1990′s have spent a lot of time and energy in a field that hopes to automate more and more of that descriptive process, so that images can be searched on more than their properties. The field is called Content Based Image Retrieval (CBIR). Rather than relying on metadata to describe images, one can have a computer “see” the image and record information about it: colour, texture, salient regions, shapes, layout, etc. This is the content of the image. Cataloguing all of the useful content of a very large number of images accurately, completely, and in a way that can be easily searched, is the holy grail of this field. We are decades away from such a system, but major strides in that direction have already taken place.
Humans + CBIR
A smaller consideration of all of this is: how is a person supposed to search for something based on content? As in, I know a dog is fuzzy, I know that the computer can find fuzzy textures, but how do I describe to a search engine that I want to find all images of a brown, fuzzy dog? It sounds simple, should be simple, but is not simple in the current state-of-the-art.
Usability is a field that concerns itself with how things are used, especially interfaces of computer programs. I’m very interested in usability of these image search systems; image search systems that already exist, and proposed systems that will exist once we have the technology to automatically identify brown fuzzy dogs in your Flickr photos from the cottage.
So that’s where I’m at. I have a general idea of the area I’m interested in, so I’m in the process of reading some review papers about the state of CBIR. Hopefully, a literature review is coming next.