[semanticweb] Re: Newbie frustrations
Luca Mascaro
info a lucamascaro.info
Mer 4 Gen 2006 07:53:49 CET
Rigiro una mail arrivata stanotte nella lista ufficiale del w3c sul
semantic web che trovo pił che sintomatica di alcune problematiche ad
applicare la marcatura di metainformazioni semantiche a livelli base.
Che ne pensate?
Ciao
Luca
On 1/3/06, wollman+semantic-web a bimajority.org
<wollman+semantic-web a bimajority.org> wrote:
>
> I have a small application which I use to generate photo galleries for
> my Web site. I've been meaning for some time to add some semantic
> metadata to the galleries for some time, as having that information
> would greatly assist a search application. I thought this would be
> easy to accomplish -- the gallery structure and exposition are already
> in an XML representation of my own devising, and the individual pages
> in the gallery are generated using make and XSLT; it would not be
> difficult to add a <metadata> element to the DTD and write another
> XSLT script to extract the properties of each image and format the
> result as RDF/XML.
>
> It turned out to be far more difficult than I had expected.
>
> My photo galleries are fairly typical affairs: for each gallery, there
> is an index page, with a description of the gallery as a whole. Each
> photo is provided in multiple resolutions, and for each pair (photo
> number, resolution) there is a photo description page in HTML which
> embeds the photo and contains navigational links.
>
> My first difficulty was in how to contain the information explosion.
> In a typical 75-photo gallery, there are 376 distinct resources: one
> index page, 75 thumbnails, and a description page and image file for
> each of two resolutions. But in the abstract "photo gallery"
> semantics, there are only 151 actual *things*: an index page, possibly
> with extended narrative, 75 photos, and 75 photo captions. It's not
> at all clear how to represent this. I ended up (after trying several
> alternatives that were unsatisfactory) representing each abstract
> photo and each abstract caption as named blank nodes, with a
> dcterms:hasFormat property indicating each of the available sizes. (I
> never figured out how to use rdf:Bag or rdf:Alt for this last bit, so
> each hasFormat property is written separately.) Then the abstract
> photo can refer to the abstract caption for its dc:description
> property, and at least some of the implicit semantic structure is made
> explicit.
>
> The next problem was also an information explosion, and I see from the
> archives of this list that it's a well-traveled road. Each photo
> description in my source file has a photographer attribute.
> Originally, this was only used to automatically generate copyright
> notices on each page, so all I have is the name of the photographer in
> conversational order. This is an obvious candidate for inclusion in
> the gallery metadata. My first implementation simply output the
> photographer name as a literal in the dc:creator property, but I felt
> like I ought to be able to better. I knew, in particular, that users
> might want to use the foaf vocabulary to describe individuals depicted
> in their photos, so I decided to represent photographers as instances
> of foaf:Person. The naive approach of using a foaf:Person as a value
> of the dc:creator property failed to represent the important
> underlying expectation that two photographers with the same name in
> the same gallery are the same person. So again I used the
> named-blank-node approach (which obviously only works because I
> keep the metadata for the whole gallery in a single document). But in
> this case it's rather less than satisfactory; I was forced to use the
> generate-id() XPath function to create the names, which means that the
> author of the gallery has no way of adding additional properties to
> one of these automatically-generated foaf:Person instances. I may end
> up removing this function entirely and requiring the user to handle
> this herself -- which I already do for the textual elements, since
> my DTD doesn't represent authorship of the descriptions.
>
> Having made an initial proof-of-concept hack, I started to annotate an
> existing photo gallery with metadata, and quickly ran aground. There
> are four obvious categories of metadata one might be interested in for
> an individual photograph:
>
> a) Technical: how the photo was taken, at what resolution, in what
> orientation, etc. I am mostly not concerned with this, since it is of
> no value to my application.
>
> b) Temporal: when the photo was taken. This is easy to accomplish and
> the choice of representation is obvious. (I used dcterms:created and
> represent the date in DTF.)
>
> c) Geographic: where was the photo taken. This was much more
> difficult; the obvious schemas all took a very computer-oriented
> approach to geocoding, representing locations as grid coordinates --
> information I do not have. I searched for hours looking for a good
> representation of an ordinary street address (the only kind of
> geographic location I might have access to for my photos) and didn't
> find anything I would describe as "good".
>
> d) Subject matter: what is this a photo of? It seems that here I have
> to develop my own ontology, since I don't tend to take photos of
> airports and only rarely take photos of people (where foaf provides
> everything that's required).
>
> I take pictures of radio towers. Most towers have a number, assigned
> by the FCC, but some don't. All I wanted was a way to represent
> "photo X shows tower Y" in a way which would allow me to answer
> queries like "show me all the photos of tower Y in temporal order".
> It shouldn't be this hard!
>
> Sorry for the long-winded rant. Am I being unreasonable or is it
> really expected to be this difficult?
>
> -GAWollman
>
>
>
>
--
Luca Mascaro
CTO Phiware Engineering Sagl
User Experience and Interface Designer / Engineer
W3C HTML, WAF, WAPIs and WCAG Working Group Member
for International Webmaster Association
Invited expert of ISO/TC 159/SC 4/WG 5
"Software ergonomics and human-computer dialogues"
Maggiori informazioni sulla lista
semanticweb