[mdx] what do do about duplicates

Leif Johansson leifj at mnt.se
Tue May 12 14:07:03 PDT 2009


On Tuesday 12 May 2009 04:19:37 pm Ian Young wrote:
> On 12 May 2009, at 13:59, Leif Johansson wrote:
> > On Tuesday 12 May 2009 02:33:53 pm Scott Cantor wrote:
> >> Leif Johansson wrote on 2009-05-12:
> >>> Couldn't you merge attributes?
> >>
> >> How do you mean? The tagging extension? That probably depends on
> >> what they
> >> mean, so I would say no in general, but probably so in most cases.
> >
> > My feeling from this thread is that the safe course is this:
> >
> > The aggregator treats the entityID as non-unique but the combination
> > of the
> > entityID and the origin location can be treated as unique (for
> > instance the
> > xs:ID attribute could be sha1(entityID+origin URL). This means that
> > if an
> > entity is received from two different "peers" then both are stored
> > by the
> > aggregator under the same entityID but separate xs:ID. When the
> > entity is
> > requested by entityID (or hash of same) then the aggregator trust
> > evaluation
> > decides which entityID to expose.
>
> That seems rather complicated.  You'd want to make sure you had a real
> use case before you decided that you needed to do all of that.
>
> I'm still inclined to say what document 2 currently says: at a point
> where you merge multiple sources of metadata, there's some (pluggable)
> policy to resolve clashes.  There's no attempt to resolve clashes by
> merging metadata, because I don't think that is possible.  You can
> imagine the policy being very complicated and looking into the
> incoming metadata to decide which one is "better", but really I think
> for 100-epsilon% of real cases just saying "pick stream A over stream
> B if there is a clash" will be what you want.
>
> The business with the ID seems like an internal detail, I certainly
> wouldn't want to expose something like that as part of what an
> aggregator exposes outside, so there's not really any need for it to
> be expressed in terms of the XML representation of an EntityDescriptor.
>
> 	-- Ian

I actually meant it to describe what my aggregator would do internally
and it seems like I almost follow doc2 only I delay the evaluation of 
what metadata is best until when it is requested...

	Cheers Leif



More information about the mdx mailing list