[mdx] Joe on section 3.2.1

Thu Sep 26 07:37:41 PDT 2013

On 9/26/13 9:35 AM, "Ian Young" <ian at iay.org.uk> wrote:

>Scott commented:
>
>> There really is no clean way to address URL length that I've ever seen.
>
>My remarks:
>
>I don't think this arises in practice, because the "+" operator in a
>query is equivalent to set intersection (it's defined as asking for the
>entity or entities that have id1 AND id2 AND id3).  It doesn't seem
>likely that you'd want to query for the intersection of a thousand named
>sets.

Even if it did (or we added OR, or whatever), I still don't think there's
anything you can do in a spec. That's purely implementation guidance.

Put another way, show me a REST API that mentions URL length and I'll
retract my statement. It's no different than talking about body length in
a POST. You simply accept that security dictates limits will be imposed,
but nobody knows what they will be.

At best, maybe a brief mention in Security Considerations I guess.

>Joe:
>
>>   In 3.2.1, are the curly braces around the IDs literal curly braces
>>   or is that just a semantic representation issue?

>Scott:
>
>> It's a specification of the actual character, after any decoding.
>
>My remarks:
>
>I'm not sure that's true.  I think we're talking about this:
>
><base_url>/entities/{ID}+{ID}+Š
>
>Judging from the example, where we see "{ID}" in the above example, the
>intention is actually to include an appropriately encoded identifier not
>surrounded by braces; if a literal opening brace appears it is taken as a
>signal for a transformation indicator.
>
>So this actually looks like a typo to me, and should probably be more
>like:
>
><base_url>/entities/<ID>+<ID>+Š

Yes, I was really addressing a different point that I thought Joe was
raising. My point was that when the text in 3.1 says:

An identifier MAY contain any URL-encodable character but MUST NOT start
with '{' (ASCII 0x7B) as
   this character has a special meaning in the first position (see below).

That's referring to the decoded character. So encoding the '{' in the URL
doesn't get you around the rule.

But you're right, the example in 3.2.1 is wrong.

>
>Of course, it might be better to have a proper grammar for this.  I don't
>think the current spec even says that <base_url> is a non-terminal, for
>example, the meaning of <X> in that context is completely implicit.
>
>Comments?

I think we need a grammar or we need only examples that are concrete and
don't introduce placeholders that will be confused with a grammar.

>The only cases I can think of where you can have a syntactic error as
>things stand would be:
>
>* <base_url>/entities/FOO+        (the second ID is missing)
>
>* <base_url>/entities/{md5f3678248a29ab8e8e5b1b00bee4060e0      (no '}')
>
>These both seem like natural 400 status (malformed request) cases to me,
>but do we need to make that explicit?

I think making it explicit that a 400 should be used is useful.

-- Scott