[mdx] metadata verification; ingress filtering
Ian Young
ian at iay.org.uk
Wed May 20 09:08:26 PDT 2009
I mentioned this in another context to the Shibb team earlier in the
week, but it seems tangentially relevant to aggregation engines so I'm
reposting here with some modifications.
Historically, we've done a couple of things in our build process:
* metadatatool is used for signing, and it performs the same checks as
the 1.3 IdP does when you run it on its own output
* one output of the build process is the notorious HTML "stats page",
generated by a huge and difficult to maintain XSLT stylesheet; this
was tweaked to include a section at the top that flagged known
problems; the person doing the signing run needed to eyeball this
every time
* we run a cron job that uses both metadatatool and siterefresh to
download the production metadata
Until recently, this collection of hacks has been sufficient to detect
everything important we've encountered in real life. Then we found
that a particular typo in a KeyName element in the right place could
take down all of our 140+ 1.3 SPs and I started thinking about getting
more serious about verification.
What I've done to address this for now is to migrate the XSLT-based
checking out of the normal Xalan-driven stats page generation to way
earlier in the pipeline where we build the metadata to be signed. I
still use XSLT, but instead of using command line Xalan, I have
written a little app to support the checking process:
http://svn.ca.iay.org.uk/ukfedmeta/trunk/src/uk/org/ukfederation/apps/mdcheck/MetadataCheck.java
This currently just runs a given XSLT stylesheet against the provided
metadata file and then discards the output; the stylesheet
communicates with the application using <xsl:message> elements to
signal errors. It may be possible to extend this by having the
application pull in other relevant information and pass it to the
checking stylesheet as parameters.
The current checking stylesheet is here:
http://svn.ca.iay.org.uk/ukfedmeta/trunk/build/check.xsl
This is a lot smaller and simpler to change than the HTML-generating
version I had before, and it's also possible to add new checks far
more easily because you don't need to care about generating valid
output.
Here's the check for that 1.3 SP issue:
<xsl:template match="ds:KeyInfo/*[namespace-uri() != 'http://www.w3.org/2000/09/xmldsig#'
]">
<xsl:call-template name="fatal">
<xsl:with-param name="m">ds:KeyInfo child element not in ds
namespace</xsl:with-param>
</xsl:call-template>
</xsl:template>
I've used a callable template to report errors so that I can add extra
information in about the entity context in the error message without
duplicating a lot of XSLT all over the place.
One final wrinkle that I plan on making fairly serious use of the
ability of the Xalan processor in particular to call out to custom
Java classes. At present, for example, there is a call out to this
class:
http://svn.ca.iay.org.uk/ukfedmeta/trunk/src/uk/org/ukfederation/xalan/Members.java
The main check.xsl passes this an auxiliary metadata document we
maintain in a custom format that describes federation members; the
class digests that to extract the canonical names of those members
into a Set<String> so that we can do this later on:
<xsl:template match="md:EntityDescriptor[md:Organization/
md:OrganizationName]
[not(ukfxm:isOwnerName($members, md:Organization/
md:OrganizationName))]">
<xsl:call-template name="fatal">
<xsl:with-param name="m">unknown owner name: <xsl:value-of
select="md:Organization/md:OrganizationName"/></xsl:with-param>
</xsl:call-template>
</xsl:template>
(Note the call to ukfxm:isOwnerName)
This makes some kinds of check that would be almost impossible (and
very cpu-intensive) in plain XSLT pretty trivial. Ideally, I'd start
using OpenSAML in this role a lot more, once I get round the fact that
some of my other XSLT (used in command-line Xalan calls elsewhere)
apparently breaks if you hand it the same endorsed libraries as
OpenSAML 2 needs :-(
Relevance to MDX follows...
The arch document talks about transformation pipelines (T blocks) in
the aggregation engine. Cases like this make me think that:
1) most aggregation engines, certainly ones publishing to end
entities, are going to want to check for known-evil metadata
constructs [1]
2) I think one simple way of doing this is to allow one of the things
in a T block pipeline to be something that explicitly there to check
-- rather than transform -- the metadata as it goes by. It probably
makes most sense for this to be on an entity-by-entity basis, and in
fact that's how I've been assuming the transformation pipelines would
work anyway. Checking is different mainly in that the output is
discarded, and replaced with the input if nothing bad happened during
the check.
-- Ian
[1] ... or to be able to process incoming metadata into a form that is
known to be non-evil by construction before handing metadata along to
consumers. This is harder in general but is the sort of thing you
want to do if you want to enforce a particular set of conventions,
dropping everything that falls outside them "just in case". I'm not
interested in this case here.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2448 bytes
Desc: not available
URL: <http://lists.iay.org.uk/pipermail/mdx-iay.org.uk/attachments/20090520/2e196c94/attachment-0002.bin>
More information about the mdx
mailing list