ColdFusion XML Parsing
People get the "Content is not allowed in Prolog" XmlParse() error in ColdFusion ... mainly when you try to parse XML that has data or white space prior to the encoding declaration or root node. This is often caused when an XML feed does not trim it's return value. Usually, passing the content through ColdFusion's Trim() method before calling XmlParse() does the trick; however, in one case, Trim() didn't seem to be helping.
When i was working with Authorize.NET's API, which returns XML responses. If you look at the FileContent, you will see that an XML document was returned. And, furthermore, from what you can see, it appears that the first piece of data returned is the encoding:
Even when we are running the returned response through ColdFusion's Trim() method before parsing it. Usually, this would take care of any prolog data issues; however, running the above code, we get the following error:
An error occured while Parsing an XML document. Content is not allowed in prolog.
It turns out, this character which is creating problem is something called as Byte-Order-Mark and in an XML document, it is used to flag the encoding type of the XML. Unfortunately, ColdFusion does not appreciate the use of this Byte-Order-Mark, or BOM. In order to get this kind of XML feed to play nicely with ColdFusion, we have to remove the BOM before we parse the document. Luckily, getting rid of this requires nothing more than a simple regular expression that strips out all characters before the first bracket.
Here is the code:
REReplace( objGet.FileContent, "^[^<]*", "", "all" )
When i was working with Authorize.NET's API, which returns XML responses. If you look at the FileContent, you will see that an XML document was returned. And, furthermore, from what you can see, it appears that the first piece of data returned is the encoding:
Even when we are running the returned response through ColdFusion's Trim() method before parsing it. Usually, this would take care of any prolog data issues; however, running the above code, we get the following error:
An error occured while Parsing an XML document. Content is not allowed in prolog.
It turns out, this character which is creating problem is something called as Byte-Order-Mark and in an XML document, it is used to flag the encoding type of the XML. Unfortunately, ColdFusion does not appreciate the use of this Byte-Order-Mark, or BOM. In order to get this kind of XML feed to play nicely with ColdFusion, we have to remove the BOM before we parse the document. Luckily, getting rid of this requires nothing more than a simple regular expression that strips out all characters before the first bracket.
Here is the code:
REReplace( objGet.FileContent, "^[^<]*", "", "all" )
