Home > Java > New lines in XML attributes

New lines in XML attributes


If you have an attribute in xml that spans multiple lines, e.g.

 q2="2
B"

you might expect the newline literal to be encoded in the resulting string when the attribute is parsed. Instead, the above example will be parsed as “2 B”, at least with Java’s SAX parser implementation. In order to have the new line literal included, you should insert the entity & #10; instead (this entity keeps getting eaten by wordpress, so ignore the space) This StackOverflow answer by Tomalak gives some more insight:

Bottom line is, the value string is saved verbatim. You get out what you put in, no need to interfere.
However… some implementations are not compliant. For example, they will encode & characters in attribute values, but forget about newline characters or tabs. This puts you in a losing position since you can’t simply replace newlines with
beforehand.

Upon parsing such a document, literal newlines in attributes are normalized into a single space (again, in accordance to the spec) – and thus they are lost.
Saving (and retaining!) newlines in attributes is impossible in these implementations.

Advertisement
Categories: Java Tags: , ,
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: