Tuesday, 15 June 2010

java - When i need to escape Html string? -



java - When i need to escape Html string? -

in legacy project can see usage of escapehtml before string sent browser.

stringescapeutils.escapehtml(stringbody);

i know api doc escapehtml does.here illustration given:-

for example: "bread" & "butter" becomes: "bread" & "butter".

my understanding when send string after escaping html browser responsibility converts original characters. right?

but not getting why , when required , happens if send string body without escaping html? cost if dont escapehtml before sending browser

i can think of several possibilities explain why string not escaped:

perhaps original programmer confident @ places string had no special characters (however, in sentiment bad programming practice; costs little escape string protection against future changes) the string escaped @ point in code. don't want escape string twice; user end seeing escape sequence instead of intended text. the string actual html itself. don't want escape html; want browser process it!

edit - reason escaping special characters & , < can end causing browser display other intended. bare & technically error in html. browsers seek deal intelligently such errors , display them correctly in cases. (this happen in illustration text if string text in <div>, instance.) however, because bad markup, browsers not work well; assistive technologies (e.g., text-to-speech) may fail; , there may other problems.

there several cases fail despite best efforts of browser recover bad markup. if sample string attribute value, escaping quote marks absolutely required. there's no way browser going correctly handle like:

<img alt=""bread" & "butter"" ... >

the general rule character not markup might confused markup need escaped.

note there several contexts in text can appear within html document, , have separate requirements escaping. within attribute values, need escape quote marks , ampersand (but not <). must escape characters have no representation in character set of document (unlikely if using utf-8, that's not case). within text nodes, & , < need escaped. within href values, characters need escaping in url must escaped (and doubly escaped still escaped after browser unescapes them once). within cdata block, nil should escaped (at html level).

finally, aside hazard of double-escaping, cost of escaping text minimal: tiny bit of processing , few bytes on network.

java escaping stringescapeutils

No comments:

Post a Comment