Regex replace results in strange characters
Posted: Sat Nov 08, 2008 4:29 pm
I have the following situation I cannot solve. I have:
Line1
Line2
I want to end up:
Line1
Line2
That is I want to delete 2 or more empty lines down to just 1.
I tried this RegEx expression:
(\r\n *){3,}
and replace with:
\r\n\n
In the Regular Expression Tester it looks as expected. When I do it to a file it looks ok in the TextCrawler preview window. Opening the file in 'Word' also looks as expected. However opening it in an editor like "(Windows)Notepad" (or TEDNotepad), results in:
Line1
Line2
Further investigation, and using different text editors (like PSPad, KDiff3, WinMerge, Crimson Editor, Plato3, TextPad, RJ TextEd, UnicEdit, WordPad), which all show it correctly, indicate to me I have to grapically represent what happens.
With '(Windows)Notepad' (or TEDNotepad) it looks like:
Line1[**][*]
Line2
Where [**] represents two rectangles, which however appear to be only one character, and the [*] represents a rectangle, but only one character. Probably [\r\n] and [\n] from the regex expression.
Obviously I want it to be looking "right" in all cases. Can you help me?
Juergen
P.S. And if this is confusing, since this forum window shows it correct as well, I can send screen dumps as pdf files, if you tell me how.
Line1
Line2
I want to end up:
Line1
Line2
That is I want to delete 2 or more empty lines down to just 1.
I tried this RegEx expression:
(\r\n *){3,}
and replace with:
\r\n\n
In the Regular Expression Tester it looks as expected. When I do it to a file it looks ok in the TextCrawler preview window. Opening the file in 'Word' also looks as expected. However opening it in an editor like "(Windows)Notepad" (or TEDNotepad), results in:
Line1
Line2
Further investigation, and using different text editors (like PSPad, KDiff3, WinMerge, Crimson Editor, Plato3, TextPad, RJ TextEd, UnicEdit, WordPad), which all show it correctly, indicate to me I have to grapically represent what happens.
With '(Windows)Notepad' (or TEDNotepad) it looks like:
Line1[**][*]
Line2
Where [**] represents two rectangles, which however appear to be only one character, and the [*] represents a rectangle, but only one character. Probably [\r\n] and [\n] from the regex expression.
Obviously I want it to be looking "right" in all cases. Can you help me?
Juergen
P.S. And if this is confusing, since this forum window shows it correct as well, I can send screen dumps as pdf files, if you tell me how.