Matching beginning of line
Posted: Thu Feb 16, 2017 8:46 am
Hello!
I am trying to write a regex that deletes all lines that do not begin with an HTML tag. In my text editor (Editpad) I would write it like this:
^[^<].+?\r\n
But in TextCrawler ^ only matches the beginning of the file, not every line.
I found a previous thread here (https://www.digitalvolcano.co.uk/board/ ... ?f=7&t=602) that suggests using:
\r\n[^<].*
But that option has two problems: 1) for some reason the * matches the final \r too, so it leaves the file with a mix of Windows and Unix-style ends of line, and 2) for that reason it only deletes half of the offending lines. If there are two consecutive lines that don't begin with < it will only remove the first one, the second one doesn't match anymore because it no longer is preceded by \r\n, only \n.
I improved that expression as:
\r\n[^<].*\r\n
and replacing it with \r\n
That way it no longer leaves Unix-style line returns, but still it only removes half of the lines, so I have to repeat the same search several times to get rid of all the lines. And still I have no way of makign sure that all lines have been removed, so I have to check manually.
This would all be easily resolved if there was an operator that matched the begining of line. Is there anything like that? Any suggestions?
Thanks!
I am trying to write a regex that deletes all lines that do not begin with an HTML tag. In my text editor (Editpad) I would write it like this:
^[^<].+?\r\n
But in TextCrawler ^ only matches the beginning of the file, not every line.
I found a previous thread here (https://www.digitalvolcano.co.uk/board/ ... ?f=7&t=602) that suggests using:
\r\n[^<].*
But that option has two problems: 1) for some reason the * matches the final \r too, so it leaves the file with a mix of Windows and Unix-style ends of line, and 2) for that reason it only deletes half of the offending lines. If there are two consecutive lines that don't begin with < it will only remove the first one, the second one doesn't match anymore because it no longer is preceded by \r\n, only \n.
I improved that expression as:
\r\n[^<].*\r\n
and replacing it with \r\n
That way it no longer leaves Unix-style line returns, but still it only removes half of the lines, so I have to repeat the same search several times to get rid of all the lines. And still I have no way of makign sure that all lines have been removed, so I have to check manually.
This would all be easily resolved if there was an operator that matched the begining of line. Is there anything like that? Any suggestions?
Thanks!