DigitalVolcano Software Support

Posted: **Thu Feb 16, 2017 8:46 am**

Hello!

I am trying to write a regex that deletes all lines that do not begin with an HTML tag. In my text editor (Editpad) I would write it like this:
^[^<].+?\r\n

But in TextCrawler ^ only matches the beginning of the file, not every line.

I found a previous thread here (https://www.digitalvolcano.co.uk/board/ ... ?f=7&t=602) that suggests using:
\r\n[^<].*

But that option has two problems: 1) for some reason the * matches the final \r too, so it leaves the file with a mix of Windows and Unix-style ends of line, and 2) for that reason it only deletes half of the offending lines. If there are two consecutive lines that don't begin with < it will only remove the first one, the second one doesn't match anymore because it no longer is preceded by \r\n, only \n.

I improved that expression as:
\r\n[^<].*\r\n
and replacing it with \r\n

That way it no longer leaves Unix-style line returns, but still it only removes half of the lines, so I have to repeat the same search several times to get rid of all the lines. And still I have no way of makign sure that all lines have been removed, so I have to check manually.

This would all be easily resolved if there was an operator that matched the begining of line. Is there anything like that? Any suggestions?

Thanks!

Posted: **Thu Feb 16, 2017 9:13 am**

This is an example to make it clearer what I want to do:

I have this text:

<keep>Keep this line</keep>
<keep>Keep this line</keep>
Remove this line
Remove this line too
<keep>Keep this line</keep>

And I want this:

<keep>Keep this line</keep>
<keep>Keep this line</keep>
<keep>Keep this line</keep>

Thank you!

Posted: **Thu Feb 16, 2017 2:38 pm**

OK, I think I found the solution reading other topics.

I need to have Multiline anchors enabled.

Now it works. :-)

Thanks!

DigitalVolcano Software Support

Matching beginning of line

Matching beginning of line

Re: Matching beginning of line

Re: Matching beginning of line