Howdy:
If I have files with SGML tags and text like this--
<para>Text and more text.</para>
<block>
<tag1>
<tag2>
<tag3>text</tag3>
</tag2>
</tag1>
</block>
<para>More text.</para>
--and I want to extract all the "blocks"--that is, every instance of <block>...</block> (and everything between those tags)--why doesn't the expression <block>.*</block> work? ("Dot matches newline" is checked.) If there are more than one "blocks" in a file, this grabs everything from the first <block> to the very last </block> in the file. What expression should I use to limit the match to only each and every discrete <block>?
Many thanks
Grab blocks of text
- DigitalVolcano
- Site Admin
- Posts: 1863
- Joined: Thu Jun 09, 2011 10:04 am
Re: Grab blocks of text
You need to make the regular expression asterisk non-greedy (i.e. it will only match the first occurrence found.)
Code: Select all
<block>.*?</block>
Re: Grab blocks of text
Beautiful! Thanks a million. So simple.