First attempt doesn't seem to work

A place to try and solve your RegEx problems.
Post Reply
SafeTex
Posts: 3
Joined: Tue May 28, 2013 6:41 pm

First attempt doesn't seem to work

Post by SafeTex »

Hello

I'm not very good on regular expressions but someone has given me one that works in a software called X Bench

<([[:digit:][:letter:]]+_)+[[:digit:][:letter:]]+>

It should find reference numbers like

ABC_123_DEFG-4567

etc of any length

The problem is that when I try to run it in TC (which I'm not really familiar with either), the search only takes a few seconds (instead of a few hours) on a txt file which is VERY big and there are 0 founds (there are just over 121,000 in Xbench)

The reason I can't use x bench is that it doesn't have an extract function and I need my founds to be extracted to a txt file

I've tried testing the expression in TC's tester but that doesn't seem to work either for me

This might be no problem with the regex but that I'm using TC wrongly (I've indicated the path and put the expression in the box and asked it to 'extract'

Can anyone help please

Thanks in advance
SafeTex
Posts: 3
Joined: Tue May 28, 2013 6:41 pm

Re: First attempt doesn't seem to work

Post by SafeTex »

Hello again

No hero to come to help me? :(

Ok, let's start with something simple

Why does [:digit:] find all the letters D I G I T in my file and not the numbers 1-9 ? (no surprise then that my regular expression does not work :D

What flavour of regular expressions should I try to use in Text Crawler?

Is there a manual?

Is there anyone there please?

Regards
Steve_L
Posts: 4
Joined: Sat May 25, 2013 12:59 am

Re: First attempt doesn't seem to work

Post by Steve_L »

Hi SafeTex...I a new to regular expressions and TextCrawler too......but did you try this?

\D+\d+\D+\d+

\D matches a character that is not a digit
\D+ matches more than one character that is not a digit

\d matches any single digit
\d+ matches one or more digits
SafeTex
Posts: 3
Joined: Tue May 28, 2013 6:41 pm

Re: First attempt doesn't seem to work

Post by SafeTex »

Hello Steve L

Thanks for the possible answer but I eventually figured out for myself how to simplify the original expression for Text Crawler

I used [0-9] cos the engine does not seem to recognise [:digit:] or any other set for that matter.

Regards

SafeTex
Post Reply