Find & Replace All Genre in XML file

Tool for Search and Replace across multiple files.
badelman
Posts: 4
Joined: Sun Apr 28, 2013 5:03 pm

Find & Replace All Genre in XML file

Post by badelman »

Hello, I have a folder called "Movie" and underneath that folder I have category folders like "Comedy", "Drama", etc.. Inside those folder I have individual movie folders like "Caddy Shack".

So the directory structure is: Movie\Comedy\Caddy Shack\*.*

Inside each individual movie folder is an XML file with a ton of info in it. The main content I want to modify is the "Genre". The reason is most movies have several genre's but not all main genre's are listed 1st or 2nd which is how my media player picks up this text and categorizes it by itself.

So the section in each *.xml file looks exactly like this (sometimes there are up to 5 or 6 genre's):
<genre>
<name>Comedy</name>
<name>Action</name>
</genre>

What I want to do is look for the <genre> heading in each file automatically inside each category directory (Comedy - all main and sub folders) and replace everything in-between <genre> and </genre> with 1entry which for our example above could be <name>Comedy</name>

The Genre I want to have as 1 entry may already be there and it may not. Everything else should be gone in between the Genre start and finish tags except the Genre specified in text crawler. In addition, the <genre> start tag is not always on the same line in every file depending upon how much data is before it.

How can I setup text crawler to do this??

Thanks!
User avatar
DigitalVolcano
Site Admin
Posts: 1863
Joined: Thu Jun 09, 2011 10:04 am

Re: Find & Replace All Genre in XML file

Post by DigitalVolcano »

I think this is what you want.

In the Regular Expression tab, with 'dot matches newline' selected.

Regex:

Code: Select all

<genre>.*?</genre>
Replace :

Code: Select all

<genre><name>Comedy</name></genre>

BEFORE:
<genre>
<name>Comedy</name>
<name>Action</name>
</genre>
<test>
blah
</test>
<genre>
<name>bfabfababf</name>
</genre>


AFTER:
<genre>
<name>Comedy</name>
</genre>
<test>
blah
</test>
<genre>
<name>Comedy</name>
</genre>
badelman
Posts: 4
Joined: Sun Apr 28, 2013 5:03 pm

Re: Find & Replace All Genre in XML file

Post by badelman »

Thank you for your quick reply! I am still a bit confused by the BEFORE/AFTER output example. The reason is that in the *.XML files I have in each sub-directory the organization of text seems to be the same in every file (not the number of entries per tag but just the organization).

1. I say this as I'm not sure why there are 2 <genre>/</genre> tags in each example as there is only 1 at top and 1 at bottom.
2. Also, the 2 <test>/</test> tags in the example what does that do? Does text crawler input those?? OR is that just showing that other stuff may be in there.
3. Also as your example also shows <name>Comedy</name> twice is that to mean that the current way of doing it will

Thx again!

Barry
badelman
Posts: 4
Joined: Sun Apr 28, 2013 5:03 pm

Re: Find & Replace All Genre in XML file

Post by badelman »

Hello,

Ok.. I tested and after also adding *.xml to the file filter under "input" and putting in your syntax I was able to get great results. So thank you! I still have a couple questions:
1. Is it ok that the open close tag is all on the same line? Or is there a way to force TextCrawler at a specific point to place a carriage return? The output I got was like this: <genre><name>Comedy</name></genre>
All XML files I have were organized like this:
<genre>
<name>Comedy</name>
</genre>

2. Is there a way to leave other genre entries and just append the #1 spot? AND possibly delete any subsequent duplicate genre entries?
for example: If I add <name>Comedy</name>

AND the current file looks like this:
BEFORE
<genre>
<name>Action</name>
<name>Comedy</name>
<name>Sci-Fi</name>
</genre>

AFTER
<genre>
<name>Comedy</name>
<name>Action</name>
<name>Sci-Fi</name>
</genre>

3. Also, is there a way on the replace to specify 2 or more genres?

Thanks Again!!
Post Reply