Wrong encoding

ebulerdo · Post by **ebulerdo** » Mon Dec 09, 2013 12:12 pm

Hi

Thank you very much for making TextCrawler freely available. It's a great software, and one of the few that support unicode well. Thanks!

I was having a problem with encoding but, while I waited for the activation message, I found the solution, so I thought I would share it. I have seen other posts in this forum regarding Japanese and Cyrillic encodings that received no answer, and that are possibly related to the same issue.

I have an xml file encoded as UTF-8, and I have a .txc file with a list of replacements. Basically what I want is replacing html entities with the characters themselves (I mean, replace á with á, and so on). The problem is the file encoding is wrongly detected as ANSI, so after the replacements it's full of weird characters.

After trying a lot of different things, I realized the problem is the unicode signature. My text editor was saving files as UTF-8 no-BOM. After I changed them from "no-BOM" to "BOM", TextCrawler is able to identify the encoding correctly. I would like to suggest that TextCrawler allow to manually change the encoding in case it's not being correctly detected. Anyway, changing the BOM signature is easy and fixes the problem for good.

Hope this is useful for someone.

Thanks!

Post by **DigitalVolcano** » Thu Jan 02, 2014 10:37 am

Thanks for the info! TextCrawler does attempt to detect UTF-8 with no BOM, but this isn't always successful. Having the BOM ensures the file is recognized as the correct type.

DigitalVolcano Software Support

Wrong encoding

Wrong encoding

Re: Wrong encoding