We created a new translation system for a project where we are using
html files that will be then sent off for translation. This allows us
to make changes to the layout if needed to fit a specific language.
The translation company needs to know the word count to give us a
price quote, and we don’t want then to take into account the tags.
Here is a one-liner in fish
shell that should get the word
count for all the .phtml files.
cat (find . -iname "*.phtml" ) | w3m -dump -T text/html | wc -w
In bash you would probably replace the parens with backticks.
Also we are using PO files for
short text throught the site. The translator we are using doesn’t know
how to work with po files. So we used po2csv to create them files that
they can open in a spreadsheet, although it makes me wonder what type
of translation company can’t use po files.
Here is som fish shell code to make the csv files and bundle them into a zip and send them off:
# for each lang directory make a csv dir and put the new csv's in there
for lang in ??
ehco $lang
mkdir $lang.csvs
po2csv $lang $lang.csvs
end
# zip up our csv and po files
find -name "??_messages.*" | sort | zip csvsAndPos -@
# mail it off
mutt -a csvsAndPos.zip