Getting word count from template files

Recently I had to get an approximate word count of an entire site for estimating translation time. To do this I processed the template files to get all the non-html tag/logic content using find and sed, then counted the words using wc.

# from views directory

# create .out files with HTML tags stripped
find . -name '*.html' -exec sh -c 'sed -e "s/<[^>]*>//g" $1 > $1.out' -- {} \;

# create .out.bout files with nunjucks/jinja2 tags stripped
find . -name '*.html.out' -exec sh -c 'sed -e "s/{[^}]*}//g" $1 > $1.bout' -- {} \;

# word count
find . -name '*.html.out.bout' | xargs wc -w