QuarkXpress, Appscript & Python
Yesterday I was asked to help an organisation out.
They had previously published a number of books. Having just done a deal to have the books published in the UK they needed to send the original text over.
Unfortunately all they had were the QuarkXpress files of the final layouts. Contractually they needed to send Word files.
The layouts for each book contained over 500 separate chunks of text. So a simple select all and copy was out of the question.
After trying a couple of tactics I decided a scripting solution was the only way to sanely achieve their goal.
I found some relevant AppleScripts however they didn't work. Though they provided me with enough insight into the QuarkXpress script dictionary to be able to create a Python based solution using the Appscript Apple event bridge.
Seven simple lines of code that saved someone days of copying and pasting. I share this simple script in case anyone else in the googlable world needs a quick solution to extract all the text from a QuarkXpress document.
from appscript import *
import codecs
stories = app('QuarkXpress').documents[1].stories()
f = codecs.open('text_extracted.txt', 'wt', 'utf-8')
for s in stories:
f.write(s)
f.close()