I happened to already have a copy of the book so I unzipped the archive and poked around. Whatever program they're using makes some questionable decisions. I identified two problems. First, I ran the html files through a linter and found there were incorrectly closed link element tags. Second, instead of semantics html elements (like paragraphs, itallics, bolded characters, etc.) they made everything span elements and styled them as if they were the semantic elements. Very weird ¯\_(ツ)_/¯
All that to say I think I can write a pretty simple python program to extract everything out. The reason I wanted to reach out is because the post mentions this being true for AK Press books in general. So I figure there's a good change they're all broken in the same way. If I had a few reference examples I could prove that out and it would be worth making the script generic rather than a one off.
Findings