Skip to main content

Legality and PDF help

Forums

I would like to upload “Means and Ends: The Revolutionary Practice of Anarchism in Europe and the United States” by Zoe Baker. Libcom.org has the PDF, approved by Zoe (See: https://x.com/anarchopac/status/1812382509079920895) — Can I run into legal trouble with the publishing company by uploading this book, given I don’t have direct permission? — Also, I can’t remember where I read it but somewhere on The Anarchist Library site said to ask the forums how to upload PDF’s. Is there a preferred method? The PDF is a direct copy of the print book. — Sorry for the formatting, I’m on my phone and paragraph breaks aren’t showing for some reason.

magsalin Sun, 11/03/2024 - 11:51

We have permission from Zoe Baker as well for uploading her book to The Anarchist Library. However, there are some technical issues, namely that the EPUB needs to be converted to MUSE markup, but irregular HTML syntax used by whatever program AK Press uses to make EPUBs make the conversion difficult, in effect deleting all emphasis like italics or bold. So no, Pandoc is insufficient for this project. We need the skills of a good programmer to parse the weird HTML syntax of the EPUB to convert it to MUSE markup. This has been a pending task for a while.

As for AK Press legally, they haven't come after us for the books of theirs we host on the site, but that doesn't mean they couldn't. Uploading to the website is largely anonymous, so the liability lies with The Anarchist Library, not with the uploader.

No, we can't just attach the PDF to a library upload and call it a day. That's not how we do things. That's fine for Libcom, because they have different conventions and standards, but it won't do for us.

There is no reliable way for converting PDFs to MUSE. It is possible to use the modern editions of Microsoft Word to convert the PDF of Zoe's book into a DOCX file and then use either Pandoc or the library's copy-paste-to-MUSE importer to convert the text into MUSE markup. This was how a lot of other books were imported in the past. However, in our current capacity, we are unable to commit to working on such a conversion project. Paragraphing will need to be fixed, as do footnotes, in such a conversion project. It is time-consuming and labor-intensive. There are still several other PDF to MUSE projects that have been pending in the Library's queue for several years now, showing that we really don't have the capacity for these kind of conversions anymore.

So that leaves us with two ways of uploading Zoe's book on the library:

  1. EPUB to MUSE conversion: Construct a parsing program to convert AK Press EPUBs into AMuseWiki-ready outputs. Fine-tune and prepare the output for publication.
  2. PDF to DOCX to MUSE conversion: Do the labor-intensive and time-consuming work of converting a PDF to MUSE by hand, manually.

One could volunteer for one or the other. Contact us more in the library's chatroom for more specific details and advice.

EDIT

I am of the opinion that solution 1 is the better course of action to take. PDF to DOCX is a highly lossy process rife with conversion errors. Meanwhile, the AK Press EPUBs are already neatly, if unorthodoxly, marked up. I believe a good parser could provide a lossless conversion from EPUB to MUSE. The trick, however, is writing the parser well. A task I cannot do.

deadpoet Sat, 05/17/2025 - 00:40
I happened to already have a copy of the book so I unzipped the archive and poked around. Whatever program they're using makes some questionable decisions. I identified two problems. First, I ran the html files through a linter and found there were incorrectly closed link element tags. Second, instead of semantics html elements (like paragraphs, itallics, bolded characters, etc.) they made everything span elements and styled them as if they were the semantic elements. Very weird ¯\_(ツ)_/¯ All that to say I think I can write a pretty simple python program to extract everything out. The reason I wanted to reach out is because the post mentions this being true for AK Press books in general. So I figure there's a good change they're all broken in the same way. If I had a few reference examples I could prove that out and it would be worth making the script generic rather than a one off.

Add new comment

Plain text

  • No HTML tags allowed.
  • Lines and paragraphs break automatically.
  • Web page addresses and email addresses turn into links automatically.
CAPTCHA
Security
573124698Click/tap this sequence: 6611
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.