Irelephant@lemm.ee to Piracy: ꜱᴀɪʟ ᴛʜᴇ ʜɪɢʜ ꜱᴇᴀꜱ@lemmy.dbzer0.comEnglish · 2 days agoHow do i turn a collection of xhtml files into a pdf?message-squaremessage-square12fedilinkarrow-up121arrow-down11file-text
arrow-up120arrow-down1message-squareHow do i turn a collection of xhtml files into a pdf?Irelephant@lemm.ee to Piracy: ꜱᴀɪʟ ᴛʜᴇ ʜɪɢʜ ꜱᴇᴀꜱ@lemmy.dbzer0.comEnglish · 2 days agomessage-square12fedilinkfile-text
minus-squaredeegeese@sopuli.xyzlinkfedilinkEnglisharrow-up2·2 days agoThere are a ton of options depending on your tech level. How are you with basic Python scripts?
minus-squareIrelephant@lemm.eeOPlinkfedilinkEnglisharrow-up1·2 days agoI made the script to rip them in bash. I know python, lua, js, bash and powershell, anything using these works.
minus-squareDaniel Quinn@lemmy.calinkfedilinkEnglisharrow-up3·2 days agoI’ve used pdfkit to considerable success. It has a few system-level dependencies, but the instructions are pretty straightforward: # apt-get install wkhtmltopdf $ pip install pdfkit
minus-squaredeegeese@sopuli.xyzlinkfedilinkEnglisharrow-up3·2 days agoSurely you can figure out how to use existing libraries for this task, or is there something you’re stuck on?
minus-squareIrelephant@lemm.eeOPlinkfedilinkEnglisharrow-up2·2 days agoCan’t really find many good ones. Google isn’t returning much, just pdfs about python libraries and the odd abandoned github repo
minus-squaredeegeese@sopuli.xyzlinkfedilinkEnglisharrow-up2·2 days agoI’d start with wkhtmltopdf/pdfkit
minus-squareundefined@lemmy.hogru.chlinkfedilinkEnglisharrow-up1·edit-22 days agoIn a production web app I use Gotenberg. It’s definitely overkill for the task at hand, but if you find yourself doing this often I would highly recommend it. It’s dead easy to convert HTML (and I imagine XHTML) to PDF.
There are a ton of options depending on your tech level.
How are you with basic Python scripts?
I made the script to rip them in bash. I know python, lua, js, bash and powershell, anything using these works.
I’ve used pdfkit to considerable success. It has a few system-level dependencies, but the instructions are pretty straightforward:
# apt-get install wkhtmltopdf $ pip install pdfkit
Surely you can figure out how to use existing libraries for this task, or is there something you’re stuck on?
Can’t really find many good ones. Google isn’t returning much, just pdfs about python libraries and the odd abandoned github repo
I’d start with wkhtmltopdf/pdfkit
In a production web app I use Gotenberg. It’s definitely overkill for the task at hand, but if you find yourself doing this often I would highly recommend it. It’s dead easy to convert HTML (and I imagine XHTML) to PDF.