Convert Web Pages into Kindle “Books” (Documents)

This script below will accept a URL parameter, download the HTML, convert it to a .mobi file with kindlegen, and copy the file onto your Kindle. It works on Ubuntu, but can be altered to work in your environment. It’s written in Perl, and requires kindlegen and wget. You can get kindlegen from Amazon’s website, and wget is in your repository.

The only “trick” it has is reading the document’s title, and using that as the document’s filename. That should help avoid problems with files overwriting each other, to some extent.

$KINDLE is the documents directory on your Kindle. If you’re using another Linux distro, it might appear in another directory. $KINDLEGEN is the path to the kindlegen command.

#! /usr/bin/perl

$KINDLE = '/media/Kindle/documents';
$KINDLEGEN = '/home/johnk/bin/kindlegen';

use File::Copy;

$url = $ARGV[0];

system("/usr/bin/wget -O /tmp/kindle.html $url");  

open FH, '</tmp/kindle.html';
@lines = <FH>;
close FH;

@titles = grep { $_ =~ /<title>/i } @lines;
$titles[0] =~ m#.*<title>(.+)</title>.*#i;
$text = $1;
$text = 'index' if (! $text);

print "title is $textn";

$text =~ s/[^a-zA-Z0-9 ]//g;
$text =~ s/s/-/g;
$text = lc($text);
$text = substr $text,0,30;

$filename = $text.'.html';
$mobifilename = $text.'.mobi';
print "filename is $filenamen";
print "mobifilename is $mobifilenamen";

rename("/tmp/kindle.html", "/tmp/$filename");

system("$KINDLEGEN /tmp/$filename");

copy("/tmp/$mobifilename", "$KINDLE/$mobifilename") or die "Copy failed: $!";