Ubuntu 5.10, archive.org, and .torrent files
I'm ready to say goodbye to my copy of Ubuntu 5.10 for i386 on CD, after nearly 2 decades of keeping it as a combination keepsake and software supply chain anchor. I donated it to archive.org:
While brainstorming about Merkle trees for file access, I noticed that not only does archive.org OCR the PDF I gave them of the cover and support browsing of the contents of Ubuntu 5.10 i386.iso, but they provide ubuntu-5.10-pc_archive.torrent, which means I can have high-performance access to the the full contents of the CDs for just 29k of storage. And brave supports .torrent
files natively with WebTorrent (WebTorrent Tutorial looks pretty straightforward.)
But what's in that .torrent
file? Aha! bencode from BEP 3! I've heard of it in OCapN discussion
but didn't realize it comes from bittorrent. BitTorrent bencode format tools is really handy, including stopping in a debugger to see the details:
BCode.hs from haskell-torrent has a crisp specification:
-- | BCode represents the structure of a bencoded file
data BCode = BInt Integer -- ^ An integer
| BString B.ByteString -- ^ A string of bytes
| BArray [BCode] -- ^ An array
| BDict (M.Map B.ByteString BCode) -- ^ A key, value map
deriving (Show, Eq)
...
-- | Return the hash of the info-dict in a torrent file
hashInfoDict :: BCode -> IO Digest
hashInfoDict bc =
do ih <- case info bc of
Nothing -> fail "Could not find infoHash"
Just x -> return x
let encoded = encode ih
digest $ L.fromChunks $ [encoded]
Playing with parse-torrent in a parse-torrent-ubuntu-5.10 project on StackBlitz is handy in that it shows the infoHash
, b890d2e1174a809d1cd0437de30400c542e0a939
, but its JSON output misled me about the real structure: there is no infoHash
key in the file; there's an info
dictionary that gets hashed.
Say... Ubuntu offers bittorrent as a download option; maybe they keep a 5.10 .torrent
file around? I didn't find one from them, but I did find:
- Ubuntu 5.10 (Breezy Badger) : Canonical Ltd., Ubuntu community : Internet Archive
Source torrent:urn:sha1:329a357ebd51db73417e1ad767b041291f598ae8 Addeddate: 2017-06-20 14:16:31 Identifier: Ubuntu-5.10
Note the Source; yes, Internet Archive ingests BitTorrents.
Somehow my Ubuntu 5.10 i386.iso
is 632,262 kb, which is 300 kb larger than theirs (631,962 kb). Maybe some unused space captured by gnome-disk-utility when I ripped the CD?