Wednesday 26 July 2017
Let's Never Create An Ad-Hoc Text Format Again
Recently I needed code to store a small amount of data in a file. Instinctively I started doing what I've always done before, which is create a trivial custom text-based format using stdio or C++ streams. But at that moment I had an epiphany: since I was using Rust, it would actually be more convenient to use the serde library. I put the data in a custom struct (EpochManifest), added #[derive(Serialize, Deserialize)] to EpochManifest, and then just had to write:
let f = File::create(manifest_path).expect("Can't create manifest"); serde_json::to_writer_pretty(f, &manifest).unwrap();and
let f = File::open(&p).expect(&format!("opening {:?}", &p)); let manifest = serde_json::from_reader(f).unwrap();
This is more convenient than hand-writing even the most trivial text (un)parser. It's almost guaranteed to be correct. It's more robust and maintainable. If I decided to give up on human readability in exchange for smaller size and faster I/O, it would only take a couple of changed lines to switch to bincode's compact binary encoding. It prevents the classic trap where the stored data grows in complexity and an originally simple ad-hoc text format evolves into a baroque monstrosity.
There are libraries to do this sort of thing in C/C++ but I've never used them, perhaps because importing a foreign library and managing that dependency is a significant amount of work in C/C++, whereas cargo makes it trivial in Rust. Perhaps that's why the ISO C++ wiki page on serialization provides lots of gory details about how to implement serialization rather than just telling you to use a library.
As long as I get to keep using Rust I should never create an ad-hoc text format again.
Comments