skip to Main Content

I am implementing a JSON export functionality in my project. Earlier, I was building the JSON in memory (by storing the data in structs) and writing them to the target file using serde_json::to_string(data). But now the JSON structure is getting too big to fit in memory, hence I am using database streaming to process one row at a time.

The JSON I want to generate has a structure like this:

{
   "measurements": [
      { "timestamp": "2023-10-23", "value": 123.54 },
      { "timestamp": "2023-10-24", "value": 123.54 },
      // a lot more similar entries
   ],   
   "media": [
      { "name": "Ugly Love", "author": "Colleen Hoover" },
      { "name": "Harry Potter", "author": "JK Rowling" },
      // a lot more similar entries
   ]
}

How can I stream this data into a json file as it comes from the database?

2

Answers


  1. It can be difficult to stream massive amounts of JSON data from a database into a file without storing it all in memory at once, particularly if the data has a layered structure as in your situation. In order to accomplish this, you must preserve the proper JSON format while writing the JSON to the file gradually as you fetch data from the database.

    example in Rust for this , assuming you have a database function called stream_from_db that returns rows, and a serialization method called serde_json::to_string.

    use serde_json;
    use std::fs::File;
    use std::io::{Write, BufWriter};
    
    let file = File::create("output.json").unwrap();
    let mut writer = BufWriter::new(file);
    
    writer.write_all(b"{"measurements": [").unwrap();
    
    let mut first = true;
    for measurement in stream_from_db("measurements") {
        if !first {
            writer.write_all(b", ").unwrap();
        }
        first = false;
    
        let json = serde_json::to_string(&measurement).unwrap();
        writer.write_all(json.as_bytes()).unwrap();
    }
    
    writer.write_all(b"], "media": [").unwrap();
    
    let mut first = true;
    for media in stream_from_db("media") {
        if !first {
            writer.write_all(b", ").unwrap();
        }
        first = false;
    
        let json = serde_json::to_string(&media).unwrap();
        writer.write_all(json.as_bytes()).unwrap();
    }
    
    writer.write_all(b"]}").unwrap();
    
    Login or Signup to reply.
  2. If the enclosing JSON data is as simple as shown in your example and does not include dynamic member names, then the simplest solution is probably to manually construct parts of the JSON data, as shown in the other answer.

    However, if the enclosing JSON data is more complex or contains dynamic member names which might have to be escaped, then you could use the Struson library and its optional Serde integration for this.

    use struson::reader::*;
    
    let file = File::create("data.json")?;
    let mut json_writer = JsonStreamWriter::new(file);
    
    json_writer.begin_object()?;
    
    json_writer.name("measurements")?;
    json_writer.begin_array()?;
    while let Some(item) = measurements_stream.try_next().await? {
        json_writer.serialize_value(&item)?;
    }
    json_writer.end_array()?;
    
    json_writer.name("media")?;
    json_writer.begin_array()?;
    while let Some(item) = media_stream.try_next().await? {
        json_writer.serialize_value(&item)?;
    }
    json_writer.end_array()?;
    
    json_writer.end_object()?;
    json_writer.finish_document()?;
    

    Or alternatively using the experimental ‘simple API’:

    use struson::reader::simple::*;
    
    let file = File::create("data.json")?;
    let json_writer = SimpleJsonWriter::new(file);
    
    json_writer.write_object(|object| {
        object.write_array_member("measurements", |array| {
            while let Some(item) = measurements_stream.try_next().await? {
                array.write_serialize(&item)?;
            }
            Ok(())
        })?;
    
        object.write_array_member("media", |array| {
            while let Some(item) = media_stream.try_next().await? {
                array.write_serialize(&item)?;
            }
            Ok(())
        })
    })?;
    

    (This is written for Struson version 0.4.0)


    Disclaimer: I am the author of Struson, and currently it is still experimental (but feedback is highly appreciated!).

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search