skip to Main Content

Is there a way to parse partial/incomplete JSON strings into map/interface values? For example: {"name": "foo becomes map[string]any{"name": "foo"}

The need for this arises when using OpenAI’s chat completion API with the stream option (link) and if you want to utilize JSON responses that uses functions (or tools) you get back SSE messages like below:

{"f
iel
d:":
"val
ue"}

Which means these messages should be concatenated on the consumer side and the complete string is going to be a valid JSON string only when stream finishes, but this makes it impossible to update the UI as the response comes in.
There is a library for this in JS and also in Python but could not yet find one for Go. Any ideas are appreciated!

2

Answers


  1. You can declare a struct to contain a json.RawMessage, for example:

    var jsonStruct struct {
        Field1 string `json:"field1"`
        Field2 int    `json:"field2"`
        json.RawMessage
    }
    

    and only parse the fields that are already complete (or received):

    func get(to interface{}, m map[string]json.RawMessage, s string) {
        if err := json.Unmarshal(m[s], &to); err != nil {
            panic(err)
        }
        delete(m, s)
    }
    
    get(&jsonStruct.Field1, m, "field1")
    

    and then parse the remaining json back to the RawMessage:

    jsonStruct.RawMessage, err = json.Marshal(json)
    if err != nil {
        panic(err)
    }
    
    Login or Signup to reply.
  2. The exact parsing you want isn’t possible, because who says {"name": "foo is {"name": "foo" ... and not {"name": "foobar" ...?

    As mentioned by coxley in comments, you can use json.Decoder.Token() to iterate over the JSON stream. The jsoniter package provides a wrapper around it (and the underlying code shows how you could do this yourself as well – it isn’t that hard). You could also look into something like bytedance/sonic which provides a whole bunch of JSON encoding/decoding tools, though that may well be overkill.

    The most basic variant what will parse each top level node one at a time would look something like this:

    
    package main
    
    import (
        "bytes"
        "encoding/json"
        "io"
        "log"
    )
    
    func main() {
        // Get some io.Reader, this could be a streaming buffer
        r := getDataReader()
        dec := json.NewDecoder(r)
    
        // Read opening { - needed to "enter" the root level
        _, _ = dec.Token()
    
        // Loop over the root level
        for dec.More() {
            key, err := dec.Token()
            if err != nil {
                log.Fatalf("error: %s", err)
            }
    
            // Read the content of the current key
            // Important: if the content of the current key isn't read like this, and there isn't any recursion,
            // dec.More() will move into the data structure, but it stops as soon as it reaches the end of the first node that
            // is a simple value
            var val any
            err = dec.Decode(&val)
            if err != nil {
                log.Fatalf("error: %s", err)
            }
    
            log.Printf("%s => %#v", key, val)
        }
    }
    
    func getDataReader() io.Reader {
        data := `{
            "first": {
                "name": "The First Thing",
                "count": 1
            },
            "second": {
                "name": "The 2d Thing",
                "count": 2
            }
    }`
    
        buf := bytes.NewBufferString(data)
    
        return buf
    }
    
    

    This will do what you’ve described in the question. If your needs are more complicated (eg you need to get chunks at a deeper level) you’ll need to add some recursion – jsoniter shows a nice and simple way to do that (or just use that library).

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search