skip to Main Content

I have the following unnormalized csv file

user_id,nickname,joinDate,product_id,price
1,kmh,2023-07-24,P131,3000
1,kmh,2023-07-24,P132,4000
1,kmh,2023-07-24,P133,7000
1,kmh,2023-07-24,P134,9000
2,john,2023-07-24,P135,2500
2,john,2023-07-24,P136,6000
3,alice,2023-07-25,P137,4500
3,alice,2023-07-25,P138,8000

I’m going to change this to the following json format (or java object).

[
    {
        "user_id": 1,
        "nickname": "kmh",
        "joinDate": "2023-07-24",
        "orders": [
            {
                "product_id": "P131",
                "price": 3000
            },
            {
                "product_id": "P132",
                "price": 4000
            },
            {
                "product_id": "P133",
                "price": 7000
            },
            {
                "product_id": "P134",
                "price": 9000
            }
        ]
    },
    {
        "user_id": 2,
        "nickname": "john",
        "joinDate": "2023-07-24",
        "orders": [
            {
                "product_id": "P135",
                "price": 2500
            },
            {
                "product_id": "P136",
                "price": 6000
            }
        ]
    },
    {
        "user_id": 3,
        "nickname": "alice",
        "joinDate": "2023-07-25",
        "orders": [
            {
                "product_id": "P137",
                "price": 4500
            },
            {
                "product_id": "P138",
                "price": 8000
            }
        ]
    }
]

I’ve been searching for quite a long time and haven’t found a library or tool that enables this .

I have so many different types of csv that I need tools or libraries to change all of these. Are there any libraries or tools that make this possible?

2

Answers


  1. All you need is a way to parse the CSV to a Java object. You can do this manually, or by using an existing library.

    For example, you can use Jackson with the CSV data format like this:

    class MyRecord {
    
        @JsonProperty("user_id")
        private int userId;
        
        private String nickname;
        
        private LocalDate joinDate;
    
        @JsonProperty("product_id")
        private String productId;
    
        // getters and setters
    
        // a meaningful toString method
    
    }
    
    public class Main {
    
        public static void main(String[] args) throws IOException {
            String csv = "user_id,nickname,joinDate,product_id,pricen" +
                    "1,kmh,2023-07-24,P131,3000n" +
                    "1,kmh,2023-07-24,P132,4000n" +
                    "1,kmh,2023-07-24,P133,7000n" +
                    "1,kmh,2023-07-24,P134,9000n" +
                    "2,john,2023-07-24,P135,2500n" +
                    "2,john,2023-07-24,P136,6000n" +
                    "3,alice,2023-07-25,P137,4500n" +
                    "3,alice,2023-07-25,P138,8000";
    
            CsvSchema schema = CsvSchema.emptySchema().withHeader(); // uses CSV header to read the schema
            ObjectMapper mapper = new CsvMapper().registerModule(new JavaTimeModule()); // to deserialise Java 8 LocalDate
            MappingIterator<MyRecord> resultIterator = mapper.readerFor(MyRecord.class).with(schema).readValues(csv);
            while (resultIterator.hasNext()) {
                System.out.println(resultIterator.next());
            }
            resultIterator.close();
        }
    } 
    

    Which prints:

    MyRecord[userId=1, nickname='kmh', joinDate=2023-07-24, productId='P131', price=3000]
    MyRecord[userId=1, nickname='kmh', joinDate=2023-07-24, productId='P132', price=4000]
    MyRecord[userId=1, nickname='kmh', joinDate=2023-07-24, productId='P133', price=7000]
    MyRecord[userId=1, nickname='kmh', joinDate=2023-07-24, productId='P134', price=9000]
    MyRecord[userId=2, nickname='john', joinDate=2023-07-24, productId='P135', price=2500]
    MyRecord[userId=2, nickname='john', joinDate=2023-07-24, productId='P136', price=6000]
    MyRecord[userId=3, nickname='alice', joinDate=2023-07-25, productId='P137', price=4500]
    MyRecord[userId=3, nickname='alice', joinDate=2023-07-25, productId='P138', price=8000]
    
    Login or Signup to reply.
  2. For your case, you can deserialize the csv into JsonNode directly without creating POJO class. And then use JSON library Josson to transform the JSON by function group().

    String csv = "user_id,nickname,joinDate,product_id,pricen" +
            "1,kmh,2023-07-24,P131,3000n" +
            "1,kmh,2023-07-24,P132,4000n" +
            "1,kmh,2023-07-24,P133,7000n" +
            "1,kmh,2023-07-24,P134,9000n" +
            "2,john,2023-07-24,P135,2500n" +
            "2,john,2023-07-24,P136,6000n" +
            "3,alice,2023-07-25,P137,4500n" +
            "3,alice,2023-07-25,P138,8000";
    ArrayNode arrayNode = Josson.createArrayNode();
    CsvSchema schema = CsvSchema.emptySchema().withHeader();
    try (MappingIterator<JsonNode> it = new CsvMapper().readerFor(JsonNode.class).with(schema).readValues(csv)) {
        arrayNode.addAll(it.readAll());
    }
    Josson josson = Josson.create(arrayNode);
    JsonNode grouped = josson.getNode(
            "group(map(user_id, nickname, joinDate), map(product_id, price))" +
            ".map(**:key, orders:elements)");
    System.out.println(grouped.toPrettyString());
    

    Function group()

    1. Group by "key" of {user_id, nickname, joinDate}
    2. With "elements" of {product_id, price}

    Functoin map()

    1. Extract values inside object "key"
    2. Add field "orders" copy from "elements"

    Output

    [ {
      "user_id" : "1",
      "nickname" : "kmh",
      "joinDate" : "2023-07-24",
      "orders" : [ {
        "product_id" : "P131",
        "price" : "3000"
      }, {
        "product_id" : "P132",
        "price" : "4000"
      }, {
        "product_id" : "P133",
        "price" : "7000"
      }, {
        "product_id" : "P134",
        "price" : "9000"
      } ]
    }, {
      "user_id" : "2",
      "nickname" : "john",
      "joinDate" : "2023-07-24",
      "orders" : [ {
        "product_id" : "P135",
        "price" : "2500"
      }, {
        "product_id" : "P136",
        "price" : "6000"
      } ]
    }, {
      "user_id" : "3",
      "nickname" : "alice",
      "joinDate" : "2023-07-25",
      "orders" : [ {
        "product_id" : "P137",
        "price" : "4500"
      }, {
        "product_id" : "P138",
        "price" : "8000"
      } ]
    } ]
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search