You’re working on an e-commerce app with 10M+ mobile DAUs. The home feed endpoint returns product cards as JSON, but cellular users in emerging markets are sensitive to payload size: every extra kilobyte increases P95 load time and hurts conversion.
To reduce payload, your backend team proposes a columnar JSON format: instead of repeating the same keys for every item, you send a single fields array and a rows matrix of values. Additionally, to further shrink the payload, you must deduplicate repeated string values (like brand names) into a dict array and replace them with integer indices.
Implement a function that converts a list of JSON-like objects into a compact representation.
Given items (a list of dictionaries) and fields (the ordered list of keys to include), produce a dictionary with:
fields: exactly the input fields.dict: an array of unique strings encountered in the selected fields, in order of first appearance.rows: a list of rows, one per item, where each row is a list aligned to fields:
dict.None.fields are considered (ignore other keys).True/False as booleans (do not convert to strings).Input:
fields = ["id", "brand", "title"]items = [{"id": 1, "brand": "Acme", "title": "Socks"}, {"id": 2, "brand": "Acme", "title": "Hat"}]Output:
{"fields": ["id","brand","title"], "dict": ["Acme","Socks","Hat"], "rows": [[1,0,1],[2,0,2]]}Explanation:
"Acme" repeats, so it appears once in dict and is referenced by index 0 in both rows.
Input:
fields = ["id", "brand", "price", "in_stock"]items = [{"id": 7, "brand": "Zen", "price": 1999}, {"id": 8, "price": 2499, "in_stock": False}]Output:
{"fields": ["id","brand","price","in_stock"], "dict": ["Zen"], "rows": [[7,0,1999,null],[8,null,2499,false]]}Explanation:
Second item is missing brand, so None is placed in that position.
1 <= len(items) <= 2 * 10^51 <= len(fields) <= 50fields are uniquestr, int, bool, None10^6Return the compact representation efficiently (linear in the number of selected values).