I'm trying to stream through a large json file using ijson in python. This is my first time trying this.
my code is really simple right now:
with open('file.json', 'rb') as f:
j = ijson.items(f, 'item')
for item in j:
print('x')
This returns a "trailing garbage" error - essentially the 2nd item in the file is considered garbage, i think because of the file format.
My json file is this one from kaggle, and is formatted like this:
{"_id":{"$oid":"6457879fd1187d621cbbba9c"},"sourceCC":"us",...etc...}
{"_id":{"$oid":"6457879fd1187d621cbddd8a"},"sourceCC":"us",...etc...}
It is about 3GB in size, so im unable to open it.
If i use 'multiple_items=True' i believe it considers all the items to be multiple values for the same item, so it does not return any error, but also does not return anything else.
What can I do?
Thanks.