EN VI

Python - file with quotes in any of the fields to be retained as is in pandas?

2024-03-12 14:00:05
Python - file with quotes in any of the fields to be retained as is in pandas

I have been trying so many options but not able to retain the quotes present in the input file onto my output file.

Reproducible code:

# Input file
csv_data = '''A,B,C,D,E
234,mno,C22,U,
567,pqr,"C3""",U,5555
999,abc,"C99",D,9999
'''

# Load CSV data into dataframes
df = pd.read_csv(StringIO(csv_data), header=0, dtype=str, keep_default_na=False, engine='python', sep=',')

df.to_csv('output.txt', sep=',', index=False, header=True)

Now, the output.txt looks like:

A,B,C,D,E
234,mno,C22,U,
567,pqr,"C3""",U,5555
999,abc,C99,D,9999

Expected output:

A,B,C,D,E
234,mno,C22,U,
567,pqr,"C3""",U,5555
999,abc,"C99",D,9999

I just don't want to lose anything present in my input data while saving (including the quotes).

Solution:

Add parameter quoting to pd.read_csv and to df.to_csv with 3 (QUOTE_NONE):

# Load CSV data into dataframes
df = pd.read_csv(StringIO(csv_data), 
                 header=0, 
                 dtype=str,
                 keep_default_na=False, 
                 engine='python', 
                 sep=',', 
                 quoting=3)
print (df)
     A    B       C  D     E
0  234  mno     C22  U      
1  567  pqr  "C3"""  U  5555
2  999  abc   "C99"  D  9999

print (df.to_csv(sep=',', index=False, header=True, quoting=3))
A,B,C,D,E
234,mno,C22,U,
567,pqr,"C3""",U,5555
999,abc,"C99",D,9999

df.to_csv('output.txt', sep=',', index=False, header=True, quoting=3)
Answer

Login


Forgot Your Password?

Create Account


Lost your password? Please enter your email address. You will receive a link to create a new password.

Reset Password

Back to login