J'ai le code suivant :
# Seperate filename from extension
sep = os.sep
# Change the casing
for n in os.listdir("staging"):
if os.path.isfile("staging" + sep + n):
filename_one, extension = os.path.splitext(n)
os.rename("staging" + sep + n, "staging" + sep + filename_one.lower() + extension)
# Show the new file names
print ('\n--------------------------------\n')
for n in os.listdir("staging"):
print (n)
# Remove the blanks, -, %, and /
for n in os.listdir("staging"):
print (n)
if os.path.isfile("staging" + sep + n):
filename_zero, extension = os.path.splitext(n)
os.rename("staging" + sep + n , "staging" + sep + filename_zero.replace(' ','_').replace('-','_').replace('%','pct').replace('/','_') + extension)
# Show the new file names
print ('\n--------------------------------\n')
for n in os.listdir("staging"):
print (n)
In order to fix all of the column headers and to solve the encoding issues and remove nulls,
first read in all of the CSV's to python as dataframes, then make changes and rewrite the old files
import os
import glob
import pandas as pd
files = glob.glob(os.path.join("staging" + "/*.csv"))
# Create an empty dictionary to hold the dataframes from csvs
dict_ = {}
# Write the files into the dictionary
for file in files:
dict_[file] = pd.read_csv(file, header = 0, dtype = str, encoding = 'cp1252').fillna('')
Dans le dictionnaire, les cadres de données sont nommés "folder/name(csv)". Ce que je voudrais faire, c'est supprimer le préfixe "staging/" des clés du dictionnaire.
Comment faire ?