2. csv files
2.1. Header rows
Month,Abbr,Numeric
January,Jan,1
...
2.2. Opening files: newline=’’
newline=''
is not specified, newlines embedded inside quoted fields will not be interpreted correctly, and on platforms that use \r\n line endings on write an extra \r will be added.newline=''
, since the csv module does its own (universal) newline handling.2.3. csv.reader
csv.reader
function to read a csv file.Syntax:
- csv.reader(f, dialect='excel', **fmtparams)
- Parameters:
f – a string for the file path to the csv file from the current directory.
dialect – Use a set of parameters specific to a particular CSV dialect; Defaults to excel; one of [‘excel’, ‘excel-tab’, ‘unix’]
fmtparams – optional fmtparams keyword arguments can be given to override individual formatting parameters in the current dialect.
Return a reader object which is an iterable that behaves like a generator allowing iteration over lines in the given f.Each row read from the csv file is returned as a list of strings.No automatic data type conversion is performed unless the QUOTE_NONNUMERIC format option is specified(in which case unquoted fields are transformed into floats).For fmtparams see: https://docs.python.org/3/library/csv.html#dialects-and-formatting-parameters’
2.4. Reading files
months.csv
import csv
csv_path = 'files/months.csv'
with open(csv_path, 'r', newline='') as csv_file:
csv_reader = csv.reader(csv_file)
for row in csv_reader:
print(row)
['Month,Abbr,Numeric']
['January,Jan,1']
['February,Feb,2']
...
['December,Dec,12']
2.5. Reading files with non comma delimiter
months_tabbed.csv
Month Abbr Numeric
January Jan 1
February Feb 2
...
December Dec 12
csv_reader = csv.reader(csv_file, delimiter="\t")
uses the delimiter argument since the csv file used is tab delimited.import csv
csv_path = 'files/months_tabbed.csv'
with open(csv_path, 'r', newline='') as csv_file:
csv_reader = csv.reader(csv_file, delimiter="\t")
for row in csv_reader:
print(row)
['Month,Abbr,Numeric']
['January,Jan,1']
['February,Feb,2']
...
['December,Dec,12']
2.6. next function
- next(iterable, default)
- Parameters:
iterable – An iterable object.
default – Optional. An default value to return if the iterable has reached its end.
The next() function returns the next item in an iterator.
2.7. Using an index with the row lists
next(csv_reader)
to skip the first row which has the headings: “Month,Abbr,Numeric”.import csv
csv_path = 'files/months.csv'
with open(csv_path, 'r', newline='') as csv_file:
csv_reader = csv.reader(csv_file)
next(csv_reader)
for row in csv_reader:
print(row[1], end=", ")
Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, Dec,
2.8. Padded strings
| Month | Abbr | Numeric |
| January | Jan | 1 |
| February | Feb | 2 |
...
| December | Dec | 12 |
padded_row = [str(i).ljust(14) for i in row]
, builds a list which pads the strings with spaces on the right to reach 14 characters in length.import csv
csv_path = 'files/months.csv'
with open(csv_path, 'r', newline='') as csv_file:
csv_reader = csv.reader(csv_file)
for row in csv_reader:
padded_row = [str(i).ljust(14) for i in row]
print("| " + '| '.join(padded_row) + "|")
2.9. Reading a csv file to a list
import csv
csv_path = 'files/months.csv'
data_list = []
with open(csv_path, 'r', newline='') as csv_file:
csv_reader = csv.reader(csv_file, delimiter=",")
for row in csv_reader:
data_list.append(row)
print(data_list)
[
["Month", "Abbr", "Numeric"],
["January", "Jan", "1"],
["February", "Feb", "2"],
["March", "Mar", "3"],
["April", "Apr", "4"],
["May", "May", "5"],
["June", "Jun", "6"],
["July", "Jul", "7"],
["August", "Aug", "8"],
["September", "Sep", "9"],
["October", "Oct", "10"],
["November", "Nov", "11"],
["December", "Dec", "12"],
]
2.10. csv writer
csv.writer
function to write to a csv file.Syntax:
- csv.writer(f, dialect='excel', **fmtparams)
- Parameters:
f – a string for the file path to the csv file from the current directory.
dialect – Use a set of parameters specific to a particular CSV dialect; Defaults to excel; one of [‘excel’, ‘excel-tab’, ‘unix’]
fmtparams – optional fmtparams keyword arguments can be given to override individual formatting parameters in the current dialect.
Return a writer object responsible for converting the user’s data into delimited strings on the given file-like object.Non-string data are stringified with str() before being written.
2.11. Converting comma delimited to tab delimited files.
delimiter="\t"
is one of the optional fmtparams arguments that allows the delimiter to be set.import csv
csv_path = 'files/months.csv'
csv_path2 = 'files/months_tabbed.csv'
with open(csv_path, 'r', newline='') as csv_file:
csv_reader = csv.reader(csv_file)
# next(csv_reader)
with open(csv_path2, 'w', newline='') as new_csv_file:
csv_writer = csv.writer(new_csv_file, delimiter="\t")
for row in csv_reader:
csv_writer.writerow(row)
Month,Abbr,Numeric
January,Jan,1
February,Feb,2
...
December,Dec,12
Month Abbr Numeric
January Jan 1
February Feb 2
...
December Dec 12
2.12. Saving a list of sublists to a csv file
import csv
data_list = [
[87574, 208, 876, -12],
[87677, 388, 900, 1992],
[87780, -236, 540, -2040],
[87884, 208, 1888, -1136],
[87989, 560, 1284, 280],
[88092, 1660, 1772, -1040],
]
myheaders = ["time", "x", "y", "z"]
filename = "files/microbit_data.csv"
with open(filename, "w", newline="") as myfile:
writer = csv.writer(myfile)
writer.writerow(myheaders)
writer.writerows(data_list)
2.13. DictReader
Syntax:
- class csv.DictReader(f, fieldnames=None, restkey=None, restval=None, dialect='excel', *args, **kwds)
- Parameters:
f – a string for the file path to the csv file from the current directory.
fieldnames – a sequence of keys that identify the order in which values in the dictionary passed to the writerow() method are written to file f.
restval – specify the value to be written if the dictionary is missing a key in fieldnames.
dialect – Use a set of parameters specific to a particular CSV dialect; Defaults to excel; one of [‘excel’, ‘excel-tab’, ‘unix’]
**kwds (*args,) –
other optional or keyword arguments are passed to the underlying writer instance.
Create an object that operates like a regular reader but maps the information in each row to a dict whose keys are given by the optional fieldnames parameter.The fieldnames parameter is a sequence.If fieldnames is omitted, the values in the first row of file f will be used as the fieldnames.The dictionary preserves their original ordering.If a row has more fields than fieldnames, the remaining data is put in a list and stored with the fieldname specified by restkey (which defaults to None).If a non-blank row has fewer fields than fieldnames, the missing values are filled-in with the value of restval (which defaults to None).
2.14. DictReading files
months.csv
import csv
csv_path = 'files/months.csv'
with open(csv_path, 'r', newline='') as csv_file:
csv_reader = csv.DictReader(csv_file)
for row in csv_reader:
print(row)
{'Month': 'January', 'Abbr': 'Jan', 'Numeric': '1'}
{'Month': 'February', 'Abbr': 'Feb', 'Numeric': '2'}
...
{"Month": "December", "Abbr": "Dec", "Numeric": "12"}
2.15. Using a key with the row dictionaries
import csv
csv_path = 'files/months.csv'
with open(csv_path, 'r', newline='') as csv_file:
csv_reader = csv.DictReader
(csv_file)
for row in csv_reader:
print(row["Abbr"], end=", ")
Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, Dec,
2.16. DictWriter
Syntax:
- class csv.DictWriter(f, fieldnames, restval='', extrasaction='raise', dialect='excel', *args, **kwds)
- Parameters:
f – a string for the file path to the csv file from the current directory.
fieldnames – a sequence of keys that identify the order in which values in the dictionary passed to the writerow() method are written to file f.
restval – specify the value to be written if the dictionary is missing a key in fieldnames.
extrasaction – indicates what action to take if the dictionary passed to the writerow() method contains a key not found in fieldnames. If it is set to ‘raise’, the default value, a ValueError is raised. If it is set to ‘ignore’, extra values in the dictionary are ignored.
dialect – Use a set of parameters specific to a particular CSV dialect; Defaults to excel; one of [‘excel’, ‘excel-tab’, ‘unix’]
**kwds (*args,) –
other optional or keyword arguments are passed to the underlying writer instance.
Create an object which operates like a regular writer but maps dictionaries onto output rows.
2.17. DictWriter.writeheader
Syntax:
- class DictWriter.writeheader
- Write a row with the field names (as specified in the constructor) to the writer’s file object.Return the return value of the csvwriter.writerow() call used internally.
2.18. DictWriter with fieldnames
letter_frequency.csv
import csv
csv_path = 'files/letter_frequency.csv'
csv_path2 = 'files/letter_frequency_tabbed.csv'
with open(csv_path, 'r') as csv_file:
csv_reader = csv.DictReader(csv_file)
with open(csv_path2, 'w', newline='') as new_file:
fieldnames = ['letter', 'frequency']
csv_writer = csv.DictWriter(new_file, fieldnames=fieldnames, delimiter='\t')
csv_writer.writeheader()
for line in csv_reader:
csv_writer.writerow(line)
letter frequency
A 0.08167
B 0.01492
...
Z 0.00074
2.19. DictWriter with selected fieldnames
afl_premiers_2000s.csv
fieldnames_dict = {key: line[key] for key in fieldnames}
, on each row, to produce a new dictionary for writing to the new csv file.import csv
csv_path = 'files/afl_premiers_2000s.csv'
csv_path2 = 'files/afl_premiers_2000s_tabbed.csv'
with open(csv_path, 'r') as csv_file:
csv_reader = csv.DictReader(csv_file)
with open(csv_path2, 'w', newline='') as new_file:
fieldnames = ["Year","Premiership team"]
csv_writer = csv.DictWriter(new_file, fieldnames=fieldnames, delimiter='\t')
csv_writer.writeheader()
for line in csv_reader:
fieldnames_dict = {key: line[key] for key in fieldnames}
csv_writer.writerow(fieldnames_dict)
Year Premiership team
2024 Brisbane Lions
2023 Collingwood
2022 Geelong Cats
2021 Melbourne
...
2000 Essendon