In my last few stories, we mostly retrieved financial data from an API. I would like to show you how to read CSV files in Python using Pandas read_csv since it is another useful way to read financial data for our analysis.
Why to use pandas read csv?
Pandas read csv method is super useful to load any data in CSV format to a Pandas DataFrame. For example, I have a CSV file containing the last year of historical prices. I have downloaded the file from Apple in Yahoo Finance where you can download historical prices in CSV format for any company. The name of the files is ‘ AAPL.csv ‘.
Would it not be great to load the data from a CSV file into a Pandas DataFrame? That way we would be able to use all Pandas capabilities to work, analyse and plot the data.
How to read csv files in Python Pandas?
Read a CSV file in Python cannot be easier thanks to Pandas. We can read a CSV file in Pandas with only three lines of codes as shown below:
- First of all, import pandas
- Use Pandas read_csv method and pass as argument the name of the file (Ensure that the file is in the same folder location that the python script)
- Finally, pass the needed parameters
import pandas as pd
apple = pd.read_csv('AAPL.csv',index_col='Date')
What paramaters can we use with Pandas Read csv?
And just like that, we have our Pandas DataFrame containing Apple historical prices. Is it not faster using read_csv method in Pandas than open the file in Excel?
Certainly, reading csv into Python and Pandas is super fast. But this is not all, you may have observed that I passed the parameter index_col as an argument. Index_col let us select the column to use as the index of our DataFrame. Besides index_col, there are plenty of other arguments that can be used with Pandas read_csv method to handle the loaded data . A extended description can be found in the Pandas documentation.
Below are a few of the pandas read_csv arguments that I use the most:
- ucsecols: Return a subset of the columns. For instance, using [0–4] would only load the first four columns included in the csv file
- skiprows: Number of lines to skip from the csv file
- nrows: Number of rows in the file to read
- skip_blank_lines: If equal to True, blank lines are skipped instead of showing them as NaN values.
- iIndex_col: Name a column to be used as index
- parse_dates: If equal to True dates in the index will be parsed
Although it may be faster to use an API to read financial data into Pandas, read_csv method is very useful as well and it is worth to know. Specially, when we have access to tabular data in Excel. In just a few lines of codes, we have the data loaded into Python and Pandas ready for analysis.
Originally published at https://codingandfun.com on May 6, 2020.