Reading/parsing Excel (xls) files with Python

Reading/parsing Excel (xls) files with Python

I highly recommend xlrd for reading .xls files. But there are some limitations(refer to xlrd github page):

Warning

This library will no longer read anything other than .xls files. For
alternatives that read newer file formats, please see
http://www.python-excel.org/.

The following are also not supported but will safely and reliably be
ignored:

- Charts, Macros, Pictures, any other embedded object, including embedded worksheets.
- VBA modules
- Formulas, but results of formula calculations are extracted.
- Comments
- Hyperlinks
- Autofilters, advanced filters, pivot tables, conditional formatting, data validation

Password-protected files are not supported and cannot be read by this
library.

voyager mentioned the use of COM automation. Having done this myself a few years ago, be warned that doing this is a real PITA. The number of caveats is huge and the documentation is lacking and annoying. I ran into many weird bugs and gotchas, some of which took many hours to figure out.

UPDATE: For newer .xlsx files, the recommended library for reading and writing appears to be openpyxl (thanks, Ikar Pohorsk√Ĺ).

Using pandas:

import pandas as pd

xls = pd.ExcelFile(ryourfilename.xls) #use r before absolute file path 

sheetX = xls.parse(2) #2 is the sheet number+1 thus if the file has only 1 sheet write 0 in paranthesis

var1 = sheetX[ColumnName]

print(var1[1]) #1 is the row number...

Reading/parsing Excel (xls) files with Python

You can choose any one of them http://www.python-excel.org/
I would recommended python xlrd library.

install it using

pip install xlrd

import using

import xlrd

to open a workbook

workbook = xlrd.open_workbook(your_file_name.xlsx)

open sheet by name

worksheet = workbook.sheet_by_name(Name of the Sheet)

open sheet by index

worksheet = workbook.sheet_by_index(0)

read cell value

worksheet.cell(0, 0).value    

Leave a Reply

Your email address will not be published.