Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I am working with Excel using Python and have couple of questions:

  1. Loading Excel Sheet into 2d Array.
    In VBA I'd simply do:
dim arrData as Variant 
arrData = shtData.Range("A1:E2500") 

I would get array(1 to 2500, 1 to 5), which i can easily access for example arrData(1,5) -> row 1 column 5

In Python what i managed to do is:

#declare list
excel_data=[]

#loop to load excel spreadsheet data into 2d Array   
#basically I am looping through every row and append it to list
for row in shtData.iter_rows(min_row=5, max_row=50,values_only =True):
    excel_data.append(row)
  1. Is there a way to assign row to list, starting from index 1 not 0?
    In VBA there is an option Option Base 1.
    https://docs.microsoft.com/en-us/office/vba/language/reference/user-interface-help/option-base-statement

  2. Is it the fastest way to operate on Excel Dataset?
    I am planning then to loop through let's say 2500 rows and 5 columns -> 12'500 cells.
    With VBA it was very efficient to be honest (operating on array in memory).

  3. As I understand, functionsof OpenPyXL:

load_workbook       

#only creates REFERENCE TO EXCEL WORKBOOK - it does not open it? or is it "loaded" into memory but what is on the HD is actually intact?

shtData = wkb.worksheets[0]                         

#again only reference?

shtReport = wkb.create_sheet(title="ReportTable")       

#it adds sheet but it adds it in excel which is loaded into memory, only after saving, Excel on HD is actually overwritten?

question from:https://stackoverflow.com/questions/65649467/python-and-excel-openpyxl

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
174 views
Welcome To Ask or Share your Answers For Others

1 Answer

You can used Pandas and create a dataframe (2D table) from the Excel spreadsheat.

import pandas as pd

df = pd.read_excel("data.xls")
print(df)
print("____________")
print(f'Average sales are: {df["Gross"].values.mean()}')
print(f'Net income for April: {df.at[3, "Net"]}')
print("____________")
df_no_header = pd.read_excel("data.xls",skiprows=1, header=None)
print(df_no_header)
print("____________")
print(f'Net income for April: {df_no_header.at[3, 2]}')

Output:

output

The Pandas dataframe has many methods that will allow you to access rows and columns and do much more. Setting skiprows=1, header=None will skip the header row. See here.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...