Operating Excel with openpyxl
overview
When working with Excel files using Python, you can use a library called openpyxl. openpyxl is a library for reading and writing Excel files from Python, allowing you to convert Excel files into Python data structures. This time, I will explain the basic usage of openpyxl.
version confirmation
python: v3.11.2
openpxl: v3.1.1
install openpyxl
First, install openpyxl. Execute the following command.
pip install openpyxl
whole source code
from openpyxl import load_workbook
# read excel file
wb = load_workbook('input.xlsx')
# get sheet
ws = wb.worksheets[0]
# get cell
cell = ws.cell(row=1, column=1)
# set the value in the cell
cell.value = 'Hello, world!'
# save excel file
wb.save('output.xlsx')
Load Excel file
To load an Excel file, use openpyxl’s load_workbook() method. Write the code as below.
from openpyxl import load_workbook
# read excel file
wb = load_workbook('input.xlsx')
Specify the path of the Excel file in the load_workbook() method.
get sheet
An Excel file may contain multiple sheets. To get the sheets, use the worksheets attribute from the wb object. Write the code as below.
# get sheet
ws = wb.worksheets[0]
The worksheets attribute returns a list of sheets. In the code above I am getting the first sheet in the list.
Manipulating cells
# get cell
cell = ws.cell(row=1, column=1)
# set the value in the cell
cell.value = 'Hello, world!'
Specify the row number and column number in the cell() method. The above code gets the cell in row 1, column 1. Also, to set the value in the obtained cell, assign it to the value attribute.
Saving an Excel file
To save an excel file, use the save() method from the wb object. Write the code as below.
# save excel file
wb.save('output.xlsx')
The save() method specifies the name of the file to save. The above is the basic flow of working with Excel files using openpyxl. These operations can be combined to achieve various processing.
Manageable project structure (bonus)
The above method works well for Excel files, but it is inconvenient when sharing with other project members. So write npm scripts to be more concise. Also, I don’t want to pollute the global pip, so I install it in the current directory and use it.
- Install node.js to use the supplemental npm scripts
Create requipments for version control
//requipments.txt
openpyxl >= 3.1.1
// package.json
{
"name": "openpyxl",
"version": "1.0.0",
"description": "",
"main": "index.js",
"scripts": {
"install": "pip install -r requirements.txt -t site-packages",
"start": "python main.py"
},
"author": "",
"license": "ISC"
}
After creating the following files, install pyinstaller using npm scripts.
npm run install
After the installation is complete, modify it to load the site-packages in the current directory.
import os, sys
sys.path.append(os.path.join(os.path.dirname(__file__), 'site-packages'))
from openpyxl import load_workbook
# read excel file
wb = load_workbook('input.xlsx')
# get sheet
ws = wb.worksheets[0]
# get cell
cell = ws.cell(row=1, column=1)
# set the value in the cell
cell.value = 'Hello, world!'
# save excel file
wb.save('output.xlsx')
After modifying the source, start python using npm scripts.
npm run start
Please see below for the project https://github.com/wiblok/python/tree/main/openpyxl