1. Preface
In the last article , We talked about using xlrd、xlwt、xlutils This combination of operations Excel Methods
The most comprehensive summary | Chat Python Office automation Excel( On )
This article will continue to talk about another way , namely :openpyxl
Have to say ,openpyxl More powerful !
It supports xlsx Format of table file , And support Numpy、Pandas Such as package , Can be used to draw graphs
2. Get ready
First , We need to install dependency packages
# Install dependency packages
pip3 install openpyxl
3. Reading data
Use openpyxl Medium load_workbook(filepath) Load the local one Excel file , The returned result is a workbook object
import openpyxl
# Load local Excel file
wb = openpyxl.load_workbook(file_path)
Use workbook objects , You can get all of Sheet Name and Sheet list
def get_all_sheet_names(wb):
"""
Get all sheet The name of
:param wb:
:return:
"""
# sheet Name list
sheet_names = wb.sheetnames
return sheet_names
def get_all_sheet(wb):
"""
Get all sheet
:param wb:
:return:
"""
# sheet Name list
sheet_names = get_all_sheet_names(wb)
# all sheet
sheets = []
for sheet_name in sheet_names:
sheet = wb[sheet_name]
sheets.append(sheet)
return sheets
Workbook objects provide active attribute , For quick access to the current selection Sheet
def get_current_sheet(wb):
"""
Get the current selection of sheet, The default is the last sheet
:param wb:
:return:
"""
# Currently selected sheet
current_sheet = wb.active
return current_sheet
in addition , It can also be done through Sheet Get a specific name to Sheet object
def get_sheet_by_name(wb, sheet_name):
"""
adopt sheetname To find a certain sheet
:param wb:
:param sheet_name:
:return:
"""
sheet_names = get_all_sheet_names(wb)
if sheet_name in sheet_names:
result = wb[sheet_name]
else:
result = None
return result
Use sheet.max_row and sheet.max_column You can get the current Sheet The number of data rows and columns in the
def get_row_and_column_num(sheet):
"""
obtain sheet The number of rows and columns
:param sheet:
:return:
"""
# Row number
row_count = sheet.max_row
# Number of columns
column_count = sheet.max_column
return row_count, column_count
# Number of rows and columns
row_count, column_count = get_row_and_column_num(sheet)
print(' The number of rows and columns is :', row_count, column_count)
openpyxl Provide 2 How to locate a cell , Namely :
-
Digital index , from 1 Start
Digital index : Row number index 、 Column number index
such as :row_index=1,column_index=1
-
A string index consisting of rows and columns
String index : Columns are made up of letters + Row index
such as :A1 For the first line 、 Cell in the first column
also ,openpyxl.utils Provides methods , Easy Column index Switch between the two
from openpyxl.utils import get_column_letter, column_index_from_string
def column_num_to_str(num):
"""
Excel Index columns are converted from numbers to letters
:param num:
:return:
"""
return get_column_letter(num)
def column_str_to_num(str):
"""
Excel Index columns , From letter to number
:param str:
:return:
"""
return column_index_from_string(str)
Cell acquisition , You can also go through the top 2 Index to get
def get_cell(sheet, row_index, column_index):
"""
Get the cell
:param sheet:
:param row_index:
:param column_index:
:return:
"""
# openpyxl The indexes are all from 1 Start counting , This is related to xlrd Somewhat different
# Get a cell ( A choice )
# such as : obtain A1 Cell data , The first line 、 The first column of data
# cell_one = sheet['A1']
cell_one = sheet.cell(row=row_index, column=column_index)
return cell_one
In daily processing Excel In the data process , You may need to determine the cell data type , and openpyxl There's no ready-made way
here , We can use the cell object's value Property gets the value , Then use isinstance Methods to judge the data type
def get_cell_value_and_type(cell):
"""
Get a certain one cell The content and data type of
:param cell:
:return:
"""
# The value of the cell
cell_value = cell.value
# The type of cell
cell_type = get_cell_value_type(cell_value)
return cell_value, cell_type
def get_cell_value_type(cell_value):
"""
Get data type
:param cell_value:
:return:
"""
# among
# 0: empty
# 1: Numbers
# 2: character string
# 3: date
# 4: other
if not cell_value:
cell_type = 0
elif isinstance(cell_value, int) or isinstance(cell_value, float):
cell_type = 1
elif isinstance(cell_value, str):
cell_type = 2
elif isinstance(cell_value, datetime.datetime):
cell_type = 3
else:
cell_type = 4
return cell_type
Get a line alone [ Column ] The data of , You can use the following methods :
def get_row_cells_by_index(sheet, row_index):
"""
By row index , Get the cell of a row
:param row_index:
:return:
"""
# Be careful : The first column is from 1 Start
row_cells = sheet[row_index]
return row_cells
def get_column_cells_by_index(sheet, column_index):
"""
Through the column index , Get the cell of a column
"""
# Number to letter
column_index_str = column_num_to_str(column_index)
# Get the data of a column
column_cells = sheet[column_index_str]
return column_cells
It should be noted that , To get the data of a row, you need to pass in a numeric index ; And for the acquisition of column data , Must pass in string index
and Python List range values are similar to ,openpyxl It also supports the use of : The symbol gets a range of data rows [ Column ]
def get_rows_by_range(sheet, row_index_start, row_index_end):
"""
Select the line range through the range
such as : Select the first 2 Go to the first place 4 All the data of the row , The return value is a tuple
:param sheet:
:param row_index_start:
:param row_index_end:
:return:
"""
rows_range = sheet[row_index_start:row_index_end]
return rows_range
def get_columns_by_range(sheet, column_index_start, column_index_end):
"""
Select the column range through the range
such as : Select the first 2 Column to the first 4 All the data in the column , The return value is a tuple
:param sheet:
:param column_index_start:
:param column_index_end:
:return:
"""
columns_range = sheet[column_num_to_str(column_index_start):column_num_to_str(column_index_end)]
return columns_range
4. Write data
To write data to Excel form
First , Use openpyxl.Workbook() Create a Excel Workbook object
next , Using the workbook object create_sheet() Create a new one Sheet
# Create a Excel workbook
# Be careful : One new one at a time Excel file , Will default to generate a name of 【Sheet】 The worksheet for Sheet
wb = openpyxl.Workbook()
# Create a new sheet, Default is inserted into the tail
# new_sheet = wb.create_sheet(' new Sheet')
# You can also use the second parameter :index To specify the insertion location
# such as : Insert at the beginning
new_sheet = wb.create_sheet(' new Sheet', 0)
Created by default Sheet To be inserted in the last position , The first 2 Parameters can be specified Sheet Insertion position
Sheet The background color of the tag also supports modification , Use sheet_properties.tabColor Appoint RGB Color value
such as , To set up a certain Sheet The background color of is red , You just need to find the corresponding Sheet, Then specify the color value as FF0000 that will do
def set_sheet_bg_color(sheet, rgb_value):
"""
Set up Sheet Label color
:param rgb_value:
:return:
"""
# Set up Sheet The color of the bottom button (RRGGBB)
sheet.sheet_properties.tabColor = rgb_value
# Set up Sheet Background color of ( Red )
set_sheet_bg_color(new_sheet, 'FF0000')
openpyxl Support row and column digital index 、 String index with this 2 There are three ways to write data to cells
def write_value_to_cell_with_num(sheet, row_index, column_index, value):
"""
Index by row 、 Column index writes data
:param shell:
:param row_index: Row index
:param column_index: Column index
:param value:
:return:
"""
# A choice
sheet.cell(row=row_index, column=column_index, value=value)
# shell.cell(row=row_index, column=column_index).value = value
def write_value_to_cell_with_index_str(sheet, index_str, value):
"""
By letter position , Write data to the corresponding cell
:param shell:
:param index_str: The cell position of the letter
:param value:
:return:
"""
sheet[index_str] = value
It's also easy to insert pictures into cells ,openpyxl Provided add_image() Method
Parameters have 2 individual , Namely : Image objects 、 Cell string index
For ease of use , We can transform the column index , And then it's packaged into two ways to insert images
from openpyxl.drawing.image import Image
def insert_img_to_cell_with_num(sheet, image_path, row_index, column_index):
"""
Insert a picture into a cell
:param sheet:
:param image_path:
:param row_index:
:param column_index:
:return:
"""
# By row index 、 Column index , Get letter index
index_str = column_num_to_str(column_index) + str(row_index)
insert_img_to_cell_with_str(sheet, image_path, index_str)
def insert_img_to_cell_with_str(sheet, image_path, index_str):
"""
Insert a picture into a cell
:param sheet:
:param image_path:
:param index_str:
:return:
"""
sheet.add_image((image_path), index_str)
Last , Calling workbook object save() Method , Write the data to the Excel In file
# Be careful : You have to write , In order to truly save to the file
wb.template = False
wb.save('new.xlsx')
5. Modifying data
Modify the data to include : Modification of cell data 、 Cell style changes
For cell data modification , Just read the workbook object first , Query to the Sheet object , Then we call the above method to modify cell data , Last call save() Function to save the overlay
def modify_excel(self, file_path):
"""
Modify local Excel Data in the file
:param file_path:
:return:
"""
# Read local Excel file
wb = openpyxl.load_workbook(file_path)
# Read a sheet
sheet = wb[' first Sheet']
print(sheet)
# Modify the data of a cell directly
write_value_to_cell_with_num(sheet, 1, 1, ' full name 1')
# Save and cover
wb.save(file_path)
Cell style contains : Font style 、 Cell background style 、 Border style 、 Alignment, etc
In common font styles 、 For example, alignment
First , Use openpyxl Medium Font Class creates an object , Specify the font name 、 font size 、 Is it bold 、 Is it italicized 、 Color 、 Underline, etc
from openpyxl.styles import Font
# Font format
# Specify the font type 、 size 、 Is it bold 、 Color, etc.
font0 = Font(name='Calibri',
size=20,
bold=False,
italic=False,
vertAlign=None,
underline='none',
strike=False,
color='FF00FF00')
next , Construct a Alignment object , Specify the alignment of cells
from openpyxl.styles import Font,Alignment
# Cell alignment
alignment0 = Alignment(horizontal='center',
vertical='bottom',
text_rotation=0,
wrap_text=False,
shrink_to_fit=False,
indent=0)
Last , Using cell objects font/alignment attribute , Set the font style and alignment
# Set attribute style ( typeface 、 Alignment mode )
sheet['A1'].font = font0
sheet['A1'].alignment = alignment0
6. Advanced usage
Next , Let's talk about some common advanced usage
1、 Get visible and hidden Sheet
By judgment Sheet Object's sheet_state Property value , You can judge the current Sheet Show or hide
The duty of visible when , representative Sheet It shows
On duty hidden when , On behalf of the Sheet It's hidden
def get_all_visiable_sheets(wb):
"""
Get all visible sheet
:param wb:
:return:
"""
return [sheet for sheet in get_all_sheet(wb) if sheet.sheet_state == 'visible']
def get_all_hidden_sheets(wb):
"""
Get all hidden sheet
:param wb:
:return:
"""
return [sheet for sheet in get_all_sheet(wb) if sheet.sheet_state == 'hidden']
2、 Get hidden / List of row indexes displayed 、 Column index list
Limited by length , Here to get all the displays / For example, a list of hidden row indexes
Traverse Sheet Object's row_dimensions Property value , By judging the row properties of hidden value , Determine whether the current line is hidden or displayed
def get_all_rows_index(sheet, hidden_or_visiable):
"""
Get all hidden / The displayed line
:param hidden_or_visiable: True: hide ;False: Show
:param sheet:
:return:
"""
# Traversal line
# Hidden index
hidden_indexs = []
# All hidden row indexes
for row_index, rowDimension in sheet.row_dimensions.items():
if rowDimension.hidden:
hidden_indexs.append(row_index)
# All displayed row indexes
visiable_indexs = [index + 1 for index in range(get_row_and_column_num(sheet)[0]) if index + 1 not in hidden_indexs]
# A list of hidden or displayed row indexes
return hidden_indexs if hidden_or_visiable else visiable_indexs
3、 Get cell font color and cell background color
Cell object font.color.rgb、fill.fgColor.rgb Attribute values represent font color values respectively 、 Cell background color
def get_cell_font_color(sheet, row_index, column_index):
"""
Get the color of the cell font
:param sheet:
:param row_index: Row index
:param column_index: Column index
:return:
"""
cell_color = sheet.cell(row_index, column_index).font.color
if cell_color:
return sheet.cell(row_index, column_index).font.color.rgb
else:
# Color doesn't exist , Maybe the cell has no data
return None
def get_cell_bg_color(sheet, row_index, column_index):
"""
Get the color of the cell background
:param sheet:
:param row_index: Row index
:param column_index: Column index
:return:
"""
return sheet.cell(row_index, column_index).fill.fgColor.rgb
7. Last
You can find ,openpyxl comparison xlrd/xlwt, A lot of practical API, More powerful , And perfectly support xlsx!
Limited by length , This paper only shows some functions and codes , More complex functions , such as : Cell merge 、 Cell full style operation , I've encapsulated it as a method to upload to the background
To get all the source code , Official account 「 AirPython 」, The background to reply 「 excel 」 You can get all the source code
If you think the article is good , Please give the thumbs-up 、 Share 、 Leaving a message. Next , Because this will be the strongest driving force for me to continue to output more quality articles !
Recommended reading
The most comprehensive summary | Chat Python Office automation Excel( On )
The most comprehensive summary | Chat Python Data processing family barrel ( Configuration article )
I use a few lines Python The automated script perfectly solves the little sister's wechat anxiety