writing | Leisure and joy
source ：Python technology 「ID: pythonall」
Why are more and more white-collar non programmers learning Python ？ They may not want to learn Python Go to some websites and get a cool sense of accomplishment , It's a lot of data analysis and processing problems , use Python It can be solved simply and efficiently . This article will show you through a practical example Python How to solve complex problems efficiently in practical work .
Xiao Ming works for an outdoor sports company , Their company has many brands , And it involves a lot of subdivided industries . Xiao Ming works as a data analyst in this company , Usually through Excel To do data analysis . The boss gave him a task today ： Select the top five brands and sales volume of the group company in the latest year before work .
about Excel For the boss , Isn't that a matter of minutes ？ Xiao Ming didn't pay attention to it , Until colleagues in the marketing department sent him the original data file , He realized that it wasn't that simple ：
Form file （ Data from network ）
It's not the top five in the imaginary order . There are 90 File , In the normal way of thinking , He either copies the contents of all the files into a table to sort them out , Or you can sort and summarize each form , Then the final results are classified and summarized .
Think about the workload , Think about the deadline , Xiao Ming scratched his head , I feel like I'm going bald .
This kind of physical work , It's the easiest thing to write a program . Xiao Ming thought of his programmer friend Duan at this time , So he left the question to the short paragraph .
A little wisp of his little hair , say ：so easy, Just find master pan .
Xiao Ming said ： You're not sure ？ We need to find someone else ！
A little bitter smile said ： No no no , Master pan is Python It's a data processing library , It's called Pandas , Be commonly called Master pan .
Xiao Ming said ： I don't care what master, master , Just say how long it will take .
On the short paragraph ： Give me a few minutes to program , Just a few more seconds ！
Xiao Mingfa's face of worshiping the big man .
A little bit of thinking , Sort out the program ideas ：
- Calculate the sales of each row in each table , use “ Number of visitors Conversion rate Customer unit price ” Just go .
- Put each table together by brand sales .
- Sum up the results of all the tables into a general table
- Summarize and sort sales by brand in the general table
Step zero , Read Excel :
import pandas as pd df = pd.read_excel("./tables/" + name)
First step , Calculate the sales in each form ：
df[' sales '] = df[' Number of visitors '] * df[' Conversion rate '] * df[' Customer unit price ']
The second step , Put each table together by brand sales ：
df_sum = df.groupby(' brand ')[' sales '].sum().reset_index()
The third step , Sum up the results of all the tables into a general table ：
result = pd.DataFrame() result = pd.concat([result, df_sum])
Step four , Summarize and sort sales by brand in the general table ：
final = result.groupby(' brand ')[' sales '].sum().reset_index().sort_values(' sales ', ascending=False)
Last , Let's take a look at the complete program ：
import pandas as pd import os result = pd.DataFrame() for name in os.listdir("./tables"): try: df = pd.read_excel("./tables/" + name) df[' sales '] = df[' Number of visitors '] * df[' Conversion rate '] * df[' Customer unit price '] df_sum = df.groupby(' brand ')[' sales '].sum().reset_index() result = pd.concat([result, df_sum]) except: print(name) pass final = result.groupby(' brand ')[' sales '].sum().reset_index().sort_values(' sales ', ascending=False) pd.set_option('display.float_format', lambda x: '%.2f' % x) print(final.head())
The end result is this ：
brand sales 15 brand -5 1078060923.62 8 brand -17 1064495314.96 4 brand -13 1038560274.21 3 brand -12 1026115153.00 13 brand -3 1006908609.07
You can see that the final top five have come out , The whole program runs very fast .
In a few minutes , Xiaoduan sent the result to Xiaoming , Xiaoming was moved to the heart , Call for dinner some other day , Apprenticeship ！
This article is to show you master pan through a practical case （Pandas） The charm of , Especially for this kind of table processing , It's very convenient . Those who have written programs may feel a little familiar with it , It's a little bit like SQL Query statement . Master pan can not only make our program more simple and efficient , It's also very friendly to non programmers who need to deal with tables frequently , It's easy to get started .