Python Data analysis
A good workman does his work well , You must sharpen your tools first “,Python Is so far to do data analysis of the most commonly used programming language , We can stand on the shoulders of giants , Efficient data analysis .
Let's first learn about Python The history of development ,Python Language was born of 20 century 80 years . By the Dutch Guido van Rossum Development complete . We call Guido van Rossum by Python The father of . It is worth mentioning that Python The origin of the name ,Python It means a python , but Guido This name has nothing to do with boa constrictor . When Guido In the realization of Python When , He also read Monty Python's Flying Circus The script , This is from a movie from 20 century 70 s BBC comedy .Guido Think he needs a brief 、 A unique and slightly mysterious name , So he decided to call the language Python.
Python1.0 Version on 1994 year 1 Published in , The main new features of this version are lambda, map, filter and reduce, however Guido I don't like this version .
Six and a half years later 2000 year 10 month ,Python2.0 Released . The main new features of this release are memory management and loop detection of garbage collectors, as well as for Unicode Support for . However , The most important change is the change in the development process ,Python Now there is a more transparent community .
2008 Year of 12 month ,Python3.0 Released .Python3.x Backward incompatibility Python2.x, It means Python3.x May not work Python2.x Code for .Python3 Represents the Python The future of language .
Today's Python Has entered into 3,0 Time ,Python The community is also booming , When you present a relevant Python problem , Almost always someone has the same problem and has solved it .
Python Characteristics of language ：
Python It's a completely object-oriented language , function 、 modular 、 Numbers 、 Strings are objects , stay Python Everything is an object . Support for overloaded operators , It also supports generic design .
Python Has a strong library of standards ,Python The core of a language contains only Numbers 、 character string 、 list 、 Dictionaries 、 Common types and functions such as files , And by the Python The standard library provides system administration 、 Network communication 、 Text processing 、 Database interface 、 Graphics system 、XML Additional functionality such as processing .
Python The community provides a large number of third-party modules , It's similar to the standard library . Their functions cover scientific computing 、 Artificial intelligence 、 machine learning 、Web Development 、 Database interface 、 Many fields of graphic system .
because Python It has powerful functions , Easy to use , Easy to start with . We often hear people say “ Life is too short , I use Python”. Research institution Tiobe Released this week 2020 year 10 Monthly analysis report ,Python Language ranked third for two consecutive years . And in the 2020 year 11 In the latest data of the month ,Python With an irresistible trend to surpass Java Become the second .
It is particularly important to choose a suitable programming language ,Python Language is simple , studies of the Book of Changes , Fast , Free and open source , It focuses on how to solve problems 、 Free and open community environment and rich third party Library , There's no need to waste time building wheels ： Various Web frame 、 The crawler frame 、 Data analysis framework 、 Machine learning framework , Use immediately . from Python In terms of popularity , It has been on the rise
We're going to use it now Python To do data analysis , There are two aspects to consider ：
First of all ： What development tools to choose .
second ： What knowledge should be learned to solve the problem of data analysis .
Development tools I recommend Anaconda. The specific software can be downloaded from Tsinghua University's open source image website （https://mirror.tuna.tsinghua.edu.cn/help/anaconda/） According to the software and hardware environment of your computer, download the corresponding version of the installation package . Input on the console after installation jupyter notebook that will do .
The official account is detailed. anaconda Installation process of , The article links below ：
anaconda The installation process Brother Dabin , official account ： Data Valley Python And Anaconda install
Data analysis uses Python The knowledge points in and common scientific computing database also need to be listed ：
Basic grammar ： Variable 、 data type 、 Conditions 、 loop .
data structure ： aggregate 、 Tuples 、 Dictionaries .
Input and output
Scientific Computing Library ：NumPy,Pandas,Matplotlib,Seaborn.
Python Data analysis is mainly to solve the problem of data cleaning and data visualization , master Python The basic rules of grammar , It is very important to call the third-party module to improve the ability of data analysis . and NumPy and Pandas Is the best tool for data cleaning ,Matplotlib and Seaborn Is a toolkit for data visualization . We can learn from a practical point of view Python, Improve the ability and efficiency of data analysis .
This article is from WeChat official account. - Data Valley （BigDataValley） , author ： Wooden Yi
The source and reprint of the original text are detailed in the text , If there is any infringement , Please contact the firstname.lastname@example.org Delete .
Original publication time ： 2020-11-09
Participation of this paper Tencent cloud media sharing plan , You are welcome to join us , share .