The significance of performance testing

After finishing a python After the project , We often have to consider optimizing the performance of the software . So we need a software optimization idea , First of all, we need to make clear the bottleneck of software code and function , The ideal situation is to have such a tool , We can evaluate the performance of every line of code of an objective function , So we can target the worst performing part of all the code , For targeted optimization . Open source library line_profiler I did a job like this , Open source address :github.com/rkern/line_profiler. Let's take a look at the installation and use details of the tool .

line_profiler Installation

line_profiler Installation support source code installation and pip Installation , Here we only introduce pip Form of installation , It's easier , Please refer to the official open source address for source code installation .

[dechin@dechin-manjaro line_profiler]$ python3 -m pip install line_profiler
Collecting line_profiler
Downloading line_profiler-3.1.0-cp38-cp38-manylinux2010_x86_64.whl (65 kB)
|████████████████████████████████| 65 kB 221 kB/s
Requirement already satisfied: IPython in /home/dechin/anaconda3/lib/python3.8/site-packages (from line_profiler) (7.19.0)
Requirement already satisfied: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in /home/dechin/anaconda3/lib/python3.8/site-packages (from IPython->line_profiler) (3.0.8)
Requirement already satisfied: backcall in /home/dechin/anaconda3/lib/python3.8/site-packages (from IPython->line_profiler) (0.2.0)
Requirement already satisfied: pexpect>4.3; sys_platform != "win32" in /home/dechin/anaconda3/lib/python3.8/site-packages (from IPython->line_profiler) (4.8.0)
Requirement already satisfied: setuptools>=18.5 in /home/dechin/anaconda3/lib/python3.8/site-packages (from IPython->line_profiler) (50.3.1.post20201107)
Requirement already satisfied: jedi>=0.10 in /home/dechin/anaconda3/lib/python3.8/site-packages (from IPython->line_profiler) (0.17.1)
Requirement already satisfied: decorator in /home/dechin/anaconda3/lib/python3.8/site-packages (from IPython->line_profiler) (4.4.2)
Requirement already satisfied: traitlets>=4.2 in /home/dechin/anaconda3/lib/python3.8/site-packages (from IPython->line_profiler) (5.0.5)
Requirement already satisfied: pygments in /home/dechin/anaconda3/lib/python3.8/site-packages (from IPython->line_profiler) (2.7.2)
Requirement already satisfied: pickleshare in /home/dechin/anaconda3/lib/python3.8/site-packages (from IPython->line_profiler) (0.7.5)
Requirement already satisfied: wcwidth in /home/dechin/anaconda3/lib/python3.8/site-packages (from prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0->IPython->line_profiler) (0.2.5)
Requirement already satisfied: ptyprocess>=0.5 in /home/dechin/anaconda3/lib/python3.8/site-packages (from pexpect>4.3; sys_platform != "win32"->IPython->line_profiler) (0.6.0)
Requirement already satisfied: parso<0.8.0,>=0.7.0 in /home/dechin/anaconda3/lib/python3.8/site-packages (from jedi>=0.10->IPython->line_profiler) (0.7.0)
Requirement already satisfied: ipython-genutils in /home/dechin/anaconda3/lib/python3.8/site-packages (from traitlets>=4.2->IPython->line_profiler) (0.2.0)
Installing collected packages: line-profiler
Successfully installed line-profiler-3.1.0

Here is an additional introduction to a temporary use pip The source of the installation program , Here is what Tencent provides pypi Source :

python3 -m pip install -i https://mirrors.cloud.tencent.com/pypi/simple line_profiler

If you need to save the source permanently, you can modify it ~/.pip/pip.conf file , A reference example is as follows ( Using the image source of Huawei cloud ):

[global]
index-url = https://mirrors.huaweicloud.com/repository/pypi/simple
trusted-host = mirrors.huaweicloud.com
timeout = 120

Reference... In code that needs debugging optimization line_profiler

Let's look directly at a case :

# line_profiler_test.py
from line_profiler import LineProfiler
import numpy as np @profile
def test_profiler():
for i in range(100):
a = np.random.randn(100)
b = np.random.randn(1000)
c = np.random.randn(10000)
return None if __name__ == '__main__':
test_profiler()

In this case , We define a function that needs to be tested test_profiler, In this function, there are several lines of performance modules to be analyzed numpy.random.randn. The way to use it is first import Come in LineProfiler function , Then, the function named profile The decorator , It's done. line_profiler Configuration of performance analysis . About python The use and principle of decorators , You can refer to this one Blog The content of . Another thing to note is ,line_profiler The scope that can be analyzed is limited to the function content with decorator , If there are other calls in the function and so on , I won't go into other functions for analysis , Except for embedded nested functions .

Use line_profiler Perform a simple performance analysis

line_profiler It's also easy to use , There are two main steps : First use kernprof analysis , Then use python The analysis results are obtained by execution .

  1. After defining the function modules that need to be analyzed , use kernprof Parse into binary lprof file :
[dechin-manjaro line_profiler]# kernprof -l line_profiler_test.py
Wrote profile results to line_profiler_test.py.lprof

After the execution of the command , Will generate a... In the current directory lprof file :

[dechin-manjaro line_profiler]# ll
Total usage 8
-rw-r--r-- 1 dechin dechin 304 1 month 20 16:00 line_profiler_test.py
-rw-r--r-- 1 root root 185 1 month 20 16:00 line_profiler_test.py.lprof
  1. Use python3 function lprof Binary :
[dechin-manjaro line_profiler]# python3 -m line_profiler line_profiler_test.py.lprof
Timer unit: 1e-06 s Total time: 0.022633 s
File: line_profiler_test.py
Function: test_profiler at line 5 Line # Hits Time Per Hit % Time Line Contents
==============================================================
5 @profile
6 def test_profiler():
7 101 40.0 0.4 0.2 for i in range(100):
8 100 332.0 3.3 1.5 a = np.random.randn(100)
9 100 2092.0 20.9 9.2 b = np.random.randn(1000)
10 100 20169.0 201.7 89.1 c = np.random.randn(10000)
11 1 0.0 0.0 0.0 return None

Here we directly get the line by line performance analysis conclusion . Briefly introduce the meaning of each column : The line number of the code in the code file 、 Number of calls 、 The total execution time of the line 、 Time spent on a single execution 、 The proportion of execution time under this function , The last column is the specific code content . Actually , About line_profiler This is the end of the introduction , But we hope to analyze it through another actual case line_profiler The function of , Interested readers can continue to read on .

Use line_profiler Analyze different function libraries and calculate sine function sin The efficiency of

We need to test the implementation in multiple libraries Sine function , It includes our own use of fortran Built in SIN function .

In the demo line_profiler Before the performance test of , Let's first look at how to make a fortran Of f90 File conversion to python Callable DLL file .

  1. First, in the Manjaro Linux On the platform gfotran
[dechin-manjaro line_profiler]# pacman -S gcc-fortran
Resolving dependencies ...
Looking for package conflicts ... software package (1) gcc-fortran-10.2.0-4 Download size : 9.44 MiB
Full installation size : 31.01 MiB :: Do you want to install ? [Y/n] Y
:: Getting package ......
gcc-fortran-10.2.0-4-x86_64 9.4 MiB 6.70 MiB/s 00:01 [#######################################################################################] 100%
(1/1) Checking the key in the key ring [#######################################################################################] 100%
(1/1) Checking package integrity [#######################################################################################] 100%
(1/1) Loading package file [#######################################################################################] 100%
(1/1) Checking for file conflicts [#######################################################################################] 100%
(1/1) Checking available storage [#######################################################################################] 100%
:: Processing package changes ...
(1/1) Installing gcc-fortran [#######################################################################################] 100%
:: Running post transaction hook function ...
(1/2) Arming ConditionNeedsUpdate...
(2/2) Updating the info directory file...
  1. Create a simple fortran file fmath.f90, The function is to return the value of sine function :
subroutine fsin(theta,result)
implicit none
real*8::theta
real*8,intent(out)::result
result=SIN(theta)
end subroutine
  1. use f2py Will be fortran The file is compiled into a file called fmath Dynamic link library for :
[dechin-manjaro line_profiler]# f2py -c -m fmath fmath.f90
running build
running config_cc
unifing config_cc, config, build_clib, build_ext, build commands --compiler options
running config_fc
unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options
running build_src
build_src
building extension "fmath" sources
f2py options: []
f2py:> /tmp/tmpup5ia9lf/src.linux-x86_64-3.8/fmathmodule.c
creating /tmp/tmpup5ia9lf/src.linux-x86_64-3.8
Reading fortran codes...
Reading file 'fmath.f90' (format:free)
Post-processing...
Block: fmath
Block: fsin
Post-processing (stage 2)...
Building modules...
Building module "fmath"...
Constructing wrapper function "fsin"...
result = fsin(theta)
Wrote C/API module "fmath" to file "/tmp/tmpup5ia9lf/src.linux-x86_64-3.8/fmathmodule.c"
adding '/tmp/tmpup5ia9lf/src.linux-x86_64-3.8/fortranobject.c' to sources.
adding '/tmp/tmpup5ia9lf/src.linux-x86_64-3.8' to include_dirs.
copying /home/dechin/anaconda3/lib/python3.8/site-packages/numpy/f2py/src/fortranobject.c -> /tmp/tmpup5ia9lf/src.linux-x86_64-3.8
copying /home/dechin/anaconda3/lib/python3.8/site-packages/numpy/f2py/src/fortranobject.h -> /tmp/tmpup5ia9lf/src.linux-x86_64-3.8
build_src: building npy-pkg config files
running build_ext
customize UnixCCompiler
customize UnixCCompiler using build_ext
get_default_fcompiler: matching types: '['gnu95', 'intel', 'lahey', 'pg', 'absoft', 'nag', 'vast', 'compaq', 'intele', 'intelem', 'gnu', 'g95', 'pathf95', 'nagfor']'
customize Gnu95FCompiler
Found executable /usr/bin/gfortran
customize Gnu95FCompiler
customize Gnu95FCompiler using build_ext
building 'fmath' extension
compiling C sources
C compiler: gcc -pthread -B /home/dechin/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC creating /tmp/tmpup5ia9lf/tmp
creating /tmp/tmpup5ia9lf/tmp/tmpup5ia9lf
creating /tmp/tmpup5ia9lf/tmp/tmpup5ia9lf/src.linux-x86_64-3.8
compile options: '-I/tmp/tmpup5ia9lf/src.linux-x86_64-3.8 -I/home/dechin/anaconda3/lib/python3.8/site-packages/numpy/core/include -I/home/dechin/anaconda3/include/python3.8 -c'
gcc: /tmp/tmpup5ia9lf/src.linux-x86_64-3.8/fmathmodule.c
gcc: /tmp/tmpup5ia9lf/src.linux-x86_64-3.8/fortranobject.c
In file included from /home/dechin/anaconda3/lib/python3.8/site-packages/numpy/core/include/numpy/ndarraytypes.h:1822,
from /home/dechin/anaconda3/lib/python3.8/site-packages/numpy/core/include/numpy/ndarrayobject.h:12,
from /home/dechin/anaconda3/lib/python3.8/site-packages/numpy/core/include/numpy/arrayobject.h:4,
from /tmp/tmpup5ia9lf/src.linux-x86_64-3.8/fortranobject.h:13,
from /tmp/tmpup5ia9lf/src.linux-x86_64-3.8/fmathmodule.c:15:
/home/dechin/anaconda3/lib/python3.8/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: Warning :#warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
17 | #warning "Using deprecated NumPy API, disable it with " \
| ^~~~~~~
In file included from /home/dechin/anaconda3/lib/python3.8/site-packages/numpy/core/include/numpy/ndarraytypes.h:1822,
from /home/dechin/anaconda3/lib/python3.8/site-packages/numpy/core/include/numpy/ndarrayobject.h:12,
from /home/dechin/anaconda3/lib/python3.8/site-packages/numpy/core/include/numpy/arrayobject.h:4,
from /tmp/tmpup5ia9lf/src.linux-x86_64-3.8/fortranobject.h:13,
from /tmp/tmpup5ia9lf/src.linux-x86_64-3.8/fortranobject.c:2:
/home/dechin/anaconda3/lib/python3.8/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: Warning :#warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
17 | #warning "Using deprecated NumPy API, disable it with " \
| ^~~~~~~
compiling Fortran sources
Fortran f77 compiler: /usr/bin/gfortran -Wall -g -ffixed-form -fno-second-underscore -fPIC -O3 -funroll-loops
Fortran f90 compiler: /usr/bin/gfortran -Wall -g -fno-second-underscore -fPIC -O3 -funroll-loops
Fortran fix compiler: /usr/bin/gfortran -Wall -g -ffixed-form -fno-second-underscore -Wall -g -fno-second-underscore -fPIC -O3 -funroll-loops
compile options: '-I/tmp/tmpup5ia9lf/src.linux-x86_64-3.8 -I/home/dechin/anaconda3/lib/python3.8/site-packages/numpy/core/include -I/home/dechin/anaconda3/include/python3.8 -c'
gfortran:f90: fmath.f90
/usr/bin/gfortran -Wall -g -Wall -g -shared /tmp/tmpup5ia9lf/tmp/tmpup5ia9lf/src.linux-x86_64-3.8/fmathmodule.o /tmp/tmpup5ia9lf/tmp/tmpup5ia9lf/src.linux-x86_64-3.8/fortranobject.o /tmp/tmpup5ia9lf/fmath.o -L/usr/lib/gcc/x86_64-pc-linux-gnu/10.2.0/../../../../lib -L/usr/lib/gcc/x86_64-pc-linux-gnu/10.2.0/../../../../lib -lgfortran -o ./fmath.cpython-38-x86_64-linux-gnu.so
Removing build directory /tmp/tmpup5ia9lf

There will be some alarms in the middle , But it doesn't affect our normal use , After compiling , You can see a so file ( If it is windows Platforms may be other types of DLL files ):

[dechin-manjaro line_profiler]# ll
Total usage 120
-rwxr-xr-x 1 root root 107256 1 month 20 16:40 fmath.cpython-38-x86_64-linux-gnu.so
-rw-r--r-- 1 root root 150 1 month 20 16:40 fmath.f90
-rw-r--r-- 1 dechin dechin 304 1 month 20 16:00 line_profiler_test.py
-rw-r--r-- 1 root root 185 1 month 20 16:00 line_profiler_test.py.lprof
  1. use ipython Test the function of the DLL :
[dechin-manjaro line_profiler]# ipython
Python 3.8.5 (default, Sep 4 2020, 07:30:14)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.19.0 -- An enhanced Interactive Python. Type '?' for help. In [1]: from fmath import fsin In [2]: print (fsin(3.14))
0.0015926529164868282 In [3]: print (fsin(3.1415926))
5.3589793170057245e-08

Here we can see that based on fortran The function of sine function has been realized , Next, let's formally compare the performance of several sine function implementations ( The underlying implementation is likely to repeat , This is a black box for performance testing ).

First , We still need to create something to be tested python file sin_profiler_test.py

# sin_profiler_test.py
from line_profiler import LineProfiler
import random
from numpy import sin as numpy_sin
from math import sin as math_sin
# from cupy import sin as cupy_sin
from cmath import sin as cmath_sin
from fmath import fsin as fortran_sin @profile
def test_profiler():
for i in range(100000):
r = random.random()
a = numpy_sin(r)
b = math_sin(r)
# c = cupy_sin(r)
d = cmath_sin(r)
e = fortran_sin(r)
return None if __name__ == '__main__':
test_profiler()

here line_profiler The definition of is consistent with the previous example , Our main test objects are numpy,math,cmath Sine function implementation of four open source libraries and one of their own fortran The sine function of , Through the above introduction f2py The dynamic link library constructed follows python Achieve seamless docking . Because of the cupy The library was not installed successfully , So there's no way to test it here for the time being, and it's commented out . And then it's the same , adopt kernprof Build it :

[dechin-manjaro line_profiler]# kernprof -l sin_profiler_test.py
Wrote profile results to sin_profiler_test.py.lprof

Finally through python3 To execute :

[dechin-manjaro line_profiler]# python3 -m line_profiler sin_profiler_test.py.lprof
Timer unit: 1e-06 s Total time: 0.261304 s
File: sin_profiler_test.py
Function: test_profiler at line 10 Line # Hits Time Per Hit % Time Line Contents
==============================================================
10 @profile
11 def test_profiler():
12 100001 28032.0 0.3 10.7 for i in range(100000):
13 100000 33995.0 0.3 13.0 r = random.random()
14 100000 86870.0 0.9 33.2 a = numpy_sin(r)
15 100000 33374.0 0.3 12.8 b = math_sin(r)
16 # c = cupy_sin(r)
17 100000 40179.0 0.4 15.4 d = cmath_sin(r)
18 100000 38854.0 0.4 14.9 e = fortran_sin(r)
19 1 0.0 0.0 0.0 return None

From this result we can see that , In the four libraries of this test ,math The computational efficiency of is the highest ,numpy The computational efficiency of is the lowest , And we wrote it ourselves fortran The interface function It's even better than that numpy It's twice as fast , Second only to math The implementation of the . Actually , Here, the value involves the performance test of a single function , We can also go through ipython The built-in timeit To test it :

[dechin-manjaro line_profiler]# ipython
Python 3.8.5 (default, Sep 4 2020, 07:30:14)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.19.0 -- An enhanced Interactive Python. Type '?' for help. In [1]: from fmath import fsin In [2]: import random In [3]: %timeit fsin(random.random())
145 ns ± 2.38 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each) In [4]: from math import sin as math_sin In [5]: %timeit math_sin(random.random())
107 ns ± 0.116 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each) In [6]: from numpy import sin as numpy_sin In [7]: %timeit numpy_sin(random.random())
611 ns ± 4.28 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) In [8]: from cmath import sin as cmath_sin In [9]: %timeit cmath_sin(random.random())
151 ns ± 1.01 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In this result, we can see that the ranking trend is still consistent with the previous one , But because it will random Module and computing module together , There are some differences in the time values given .

Summary

This article focuses on the introduction python A line by line performance analysis tool line_profiler, Through the simple call of decorator, we can analyze the performance bottleneck of the program , So as to carry out targeted optimization . in addition , In the process of testing, we can also find that , Different forms of sine trigonometric functions are realized , There are differences in performance , It's just that it's not perceived when the frequency of daily use is low . What we need to know is , Even sine functions have many different implementations , For example, series expansion , And now it's the most popular 、 The highest performance computing method , In fact, it's still by looking up the table . therefore , Different algorithm implementations 、 Different language implementations , Will lead to completely different results . In terms of testing , The known performance ranking is :math<fortran<cmath<numpy The running time increases from left to right .

Copyright notice

The first link to this article is :https://www.cnblogs.com/dechinphy/p/line-profiler.html

author ID:DechinPhy

For more original articles, please refer to :https://www.cnblogs.com/dechinphy/

Use line_profiler Yes python More articles on code performance evaluation and optimization

  1. Python Code performance optimization techniques ( turn )

    original text :Python Code performance optimization techniques Python Common tips for code optimization Code optimization can make programs run faster , It makes the program run more efficiently without changing the result of the program , according to 80/20 principle , Realize the refactoring of the program . Optimize . ...

  2. [ turn ] Python Code performance optimization techniques

    When you choose a scripting language, you have to endure its speed , To some extent, this sentence shows that python As a weakness of the script , That is, the execution efficiency and performance are not ideal , Especially in performance On a poor machine , So it is necessary to optimize the code ...

  3. Python Code performance optimization techniques

    Abstract : Code optimization can make programs run faster , Can improve the efficiency of the implementation of the program , For a software developer , How to optimize the code , Where to start to optimize ? These are the issues they are very concerned about . This article focuses on how to optimize Python Code , It'll make you ...

  4. Python Code performance optimization techniques

    When you choose a scripting language, you have to endure its speed , To some extent, this sentence shows that python As a weakness of the script , That is, the execution efficiency and performance are not ideal , Especially in performance On a poor machine , So it is necessary to optimize the code ...

  5. Use profile Conduct python Code performance analysis

    Locate program performance bottlenecks The premise of code optimization is to know where the performance bottleneck is , The main time that a program runs is where it is consumed , For more complex code, you can use some tools to locate ,python Rich performance analysis tools are built in , Such as profile,c ...

  6. python Code performance analysis library

    Problem description 1.Python The developed program is slow to use , Want to determine which code is slower : 2.Python The developed program takes up a lot of memory in use , I want to know which code caused it : Solution Use profile Analysis and analysis c ...

  7. Python Performance analysis and optimization PDF Download the full HD version for free | Baidu cloud disk

    Baidu cloud disk |Python Performance analysis and optimization PDF Download the full HD version for free Extraction code :ubjt Content abstract Comprehensive mastery Python Code performance analysis and optimization methods , Eliminate performance bottlenecks , Quickly improve program performance ! about Python For the programmer , only ...

  8. Optimize Python Code 4 Methods

    Introduce As a data scientist , Write optimized Python Code is very, very important . clutter , Inefficient code is a waste of your time and even your project money . Experienced data scientists and professionals know , When we work with customers , Messy code is unacceptable . ...

  9. Machine learning practice notes (Python Realization )-07- Model evaluation and classification performance measurement

    1. Empirical error and over fitting Usually we call the proportion of the number of samples with wrong classification to the total number of samples as “ Error rate ”(error rate), That is, if you are in m There are a Wrong classification of samples , Then the error rate E=a/m: Corresponding ,1-a/m be called “ precision ”(acc ...

  10. Python Code analysis tools dis modular

    from :http://hi.baidu.com/tinyweb/item/923d012e8146d00872863ec0  , The format has been adjusted . Code analysis is not a new topic , The importance of code analysis is subjective , Different ...

Random recommendation

  1. 【 Reprint 】JSP Common jump ways

    from :http://blog.csdn.net/wanghuan203/article/details/8836326 (1)href Hyperlink tags , Belongs to client jump (2) Use javascript complete , Belong to ...

  2. highcharts Report plug-in expoting Use of parameters

    exporting Parameter configuration Reprinted from :http://blog.csdn.net/myjlvzlp/article/details/8531275 explain : Export and print options Print the configuration item of export function . 1. ...

  3. Hive2.0 Function Daquan ( Chinese version )

    Abstract Hive Many functions are provided internally for developers to use , Including mathematical functions , Type conversion function , Conditional function , Character functions , Aggregate functions , Table generating functions and so on , These functions are collectively referred to as built-in functions . Catalog Mathematical functions Set function Type conversion function Date function strip ...

  4. linux Interprocess communication And fifo

    A way of interprocess communication has been introduced in the last blog , But it's only about the process of kinship , That is, the communication between father and son processes , That's for processes that are not related by blood , So how to communicate ?  This is about creating a famous pipeline , To solve unrelated process communication , fi ...

  5. CF #CROC 2016 - Elimination Round D. Robot Rapping Results Report Two points + A topological sort

    Topic link :http://codeforces.com/contest/655/problem/D The main idea is to give some pairs of partial orders , How many pairs of relationships do you need at least , You can determine all the size relationships . The solution is a dichotomy , Using topological sorting, we can see that ...

  6. Oracle dblink The connection mode of the test summary

    This article mainly introduces database link Due to the different ways of connecting to the database, some problems encountered , We know the connection ORACLE There are generally two modes of server mode : Dedicated server connection (dedicated server) Connect to a shared server ...

  7. Time selector (timepicker)

    have access to Slider drag select , You can also use timespinner Change time , Or by hand . Automatically determine the position effect : Source code : <!DOCTYPE html> <html xmlns=&quo ...

  8. html5- New elements, new layout templates - Improving

    <!DOCTYPE html><html lang="en"><head>    <meta charset="UTF-8&qu ...

  9. css There is a gap

  10. Hadoop colony ( Two ) HDFS build

    HDFS It's just Hadoop The most basic service , A lot of other services , It's all based on HDFS In the . So deploy a HDFS colony , It's a core action , It's also the beginning of big data platform . install Hadoop colony , First you have to have Zookeeper ...