Python The actual combat community
Java The actual combat community
Long press to identify the QR code below , Add as needed
Scan code, pay attention to add customer service
Into the Python community ▲
Scan code, pay attention to add customer service
Into the Java community ▲
In machine learning , We often need to use classes and functions to define parts of the model , For example, define a function to read data 、 Functions that preprocess data 、 Model architecture and training process functions and so on . So what kind of function is beautiful 、 What about the nice code ? In this paper ,Jeff Knupp This paper discusses how to develop wonderful functions from naming to code amount .
Like most modern programming languages , stay Python in , Function is one of the basic methods of abstraction and encapsulation . You may have written hundreds of functions at the development stage , But not every function is created equal . Write 「 Perishing 」 Functions directly affect the readability and maintainability of the code . that , What kind of function is 「 Perishing 」 Function? ? what's more , How to write 「 well 」 Function? ?
Brief review
Mathematics is full of functions , Although we may not remember them . First of all, let's recall our favorite topic —— Differential and integral calculus . You may remember this equation :f(x) = 2x + 3. This one is called 「f」 Function of , There is an unknown number x,「 return 」2*x+3. This function may be related to us in Python It's not the same , But its basic idea is the same as the function in computer language .
Function has a long history in mathematics , But in computer science, it's more magical . For all that , There are still some defects in the function . Next we'll talk about what is 「 well 」 function , And what kind of symptoms do we need to refactor the function .
The key to a function is
well Python Function and lameness Python What is the difference between functions ?「 good 」 It's amazing how many functions are defined . For our purposes , I'll take care of it Python Functions are defined as functions that conform to most of the rules in the following listing ( Some are more difficult to implement ):
The name is reasonable
With a single function
Include document comments
Returns a value
Code no more than 50 That's ok
idempotent , As pure as possible
For many people , This list may be a little too strict . But I promise , If your function follows these rules , Your code will look beautiful . I'll explain the rules step by step , Then sum up how these rules make up a 「 good 」 function .
name
On this question , One of my favorite words ( come from Phil Karlton, Always be mistaken for Donald Knuth Yes ) yes :
There are only two problems in computer science : Cache invalidation and naming problems .
It sounds a little weird , But the whole nice naming is really hard . Here's a bad function name :
def get_knn(from_df):
I've seen bad naming almost everywhere , But this example comes from data science ( Or say , machine learning ), Practitioners are always in Jupyter notebook Write the code , And then try to turn those different units into an understandable program .
The first problem with naming this function is the use of acronyms / Abbreviations . Compared with acronyms and less popular acronyms , Complete English words will be better . The only reason to use abbreviations is to save typing time , But modern editors have automatic completion function , So you just type in your full name once . The reason why abbreviations are a problem , It's because they're usually used only in specific areas . In the code above ,knn Refer to 「K-Nearest Neighbors」,df refer to 「DataFrame」—— omnipresent Pandas data structure . If another programmer who is not familiar with these abbreviations is reading the code , that TA You'll be confused .
About the function name , There are two other small problems : word 「get」 Be of no great importance . For most well named functions , Obviously, the function will return something , Its name will reflect that .from_df It's also unnecessary . If the name description of the parameter is not clear enough , The function's document comment or type comment will describe the parameter type .
So how do we rename this function ? for example :
def k_nearest_neighbors(dataframe):
Now? , Even the layman knows what this function is calculating , Name of parameter (dataframe) It also clearly tells us what kind of parameters we should pass .
Single function principle
「 Single function principle 」 come from Bob Martin「 uncle 」 A Book of , Not only for classes and modules , The same applies to functions (Martin The original goal ). The principle emphasizes , Functions should have 「 A single function 」. in other words , A function should do only one thing . A big reason for this is : If each function does only one thing , So only if the way a function does that has to change , This function needs to be changed . When a function can be deleted , It's easy : If something else changes , The single function of this function is no longer needed , So just delete it .
Take an example to explain . Here's more than one 「 things 」 Function of :
def calculate_and print_stats(list_of_numbers):
sum = sum(list_of_numbers)
mean = statistics.mean(list_of_numbers)
median = statistics.median(list_of_numbers)
mode = statistics.mode(list_of_numbers)
print('-----------------Stats-----------------')
print('SUM: {}'.format(sum) print('MEAN: {}'.format(mean)
print('MEDIAN: {}'.format(median)
print('MODE: {}'.format(mode)
This function does two things : Calculate a set of statistics about a list of numbers , And print them to STDOUT. This function violates the principle that only one reason can make a function change . There are obviously two reasons why this function can change : New or different data needs to be calculated or the format of output needs to be changed . It's best to write the function as two independent functions : One is used to execute and return the calculation result ; The other is used to receive the results and print them out . A fatal flaw in a function's versatility is that the function name contains the word 「and」
This separation can also simplify testing for function behavior , And they are not only separated into two functions in a module , It may also exist in different modules where appropriate . This makes the test Cleaner 、 Easier maintenance .
Functions that do only two things are very rare . More often, a function is responsible for many, many tasks . Again , For readability 、 For testability , We should put these 「 Versatile 」 Functions are divided into small functions , Each small function is responsible for only one task .
Documentation Comments
quite a lot Python Developers know that PEP-8, It defines the Python Programming style guide , But very few people know what defines the style of document annotation PEP-257. I'm not going to go into detail here PEP-257, Readers can read the document annotation style stipulated in the guide in detail .
PEP-8:https://www.python.org/dev/peps/pep-0008/
PEP-257:https://www.python.org/dev/peps/pep-0257/
First of all, document annotations are defining modules 、 function 、 The first string declaration of a class or method , This string should describe the function clearly 、 Input parameters and return parameters, etc .PEP-257 The main information is as follows :
Each function needs a document description ;
Use proper grammar and punctuation , Write complete sentences ;
At the beginning, we need to summarize the main function of the function in one sentence ;
Use prescriptive language instead of descriptive language .
When writing a function , It's easy to follow these rules . We just need to get into the habit of writing documentation notes , And finish them before actually writing the function body . If you can't clearly describe what this function does , So you need to think more about why you write this function .
Return value
Function can and should be treated as a separate applet . They take some input in the form of parameters , And return some output values . Of course , Parameters are optional , But from Python In terms of internal mechanism , The return value is not optional . Even if you try to create a function that doesn't return a value , We also cannot choose not to use return values internally , because Python The interpreter of will force a return of None. Unbelievable readers can test it with the following code :
* python3
Python 3.7.0 (default, Jul 23 2018, 20:22:55)
[Clang 9.1.0 (clang-902.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" *for *more information.
>>> def add(a, b):
... print(a + b)
...
>>> b = add(1, 2)
3
>>> b
>>> b is None
True
Run the above code , You'll see b The value of is really None. So even if we write one that doesn't contain return Function of statement , It still returns something . But functions should also return something , Because it's also a small program . How much use will a program without output , How do we test it ?
I would even like to make the following statement : Every function should return a useful value , Even if this value can only be used to test . The code we write should need to be tested , It's hard to test a function without a return value , The above function may require redirection I/O To get the test . Besides , The return value can change the method call , The following code shows this concept :
with open('foo.txt', 'r') as input_file:
for line in input_file:
if line.strip().lower().endswith('cat'):
# ... do something useful with these lines
Lines of code if line.strip().lower().endswith('cat') Be able to function , Because string methods (strip(), lower(), endswith()) Will return a string as a result of the call function .
Here are some common reasons people give when asked why they write functions that don't return values :
「 What the function does is similar to I/O The operation of , For example, saving a value to a database , This function does not return useful output .」
I don't agree with that , Because when the operation is successfully completed , The function can return True.
「 I need to return multiple values , Because returning only one value doesn't represent anything .」
Of course, you can also return a tuple containing multiple values . In short , Even in the existing code base , It must be a good idea to return a value from a function , And it's unlikely to destroy anything .
Function length
The length of the function directly affects the readability , This will affect maintainability . So make sure your function is short enough .50 The line function is a reasonable length for me .
If the function follows the single function principle , Generally speaking, its length will be very short . If the function is pure or idempotent ( We'll talk about ), It will also be shorter in length . These ideas are very helpful for building concise code .
So what if a function is too long ? Code refactoring (refactor)! Code refactoring is probably what you've been doing when you write code , Even if you're not familiar with the term . It means : Changing the structure of a program without changing its behavior . So extracting a few lines of code from a long function and converting it to a function belonging to that function is also a kind of code refactoring . This is also the fastest and most common way to shorten long functions . As long as these new functions are properly named , The code will become easier to read .
Idempotency and function purity
Idempotent function (idempotent function) Given the same set of variable parameters, the same value is returned , No matter how many times it's called . The result of a function does not depend on nonlocal variables 、 The variability of the parameters or from any I/O Stream data . Following add_three(number) Functions are idempotent :
def add_three(number):
"""Return *number* + 3."""
return number + 3
Whenever called add_three(7), Its return values are all 10. Here is an example of a non idempotent function :
def add_three():
"""Return 3 + the number entered by the user."""
number = int(input('Enter a number: '))
return number + 3
This function is not idempotent , Because the return value of a function depends on I/O, That is, the number entered by the user . Every time this function is called , It can all return different values . If it's called twice , Then the user can enter... For the first time 3, Second input 7, Make right add_three() The call to return 6 and 10.
Why idempotent is important ?
Testability and maintainability . Idempotent functions are easy to test , Because they return the same result with the same parameters . Testing is to check whether the values returned by different calls to a function are as expected . Besides , Testing idempotent functions is quick , This is in unit testing (Unit Testing) It's very important , But it's often overlooked . It's also very simple to reconstruct idempotent functions . No matter how you change the code outside the function , The values returned by calling a function with the same parameters are the same .
What is? 「 pure 」 function ?
In functional programming , If the function is idempotent and has no obvious side effects (side effect), Then it's a pure function . remember , An idempotent function means that it always returns the same result for a given set of parameters , No external factors can be used to calculate the result . however , This does not mean that idempotent functions cannot affect nonlocal variables (non-local variable) or I/O stream etc. . for example , If add_three(number) The idempotent version of outputs the result before returning it , It's still idempotent , Because it visited I/O stream, This does not affect the return value of the function . call print() It's a side effect : In addition to the return value , Interaction with the rest of a program or system .
Let's expand add_three(number) This example . We can use the following code snippet to see add_three(number) Number of times the function was called :
add_three_calls = 0
def add_three(number):
"""Return *number* + 3."""
global add_three_calls
print(f'Returning {number + 3}')
add_three_calls += 1
return number + 3
def num_calls():
"""Return the number of times *add_three* was called."""
return add_three_calls
Now we output the results to the console ( One side effect ), And modify the nonlocal variable ( Another side effect ), But because these side effects don't affect the return value of the function , So the function is still idempotent .
Pure functions have no side effects . It doesn't just use any 「 External data 」 To calculate the value , It's not about the system / Other parts of the program interact , In addition to calculating and returning values . therefore , Despite our new definition of add_three(number) It's still idempotent , But it's no longer a pure function .
Pure functions do not record statements or print() call , No database or Internet connection , Do not access or modify nonlocal variables . They don't call any other impure functions .
All in all , Pure functions cannot ( In the context of Computer Science ) Do what Einstein said 「 Ghostly telepresence 」(spooky action at a distance). They don't modify the rest of the program or system in any way . In command programming ( Write Python Code is imperative programming ), They are the safest functions . They are very easy to test and maintain , Even better than pure idempotent functions in this respect . Testing pure functions is almost as fast as executing them . And the test is simple : No database connection or other external resources , There is no requirement to set the code , There's no need to clean up after the test .
obviously , Idempotent and pure functions are the icing on the cake , But it's not necessary . namely , Because of the above advantages , We like to write pure functions or idempotent functions , But it's not always possible to write them . The key lies in , We instinctively want to eliminate side effects and external dependencies when we start deploying code . This makes every line of code we write easier to test , Even if you don't write pure functions or idempotent functions .
summary
The secret of writing good functions is no longer a secret . Just follow some well-established best practices and rules of thumb . I hope this article can help you .
Link to the original text :
https://hackernoon.com/write-better-python-functions-c3a9a36382a6
Programmer column Scan code and pay attention to customer service Press and hold to recognize the QR code below to enter the group
Recent highlights are recommended : My girlfriend thinks the annual salary is 50 Ten thousand is the average level , What do I do ? The sexy goddess of the biggest straight men forum in China overturned IntelliJ IDEA Fully optimized settings , Efficiency bars ! Very useful Python skill
Here's a good article to share with more people ↓↓