Compared with the commonly used os.path for ,pathlib The operation of directory path is more brief and closer to Pythonic. But it's not just about simplifying operations , There's more to it .
pathlib yes Python Built in Library ,Python The document defines it as :The pathlib module – object-oriented filesystem paths( Object oriented file system path )
pathlib Provide a class that represents the path to the file system , Its semantics apply to different operating systems .

For more details, please refer to the official documents :
https://docs.python.org/3/library/pathlib.html#methods
Many people study python, I don't know where to start .
Many people study python, After mastering the basic grammar , I don't know where to look for cases to start .
A lot of people who have done cases , But I don't know how to learn more advanced knowledge .
So for these three kinds of people , I will provide you with a good learning platform , Get a free video tutorial , electronic text , And the source code of the course !
QQ Group :1156465813
One 、Path The basic use of classes
Here is Get the file name 、 Get the file name except the suffix 、 Get file suffix 、 Return to one iterable Contains all parent directories And so on
from pathlib import Path path = r'D:\python\pycharm2020\program\pathlib Basic use of modules .py' p = Path(path) print(p.name) # Get the file name print(p.stem) # Get the file name except the suffix print(p.suffix) # Get file suffix print(p.parent) # amount to dirname print(p.parent.parent.parent) print(p.parents) # Return to one iterable Contains all parent directories for i in p.parents: print(i) print(p.parts) # The path is divided into a tuple by a separator
The operation results are as follows :
pathlib Basic use of modules .py pathlib Basic use of modules .py D:\python\pycharm2020\program D:\python <WindowsPath.parents> D:\python\pycharm2020\program D:\python\pycharm2020 D:\python D:\ ('D:\\', 'python', 'pycharm2020', 'program', 'pathlib Basic use of modules .py')
- Path.cwd() Returns the path object representing the current directory
- Path.home() Returns the path object representing the user's home directory
- Path.expanduser() Returns... With extension ~user A new path to construction
from pathlib import Path path_1 = Path.cwd() # Get current file path path_2 = Path.home() p1 = Path('~/pathlib Basic use of modules .py') print(path_1) print(path_2) print(p1.expanduser())
The operation results are as follows :
D:\python\pycharm2020\program
C:\Users\Administrator
C:\Users\Administrator\pathlib Basic use of modules .py
Path.stat() Returns an object whose operating system statistics contain information about this path
from pathlib import Path import datetime p = Path('pathlib Basic use of modules .py') print(p.stat()) # Get file details print(p.stat().st_size) # The byte size of the file print(p.stat().st_ctime) # File creation time print(p.stat().st_mtime) # Last time the file was modified creat_time = datetime.datetime.fromtimestamp(p.stat().st_ctime) st_mtime = datetime.datetime.fromtimestamp(p.stat().st_mtime) print(f' The file was created on :{creat_time}') print(f' Last time the file was modified :{st_mtime}')
The operation results are as follows :
os.stat_result(st_mode=33206, st_ino=3659174698076635, st_dev=3730828260, st_nlink=1, st_uid=0, st_gid=0, st_size=543, st_atime=1597366826, st_mtime=1597366826, st_ctime=1597320585)
543
1597320585.7657475
1597366826.9711637
The file was created on :2020-08-13 20:09:45.765748
Last time the file was modified :2020-08-14 09:00:26.971164
From different .stat().st_ attribute The time stamp returned represents from 1970 year 1 month 1 The number of seconds in the day , It can be used datetime.fromtimestamp Convert timestamps into useful time formats .
- Path.exists() Whether the path exists in an existing file or directory
- Path.resolve(strict=False) Make the path absolute , Resolve any symbolic links . Returns a new path object
from pathlib import Path p1 = Path('pathlib Basic use of modules .py') # file p2 = Path(r'D:\python\pycharm2020\program') # Folder absolute_path = p1.resolve() print(absolute_path) print(Path('.').exists()) print(p1.exists(), p2.exists()) print(p1.is_file(), p2.is_file()) print(p1.is_dir(), p2.is_dir()) print(Path('/python').exists()) print(Path('non_existent_file').exists())
The operation results are as follows :
D:\python\pycharm2020\program\pathlib Basic use of modules .py
True
True True
True False
False True
True
False
- Path.iterdir() When the path points to a directory , Will generate the contents of the directory path object
from pathlib import Path p = Path('/python') # python Files in directory for child in p.iterdir(): print(child)
The operation results are as follows :
\python\Anaconda
\python\EVCapture
\python\Evernote_6.21.3.2048.exe
\python\Notepad++
\python\pycharm-community-2020.1.3.exe
\python\pycharm2020
\python\pyecharts-assets-master
\python\pyecharts-gallery-master
\python\Sublime text 3
- Path.glob(pattern)Glob This path represents the relative pattern given in the directory , Generate all matching files ( Any kind of ),** Pattern Represents recursively this directory and all subdirectories . let me put it another way , It supports recursive globalization .
- Note Use... In large directory trees ** Pattern It can take a lot of time
Recursively traverse all the files in this directory , Get all that matches pattern The file of , Return to one generator.
Here are some common operation code , Take it and use it
Get all the files in the directory .py file
from pathlib import Path path = r'D:\python\pycharm2020\program' p = Path(path) file_name = p.glob('**/*.py') print(type(file_name)) # <class 'generator'> for i in file_name: print(i)
Get all the files in the directory .jpg picture
from pathlib import Path path = r'D:\python\pycharm2020\program' p = Path(path) file_name = p.glob('**/*.jpg') print(type(file_name)) # <class 'generator'> for i in file_name: print(i)
Get all... In the given directory .txt file 、.jpg Pictures and .py file
from pathlib import Path def get_files(patterns, path): all_files = [] p = Path(path) for item in patterns: file_name = p.rglob(f'**/*{item}') all_files.extend(file_name) return all_files path = input('>>> Please enter the file path :') results = get_files(['.txt', '.jpg', '.py'], path) print(results) for file in results: print(file)
About Path.mkdir(mode=0o777, parents=False, exist_ok=False) Explanation
- Create a new directory on the given path . If you specify mode, It will be associated with the process of umask Value combination , To determine the file mode and access flag . If the path already exists , The cause FileExistsError.
- If parents Parameter set to True, Any missing parents of this path will be created as needed ; They are created with default permissions , Instead of thinking about patterns ( imitation POSIX mkdir-p command ).
- If the parent is False( The default value is ), The absence of a parent causes FileNotFoundError.
- If exist_ok by False( The default value is ), If the target directory already exists FileExistsError.
- If exist_ok by True, Will ignore FileExistsError abnormal ( And POSIX mkdir-p The behavior of the command is the same ), But only if the last path component is not an existing non directory file .
In version 3.5 Change in : Added exist_ok Parameters .
Path.rmdir(): Delete this directory , Directory must be empty .
from pathlib import Path p = Path(r'D:\python\pycharm2020\program\test') p.mkdir() p.rmdir()
from pathlib import Path p = Path(r'D:\python\test1\test2\test3') p.mkdir(parents=True) # If parents is true, any missing parents of this path are created as needed p.rmdir() # Delete test3 Folder
from pathlib import Path p = Path(r'D:\python\test1\test2\test3') p.mkdir(exist_ok=True)
- Path.unlink()(missing_ok=False): Delete this file or symbolic link . If the path points to a directory , Please use Path.rmdir() Instead of . If missing_ok by False( The default value is ), When the path does not exist FileNotFoundError. If missing_ok by true, Will ignore FileNotFoundError abnormal . In version 3.8 Change in : Added missing “ determine ” Parameters .
- Path.rename()(target): Rename this file or directory to the given destination , And return the new path instance to the target . stay Unix On , If the target exists and is a file , If the user has permission , It will be replaced without prompting . The target can be a string or other path object .
- Path.open(mode='r', buffering=-1, encoding=None, errors=None, newline=None): Open the file that the path points to , It's like the built-in open() The function is the same .
from pathlib import Path p = Path('foo.txt') p.open(mode='w').write('some text') target = Path('new_foo.txt') p.rename(target) content = target.open(mode='r').read() print(content) target.unlink()
Two 、 And os Comparison of module usage

3、 ... and 、 Practical cases
For multi-layer folder reading , use os Modules can only read and retrieve files layer by layer , Write more than one for loop , The efficiency is not high , Now we can use Path.glob(**/*) Dafa , Let's take a practical example to experience its power .
The folder for testing is as follows :

md The data in the file are as follows :

Need to achieve Put all the md The data of the file is extracted , And clean it , And then write csv In file .
# -*- coding: UTF-8 -*- from pathlib import Path import re import pandas as pd # The incoming path p = Path(r'.\ Microblog hot search data \ Hot search data /') # Get all the files in the directory .md file file_list = list(p.glob('**/*.md')) print(f' Read md Number of documents :{len(file_list)}') for item in file_list: print(item) # There are two hot searches every day 11 spot 23 spot There will be overlapping data duplicate removal filelist = list(filter(lambda x: str(x).find('23 spot ') >= 0, file_list)) sum_list = [] i = 0 for file in filelist: # Go through each one md file Reading data with file.open(encoding='utf-8') as f: lines = f.readlines() lines = [i.strip() for i in lines] # Remove empty characters data = list(filter(None, lines)) # Remove the empty strings from the list data = data[1:101] con = data[::2] # Hot search content rank = data[1::2] # degree of heat date = re.findall(' year (.+)2', str(file)) * len(con) for m in range(len(con)): con[m] = con[m].split('、')[-1] # String manipulation for n in range(len(rank)): rank[n] = re.findall(r'\d+', rank[n])[0] con_dic = {' date ': date, ' Hot search content ': con, ' degree of heat ': rank} df = pd.DataFrame(con_dic) if i == 0: df.to_csv('weibo1.csv', mode='a+', index=False, header=True) else: df.to_csv('weibo1.csv', mode='a+', index=False, header=False) # Every md In file 50 Data i += 50 print(' common {} Data written to csv'.format(i))
The operation effect is as follows :


You can see that all of the items in this directory are successfully added md The data of the file is extracted , And clean it , And then I wrote csv In file .
Be careful : If you're looking for python Well paid jobs . I suggest you write more about real enterprise projects and accumulate experience . Or you won't find a job , Of course, a lot of people have never been in a business , How can there be project experience ? So you have to find more enterprise projects and practice more . If you're lazy and don't want to find , You can also enter my Python Circle of communication :1156465813. There are some real enterprise project cases that I have written before in the group file . You can take it to study , If you don't understand, you can find me in your skirt , I'll answer you patiently when I have time .
The following is useless , For this blog to be crawled by search engines
(* ̄︶ ̄)(* ̄︶ ̄)(* ̄︶ ̄)(* ̄︶ ̄)(* ̄︶ ̄)(* ̄︶ ̄)(* ̄︶ ̄)(* ̄︶ ̄)
python What is it Zero basis to learn python How long will it take? python Why is it called a reptile
python Novice reptile tutorial python Crawler universal code python How do reptiles make money
python Basic course Web crawler python python Classic examples of reptiles
python Reptiles
(* ̄︶ ̄)(* ̄︶ ̄)(* ̄︶ ̄)(* ̄︶ ̄)(* ̄︶ ̄)(* ̄︶ ̄)(* ̄︶ ̄)(* ̄︶ ̄)
The above is useless , For this blog to be crawled by search engines