One 、 Requirements describe
Write a Python Program , Every time you download a compressed file , Automatically decompress the internal files to the current folder and delete the compressed package , What we can learn from this case :
os
Integrated application of modules glob
Integrated application of modules gzip
zipfile
rarfile
tarfile
Module decompression file Two 、 Step analysis and pre knowledge
Code code before the need to interpret complex issues into a number of clear requirements , That is, the logic of the program implementation is :
When it comes to compressed files, it is necessary to discuss different compression formats , There are mainly the following 4 Kind of :
“
.gz
: namely gzip, Usually only one file can be compressed.tar
: It's not a compression tool, it's a packaging tool , Can follow .gz Coordination formation .tar.gz Package compression format of.zip
: and .tar.gz It's similar, but you can pack and compress multiple files.rar
: Pack and compress files , Originally used for DOS ”
therefore , The judgment logic of whether the file is a compressed file can be as follows :
compressed_lst = ['gz', 'tar', 'zip', 'rar']
filename.split(.)[-1]
Get suffix compressed_lst
in , If it exists, run the subsequent decompression code gz
Final document , After decompressing, we need to judge again whether it is with .tar
ending , And deal with it accordingly The decompression code of different compressed files is different , Expand the specific operation in the following code .
3、 ... and 、 Code implementation
The first is acquisition download
File names of all files in the folder
import glob import os path = r'C:\xxxx\download' file_lst = glob.glob(path + '/*') # List derivation filename_lst = [os.path.basename(i) for i in file_lst] print(filename_lst)
Next, according to the suffix name of the file to determine whether it needs to be compressed , Let's take a look at the code framework
for i in filename_lst: if '.' in i: # Get suffix suffix = i.split('.')[-1] # Match the suffixes one by one with the compressed suffixes of each format if suffix == 'gz': pass if ...: # Again, judge whether the new file name is with .tar ending pass if suffix == 'tar': pass if suffix == 'zip': pass if suffix == 'rar': pass
Here you are 2 A note :
.
, Use filename.split('.')[-1]
Will report a mistake , So we need to judge .gz
Final document , After decompressing, we need to judge again whether it is with .tar
ending , Then we can write the decompression code of the four kinds of compressed files into functions , It can be called separately when it meets the requirements import gzip def ungz(filename): filename = filename[:-3] # gz The single file decompression of a file is to remove filename hinder .gz gz_file = gzip.GzipFile(filename) with open(filename, "w+") as file: file.write(gz_file.read()) return filename # This gzip The function needs to return a value to further match untar function
As mentioned repeatedly before gz It's possible that the document is related to tar
Document matching , So decompress gz After the file, we need to judge whether it is necessary to untie tar
file
Here you can put tar The function of the file is written out ?
import tarfile def untar(filename): tar = tarfile.open(filename) names = tar.getnames() # tar It's a package of files , Unpacking produces a lot of files , So you need to create a folder to store if not os.path.isdir(filename + "_dir"): os.mkdir(filename + "_dir") for name in names: tar.extract(name, filename + "_dir/") tar.close()
import zipfile def unzip(filename): zip_file = zipfile.ZipFile(filename) # similar tar Unpack , Create a folder to store the extracted files if not os.path.isdir(filename + "_dir"): os.mkdir(filename + "_dir") for names in zip_file.namelist(): zip_file.extract(names, filename + "_dir/") zip_file.close()
import rarfile def unrar(filename): rar = rarfile.RarFile(filename) if not os.path.isdir(filename + "_dir"): os.mkdir(filename + "_dir") os.chdir(filename + "_dir") rar.extractall() rar.close()
You can see , The four decompression codes are slightly different , I recommend you to experience the difference in the actual demonstration , Decompress and cooperate os.remove()
You can delete the compressed package , Now let's look at the logic framework with the decompression function added :
for filename in filename_lst: if '.' in filename: suffix = filename.split('.')[-1] if suffix == 'gz': new_filename = ungz(filename) os.remove(filename) if new_filename.split('.')[-1] == 'tar': untar(new_filename) os.remove(new_filename) if suffix == 'tar': untar(filename) os.remove(filename) if suffix == 'zip': unzip(filename) os.remove(filename) if suffix == 'rar': unrar(filename) os.remove(filename)
The simpler way is to set up while True
Dead loop coordination time.sleep()
sleep , The framework is as follows :
import time while True: func() time.sleep(5) # Sleep seconds can be set a bit larger to avoid over using resources
Finally, change the implementation code of the second step into a function and put it in the loop framework to complete the requirement , The complete code is as follows
import glob import os import gzip import tarfile import zipfile import rarfile import time path = r'C:\xxxx\download' file_lst = glob.glob(path + '/*') filename_lst = [os.path.basename(i) for i in file_lst] def ungz(filename): filename = filename[:-3] gz_file = gzip.GzipFile(filename) with open(filename, "w+") as file: file.write(gz_file.read()) return filename def untar(filename): tar = tarfile.open(filename) names = tar.getnames() if not os.path.isdir(filename + "_dir"): os.mkdir(filename + "_dir") for name in names: tar.extract(name, filename + "_dir/") tar.close() def unzip(filename): zip_file = zipfile.ZipFile(filename) if not os.path.isdir(filename + "_dir"): os.mkdir(filename + "_dir") for names in zip_file.namelist(): zip_file.extract(names, filename + "_dir/") zip_file.close() def unrar(filename): rar = rarfile.RarFile(filename) if not os.path.isdir(filename + "_dir"): os.mkdir(filename + "_dir") os.chdir(filename + "_dir") rar.extractall() rar.close() def unzip_files(): for filename in filename_lst: if '.' in filename: suffix = filename.split('.')[-1] if suffix == 'gz': new_filename = ungz(filename) os.remove(filename) if new_filename.split('.')[-1] == 'tar': untar(new_filename) os.remove(new_filename) if suffix == 'tar': untar(filename) os.remove(filename) if suffix == 'zip': unzip(filename) os.remove(filename) if suffix == 'rar': unrar(filename) os.remove(filename) while True: unzip_files() time.sleep(5)
-END-
This article is from WeChat official account. - Get up early Python(zaoqi-python) , author : Chen Xi
The source and reprint of the original text are detailed in the text , If there is any infringement , Please contact the [email protected] Delete .
Original publication time : 2020-11-07
Participation of this paper Tencent cloud media sharing plan , You are welcome to join us , share .