See how I use Python to create a magic with baby (one play can play for a day)?

TrueDei 2020-12-05 01:19:34
use python create magic baby

One 、 Finally found a reliable oral English teacher for the children

“ Education can't be poor again , No more pain, no child ”, As the baby's parents , It's not just hard work and material support , We should pay more attention to the children's learning situation , And I'm afraid of babies all the time “ Lost on the starting line ”, But , There are too many starting lines for children now , English 、 All kinds of artistic specialties , Even skipping rope , I'm very busy . However, parents are not all rounders , see , My sister has been worrying about her daughter's spoken English recently , My pronunciation is not accurate , I don't know which one is reliable , The child is going to lag behind his partner , Knowing this situation , I take out my English textbook , Think of yourself every time 60 My English grades are flying by , Put it back again , Took my weapon —— Code .

 Insert picture description here

In recent years, natural language processing has been applied in many fields , The cost of intelligent voice assessment has long been affordable by the public . According to the need to correct pronunciation for baby , I finally chose to call a reliable factory , There is wisdom in saying that API To develop a simple Voice evaluation program , Or call it —— Intelligent speaking teacher !

Two 、 preparation

First , It is necessary to create instances on Youdao Zhiyun's personal page 、 Create an 、 Binding applications and instances , Get the application's id And the key . Specific personal registration process and application creation process are detailed in the article Share a batch file translation development process

 Insert picture description here

3、 ... and 、 The development process is introduced in detail

The following describes the specific code development process .

First, study Official documents Given API Input and output specifications . The API use https Means of communication , Simply speaking , It is to encode and process the pre recorded sound file , Sign and submit to API, analysis API Back to json You can get the score result .

Address of the interface :

https Interface :

API Input the required parameters in the table below :

Field name type meaning Required remarks
q text The audio file to be evaluated Base64 Encoded string True Must be Base64 code
text text The text corresponding to the audio file to be evaluated True have a good day
langType text source language True Support language
appKey text application ID True Can be found in Application management see
salt text UUID True UUID
curtime text Time stamp ( second ) True TimeStamp
sign text Signature , adopt sha256( application ID+input+salt+curtime+ Application key ) Generate ;input See the notes under the table for the generation rules of True sha256( application ID+input+salt+curtime+ Application key )
signType text Signature type True v2
format text The format of the voice file ,wav true wav
rate text Sampling rate , recommend 16000 Adoption rate true 16000
channel text Track number , Only mono support , Please fill in the fixed value 1 true 1
type text Upload the type , Support only base64 Upload , Please fill in the fixed value 1 true 1

The signature sign The generation method is as follows :
signType=v2; sign=sha256( application ID+input+salt+curtime+ Application key ).
What needs to be noted here is input The calculation method of is :input=q front 10 Characters + q length + q after 10 Characters ( When q Longer than 20) or input=q character string ( When q Length less than or equal to 20).

The output parameters of the interface are as follows :

Field meaning
errorCode Identification result error code , There must be . Details can be found in Error code list
refText The text of the request
start Sentence start time in audio , The unit is seconds
end Sentence end time in audio , The unit is seconds
integrity Sentence integrity score
fluency Sentence fluency score
pronunciation Sentence accuracy score
speed The speed , word / minute
overall Sentence comprehensive score
words Word score array
-word word
-start Word start time , The unit is seconds
-end Word end time , The unit is seconds
-pronunciation Word accuracy score
-phonemes Phonetic array
–phoneme Phonetic symbols
–start Phonetic start time , The unit is seconds
–end Phonetic end time , The unit is seconds
–judge To judge whether the phoneme is wrong ,true To pronounce correctly ,false It's a mistake in pronunciation , meanwhile calibration Give hints
–calibration If the pronunciation is wrong , Prompt the user what the pronunciation looks like
–prominence The degree of stress , The higher the score , The more likely the current phonetic symbol is to be stressed , Fraction in [0 100]
–stress_ref Vowel stress reference / The standard answer , If true, The vowel should be stressed , There is no meaning in consonants
–stress_detect In a word , The user pronounces the phonetic symbol as stress

( One )Demo Development :

This demo Use python3 Development , Include,, Three files , Respectively demo The interface of 、 Recording and other logic processing and intelligent voice evaluation interface call method encapsulation .

** 1. Interface part :**

UI The part is divided into three parts , Article processing area 、 Recording area and rating display area .

 Insert picture description here

The layout code is as follows :

root.title("youdao ise test")
frm = tk.Frame(root)
frm.grid(padx='50', pady='50')
# Select the article 
btn_get_file_path=tk.Button(frm,text=' Choose the text :',command=get_file)
text1=tk.Text(frm,width='70', height='2')
# Article content display 
text2=tk.Text(frm,width='70', height='5')
# Start and stop recording 
btn_start_rec=tk.Button(frm,text=' sound recording ',command=start_rec,width=10)
lb_Status = tk.Label(frm, text='Ready', anchor='w', fg='green')
btn_stop_rec=tk.Button(frm,text=" End of the tape ",command=stop_rec)
# Scoring button and result display 
btn_score=tk.Button(frm,text=" score ",command=start_score,width=10)
text3=tk.Text(frm,width='70', height='10')

And the start button btn_score Binding events for start_score() To collect all the text files , Start synthesis , And print the running results :

def start_score():
for r in result:

** 2、**
Here mainly realizes the file processing 、 Recording and processing interface return function . So let's define a Audio_model

class Audio_model():
def __init__(self, audio_path,is_recording):
self.current_file='' # The original path of the current recording 
self.is_recording=is_recording # Recording status identification 
self.audio_chunk_size=1600 # The following are necessary parameters for recording 

record_and_save() Method to record and save to the project record In the path , The recording file name is the same as the original file name , Easy to correspond to .

 def record_and_save(self):
self.is_recording = True

get_score() Method implements the call The function of encapsulating the tool in and parsing the return value :

 def get_score(self,dict):
for path in dict:
# Processing results , Add to result set 
result.append( score_result)
return result

3、 The Chinese are and ask for wisdom API Some directly related methods , The bottom line is this connect() Method , Integrated API The required parameters , And call the method to execute the request do_request(), Then according to UI The exhibition needs of , Handle API And concatenate the string .

def connect(audio_file_path,audio_text):
audio_file_path = audio_file_path
lang_type = 'en' # Currently only English is supported 
extension = audio_file_path[audio_file_path.rindex('.')+1:]
if extension != 'wav':
print(' Unsupported audio type ')
wav_info =, 'rb')
sample_rate = wav_info.getframerate()
nchannels = wav_info.getnchannels()
with open(audio_file_path, 'rb') as file_wav:
q = base64.b64encode('utf-8')
data = {
data['text'] = audio_text
curtime = str(int(time.time()))
data['curtime'] = curtime
salt = str(uuid.uuid1())
signStr = APP_KEY + truncate(q) + salt + curtime + APP_SECRET
sign = encrypt(signStr)
data['appKey'] = APP_KEY
data['q'] = q
data['salt'] = salt
data['sign'] = sign
data['signType'] = "v2"
data['langType'] = lang_type
data['rate'] = sample_rate
data['format'] = 'wav'
data['channel'] = nchannels
data['type'] = 1
# Process return value 
response = do_request(data)
j = json.loads(str(response.content, encoding="utf-8"))
# Sentence integrity 
contextIntegrity=" Sentence integrity :"+str( round(j["integrity"], 2))+" "
pronunciation=" Pronunciation accuracy :"+str(round(j["pronunciation"],2))+" "
fluency=" fluency :"+str(round(j["fluency"],2))+" "
speed=" The speed :" +str(round(j["speed"],2))+" "
recordAndResult=recordname+" "+contextIntegrity+pronunciation+fluency+speed+"\n"
return recordAndResult

( Two ) Effect display

Show me my pure ”chinenglish“ The operation of the program after recording ( It doesn't matter how much you score , What's important is its objective evaluation :P )

Let's first introduce the operation method :

  • 1) Click on “ Choose the article ”, Select the articles to be evaluated ;

  • 2) Click on “ sound recording ”,“ End of the tape ” Button , Do voice recording ;

  • 3) If you need to evaluate more than one article , repeat 1)、2) Step by step

  • 4) Click on “ score “, Intelligent voice assessment , And show the rating results , At the same time, the scoring results will be detailed , Stored in the path of this code result Under the table of contents .

 Insert picture description here

Effect display

Interface part : It shows Sentence integrity 、 Accuracy of pronunciation 、 Fluency score , And the speed of speaking :

 Insert picture description here

The documentation section : Each voice is evaluated separately , And will return the detailed results with json There is a form of result Under the folder .
 Insert picture description here

The output shows :


’integrity‘: 100,// Sentence integrity 
'refText’: "Are you ok? ",// The text corresponding to the speech to be evaluated 
'pronunciation': 67.108101,// Sentence pronunciation accuracy 
'start': 0.030000,// Audio start time , second 
'words': [{
 // List of word information 
'pronunciation': 50.640327, // Word accuracy score 
'start': 0.73, // Word start time , second 
'end': 0.76,// Word end time , second 
'word': 'Are', // Word text 
'phonemes': [{
 // List of phonetic information 
'stress_ref': False, // Vowel stress reference ( Standard stress ), If true, The vowel should be stressed , There is no meaning in consonants 
'pronunciation': 50.640331, // Sound standard accuracy score 
'stress_detect': False,// In a word , The user does not pronounce the phonetic symbol as stress 
'phoneme': 'ɝ', // Phonetic name 
'start': 0.73, // Phonetic start time , second 
'end': 0.76, // Phonetic end time , second 
'judge': True, // Judge whether the phonetic symbol is wrong ,true To pronounce correctly ,false It's a mistake in pronunciation , meanwhile calibration Give hints 
'calibration': 'ɝ', // Judge whether the phonetic symbol is wrong ,true To pronounce correctly ,false It's a mistake in pronunciation , meanwhile calibration Give hints 
'prominence': 1 // The degree of stress , The more likely the current phonetic symbol is to be stressed , Score range [0 100]
}, {

'pronunciation': 76.810608,
'start': 0.77,
'end': 1.08,
'word': 'you',
'phonemes': [{

'stress_ref': False,
'pronunciation': 79.084282,
'stress_detect': False,
'phoneme': 'j',
'start': 0.77,
'end': 0.86,
'judge': True,
'calibration': 'j',
'prominence': 0.944885
}, {

'stress_ref': True,
'pronunciation': 74.536934,
'stress_detect': True,
'phoneme': 'u',
'start': 0.87,
'end': 1.08,
'judge': True,
'calibration': 'u',
'prominence': 1
}, {

'pronunciation': 66.129013,
'start': 1.14,
'end': 1.8,
'word': 'ok',
'phonemes': [{

'stress_ref': True,
'pronunciation': 69.046341,
'stress_detect': True,
'phoneme': 'o',
'start': 1.14,
'end': 1.27,
'judge': True,
'calibration': 'o',
'prominence': 1
}, {

'stress_ref': False,
'pronunciation': 65.357841,
'stress_detect': False,
'phoneme': 'k',
'start': 1.28,
'end': 1.42,
'judge': True,
'calibration': 'k',
'prominence': 0.838557
}, {

'stress_ref': True,
'pronunciation': 63.982838,
'stress_detect': True,
'phoneme': 'e',
'start': 1.43,
'end': 1.8,
'judge': True,
'calibration': 'e',
'prominence': 0.956448
'fluency': 83.554047, // Sentence fluency 
'overall': 83.885124,// Sentence comprehensive score 
'errorCode': '0', // Identification result error code , There must be 
'end': 1.8,// Sentence end time , second 
'speed': 55.555557 // Sentence speed ( word / minute )

Four 、 summary

Intelligent voice evaluation of Youdao Zhiyun API Documents are clear , There is no hole in the call process , The development experience is very friendly , The scoring results are objective and fair , It is of great reference value , So I want to study and improve with my little niece !

Project address :


  1. 利用Python爬虫获取招聘网站职位信息
  2. Using Python crawler to obtain job information of recruitment website
  3. Several highly rated Python libraries arrow, jsonpath, psutil and tenacity are recommended
  4. Python装饰器
  5. Python实现LDAP认证
  6. Python decorator
  7. Implementing LDAP authentication with Python
  8. Vscode configures Python development environment!
  9. In Python, how dare you say you can't log module? ️
  10. 我收藏的有关Python的电子书和资料
  11. python 中 lambda的一些tips
  12. python中字典的一些tips
  13. python 用生成器生成斐波那契数列
  14. python脚本转pyc踩了个坑。。。
  15. My collection of e-books and materials about Python
  16. Some tips of lambda in Python
  17. Some tips of dictionary in Python
  18. Using Python generator to generate Fibonacci sequence
  19. The conversion of Python script to PyC stepped on a pit...
  20. Python游戏开发,pygame模块,Python实现扫雷小游戏
  21. Python game development, pyGame module, python implementation of minesweeping games
  22. Python实用工具,email模块,Python实现邮件远程控制自己电脑
  23. Python utility, email module, python realizes mail remote control of its own computer
  24. 毫无头绪的自学Python,你可能连门槛都摸不到!【最佳学习路线】
  25. Python读取二进制文件代码方法解析
  26. Python字典的实现原理
  27. Without a clue, you may not even touch the threshold【 Best learning route]
  28. Parsing method of Python reading binary file code
  29. Implementation principle of Python dictionary
  30. You must know the function of pandas to parse JSON data - JSON_ normalize()
  31. Python实用案例,私人定制,Python自动化生成爱豆专属2021日历
  32. Python practical case, private customization, python automatic generation of Adu exclusive 2021 calendar
  33. 《Python实例》震惊了,用Python这么简单实现了聊天系统的脏话,广告检测
  34. "Python instance" was shocked and realized the dirty words and advertisement detection of the chat system in Python
  35. Convolutional neural network processing sequence for Python deep learning
  36. Python data structure and algorithm (1) -- enum type enum
  37. 超全大厂算法岗百问百答(推荐系统/机器学习/深度学习/C++/Spark/python)
  38. 【Python进阶】你真的明白NumPy中的ndarray吗?
  39. All questions and answers for algorithm posts of super large factories (recommended system / machine learning / deep learning / C + + / spark / Python)
  40. [advanced Python] do you really understand ndarray in numpy?
  41. 【Python进阶】Python进阶专栏栏主自述:不忘初心,砥砺前行
  42. [advanced Python] Python advanced column main readme: never forget the original intention and forge ahead
  43. python垃圾回收和缓存管理
  44. java调用Python程序
  45. java调用Python程序
  46. Python常用函数有哪些?Python基础入门课程
  47. Python garbage collection and cache management
  48. Java calling Python program
  49. Java calling Python program
  50. What functions are commonly used in Python? Introduction to Python Basics
  51. Python basic knowledge
  52. Anaconda5.2 安装 Python 库(MySQLdb)的方法
  53. Python实现对脑电数据情绪分析
  54. Anaconda 5.2 method of installing Python Library (mysqldb)
  55. Python implements emotion analysis of EEG data
  56. Master some advanced usage of Python in 30 seconds, which makes others envy it
  57. python爬取百度图片并对图片做一系列处理
  58. Python crawls Baidu pictures and does a series of processing on them
  59. python链接mysql数据库
  60. Python link MySQL database