writing |  doug

source :Python technology 「ID: pythonall」


2019 At the end of the year, this sudden epidemic that hit the whole country , Let's have the longest Spring Festival holiday ever , It has also brought us a lot of losses , At the same time, we also realize the insignificance of human beings in the face of nature . Fortunately, under the correct and powerful leadership of the government , Through the unremitting efforts of the people of the whole country , We have finally contained the epidemic , Won this war without smoke of gunpowder .

However, in China, the epidemic situation has obviously improved , When we achieve zero growth in confirmed cases , The epidemic began to spread around the world . today , I use Python Made a big screen of global epidemic data , Let's take a look at the overall rendering .


The whole screen is divided into two modules: global data and domestic data , Each module is divided into three parts , On the left is the detailed data of each region , In the middle is the epidemic data map , On the right are the charts and the latest news .

Project structure

The structure of our whole project is as follows .


Crawler module is responsible for obtaining data from Tencent News , And then deposit it in Redis.Flask It's a Web frame , be responsible for URL And background functions , And the transmission of data . In other words , That is from Redis Get the raw data from , And then organize it into the corresponding format and pass it to the front page , The front page after getting the data , Call Baidu's ECharts To display the chart .

Bring in all the modules needed for the project .

import requests
import json
import redis
from flask import Flask, render_template
import datetime
import pandas as pd

Data acquisition

Before we start , We need to sort out what data we need first . About domestic , What we need is detailed data of each province 、 The sum of the national data 、 What's new 、 And the number of people imported from abroad TOP 10 Provinces and cities in China . About abroad , What we need is detailed data for each country 、 The sum of foreign data 、 What's new 、 as well as 24 The number of new people per hour TOP 10 The country .

Our epidemic data this time is obtained from Tencent News , Open the URL (https://news.qq.com/zt2020/page/feiyan.htm#/?nojump=1) Press F12 Bring out the developer tools , And then switch to Network Options page , After interface by interface analysis , We find that all the data we want are returned by the interface . The data interfaces are as follows :

  • Domestic Statistics ( The sum of the data and the detailed data of each province ):https://view.inews.qq.com/g2/getOnsInfo?name=disease_h5
  • Detailed data of foreign countries :https://api.inews.qq.com/newsqa/v1/automation/foreign/country/ranklist
  • The sum of foreign data :https://api.inews.qq.com/newsqa/v1/automation/modules/list?modules=FAutoGlobalStatis
  • The latest data :https://api.inews.qq.com/newsqa/v1/automation/modules/list?modules=FAutoNewsArticleList

After finding the interface , And then it's very simple , Directly call the interface to crawl the data down . The domestic statistical data interface returns data It's not standard JSON strand , It's a string , So we need to do a simple transformation .

For convenience h Follow up operation , We're going to call request The library crawls the data and encapsulates it , Convenient to call .

def pull_data_from_web(url):
    response = requests.get(url, headers=header)
    return json.loads(response.text) if response.status_code == 200 else None

We only take the release time for the latest dynamic data , Title and link address .

#  Get the latest data 
#  Get the latest data
def get_article_data():
    data = pull_data_from_web('https://api.inews.qq.com/newsqa/v1/automation/modules/list?modules=FAutoNewsArticleList')
    if data is None:
        return ''
    return [[item['publish_time'], item['url'], item['title']] for item in data['data']['FAutoNewsArticleList']]

Domestic data we need to get the total data and detailed data of each province .

#  Access to domestic Statistics 【 Current diagnosis   Diagnosis   Cure   Death   Import from abroad  &  Detailed data of each province ( Current diagnosis   Diagnosis   Cure   Death   Import from abroad )】
def get_china_data():
    data = pull_data_from_web('https://view.inews.qq.com/g2/getOnsInfo?name=disease_h5')
    if data is None:
        return ''
    dict = json.loads(data['data'])
    province_res = []
    for province in dict['areaTree'][0]['children']:
        name = province['name']
        now_confirm = province['total']['nowConfirm']
        confirm = province['total']['confirm']
        heal = province['total']['heal']
        dead = province['total']['dead']
        import_abroad = 0
        for item in province['children']:
            if item['name'] == ' Import from abroad ':
                import_abroad = item['total']['confirm']
        province_res.append([name, import_abroad, now_confirm, confirm, heal, dead])
    return {'chinaTotal': dict['chinaTotal'], 'chinaAdd': dict['chinaAdd'], 'province': province_res}

Obtain detailed data of foreign countries and regions .

#  Get the current 【 newly added 、 Diagnosis 、 Cure 、 Death 】 data 
def get_rank_data():
    data = pull_data_from_web('https://api.inews.qq.com/newsqa/v1/automation/foreign/country/ranklist')
    if data is None:
        return ''
    return [[item['name'], item['confirmAdd'], item['confirm'], item['heal'], item['dead']] for item in data['data']]

The sum of foreign data .

#  Access to foreign statistical data 【 Current diagnosis   Diagnosis   Cure   Death 】
def get_foreign_data():
    data = pull_data_from_web('https://api.inews.qq.com/newsqa/v1/automation/modules/list?modules=FAutoGlobalStatis')
    if data is None:
        return ''
    return data['data']['FAutoGlobalStatis']

Store data Redis And then this step was done .

article_data = get_article_data()
r.set('article_data', json.dumps(article_data))

rank_data = get_rank_data()
r.set('rank_data', json.dumps(rank_data))

china_data = get_china_data()
r.set('china_data', json.dumps(china_data))

foreign_data = get_foreign_data()
r.set('foreign_data', json.dumps(foreign_data))

Data processing

After getting the source data , We need to sort out the data , To meet the display requirements of the front page .

When we organize the data, we use pandas This library , In us 100 It has been introduced in the articles in the series of days , Forget the friends can turn over the history of the article review .

The latest data is relatively regular , There's not much to do , Just use it directly .


Look at the domestic statistics and the detailed data of each province .


It can be seen that the sum of the current domestic data is chinaTotal in , On the same day, we added chinaAdd in , Detailed data of provinces and cities are in province in , The detailed data of provinces and cities are imported from abroad 、 The current diagnosis is 、 Cumulative diagnosis 、 Cure 、 Death comes to store .

For the detailed data of various provinces and cities , Let's sort it in reverse order by the number of confirmed cases .


You can see , Now the number of confirmed cases in Hubei is very small , And Jitou Heilongjiang ranks first , Become the number of confirmed cases in China TOP 1 Provinces of .

Finally, let's look at the foreign data , The statistics are regular , We sort the detailed data of each country in reverse order according to the cumulative number of confirmed cases .


Last , We still need to deal with 「 Provinces and cities imported from abroad TOP 10」 and 「24 New countries TOP 10」 The data of . It can be obtained directly from the detailed data of provinces and cities and the detailed data of various countries .

 picture  picture

Chart display

Let's take a brief look at ECharts How to use , The first step is to introduce the corresponding js file , And then write a chart that holds div label .

<script type="text/javascript" src="https://assets.pyecharts.org/assets/echarts.min.js"></script>
<div id="top10" style="width: 300px;height:300px"></div>

Finally, the chart is written js The code can be .

<script type="text/javascript">
    var top10 = echarts.init(document.getElementById('top10'));
    //  Specify configuration items and data for the chart
    var option = {
        title: {
            text' Test charts ',
        grid: {
        xAxis: {
        yAxis: {
            data: [' Apple '' a mandarin orange '' Banana '' Pomegranate '],
        series: [
                data: [66889020]



But the data here is fixed , And our big screen display data is dynamic , What shall I do? , Front end by Ajax Technology can be obtained from the background interface , Similar to the following .

var top10_result = $.ajax({type : "GET"url : ''data : nullasync : false});
top10_result = JSON.parse(top10_result.responseText);

//  Set up the data
yAxis: {
    data: top10_result.country,

Now we've finished a simple chart , And the data can be set dynamically . What's missing now is to put together all the charts on the big screen , And set the data we have prepared before .

First we initialize Flask And set the route mapping relationship , Then the statistics needed by the front page are returned together with .

app = Flask(__name__)

def global_index():
    context = {
        'date': get_date(),
        'statistics_data': json.loads(r.get('foreign_data')),
        'country_data': get_rank_data(),
        'article_data': json.loads(r.get('article_data'))
    return render_template('global.html', **context)

Map data and TOP 10 Provide data in the way you need to , The front page goes directly through Ajax Technology invocation .

def get_global_top10():
    df = pd.DataFrame(json.loads(r.get('rank_data')), columns=['name''confirmAdd''confirm''heal''dead'])
    top10 = df.sort_values('confirmAdd', ascending=True).tail(10)
    result = {'country': top10['name'].values.tolist(), 'data': top10['confirmAdd'].values.tolist()}
    return json.dumps(result)

def get_global_map():
    df = pd.DataFrame(json.loads(r.get('rank_data')), columns=['name''confirmAdd''confirm''heal''dead'])
    records = df.to_dict(orient="records")
    china_data = json.loads(r.get('china_data'))
    result = {
        'confirmAdd': [{'name'' China ''value': china_data['chinaAdd']['confirm']}],
        'confirm': [{'name'' China ''value': china_data['chinaTotal']['confirm']}],
        'heal': [{'name'' China ''value': china_data['chinaTotal']['heal']}],
        'dead': [{'name'' China ''value': china_data['chinaTotal']['dead']}]

    for item in records:
        result['confirmAdd'].append({'name': item['name'], 'value': item['confirmAdd']})
        result['confirm'].append({'name': item['name'], 'value': item['confirm']})
        result['heal'].append({'name': item['name'], 'value': item['heal']})
        result['dead'].append({'name': item['name'], 'value': item['dead']})

    return json.dumps(result)    

thus , We finished getting data from , To sort out the data , Then go to the whole process of front page display , It still needs a lot of knowledge .


Today, we have completed a large screen visualization program of global epidemic data , The steps are clear , Difficulty is not great , It's just that it's more complicated to debug the position and style of each chart component on the page , But that's not the point of this article .

What you need to focus on is the front page URL How to do route mapping with background functions , How data is transferred and bound , As well as the background logic and data processing process, this is Web The essence of development .

Last , You can get the source code from the official account , Modify the program to support getting data from the data source regularly , Update the front end chart , Without manual operation .

Reference material