Real time access to stock data, free—— Python crawler Sina stock actual combat

2021-10-29
real time access stock data

The importance of real-time stock data

For the four tradable assets : Stocks 、 futures 、 option 、 For digital money , futures 、 option 、 Digital currency , Can be provided from the exchange api Receive real-time market data , The quantitative trading interface of stocks is not open to ordinary people , This leads us to want to get real-time data of stocks , Very difficult . And at the same time , Real time stock data , It is also extremely important floor trading data .

For manual traders , On the one hand, real-time data can assist in staring at the disk , On the other hand , You can use the program to simply develop the price reminder , To a certain price , Make entry and exit transactions .

For quantitative traders , Real time market is even more important . After we receive real-time quotes , It is not only necessary to calculate the strategy signal with real-time data , And when the strategy signal needs to place an order for a stock , We also need to know the latest price of the stock 、 Pankou data , Thus according to the price + Handicap , Choose the right price to order . Besides , After real-time data landing , It can also serve our policy backtesting .

The most popular explanation of reptiles is

Reptiles , It is equivalent to imitating the action of web page query , For example, we enter in the browser, The browser returns to Baidu's home page , In fact, this is a request + The process of returning . We're asking for an address , What is returned is data ( Although what we see is Baidu home page , In fact, there are some columns of data behind it , But the subsequent visual display became a web page ).

For requesting stock data , It's the same thing . For example, we request the data of a stock ( With 600000 Take Shanghai Pudong Development Bank as an example ), Enter... In the browser address :, The stock code can be modified at will , The browser returns the following data set and displays :

  Of course, query multiple stocks at the same time , It can also be done , We also enter the address in the browser , for example :,sz000001,sz000002, At the same time, I inquired about Shanghai Pudong Development Bank 、 Ping An Bank 、 vanke A The data of .


that , about Python Come on , How do we use Python To imitate the action of Web query ? There must be a Python Three party Library :request library .

requests It's using Apache2 licensed Permit HTTP library .Request Support HTTP Connection hold and connection pool , Support use cookie Keep talking , Support file upload , Support automatic response content encoding , Support international URL and POST Automatic data coding . so to speak ,requests stay python The built-in module is highly encapsulated , Thus making python When making a network request , Become human , Use Requests It's easy to do whatever the browser can do . meanwhile ,requests Persistent connections are automatically made keep-alive.

requests The above advantages of the library , Plus the convenience of its use , Make it Python The preferred tool for crawlers . We go through requests, Let's repeat the query process of the above web page , The steps are also very simple , First, query a single stock , Return as follows

  Query multiple stocks again , Return as follows , You can see , use requests The result returned by the query , It's exactly the same as what we see on our website , This also shows that requests It perfectly simulates the request action of the browser .


Python Reptiles Sina Stock actual combat

First step ,Sina API + Requests Library calls core functions

By calling Sina Stock API, Query the stock price in real time . We use multi stock query , Use requests Request interface .

Core function logic :code Pass in the stock code , call requests Query the library , And parse the query results , Get the latest price of the stock we want to query 、 The day's rise and fall 、 Yesterday's closing price and other key fields

The second step , adopt threading Multi thread simultaneous query results 、 adopt Queue Implement thread pool

Let's briefly introduce threading and Queue.threading The module contains rich functions about thread operation , Include : Common thread functions , Thread object , Lock object , Recursive lock object , Event object , Conditional variable object , Semaphore object , Timer Objects , Fence object .threading.Thread: Thread object , Important method ,start(): Start thread activity . It will make run() Method is called in a separate control thread , It should be noted that the same thread object start() Method can only be called once , If called more than once , Will report RuntimeError error .run(): This method represents thread activity .

Python Of Queue Synchronous... Is provided in the module 、 Thread safe queue class , Include FIFO( First in, first out ) queue Queue,LIFO( After the first out ) queue LifoQueue, And priority queues PriorityQueue. These queues all implement lock primitives , Can be used directly in multithreading . You can use queues to synchronize threads .Queue.put(item) Written to the queue ;Queue.get([block[, timeout]]) Get the queue

The third step , Modular implementation :Worker class & Stock class

Worker class , Main functions :

  1. Incoming thread instance object , call init、start Method , heavy load run function

  2. work_queue Store the queue to query , Take out one by one (FIFO), Save the query results to result_queue queue , When the queue is full , Retrieve all query results in the result queue , And print



Stock class , Main functions :

  1. Initialize the producer 、 Consumer queue , The producer queue is empty , The maximum number of consumer queues is the number of stocks queried

  2. According to the preset thread data , Initializes the thread pool , binding Worker class

  3. For every stock you query , Join the producer queue

  4. Define the function for the crawler to get data


Last , We call Worker Classes and Stock class , You can get the results you want .

Operation result display

The program can not only obtain individual stock data , You can also get index data , We subscribe to the Shanghai index by default 、 Shenzhen Composite Index 、 vanke A、 Pudong Development Bank ,4 Data streams , The following is a display of the default run results , We printed the latest price 、 applies 、 Yesterday's closing price and other key data . We follow certain time intervals , Get data and print .

  Program extension : At the main program entry , You can modify the number of stocks queried , We support simultaneous query of multiple stocks ; Query interval 、 Modification of the number of threads queried .

  Of course , We can also further expand the crawler program , For example, calculate the quantization strategy signal , The landing data is used as the back test data , wait .

