When I was preparing for the autumn move , Every time I see those great gods holding seven or eight big factories on Niuke online offer, I can't envy it , At that time, my level of machine learning algorithm could not be put on the table , At most, I read several courses 、 Did two game levels , Then the game hasn't got any ranking , I really began to worry so much , Many small friends around have started internships in big factories , And I really ask myself , Almost zero internship experience , The algorithm has not been systematically deduced ,leetcode Only occasionally brush , About twenty questions have been brushed , It's not too much to describe the state at that time with disaster , I'm starting to panic , Then I began to calm down and find all kinds of experience posts , Niuke is a gathering of big men , I've read about ten articles here , I've read several articles on , I summed up several key investigation directions of autumn recruitment algorithm post : Self introduction. 、 Project introduction 、 Algorithm details 、 Data structure and algorithm , Next, let me talk about the details of these four directions one by one .

  • Self introduction.

Good introduction , It will leave a deep impression on the interviewer , And this piece is all worded by yourself , The key point to highlight is to control yourself , So from your narrative , The interviewer can hear your familiarity with the project and your depth of thinking , therefore , It is particularly important to prepare in advance , During the interview, each project should be described according to a certain logic , Of course, the most important thing in the algorithm project is the data 、 features 、 Model 、 effect , According to this framework, I made it clear , The interviewer listened with ease , The next interview stage will also be more fluent , Because the interviewer will capture the key words in your self introduction , I'll ask you a series of questions in the Q & a session later , This implies a tip, That's what you said , Be sure to understand better than the interviewer , Don't move in those vague things , Or sooner or later it will show up .

  • Project introduction

That's the top priority , The project will reflect the overall quality of an interviewer . What should we do if we have no internship experience ? Easy to handle , Go to the game , Try to get the place in the competition , Then make the algorithms used in the game familiar , The interviewer is bound to expand on the details of your project , Let you dissect some of his doubts , For example, the selection of positive and negative samples 、 Feature handling 、 Details of the model , Another example is that you used a tree model in your game , You have to know all the knowledge points related to the tree model , Let me just give you a few examples :XGBoost Why not be sensitive to missing values ? Compared with ordinary GBDT,XGBoost How to deal with missing values ? Why? xgboost/gbdt Why the depth of the tree is rarely able to achieve high accuracy when adjusting parameters ? Wait, this kind of detailed question , Once you answer vaguely , The interviewer is bound to deduct points , So don't take chances , I don't think the interviewer will ask , Murphy's theorem tells us , Anything that can go wrong is bound to happen , These can be prepared , It's almost an open book exam , Why not think about how to answer in advance , I have to go to the examination room and rack my brains to think of a different answer . If the competition platform , You must know , Like Ali Tianchi 、kaggle And so on are all platforms for everyone to participate in the competition .

  • Algorithm details

In addition to examining the details of the algorithms that appear in the project , You will also ask questions about the basis of your machine learning algorithm , Let me summarize some of the more important , Traditional algorithms : Logical regression 、 Naive Bayes 、 Tree model (random forest/Adaboost/xgboost/lightgbm)、SVM、PageRank、 clustering ; Some theories of machine learning , Nonequilibrium problem 、 Over fitting problem , Cross validation problem , Model selection ; Recommendation system : Collaborative filtering 、FM/FFM、LS-PLM、Wide&Deep、DeepFM、DIN、DIEN、ESMM、Embedding、 Recall 、EE、 Performance evaluation ; These are the core of the algorithm post , Besides , Investigation of some code languages , It will also be something that some interviewers value very much , such as C++/python/Spark etc. , In order to prepare for this , I turned all the questions I could think of into questions and answers , Prepare yourself by asking and answering questions , I listed all the questions at the end of the article . You can prepare according to this , Or choose some of them . Because my job is to recommend algorithms , Therefore, it will pay more attention to this aspect , Later, if I have time , It can be extended to other fields , Like natural language processing 、 Computer vision and so on . Recommended books , It is also some experience of Niuke's great God , expericnce 《 Statistical learning method 》、《 Baimian machine learning 》、《 In depth learning 》、《 Deep learning recommendation system 》、 Zhou's 《 machine learning 》. Of course, reading these books is definitely not enough , Reading it doesn't mean you have mastered , You follow the list of questions I made for you , Answer it in your heart , Or write it directly , This effect is absolutely top , When I recruit in autumn , It's almost a second , In your own hands . I will also issue these questions in my official account , Basically finished , You can also see my website , Website 、 The official account is introduced below , Welcome to exchange .

  • Data structure and algorithm

This is also quite critical , Some companies will even decide your future according to your performance in this field , Like the headline is the famous master of dynamic programming , I always like to take some intermediate or hard The topic , It's a headache . Many non computer students have no foundation , Practice is not in place , You may not think clearly on the spot . My suggestion is to brush according to the topic first , Like dynamic programming 、 Sliding window theme 、 Double pointer 、 Speed pointer 、topK wait , Brush up 200 The question is about , Then you can brush at random , Be sure to brush more questions , This is an interview rule that cannot be overemphasized a hundred times . The recommended books are 《 The finger of the sword offer》, Website, you can find leetcode Chinese net .
Okay, talk less , The best time to plant a tree is ten years ago , The second is now , If you think your algorithmic skills are not enough, you can gain in the process of autumn moves offer Words , Then from now on, conquer one by one , It's not hard . I will also update my algorithm and related questions and answers on my website and official account 、 The interview experience 、 Push and so on , If you are interested, you can pay attention to a wave , My expectation for this number is to do it for a long time , I will also do it with my heart , I hope it can bring real help to the students of the algorithm .

Machine learning is full of questions and answers

Logical regression

  • Let's deduce the loss function of logistic regression , And explain what it means .
  • In the advertisement LR In the model , Why do feature combinations ?
  • Why? LR The model uses sigmoid function , What is the mathematical principle behind ? Why not use other functions ?
  • Why? LR It can be used to predict the click through rate ?
  • Data meeting what conditions are used LR best ? let me put it another way , in order to LR Work better , What to do with the data ?
  • Can logistic regression solve the nonlinear classification problem ?
  • To one with m Samples n Data sets of dimensional features ,LR What is the dimension of the gradient in the algorithm ?
  • Why does the logistic regression loss function use the maximum likelihood estimation instead of the least square method ?
  • How to solve the parameters of logistic regression ?
  • SVM and LR What are the similarities and differences ? Under what circumstances ?
  • Why? LR res MSE?
  • Why does logistic regression need to discretize the features first ?
  • parallel LR The implementation of the
  • Logical regression (Logistic regression) What are the applications in the financial field ?

Naive Bayes

  • What is Bayesian decision theory ?
  • Do you know what naive Bayes is ?
  • There are men in the company 60 people , Women have 40 people , The number of men wearing leather shoes is 25 people , The number of people wearing sneakers is 35 people , The number of women wearing leather shoes is 10 people , The number of people wearing high heels is 30 people . Now all you know is that one person is wearing leather shoes , Then you need to guess what his gender is . If you infer that he is a man, the probability is greater than that of a woman , Then think he's male , Otherwise I think he's a woman .
  • Can you tell me about the advantages and disadvantages of naive Bayes ?
  • “ simple ” This is the disadvantage of naive Bayes in prediction , So there is such an obvious assumption, the disadvantage is , Why can naive Bayesian prediction still achieve good results ?
  • What is Laplace smoothing ?
  • Is there any super parameter in naive Bayes that can be adjusted ?
  • How many models are there in naive Bayes ?
  • Do you know what applications naive Bayes has ?
  • Is naive Bayes a high square error or low variance model ?
  • What are the assumptions of naive Bayes ? What are the advantages and disadvantages ?
  • How naive Bayes estimates parameters ?
  • What's the difference between Bayesian school and frequency school ?
  • What is the difference between logistic regression and naive Bayes ?

Tree model

  • Talk about your understanding of entropy 、 Understanding of information gain and information gain ratio ?
  • ID3 What is the division standard of the algorithm ?
  • ID3 What's wrong with the algorithm ?C4.5 How does the algorithm solve ID3 Defective ?ID3 and C4.5 What's the flaw ?
  • C4.5 How to deal with missing values ?
  • C4.5 What are the criteria for the division of ?
  • C4.5 What are the shortcomings of the algorithm ?
  • What is the definition of Gini coefficient and its advantages ?
  • CART How to select partition features when eigenvalues are missing ?
  • Select the division feature ,CART How should the model deal with the samples missing the eigenvalue ?
  • Reasons for over fitting of decision tree and its solutions ?
  • What are the strategies for decision tree pruning ? What are their advantages and disadvantages ?
  • C4.5 What is the pruning method used ?
  • CART How to deal with category imbalance ?
  • CART How to deal with continuous values ?
  • Please tell me ID3、C4.5 and CART The difference between the three .
  • CART Why did the algorithm choose gini Index ?
  • C4.5 How does the algorithm deal with continuous values ?
  • How does the decision tree deal with missing values ?
  • How to calculate the importance of each feature of the decision tree ?
  • If there are many features , Must the last unused feature in the decision tree be useless ?
  • Does the decision tree need normalization ?
  • Since the use of neural network can also solve the classification problem , that SVM、 What is the significance of these algorithms in decision tree ?
  • Relationship between decision tree and conditional probability distribution ?
  • CART What is your pruning strategy ?
  • If it is caused by outliers or uneven data distribution , What impact will it have on the decision tree ?
  • What are the advantages of decision tree compared with other models ?
  • The difference between decision tree and logistic regression ?
  • What is the difference between classification tree and regression tree ?
  • How to understand the loss function of decision tree ?
  • sklearn Whether the decision tree in should be used one-hot code ?
  • Briefly describe the steps of random forest
  • Whether the random forest will appear ?
  • Why not divide random forest into training set and test set ?
  • How random forests deal with missing values ?
  • Random forest and GBDT The difference between
  • Random forest and SVM Comparison
  • Talk about the advantages and disadvantages of random forest
  • Briefly Adaboost The weight update method of
  • Let's deduce Adaboost Sample weight update formula
  • During training , Why is there always the problem of wrong classification in each round of training , Whole Adaboost But it can converge quickly ?
  • Adaboost Advantages and disadvantages ?
  • AdaBoost And GBDT What are the similarities and differences of comparison ?
  • Please give us a brief introduction GBDT Principle
  • Why can regression trees be used as GBDT Iterative learning machine ?
  • GBDT How it is used to classify problems ?
  • Why? GBDT take CART The regression tree is divided into m A binary tree ( Each tree has only two leaf nodes ), Instead of asking for a tree m+1 A binary tree of layers ( At most 2m A leaf node )?
  • GBDT How to regularize ?
  • gbdt Why do we use negative gradients instead of ?
  • GBDT What are the advantages of ?
  • GBDT What is the role of reduction in ?
  • Why based on residuals GBDT It's not a good choice ?
  • Why is it said in the gradient lifting tree that the negative gradient of the objective function with respect to the current model is the approximate value of the residual ?
  • Why? xgboost/gbdt Why the depth of the tree is rarely able to achieve high accuracy when adjusting parameters ?
  • Why in the actual kaggle In the game ,GBDT and Random Forest The effect is very good ?
  • GBDT How to use it in click through rate prediction ?
  • GBDT How to calculate the gradient in ? The gradient of who to whom ?
  • m×n Data sets , If you use GBDT, So how many dimensions is the gradient ? Or it's related to the depth of the tree ? Or it is related to the number of leaf nodes of the tree ?
  • Random forest and GBDT The differences and similarities
  • In machine learning algorithm GBDT And Adaboost What are the differences and connections between ?
  • Introduce to you XGBoost Principle
  • XGBoost And GBDT What's the difference
  • RF and GBDT The differences and similarities
  • XGBoost Why use Taylor's second-order expansion
  • XGBoost How to implement the parallelization part of ?
  • XGBoost Why fast ?
  • XGBoost How to calculate the weight of leaf nodes in the middle ? Why can leaf node score be used to measure the complexity of the tree ?
  • XGBoost Stop growing condition of a tree in
  • Please deduce Xgboost
  • XGBoost What are the methods to prevent over fitting ?
  • XGBoost How to deal with unbalanced data
  • Compare LR and GBDT, Tell me what situation GBDT Not as good as LR
  • XGBoost How to prune trees in
  • Use XGBoost Training model , If it is fitted, how to adjust the parameters ?
  • XGBoost How to choose the best splitting point ?
  • XGBoost Of Scalable How sex is reflected
  • XGBoost How to evaluate the importance of features
  • XGBooost General steps of parameter tuning
  • XGBoost If the model is over fitted, how to solve it
  • XGBoost Why not be sensitive to missing values ? Compared with ordinary GBDT,XGBoost How to deal with missing values ?
  • XGBoost How to realize the regularization of ?
  • XGBoost and LightGBM The difference between
  • XGBoost How to ask Hessian The inverse of a matrix ?
  • xgboost In the algorithm, the approximate algorithm is used to obtain the segmentation points ?
  • LightGBM Compare with XGBoost What are the advantages and disadvantages ?
  • Please introduce several common integrated learning frameworks :boosting/bagging/stacking
  • Why integrated learning is better than a single learner ?
  • Please briefly describe the meaning of variance and deviation of the model ?
  • Must the base model in ensemble learning be a weak model ?
  • Please calculate the overall expectation and overall variance of the model
  • Why? Bagging The base model in must be a strong model ?
  • Why? Boosting The base model in the framework must be weak ?

Feature Engineering

  • Machine learning , What are the engineering methods for feature selection ?
  • In the ad click through rate model ,LR, GBDT+LR, FM, DNN Advantages and disadvantages of such model ? What's the actual effect ?
  • Multi label (multi-label) Data learning problem , What are the commonly used classifiers or classification strategies ?


  • About SVM in , Yes, constant C The understanding of the ?
  • machine learning SVM About why the function interval can be set to 1?
  • Machine learning has a lot to say about kernel functions , What is the definition and function of kernel function ?

optimization algorithm

  • optimization algorithm
  • What is gradient descent method ?
  • Training with gradient descent SVM What's the problem ?
  • least square 、 Maximum likelihood 、 What's the difference between gradient descent ?
  • In optimization problems , Why does Newton method need fewer iterations than gradient descent method ?
  • Why? nn The bigger problem is that it will fall into local optimization , The convex function is not selected as the activation function ?

Loss function

  • Please explain the definition of loss function
  • Please talk about your understanding of logistic regression loss function
  • Please talk about your understanding of the square loss function .
  • Please talk about your understanding of exponential loss function .
  • Please tell me about you Hinge Understanding of hinge loss function .
  • Please return to logic and SVM Compare the loss function of .
  • For logical regression , Why is the square loss function nonconvex ?
  • How to make SVM The derivation of is related to the loss function ?
  • How neural networks design their own loss function, If you need to modify or design your own loss, What rules to follow ?
  • softmax and cross-entropy What's the relationship ?
  • Why is the loss function of neural network non convex ?
  • What loss functions are commonly used in deep learning ( Optimize the objective function )?
  • Neural network , Design loss function What are the techniques ?
  • Neural network , Why not take the partial derivative of the loss function and make it equal to zero , Find the optimal weight , Instead, use the gradient descent method ( iteration ) Calculate weight ?
  • When using the cross entropy loss function , I just want to punish 0.4~0.6 Such a fuzzy value , How to change ?


  • Please explain the meaning of regularization .
  • What is the relationship between regularization and a priori distribution of data ?
  • L1 Compared with L2 Why is it easy to get sparse relief ?
  • L1 Why can regularization make the coefficient become 0?L1 How to deal with 0 Point non differentiable case ?
  • Deep learning how to prevent over fitting ?
  • Multiple... Are used simultaneously in the objective function L1 and L2 The case of regularization term , How to solve ?


  • Please explain AUC.
  • AUC And accuracy must be positively correlated ? Is there any internal relationship ?
  • Accuracy 、 Recall rate 、F1 value 、ROC、AUC What are the advantages and disadvantages of each ?
  • Why? accuracy、precision、f1-score、recall They all scored high, but AUC Low score ?
  • Machine learning ,F1 and ROC/AUC, How to do index evaluation for multi classification ?
  • How to solve offline and online problems auc Inconsistent with the online click through rate ?
  • Why? AUC Insensitive to the proportion of positive and negative samples
  • AUC How much does it take ?
  • AUC A probabilistic explanation for .

Unbalanced data

  • What are the processing methods of unbalanced data sets in machine learning ?
  • Please give us a brief introduction SMOTE How does the sampling method deal with unbalanced data ?
  • The original SMOTE What's wrong with the algorithm ? How to improve ?
  • Please give us a brief introduction Tomek Links Under sampling method .
  • Please give us a brief introduction NearMiss Method
  • EasyEnsemble How does the algorithm solve the problem of unbalanced data ?
  • BalanceCascade How does the algorithm solve the problem of unbalanced data ?
  • SMOTE Oversampling and Tomek Links Can undersampling algorithms be combined ?

The recommendation system is full of questions and answers

Shallow model

  • Please briefly describe user based collaborative filtering UserCF The recommendation process .
  • In user based collaborative filtering , How to calculate the similarity of users ?
  • User based collaborative filtering UserCF What are the shortcomings ?
  • Please briefly describe the article based collaborative filtering ItemCF The recommendation process .
  • Please briefly describe the off-line engineering implementation of item based collaborative filtering algorithm (spark)
  • The large-scale sparse matrix multiplication in the calculation process of collaborative filtering algorithm spark Existing solutions in (multiply) What are the drawbacks ? How to improve ?
  • Please briefly describe the advantages and disadvantages of collaborative filtering algorithm
  • In real business , In order to make collaborative filtering more valuable to the business , What problems should we pay attention to when using this algorithm ?
  • Collaborative filtering algorithm will have the problem of cold start , Mainly reflected in ?
  • What recommended business scenarios can collaborative filtering algorithms be used for ?
  • Please briefly describe the engineering implementation of near real-time collaborative filtering algorithm
  • What is the principle of matrix decomposition ? What are the main methods to solve ?
  • How to understand the matrix decomposition model from the perspective of deep learning model
  • Matrix decomposition algorithm , The length of the hidden vector k How does the value of affect the effect and project cost ?
  • Please briefly describe the process of singular value decomposition . What are the defects of singular value decomposition ? Why not apply to the solution in the Internet scenario ?
  • Please briefly describe the gradient descent method to solve user - The process of implicit vector of goods .
  • How to solve the problem of scoring deviation between users and items in matrix decomposition ?
  • The logistic regression model predicts that compared with collaborative filtering , What is the biggest advantage ? What is the recommendation process ?
  • Please deduce the mathematical form of logistic regression .
  • Please deduce the process of solving the parameter update of logistic regression by gradient descent method .
  • Logistic regression as CTR What are the advantages and disadvantages of the prediction model ?
  • In industry , Very few use continuous values as LR The feature input of the model , Instead, the continuous features are discretized into a series of 0、1 features , What are the advantages ?
  • CTR All features in the estimate are crossed in pairs , What are the shortcomings of the method of giving weight to all combinations ?
  • FM What is the principle of ? What is the connection with matrix decomposition ?
  • FM Compare with POLY2 Why is generalization better ? What are the advantages in Engineering ?
  • FFM Compare with FM What's the improvement ?
  • FM What is the training complexity of ? How to deduce ?FFM What is the training complexity of ?
  • Why? GBDT It can be used for feature selection and feature combination ?
  • GBDT+LR In the composite model ,GBDT How to generate eigenvectors ?
  • GBDT+LR What are the advantages and disadvantages ?
  • Please briefly describe what Ali's mother put forward LS-PLM The principle and mathematical form of the model .
  • LS-PLM What are the advantages of the model ?
  • LS-PLM What is the relationship between the model and the deep learning model ?

    Depth model

  • Please briefly Deep Crossing Network structure .
  • Deep Crossing What is the role of the residual element in ?
  • NeuralCF Based on the matrix decomposition model , What improvements have been made ?
  • PNN Comparison NeuralCF and Deep Crossing What are the improvements ? What are the advantages ?
  • PNN What are the ways of feature intersection in ? What's the difference ?
  • PNN What are the advantages and limitations of the model ?
  • How to understand Wide&Deep Model Memorization?
  • Wide&Deep Medium Memorization What are the disadvantages ?
  • How to understand Wide&Deep Model Generalization?
  • Wide&Deep Medium Generalization What are the disadvantages ?
  • Please briefly Wide&Deep Model structure of .
  • Wide&Deep In training, we used Joint Training, What are the benefits ?
  • wide&deep How does the author apply wide&deep To make recommendations ?
  • Why? wide&deep For model ftrl and adagrad Two optimization methods ?
  • Wide&Deep What are the innovations and advantages of the model ?
  • In the application scenario , Which features are suitable for Wide Side , Which features are suitable for Deep Side , Why? ?
  • wide&deep Why should continuous features be discretized in the model ?
  • Deep&Cross comparison Wide&Deep What improvements have been made ?Deep&Cross Model Cross How does the network operate ?
  • FNN What is the main purpose of the model to solve ? How it was solved ?
  • FNN What is the model structure of ?
  • FNN How to use... In the model FM To initialize the Embeddiing Of layer parameters ?
  • DeepFM What is the motivation for the proposal ?
  • DeepFM in FM Layer and NN Layers are shared features Embedding What are the benefits of ?
  • DeepFM Compare with Wide&Deep What's the improvement ? Why is it so improved ?
  • NFM comparison Wide&Deep What's the improvement ? Why is it so changed ?
  • What are the characteristics of users' interest in goods ?DIN How to capture these characteristics of user interest ?
  • Capture of user interests , What methods do we usually have ?
  • DIN How to handle the input of ?
  • DIN How is the activation unit designed ?
  • Why add cross product as input ?
  • Why use simple MLP Realization AU Well ?
  • DIN The activation function is used in Dice Replace the classic PReLU Activation function , What are the advantages ?
  • DIN An adaptive regularization algorithm is used , What is its motivation ?
  • DIN Used in the paper GAUC As an evaluation indicator , What are its benefits ?
  • DIEN What is the motivation for introducing sequence information ?
  • Please draw DIEN The structure of each layer ? Interest extraction layer and interest evolution layer .
  • CVR Estimate what to estimate ?
  • And CTR Different estimates ,cvr What are the data sparsity and sample selection bias faced in the estimation ?
  • ESMM How to solve the problem of sample selection deviation ?
  • ESMM How to solve the problem of data sparsity ?
  • ESMM The structure of is based on “ ride ” The relationship design is not based on what is the reason for division ?
  • Explain it. ESMM The objective function of


  • Embedding Why is technology important for deep learning recommendation systems ?
  • Please briefly Word2Vec Principle and structure of .
  • In order to speed up Word2Vec Training for , What methods have been taken ?
  • Talk to you about Item2Vec The understanding of the , What are its limitations ?
  • Please briefly describe the structure of the two tower model , What is the function of the item tower ?
  • The more complex the model structure, the better ? The more features, the better ?
  • DeepWalk What is your main idea ? Say the algorithm steps
  • Node2Vec What does homogeneity and structure mean in ? They and DFS and BFS What is the corresponding relationship between ?
  • Please write out Node2Vec Jump probability formula between nodes .
  • Illustrate with examples Node2Vec The intuitive explanation of homogeneity and structure in Recommendation System .
  • EGES The proposal of is mainly to make up for DeepWalk Defective , How to make up for it ?
  • Please briefly EGES Structure of model , And briefly describe the practice of each layer .
  • Embedding What are the applications in the deep learning recommendation system ? List three directions .
  • Embedding What are the problems with training as a deep learning model ?
  • Embedding What are the pre training methods , Separately .
  • Please briefly Embedding As a process of recall layer .
  • Please briefly describe the principle of local sensitive hash and its role in Recommendation System .


  • Suppose the number of inventory reaches the level of millions , How to design methods to recommend to users from this level of quantity top10 The items , At the same time, it can reduce the pressure of calculation ?
  • Why sorting is more concerned than recall ?
  • What are the characteristics of the recall model that are significantly different from the ranking model ?
  • Why not just take " The exposure did not click " Make a negative sample of the recall model ?
  • How does the recall model randomly sample negative samples ?
  • What are the drawbacks of using random samples as negative samples ? How to solve ?
  • Why is there a recall in the recommendation system ? What are the similarities and differences between recall and ranking in the recommendation system ?
  • Recommend how system recall is implemented item The pressure of ?
  • CTR What are the goals of the prediction and recommendation system gap?
  • Reality recommendation system only by " forecast CTR" Sort ?
  • Why? CTR Estimates apply only to those with “ True negative ” Sample scenario ?
  • Which scenarios in the recommendation system cannot obtain true and negative samples ? How to solve ?
  • Please briefly describe based on embedding Recall method for , What are the advantages ?
  • Airbnb Recall algorithm listing embedding How to select positive and negative samples for recall ?
  • Airbnb Recall algorithm user/listing-type embedding How to select positive and negative samples for recall ?
  • Facebook Of EBR How the algorithm selects positive and negative samples ?
  • Why does the recall require isolation user And item Decoupling of features ? How to decouple ?
  • In the recall scenario , Why often use Pairwise LearningToRank To build the relative accuracy of sorting ?
  • Optimize recall Pairwise LearningToRank What forms of loss function can be used ?
  • Pinterest Of PinSAGE How to build a positive sample ?
  • Please briefly DSSM The principle of the model
  • DSSM The input layer maps text into a low dimensional vector space and transforms it into a vector What's the problem ?
  • Please briefly describe DSSM How is it applied to recall ? What is the structure ?
  • Please give us a brief introduction DSSM Advantages and disadvantages
  • DSSM Why is the negative sample in random sampling , without “ The exposure did not click ” When the negative sample ?
  • Please briefly describe Baidu's twin tower model
  • Please briefly YoutubeDNN The structure and principle of
  • Please briefly describe the user multi interest network MIND What is your starting point , Describe its structure ?
  • SDM How to combine the long-term and short-term interests of users ? How is its structure ?

Feature Engineering

  • Feature engineering that allows you to design a recommendation system , How would you design ? Including user side 、 Item side and contextual features .
  • During feature processing , How to deal with continuous features ?
  • During feature processing , How to deal with category features ?

Explore and use

  • When new users register or new items are put into storage , How to provide users with satisfactory recommendation services , And how to recommend new items , Recommend it to users who like it ?
  • Briefly describe the meaning of exploration and utilization .
  • Greedy What is the principle of the algorithm ? What are the drawbacks ?
  • Please briefly Thompson Sampling Principle and steps of the method .
  • Excuse me, UCB How to solve the problem of exploration and utilization in cold start ?
  • sketch LinUCB The principle and specific practice of .

Characteristic evaluation

  • How does the real-time performance of the model affect the effect of the recommendation system ?
  • How does the client recommend real-time features in real time ?
  • How does the stream computing platform perform quasi real-time feature processing ?
  • Distributed storage system HDFS And the role of distributed batch processing platform in recommendation ?
  • Please briefly offline/nearline/online Training methods and steps .
  • Please name several indicators for offline evaluation .
  • Please say P-R curve 、ROC Curves and AUC The meaning and relationship of .
  • How to skillfully draw ROC curve ?
  • Offline AUC promote , Will it certainly lead to the improvement of online indicators ? Why? ?
  • Why AB test ? What are the advantages over offline evaluation ?
  • AB How does the test design the layering and shunting mechanism ?

Deep learning

  • Please write down the commonly used loss function , Loss of square 、 Cross entropy loss 、softmax Loss function and hinge Loss function .
  • Why is the training of deep neural network very difficult ? What are the main reasons .
  • Please illustrate forward propagation and back propagation with examples
  • What is the function of introducing nonlinear activation function into deep learning ?
  • Please name the commonly used activation functions , And draw their corresponding images .
  • How to choose the activation function ? Please describe the characteristics of various activation functions .
  • Relu What are the advantages of activation functions ?
  • Please explain Softmax Definition and function of activation function ?Softmax How the activation function applies to multiple classifications ?
  • In depth model training , Why batch size? How to choose the right one batch size, Have and influence on the results ?
  • Please explain BN Principle , Why batch normalization ?
  • What is model tuning fine tuning? Please explain fine-tuning Three states of the model , What are the characteristics of each ?
  • Why unsupervised pre training can help deep learning ?
  • What are the methods for initializing weight deviation ? Explain their characteristics .
  • What is the role of setting the learning rate ? What are the common learning rate attenuation methods ? Explain their respective characteristics
  • What are the methods to prevent over fitting in deep learning ?
  • Please name several common optimization algorithms , And their respective characteristics .
  • How to balance variance and deviation in deep learning ? If the deviation is too large, what should we do ? The variance is too large ?
  • Please explain Dropout Principle , During training and testing dropout What's the difference ?
  • Data enhancement methods commonly used in deep learning ?
  • How to understand Internal Covariate Shift?

C++ A hundred questions and answers


  • What is the role of variables ? What is the syntax for creating variables ?
  • C++ What is the function of constants in ? Please write down two ways to define constants .
  • Please give me a few C++ Examples of pre reserved keywords in
  • short type 、int type 、long The type and long long What is the memory space occupied by each type ?
  • sizeof What is the function of keywords ?
  • What is the memory space occupied by character variables ? What are the characteristics of it in storage ?
  • Please give me some examples of C++ Escape characters in ?
  • C++ The difference between pre increment and post increment is ?
  • Write an example of a ternary operator ? And explain .
  • switch case In the sentence break What is the role of ?
  • One for The starting expression in the loop statement 、 Conditional expression 、 What is the execution order of the end loop body and loop statements ?
  • break Statement and continue What is the function of the sentence ?


  • What are the characteristics of arrays ? How to define an array ?
  • What is the relationship between the name of a one-dimensional array and its memory address ?
  • How to define a two-dimensional array ? What is the relationship between the name of a two-dimensional array and its memory address ?


  • Explain the meaning of form participation arguments .
  • What is the meaning of value passing ? What are the effects on formal and arguments ?
  • What function declarations do ?

The pointer

  • What does the pointer do ? What is the difference between pointer variables and ordinary variables ?
  • How much memory space does the pointer occupy ?
  • Constant pointer 、 What's the difference between pointer constants ?
  • What's the difference between value passing and address passing ?


  • How to create a structure ? Please write down two methods .
  • How to create a structure array ?
  • Structure pointer how to access members of a structure ?
  • How structures nest structures ? For example
  • Can a structure pass parameters to a function as parameters ?


  • Please briefly C++ When the program executes, each memory block ( Code section 、 Global area 、 The stack area 、 Heap area ) Functional characteristics of .
  • new What is the function of the operator ? How do you use it? ?


  • What is the function of reference ? What is the essence of it ?
  • When a reference is used as a function parameter , And value passing 、 What's the difference between address passing ?
  • What are the functions and writing methods of constant references ?
  • When writing function default parameters , What do you need to pay attention to ?

    heavy load

  • What conditions need to be met for function overloading ?


  • What is the meaning of encapsulation ?
  • What are the access permissions for the members and behaviors of the class ? What's the difference ?
  • What's the difference between a class and a structure ?
  • What are the advantages of setting member properties private ?

Initialization and cleanup

  • What are the functions of constructors and destructors ?
  • What is the constructor Syntax ? What are the characteristics of constructors ?
  • What is the destructor Syntax ? What are the characteristics of destructors ?
  • What are the constructor call rules ?
  • Please explain C++ Deep copy and shallow copy in ?
  • C++ What is the syntax for initializing the list in ?
  • B There are objects in the class A As a member ,A For object members , When creating a B Object time ,A And B The order of construction and deconstruction is who comes first and who comes second ?
  • What are the characteristics of static members ?
  • Are member variables and member functions stored separately in a class ? Do non static member variables occupy object space ?
  • this What does the pointer do ?
  • const What effect does decorating member functions have ? keyword mutable What is the role of ?
  • C++ What is the role of Chinese friends ? Global function 、 class 、 How are member functions implemented as friends ?
  • What are the ways of inheritance ? What is its authority ?
  • Can a subclass inherit the private members of the parent class ?
  • What is the constructor and destructor order of parent and child classes ?
  • When a member with the same name appears between a subclass and a parent class , How to use subclasses , Access to data with the same name in a child or parent class ?
  • What problems does diamond inheritance bring ?C++ How to solve it in ?
  • What is the difference between static polymorphism and dynamic polymorphism ?
  • What are the satisfaction and use conditions of polymorphism ?
  • What are the advantages of polymorphism ?
  • What is the meaning of pure virtual functions ? What is grammar like ? What does it have to do with abstract classes ?
  • Explain the meaning of virtual destruct and pure virtual destruct 、 Grammar and its differences ?
  • How to create a function template ? What does it do ? What to pay attention to ?
  • What is the difference between a normal function and a function template ? What are the calling rules ?
  • What is the purpose of materializing the function template to solve ?
  • What is the role of class templates ? What is grammar like ? What's the difference with function templates ?
  • When to create a member function in a class template
  • Please explain STL In the container 、 Algorithms and iterators .

python A hundred questions and answers

  • python in list、tuple、dict、set What's the difference between other types ?
  • What are the forms of function arguments ? What are the characteristics of each ?
  • Please explain python Default parameter trap problem .
  • Please give an example to illustrate the difference between shallow copy and deep copy
  • What are the concepts of generator and iterator ?
  • Please briefly describe the built-in functions zip Usage of . When the length of iterators is inconsistent , How is it handled , Is there any alternative ?
  • Higher order function map/reduce/filter/sorted What are the usages of ? Illustrate with examples .
  • What is the concept of closure ? Illustrate with examples .
  • What are the benefits of anonymous functions ? Please give an example to illustrate its usage .
  • What is the concept of decorator ? How to use ?
  • What is the concept of partial function ? How to use ?
  • enumerate comparison range What are the advantages ?
  • What is a factory function ? Illustrate with examples .
  • Illustrate the difference between class attribute and instance attribute .
  • Please explain the concepts of inheritance and polymorphism with examples .
  • How to set access restrictions on attributes in a class ?
  • How to use __slots__?
  • Custom class __str__,__iter__, __getitem__,__getattr__,__call__ What's the use of separation ?
  • Static methods 、 What is the difference between class methods and member methods
  • @classmethod, @staticmethod, @property What are these ?
  • __init__ and __new__ What's the difference ?
  • What is? Python introspection ?
  • python How to manage memory ?
  • What is? GIL?
  • Please briefly python Exception handling mechanism of .
  • How do you position python programmatic bug Of ? stay python How to realize single step execution in ?
  • assert What's the use of assertions ?
  • What are the built-in properties of the class ?
  • How can a list of elements that are strings be transformed into a space delimited string ?
  • python Medium is How operators compare ?
  • Please write a regular expression that matches the email address .
  • python How to pass command line parameters ?
  • How to understand python Thread in ?
  • Please briefly python Multiple processes in .

Spark A hundred questions and answers

  • Please briefly RDD The concept of , How to create RDD?
  • RDD What operations are supported ?
  • RDD What operations are supported ? What are the characteristics of each ?
  • Please give an example of RDD Transformation operation and action operation
  • explain RDD Inert evaluation mechanism
  • Explain the conversion operation respectively map、flatMap、filter、distinct Function and usage of
  • Explain the conversion operation respectively union、intersection、subtract、cartesian Function and usage of
  • Explain the action and operation respectively reduce、fold、aggregate The usage and difference of
  • Why is it right RDD persist , Please briefly describe the different levels of caching mechanisms .
  • Please briefly reduceByKey、groupByKey、combineByKey The role and difference of .
  • RDD The meaning and difference between wide dependence and narrow dependence
  • RDD in Client、Master、Worker、Driver、Executor What are the meanings of
  • DataFrame And RDD The main difference is ?
  • Why? Spark Faster than mapreduce?
  • Spark There are those components ?
  • Spark Streaming The basic principle of
  • How to solve Spark The problem of data skew in
  • Please explain SparkSQL Three species join 1.Broadcast Join 2.Shuffle Hash Join 3.Sort Merge Join Meaning and difference
  • spark Can replace hadoop Do you ?
  • Executor The role of ?
  • Driver The role of ?
  • spark Rdd The cache of ?
  • spark spark-submit What are the parameters of the script ?
  • Wide dependence and narrow dependence
  • summary Rdd operator (30 More than )
  • coalesce and repartition The difference between ?
  • reduceByKey and groupByKey The difference between ?
  • union and intersection The difference between ?
  • What are the commonly used wide dependency operators and narrow dependency operators ?
  • DAG How to divide stage?
  • How to divide job?
  • Spark The choice of persistence ?
  • Application scenarios of persistence and fault tolerance ?
  • What is an accumulator ?
  • What are broadcast variables ?
  • Nodes and task The relationship of execution ?
  • cluster Mode how to view logs
  • Spark Optimize ?
  • What is? DataFrame?
  • Rdd,DataFrame,DataSet The difference between ?
  • spark Bottom core RDD Cache mechanism 、 Application scenarios 、 How to use 、 How to clear the cache
  • DAG Directed acyclic graphs and partitions stage

