When I was preparing for the autumn move , Every time I see those great gods holding seven or eight big factories on Niuke online offer, I can't envy it , At that time, my level of machine learning algorithm could not be put on the table , At most, I read several courses 、 Did two game levels , Then the game hasn't got any ranking , I really began to worry so much , Many small friends around have started internships in big factories , And I really ask myself , Almost zero internship experience , The algorithm has not been systematically deduced ,leetcode Only occasionally brush , About twenty questions have been brushed , It's not too much to describe the state at that time with disaster , I'm starting to panic , Then I began to calm down and find all kinds of experience posts , Niuke is a gathering of big men , I've read about ten articles here , I've read several articles on , I summed up several key investigation directions of autumn recruitment algorithm post : Self introduction. 、 Project introduction 、 Algorithm details 、 Data structure and algorithm , Next, let me talk about the details of these four directions one by one .
- Self introduction.
Good introduction , It will leave a deep impression on the interviewer , And this piece is all worded by yourself , The key point to highlight is to control yourself , So from your narrative , The interviewer can hear your familiarity with the project and your depth of thinking , therefore , It is particularly important to prepare in advance , During the interview, each project should be described according to a certain logic , Of course, the most important thing in the algorithm project is the data 、 features 、 Model 、 effect , According to this framework, I made it clear , The interviewer listened with ease , The next interview stage will also be more fluent , Because the interviewer will capture the key words in your self introduction , I'll ask you a series of questions in the Q & a session later , This implies a tip, That's what you said , Be sure to understand better than the interviewer , Don't move in those vague things , Or sooner or later it will show up .
- Project introduction
That's the top priority , The project will reflect the overall quality of an interviewer . What should we do if we have no internship experience ? Easy to handle , Go to the game , Try to get the place in the competition , Then make the algorithms used in the game familiar , The interviewer is bound to expand on the details of your project , Let you dissect some of his doubts , For example, the selection of positive and negative samples 、 Feature handling 、 Details of the model , Another example is that you used a tree model in your game , You have to know all the knowledge points related to the tree model , Let me just give you a few examples :XGBoost Why not be sensitive to missing values ? Compared with ordinary GBDT,XGBoost How to deal with missing values ? Why? xgboost/gbdt Why the depth of the tree is rarely able to achieve high accuracy when adjusting parameters ? Wait, this kind of detailed question , Once you answer vaguely , The interviewer is bound to deduct points , So don't take chances , I don't think the interviewer will ask , Murphy's theorem tells us , Anything that can go wrong is bound to happen , These can be prepared , It's almost an open book exam , Why not think about how to answer in advance , I have to go to the examination room and rack my brains to think of a different answer . If the competition platform , You must know , Like Ali Tianchi 、kaggle And so on are all platforms for everyone to participate in the competition .
- Algorithm details
In addition to examining the details of the algorithms that appear in the project , You will also ask questions about the basis of your machine learning algorithm , Let me summarize some of the more important , Traditional algorithms : Logical regression 、 Naive Bayes 、 Tree model (random forest/Adaboost/xgboost/lightgbm)、SVM、PageRank、 clustering ; Some theories of machine learning , Nonequilibrium problem 、 Over fitting problem , Cross validation problem , Model selection ; Recommendation system : Collaborative filtering 、FM/FFM、LS-PLM、Wide&Deep、DeepFM、DIN、DIEN、ESMM、Embedding、 Recall 、EE、 Performance evaluation ; These are the core of the algorithm post , Besides , Investigation of some code languages , It will also be something that some interviewers value very much , such as C++/python/Spark etc. , In order to prepare for this , I turned all the questions I could think of into questions and answers , Prepare yourself by asking and answering questions , I listed all the questions at the end of the article . You can prepare according to this , Or choose some of them . Because my job is to recommend algorithms , Therefore, it will pay more attention to this aspect , Later, if I have time , It can be extended to other fields , Like natural language processing 、 Computer vision and so on . Recommended books , It is also some experience of Niuke's great God , expericnce 《 Statistical learning method 》、《 Baimian machine learning 》、《 In depth learning 》、《 Deep learning recommendation system 》、 Zhou's 《 machine learning 》. Of course, reading these books is definitely not enough , Reading it doesn't mean you have mastered , You follow the list of questions I made for you , Answer it in your heart , Or write it directly , This effect is absolutely top , When I recruit in autumn , It's almost a second , In your own hands . I will also issue these questions in my official account , Basically finished , You can also see my website , Website 、 The official account is introduced below , Welcome to exchange .
- Data structure and algorithm
This is also quite critical , Some companies will even decide your future according to your performance in this field , Like the headline is the famous master of dynamic programming , I always like to take some intermediate or hard The topic , It's a headache . Many non computer students have no foundation , Practice is not in place , You may not think clearly on the spot . My suggestion is to brush according to the topic first , Like dynamic programming 、 Sliding window theme 、 Double pointer 、 Speed pointer 、topK wait , Brush up 200 The question is about , Then you can brush at random , Be sure to brush more questions , This is an interview rule that cannot be overemphasized a hundred times . The recommended books are 《 The finger of the sword offer》, Website, you can find leetcode Chinese net .
Okay, talk less , The best time to plant a tree is ten years ago , The second is now , If you think your algorithmic skills are not enough, you can gain in the process of autumn moves offer Words , Then from now on, conquer one by one , It's not hard . I will also update my algorithm and related questions and answers on my website and official account 、 The interview experience 、 Push and so on , If you are interested, you can pay attention to a wave , My expectation for this number is to do it for a long time , I will also do it with my heart , I hope it can bring real help to the students of the algorithm .
- Website :http://ml-union.cn
- official account : Opiate algorithm
- Qr code for public account : as follows
Here are the questions I sorted out above , I will keep updating this list , The main writing area in the later period is in the official account , It is estimated that Niuke will not come often , Welcome to my wechat , Communicate together .
- Personal wechat :ayao-algo
Machine learning is full of questions and answers
Logical regression
- Let's deduce the loss function of logistic regression , And explain what it means .
- In the advertisement LR In the model , Why do feature combinations ?
- Why? LR The model uses sigmoid function , What is the mathematical principle behind ? Why not use other functions ?
- Why? LR It can be used to predict the click through rate ?
- Data meeting what conditions are used LR best ? let me put it another way , in order to LR Work better , What to do with the data ?
- Can logistic regression solve the nonlinear classification problem ?
- To one with m Samples n Data sets of dimensional features ,LR What is the dimension of the gradient in the algorithm ?
- Why does the logistic regression loss function use the maximum likelihood estimation instead of the least square method ?
- How to solve the parameters of logistic regression ?
- SVM and LR What are the similarities and differences ? Under what circumstances ?
- Why? LR res MSE?
- Why does logistic regression need to discretize the features first ?
- parallel LR The implementation of the
- Logical regression (Logistic regression) What are the applications in the financial field ?
Naive Bayes
- What is Bayesian decision theory ?
- Do you know what naive Bayes is ?
- There are men in the company 60 people , Women have 40 people , The number of men wearing leather shoes is 25 people , The number of people wearing sneakers is 35 people , The number of women wearing leather shoes is 10 people , The number of people wearing high heels is 30 people . Now all you know is that one person is wearing leather shoes , Then you need to guess what his gender is . If you infer that he is a man, the probability is greater than that of a woman , Then think he's male , Otherwise I think he's a woman .
- Can you tell me about the advantages and disadvantages of naive Bayes ?
- “ simple ” This is the disadvantage of naive Bayes in prediction , So there is such an obvious assumption, the disadvantage is , Why can naive Bayesian prediction still achieve good results ?
- What is Laplace smoothing ?
- Is there any super parameter in naive Bayes that can be adjusted ?
- How many models are there in naive Bayes ?
- Do you know what applications naive Bayes has ?
- Is naive Bayes a high square error or low variance model ?
- What are the assumptions of naive Bayes ? What are the advantages and disadvantages ?
- How naive Bayes estimates parameters ?
- What's the difference between Bayesian school and frequency school ?
- What is the difference between logistic regression and naive Bayes ?
Tree model
- Talk about your understanding of entropy 、 Understanding of information gain and information gain ratio ?
- ID3 What is the division standard of the algorithm ?
- ID3 What's wrong with the algorithm ?C4.5 How does the algorithm solve ID3 Defective ?ID3 and C4.5 What's the flaw ?
- C4.5 How to deal with missing values ?
- C4.5 What are the criteria for the division of ?
- C4.5 What are the shortcomings of the algorithm ?
- What is the definition of Gini coefficient and its advantages ?
- CART How to select partition features when eigenvalues are missing ?
- Select the division feature ,CART How should the model deal with the samples missing the eigenvalue ?
- Reasons for over fitting of decision tree and its solutions ?
- What are the strategies for decision tree pruning ? What are their advantages and disadvantages ?
- C4.5 What is the pruning method used ?
- CART How to deal with category imbalance ?
- CART How to deal with continuous values ?
- Please tell me ID3、C4.5 and CART The difference between the three .
- CART Why did the algorithm choose gini Index ?
- C4.5 How does the algorithm deal with continuous values ?
- How does the decision tree deal with missing values ?
- How to calculate the importance of each feature of the decision tree ?
- If there are many features , Must the last unused feature in the decision tree be useless ?
- Does the decision tree need normalization ?
- Since the use of neural network can also solve the classification problem , that SVM、 What is the significance of these algorithms in decision tree ?
- Relationship between decision tree and conditional probability distribution ?
- CART What is your pruning strategy ?
- If it is caused by outliers or uneven data distribution , What impact will it have on the decision tree ?
- What are the advantages of decision tree compared with other models ?
- The difference between decision tree and logistic regression ?
- What is the difference between classification tree and regression tree ?
- How to understand the loss function of decision tree ?
- sklearn Whether the decision tree in should be used one-hot code ?
- Briefly describe the steps of random forest
- Whether the random forest will appear ?
- Why not divide random forest into training set and test set ?
- How random forests deal with missing values ?
- Random forest and GBDT The difference between
- Random forest and SVM Comparison
- Talk about the advantages and disadvantages of random forest
- Briefly Adaboost The weight update method of
- Let's deduce Adaboost Sample weight update formula
- During training , Why is there always the problem of wrong classification in each round of training , Whole Adaboost But it can converge quickly ?
- Adaboost Advantages and disadvantages ?
- AdaBoost And GBDT What are the similarities and differences of comparison ?
- Please give us a brief introduction GBDT Principle
- Why can regression trees be used as GBDT Iterative learning machine ?
- GBDT How it is used to classify problems ?
- Why? GBDT take CART The regression tree is divided into m A binary tree ( Each tree has only two leaf nodes ), Instead of asking for a tree m+1 A binary tree of layers ( At most 2m A leaf node )?
- GBDT How to regularize ?
- gbdt Why do we use negative gradients instead of ?
- GBDT What are the advantages of ?
- GBDT What is the role of reduction in ?
- Why based on residuals GBDT It's not a good choice ?
- Why is it said in the gradient lifting tree that the negative gradient of the objective function with respect to the current model is the approximate value of the residual ?
- Why? xgboost/gbdt Why the depth of the tree is rarely able to achieve high accuracy when adjusting parameters ?
- Why in the actual kaggle In the game ,GBDT and Random Forest The effect is very good ?
- GBDT How to use it in click through rate prediction ?
- GBDT How to calculate the gradient in ? The gradient of who to whom ?
- m×n Data sets , If you use GBDT, So how many dimensions is the gradient ? Or it's related to the depth of the tree ? Or it is related to the number of leaf nodes of the tree ?
- Random forest and GBDT The differences and similarities
- In machine learning algorithm GBDT And Adaboost What are the differences and connections between ?
- Introduce to you XGBoost Principle
- XGBoost And GBDT What's the difference
- RF and GBDT The differences and similarities
- XGBoost Why use Taylor's second-order expansion
- XGBoost How to implement the parallelization part of ?
- XGBoost Why fast ?
- XGBoost How to calculate the weight of leaf nodes in the middle ? Why can leaf node score be used to measure the complexity of the tree ?
- XGBoost Stop growing condition of a tree in
- Please deduce Xgboost
- XGBoost What are the methods to prevent over fitting ?
- XGBoost How to deal with unbalanced data
- Compare LR and GBDT, Tell me what situation GBDT Not as good as LR
- XGBoost How to prune trees in
- Use XGBoost Training model , If it is fitted, how to adjust the parameters ?
- XGBoost How to choose the best splitting point ?
- XGBoost Of Scalable How sex is reflected
- XGBoost How to evaluate the importance of features
- XGBooost General steps of parameter tuning
- XGBoost If the model is over fitted, how to solve it
- XGBoost Why not be sensitive to missing values ? Compared with ordinary GBDT,XGBoost How to deal with missing values ?
- XGBoost How to realize the regularization of ?
- XGBoost and LightGBM The difference between
- XGBoost How to ask Hessian The inverse of a matrix ?
- xgboost In the algorithm, the approximate algorithm is used to obtain the segmentation points ?
- LightGBM Compare with XGBoost What are the advantages and disadvantages ?
- Please introduce several common integrated learning frameworks :boosting/bagging/stacking
- Why integrated learning is better than a single learner ?
- Please briefly describe the meaning of variance and deviation of the model ?
- Must the base model in ensemble learning be a weak model ?
- Please calculate the overall expectation and overall variance of the model
- Why? Bagging The base model in must be a strong model ?
- Why? Boosting The base model in the framework must be weak ?
Feature Engineering
- Machine learning , What are the engineering methods for feature selection ?
- In the ad click through rate model ,LR, GBDT+LR, FM, DNN Advantages and disadvantages of such model ? What's the actual effect ?
- Multi label (multi-label) Data learning problem , What are the commonly used classifiers or classification strategies ?
SVM
- About SVM in , Yes, constant C The understanding of the ?
- machine learning SVM About why the function interval can be set to 1?
- Machine learning has a lot to say about kernel functions , What is the definition and function of kernel function ?
optimization algorithm
- optimization algorithm
- What is gradient descent method ?
- Training with gradient descent SVM What's the problem ?
- least square 、 Maximum likelihood 、 What's the difference between gradient descent ?
- In optimization problems , Why does Newton method need fewer iterations than gradient descent method ?
- Why? nn The bigger problem is that it will fall into local optimization , The convex function is not selected as the activation function ?
Loss function
- Please explain the definition of loss function
- Please talk about your understanding of logistic regression loss function
- Please talk about your understanding of the square loss function .
- Please talk about your understanding of exponential loss function .
- Please tell me about you Hinge Understanding of hinge loss function .
- Please return to logic and SVM Compare the loss function of .
- For logical regression , Why is the square loss function nonconvex ?
- How to make SVM The derivation of is related to the loss function ?
- How neural networks design their own loss function, If you need to modify or design your own loss, What rules to follow ?
- softmax and cross-entropy What's the relationship ?
- Why is the loss function of neural network non convex ?
- What loss functions are commonly used in deep learning ( Optimize the objective function )?
- Neural network , Design loss function What are the techniques ?
- Neural network , Why not take the partial derivative of the loss function and make it equal to zero , Find the optimal weight , Instead, use the gradient descent method ( iteration ) Calculate weight ?
- When using the cross entropy loss function , I just want to punish 0.4~0.6 Such a fuzzy value , How to change ?
Regularization
- Please explain the meaning of regularization .
- What is the relationship between regularization and a priori distribution of data ?
- L1 Compared with L2 Why is it easy to get sparse relief ?
- L1 Why can regularization make the coefficient become 0?L1 How to deal with 0 Point non differentiable case ?
- Deep learning how to prevent over fitting ?
- Multiple... Are used simultaneously in the objective function L1 and L2 The case of regularization term , How to solve ?
AUC
- Please explain AUC.
- AUC And accuracy must be positively correlated ? Is there any internal relationship ?
- Accuracy 、 Recall rate 、F1 value 、ROC、AUC What are the advantages and disadvantages of each ?
- Why? accuracy、precision、f1-score、recall They all scored high, but AUC Low score ?
- Machine learning ,F1 and ROC/AUC, How to do index evaluation for multi classification ?
- How to solve offline and online problems auc Inconsistent with the online click through rate ?
- Why? AUC Insensitive to the proportion of positive and negative samples
- AUC How much does it take ?
- AUC A probabilistic explanation for .
Unbalanced data
- What are the processing methods of unbalanced data sets in machine learning ?
- Please give us a brief introduction SMOTE How does the sampling method deal with unbalanced data ?
- The original SMOTE What's wrong with the algorithm ? How to improve ?
- Please give us a brief introduction Tomek Links Under sampling method .
- Please give us a brief introduction NearMiss Method
- EasyEnsemble How does the algorithm solve the problem of unbalanced data ?
- BalanceCascade How does the algorithm solve the problem of unbalanced data ?
- SMOTE Oversampling and Tomek Links Can undersampling algorithms be combined ?
The recommendation system is full of questions and answers
Shallow model
- Please briefly describe user based collaborative filtering UserCF The recommendation process .
- In user based collaborative filtering , How to calculate the similarity of users ?
- User based collaborative filtering UserCF What are the shortcomings ?
- Please briefly describe the article based collaborative filtering ItemCF The recommendation process .
- Please briefly describe the off-line engineering implementation of item based collaborative filtering algorithm (spark)
- The large-scale sparse matrix multiplication in the calculation process of collaborative filtering algorithm spark Existing solutions in (multiply) What are the drawbacks ? How to improve ?
- Please briefly describe the advantages and disadvantages of collaborative filtering algorithm
- In real business , In order to make collaborative filtering more valuable to the business , What problems should we pay attention to when using this algorithm ?
- Collaborative filtering algorithm will have the problem of cold start , Mainly reflected in ?
- What recommended business scenarios can collaborative filtering algorithms be used for ?
- Please briefly describe the engineering implementation of near real-time collaborative filtering algorithm
- What is the principle of matrix decomposition ? What are the main methods to solve ?
- How to understand the matrix decomposition model from the perspective of deep learning model
- Matrix decomposition algorithm , The length of the hidden vector k How does the value of affect the effect and project cost ?
- Please briefly describe the process of singular value decomposition . What are the defects of singular value decomposition ? Why not apply to the solution in the Internet scenario ?
- Please briefly describe the gradient descent method to solve user - The process of implicit vector of goods .
- How to solve the problem of scoring deviation between users and items in matrix decomposition ?
- The logistic regression model predicts that compared with collaborative filtering , What is the biggest advantage ? What is the recommendation process ?
- Please deduce the mathematical form of logistic regression .
- Please deduce the process of solving the parameter update of logistic regression by gradient descent method .
- Logistic regression as CTR What are the advantages and disadvantages of the prediction model ?
- In industry , Very few use continuous values as LR The feature input of the model , Instead, the continuous features are discretized into a series of 0、1 features , What are the advantages ?
- CTR All features in the estimate are crossed in pairs , What are the shortcomings of the method of giving weight to all combinations ?
- FM What is the principle of ? What is the connection with matrix decomposition ?
- FM Compare with POLY2 Why is generalization better ? What are the advantages in Engineering ?
- FFM Compare with FM What's the improvement ?
- FM What is the training complexity of ? How to deduce ?FFM What is the training complexity of ?
- Why? GBDT It can be used for feature selection and feature combination ?
- GBDT+LR In the composite model ,GBDT How to generate eigenvectors ?
- GBDT+LR What are the advantages and disadvantages ?
- Please briefly describe what Ali's mother put forward LS-PLM The principle and mathematical form of the model .
- LS-PLM What are the advantages of the model ?
LS-PLM What is the relationship between the model and the deep learning model ?
Depth model
- Please briefly Deep Crossing Network structure .
- Deep Crossing What is the role of the residual element in ?
- NeuralCF Based on the matrix decomposition model , What improvements have been made ?
- PNN Comparison NeuralCF and Deep Crossing What are the improvements ? What are the advantages ?
- PNN What are the ways of feature intersection in ? What's the difference ?
- PNN What are the advantages and limitations of the model ?
- How to understand Wide&Deep Model Memorization?
- Wide&Deep Medium Memorization What are the disadvantages ?
- How to understand Wide&Deep Model Generalization?
- Wide&Deep Medium Generalization What are the disadvantages ?
- Please briefly Wide&Deep Model structure of .
- Wide&Deep In training, we used Joint Training, What are the benefits ?
- wide&deep How does the author apply wide&deep To make recommendations ?
- Why? wide&deep For model ftrl and adagrad Two optimization methods ?
- Wide&Deep What are the innovations and advantages of the model ?
- In the application scenario , Which features are suitable for Wide Side , Which features are suitable for Deep Side , Why? ?
- wide&deep Why should continuous features be discretized in the model ?
- Deep&Cross comparison Wide&Deep What improvements have been made ?Deep&Cross Model Cross How does the network operate ?
- FNN What is the main purpose of the model to solve ? How it was solved ?
- FNN What is the model structure of ?
- FNN How to use... In the model FM To initialize the Embeddiing Of layer parameters ?
- DeepFM What is the motivation for the proposal ?
- DeepFM in FM Layer and NN Layers are shared features Embedding What are the benefits of ?
- DeepFM Compare with Wide&Deep What's the improvement ? Why is it so improved ?
- NFM comparison Wide&Deep What's the improvement ? Why is it so changed ?
- What are the characteristics of users' interest in goods ?DIN How to capture these characteristics of user interest ?
- Capture of user interests , What methods do we usually have ?
- DIN How to handle the input of ?
- DIN How is the activation unit designed ?
- Why add cross product as input ?
- Why use simple MLP Realization AU Well ?
- DIN The activation function is used in Dice Replace the classic PReLU Activation function , What are the advantages ?
- DIN An adaptive regularization algorithm is used , What is its motivation ?
- DIN Used in the paper GAUC As an evaluation indicator , What are its benefits ?
- DIEN What is the motivation for introducing sequence information ?
- Please draw DIEN The structure of each layer ? Interest extraction layer and interest evolution layer .
- CVR Estimate what to estimate ?
- And CTR Different estimates ,cvr What are the data sparsity and sample selection bias faced in the estimation ?
- ESMM How to solve the problem of sample selection deviation ?
- ESMM How to solve the problem of data sparsity ?
- ESMM The structure of is based on “ ride ” The relationship design is not based on what is the reason for division ?
- Explain it. ESMM The objective function of
Embedding
- Embedding Why is technology important for deep learning recommendation systems ?
- Please briefly Word2Vec Principle and structure of .
- In order to speed up Word2Vec Training for , What methods have been taken ?
- Talk to you about Item2Vec The understanding of the , What are its limitations ?
- Please briefly describe the structure of the two tower model , What is the function of the item tower ?
- The more complex the model structure, the better ? The more features, the better ?
- DeepWalk What is your main idea ? Say the algorithm steps
- Node2Vec What does homogeneity and structure mean in ? They and DFS and BFS What is the corresponding relationship between ?
- Please write out Node2Vec Jump probability formula between nodes .
- Illustrate with examples Node2Vec The intuitive explanation of homogeneity and structure in Recommendation System .
- EGES The proposal of is mainly to make up for DeepWalk Defective , How to make up for it ?
- Please briefly EGES Structure of model , And briefly describe the practice of each layer .
- Embedding What are the applications in the deep learning recommendation system ? List three directions .
- Embedding What are the problems with training as a deep learning model ?
- Embedding What are the pre training methods , Separately .
- Please briefly Embedding As a process of recall layer .
- Please briefly describe the principle of local sensitive hash and its role in Recommendation System .
Recall
- Suppose the number of inventory reaches the level of millions , How to design methods to recommend to users from this level of quantity top10 The items , At the same time, it can reduce the pressure of calculation ?
- Why sorting is more concerned than recall ?
- What are the characteristics of the recall model that are significantly different from the ranking model ?
- Why not just take " The exposure did not click " Make a negative sample of the recall model ?
- How does the recall model randomly sample negative samples ?
- What are the drawbacks of using random samples as negative samples ? How to solve ?
- Why is there a recall in the recommendation system ? What are the similarities and differences between recall and ranking in the recommendation system ?
- Recommend how system recall is implemented item The pressure of ?
- CTR What are the goals of the prediction and recommendation system gap?
- Reality recommendation system only by " forecast CTR" Sort ?
- Why? CTR Estimates apply only to those with “ True negative ” Sample scenario ?
- Which scenarios in the recommendation system cannot obtain true and negative samples ? How to solve ?
- Please briefly describe based on embedding Recall method for , What are the advantages ?
- Airbnb Recall algorithm listing embedding How to select positive and negative samples for recall ?
- Airbnb Recall algorithm user/listing-type embedding How to select positive and negative samples for recall ?
- Facebook Of EBR How the algorithm selects positive and negative samples ?
- Why does the recall require isolation user And item Decoupling of features ? How to decouple ?
- In the recall scenario , Why often use Pairwise LearningToRank To build the relative accuracy of sorting ?
- Optimize recall Pairwise LearningToRank What forms of loss function can be used ?
- Pinterest Of PinSAGE How to build a positive sample ?
- Please briefly DSSM The principle of the model
- DSSM The input layer maps text into a low dimensional vector space and transforms it into a vector What's the problem ?
- Please briefly describe DSSM How is it applied to recall ? What is the structure ?
- Please give us a brief introduction DSSM Advantages and disadvantages
- DSSM Why is the negative sample in random sampling , without “ The exposure did not click ” When the negative sample ?
- Please briefly describe Baidu's twin tower model
- Please briefly YoutubeDNN The structure and principle of
- Please briefly describe the user multi interest network MIND What is your starting point , Describe its structure ?
- SDM How to combine the long-term and short-term interests of users ? How is its structure ?
Feature Engineering
- Feature engineering that allows you to design a recommendation system , How would you design ? Including user side 、 Item side and contextual features .
- During feature processing , How to deal with continuous features ?
- During feature processing , How to deal with category features ?
Explore and use
- When new users register or new items are put into storage , How to provide users with satisfactory recommendation services , And how to recommend new items , Recommend it to users who like it ?
- Briefly describe the meaning of exploration and utilization .
- Greedy What is the principle of the algorithm ? What are the drawbacks ?
- Please briefly Thompson Sampling Principle and steps of the method .
- Excuse me, UCB How to solve the problem of exploration and utilization in cold start ?
- sketch LinUCB The principle and specific practice of .
Characteristic evaluation
- How does the real-time performance of the model affect the effect of the recommendation system ?
- How does the client recommend real-time features in real time ?
- How does the stream computing platform perform quasi real-time feature processing ?
- Distributed storage system HDFS And the role of distributed batch processing platform in recommendation ?
- Please briefly offline/nearline/online Training methods and steps .
- Please name several indicators for offline evaluation .
- Please say P-R curve 、ROC Curves and AUC The meaning and relationship of .
- How to skillfully draw ROC curve ?
- Offline AUC promote , Will it certainly lead to the improvement of online indicators ? Why? ?
- Why AB test ? What are the advantages over offline evaluation ?
- AB How does the test design the layering and shunting mechanism ?
Deep learning
- Please write down the commonly used loss function , Loss of square 、 Cross entropy loss 、softmax Loss function and hinge Loss function .
- Why is the training of deep neural network very difficult ? What are the main reasons .
- Please illustrate forward propagation and back propagation with examples
- What is the function of introducing nonlinear activation function into deep learning ?
- Please name the commonly used activation functions , And draw their corresponding images .
- How to choose the activation function ? Please describe the characteristics of various activation functions .
- Relu What are the advantages of activation functions ?
- Please explain Softmax Definition and function of activation function ?Softmax How the activation function applies to multiple classifications ?
- In depth model training , Why batch size? How to choose the right one batch size, Have and influence on the results ?
- Please explain BN Principle , Why batch normalization ?
- What is model tuning fine tuning? Please explain fine-tuning Three states of the model , What are the characteristics of each ?
- Why unsupervised pre training can help deep learning ?
- What are the methods for initializing weight deviation ? Explain their characteristics .
- What is the role of setting the learning rate ? What are the common learning rate attenuation methods ? Explain their respective characteristics
- What are the methods to prevent over fitting in deep learning ?
- Please name several common optimization algorithms , And their respective characteristics .
- How to balance variance and deviation in deep learning ? If the deviation is too large, what should we do ? The variance is too large ?
- Please explain Dropout Principle , During training and testing dropout What's the difference ?
- Data enhancement methods commonly used in deep learning ?
- How to understand Internal Covariate Shift?
C++ A hundred questions and answers
Basics
- What is the role of variables ? What is the syntax for creating variables ?
- C++ What is the function of constants in ? Please write down two ways to define constants .
- Please give me a few C++ Examples of pre reserved keywords in
- short type 、int type 、long The type and long long What is the memory space occupied by each type ?
- sizeof What is the function of keywords ?
- What is the memory space occupied by character variables ? What are the characteristics of it in storage ?
- Please give me some examples of C++ Escape characters in ?
- C++ The difference between pre increment and post increment is ?
- Write an example of a ternary operator ? And explain .
- switch case In the sentence break What is the role of ?
- One for The starting expression in the loop statement 、 Conditional expression 、 What is the execution order of the end loop body and loop statements ?
- break Statement and continue What is the function of the sentence ?
Array
- What are the characteristics of arrays ? How to define an array ?
- What is the relationship between the name of a one-dimensional array and its memory address ?
- How to define a two-dimensional array ? What is the relationship between the name of a two-dimensional array and its memory address ?
function
- Explain the meaning of form participation arguments .
- What is the meaning of value passing ? What are the effects on formal and arguments ?
- What function declarations do ?
The pointer
- What does the pointer do ? What is the difference between pointer variables and ordinary variables ?
- How much memory space does the pointer occupy ?
- Constant pointer 、 What's the difference between pointer constants ?
- What's the difference between value passing and address passing ?
Structure
- How to create a structure ? Please write down two methods .
- How to create a structure array ?
- Structure pointer how to access members of a structure ?
- How structures nest structures ? For example
- Can a structure pass parameters to a function as parameters ?
Memory
- Please briefly C++ When the program executes, each memory block ( Code section 、 Global area 、 The stack area 、 Heap area ) Functional characteristics of .
- new What is the function of the operator ? How do you use it? ?
quote
- What is the function of reference ? What is the essence of it ?
- When a reference is used as a function parameter , And value passing 、 What's the difference between address passing ?
- What are the functions and writing methods of constant references ?
When writing function default parameters , What do you need to pay attention to ?
heavy load
- What conditions need to be met for function overloading ?
encapsulation
- What is the meaning of encapsulation ?
- What are the access permissions for the members and behaviors of the class ? What's the difference ?
- What's the difference between a class and a structure ?
- What are the advantages of setting member properties private ?
Initialization and cleanup
- What are the functions of constructors and destructors ?
- What is the constructor Syntax ? What are the characteristics of constructors ?
- What is the destructor Syntax ? What are the characteristics of destructors ?
- What are the constructor call rules ?
- Please explain C++ Deep copy and shallow copy in ?
- C++ What is the syntax for initializing the list in ?
- B There are objects in the class A As a member ,A For object members , When creating a B Object time ,A And B The order of construction and deconstruction is who comes first and who comes second ?
- What are the characteristics of static members ?
- Are member variables and member functions stored separately in a class ? Do non static member variables occupy object space ?
- this What does the pointer do ?
- const What effect does decorating member functions have ? keyword mutable What is the role of ?
- C++ What is the role of Chinese friends ? Global function 、 class 、 How are member functions implemented as friends ?
- What are the ways of inheritance ? What is its authority ?
- Can a subclass inherit the private members of the parent class ?
- What is the constructor and destructor order of parent and child classes ?
- When a member with the same name appears between a subclass and a parent class , How to use subclasses , Access to data with the same name in a child or parent class ?
- What problems does diamond inheritance bring ?C++ How to solve it in ?
- What is the difference between static polymorphism and dynamic polymorphism ?
- What are the satisfaction and use conditions of polymorphism ?
- What are the advantages of polymorphism ?
- What is the meaning of pure virtual functions ? What is grammar like ? What does it have to do with abstract classes ?
- Explain the meaning of virtual destruct and pure virtual destruct 、 Grammar and its differences ?
- How to create a function template ? What does it do ? What to pay attention to ?
- What is the difference between a normal function and a function template ? What are the calling rules ?
- What is the purpose of materializing the function template to solve ?
- What is the role of class templates ? What is grammar like ? What's the difference with function templates ?
- When to create a member function in a class template
- Please explain STL In the container 、 Algorithms and iterators .
python A hundred questions and answers
- python in list、tuple、dict、set What's the difference between other types ?
- What are the forms of function arguments ? What are the characteristics of each ?
- Please explain python Default parameter trap problem .
- Please give an example to illustrate the difference between shallow copy and deep copy
- What are the concepts of generator and iterator ?
- Please briefly describe the built-in functions zip Usage of . When the length of iterators is inconsistent , How is it handled , Is there any alternative ?
- Higher order function map/reduce/filter/sorted What are the usages of ? Illustrate with examples .
- What is the concept of closure ? Illustrate with examples .
- What are the benefits of anonymous functions ? Please give an example to illustrate its usage .
- What is the concept of decorator ? How to use ?
- What is the concept of partial function ? How to use ?
- enumerate comparison range What are the advantages ?
- What is a factory function ? Illustrate with examples .
- Illustrate the difference between class attribute and instance attribute .
- Please explain the concepts of inheritance and polymorphism with examples .
- How to set access restrictions on attributes in a class ?
- How to use __slots__?
- Custom class __str__,__iter__, __getitem__,__getattr__,__call__ What's the use of separation ?
- Static methods 、 What is the difference between class methods and member methods
- @classmethod, @staticmethod, @property What are these ?
- __init__ and __new__ What's the difference ?
- What is? Python introspection ?
- python How to manage memory ?
- What is? GIL?
- Please briefly python Exception handling mechanism of .
- How do you position python programmatic bug Of ? stay python How to realize single step execution in ?
- assert What's the use of assertions ?
- What are the built-in properties of the class ?
- How can a list of elements that are strings be transformed into a space delimited string ?
- python Medium is How operators compare ?
- Please write a regular expression that matches the email address .
- python How to pass command line parameters ?
- How to understand python Thread in ?
- Please briefly python Multiple processes in .
Spark A hundred questions and answers
- Please briefly RDD The concept of , How to create RDD?
- RDD What operations are supported ?
- RDD What operations are supported ? What are the characteristics of each ?
- Please give an example of RDD Transformation operation and action operation
- explain RDD Inert evaluation mechanism
- Explain the conversion operation respectively map、flatMap、filter、distinct Function and usage of
- Explain the conversion operation respectively union、intersection、subtract、cartesian Function and usage of
- Explain the action and operation respectively reduce、fold、aggregate The usage and difference of
- Why is it right RDD persist , Please briefly describe the different levels of caching mechanisms .
- Please briefly reduceByKey、groupByKey、combineByKey The role and difference of .
- RDD The meaning and difference between wide dependence and narrow dependence
- RDD in Client、Master、Worker、Driver、Executor What are the meanings of
- DataFrame And RDD The main difference is ?
- Why? Spark Faster than mapreduce?
- Spark There are those components ?
- Spark Streaming The basic principle of
- How to solve Spark The problem of data skew in
- Please explain SparkSQL Three species join 1.Broadcast Join 2.Shuffle Hash Join 3.Sort Merge Join Meaning and difference
- spark Can replace hadoop Do you ?
- Executor The role of ?
- Driver The role of ?
- spark Rdd The cache of ?
- spark spark-submit What are the parameters of the script ?
- Wide dependence and narrow dependence
- summary Rdd operator (30 More than )
- coalesce and repartition The difference between ?
- reduceByKey and groupByKey The difference between ?
- union and intersection The difference between ?
- What are the commonly used wide dependency operators and narrow dependency operators ?
- DAG How to divide stage?
- How to divide job?
- Spark The choice of persistence ?
- Application scenarios of persistence and fault tolerance ?
- What is an accumulator ?
- What are broadcast variables ?
- Nodes and task The relationship of execution ?
- cluster Mode how to view logs
- Spark Optimize ?
- What is? DataFrame?
- Rdd,DataFrame,DataSet The difference between ?
- spark Bottom core RDD Cache mechanism 、 Application scenarios 、 How to use 、 How to clear the cache
- DAG Directed acyclic graphs and partitions stage
Welcome to my official account
- Website :http://ml-union.cn
- official account : Opiate algorithm
- Qr code for public account :
You are also welcome to add my wechat , Communicate together .
- Personal wechat :ayao-algo