Microsoft Data Scientist


The role of a Microsoft Data Scientist

Microsoft has been a technology giant and has emerged as a major player in the data science industry after the onset of Azure. Its suite of machine learning tools including those made available as part of its cloud computing services are considered industry leading and cutting-edge. Microsoft has multiple teams that work on speech and language, AI, machine learning infrastructure, cloud computing solutions and many more. Therefore, Microsoft has become one of the largest recruiters for the Data Scientist profile.

Data Scientist salary at Microsoft:

  • Entry level salary :USD 151,000.
  • Senior positions   :USD 360,000. 
  • Median salary      :USD 200,000 with base component being USD 157,000, stock component being USD 21,000 and bonus being USD 22,000.

Microsoft Data Scientist Requirements:

  • Bachelor's degree or higher in Statistics/Math/Computer Science or related field
  • 3+ years of industry work experience in SQL, R, Python, machine/deep learning, and analysis (Recommenders, Prediction, Classification, Clustering, etc.) in big data environment
  • Experience on large scale computing systems like Hadoop, MapReduce and/or similar systems preferred
  • Experience with programming, e.g. Java, C# is a plus
  • Familiarity with deep learning toolkits, e.g. PyTorch, TensorFlow, etc.
  • Self-driven and ability to deliver on ambiguous projects with incomplete or dirty data

Role of a Data Scientist at Microsoft

  • The role of a data scientist at Microsoft varies according to the team one is assigned to. In some cases, the role may be analytics-centric while others may involve heavy use of machine learning and AI. 
  • A Data Scientist at Microsoft may be asked to identify a business or engineering problem and interpret it by applying data science concepts.
  • He/She may also have to uncover sources of information, lead the analysis that would provide valuable actionable insights, and help engineering teams to implement the solution.
  • Data Scientists at Microsoft also have to collaborate with a wide range of engineers and program managers to deliver solutions.

Interview Guide

The interview process consists of 3 rounds as follows:

  1. Initial Screen (30 minutes-1 hour)
  2. Technical Screen(45minutes-1 hour)
  3. Onsite Interview (full day interview)

Initial Screen


Once your application for the job is submitted, you will receive a phone call from one of Microsoft's hiring managers. This telephonic interview usually lasts for under an hour and consists of two sections. The first part will focus on your experiential fit for the company. Here, you will be asked questions on your background, CV, previous experience and projects handled. The second part will include technical questions. The questions will be mostly theoretical. In the past questions on machine learning theory and a few questions on statistics and probability have been asked.

Interview Questions

Most asked interview questions in the Initial Screen:

  • How do lasso and ridge regression differ?
  • Tell us how you would explain the working of a deep learning model to a business person.
  • Explain the meaning of P-value. How differently would you explain it to customers?

Technical Screen


After the initial screen, the recruiter will fix you up for a technical screen. This interview is also telephonic and lasts between 45 minutes and an hour and is more technical than the previous round. As per previous trends, the technical screen consists of around three different questions covering the topics of data science algorithms, SQL, and fundamentals of probability and statistics.

Interview Questions

Most asked interview questions in the Technical screen:

  • You have a bag with 6 marbles. One marble is white.  You reach the bag 100 times. After taking out a marble, it is placed back in the bag. What is the probability of drawing a white marble at least once?
  • Given a box of dimensions W, H, and coordinates of points inside that box. Find the largest area that is free of any of these points.
  • Why neural networks work and why is it a booming field?
  • What’s power? How to explain it to a non-statistics person? what’s a false positive and a false negative?
  • Write a function to check whether a particular word is a palindrome or not.
  • Find the maximum of subsequence in an integer list.  
  • Generate a fair coin from a biased one.
  • Generate 7 integers with equal probability from a function which returns 1/0 with probability p and (1-p).
  • What are the ROC curve and the meaning of sensitivity, specificity, confusion matrix?
  • Given a time series dataset, how will you predict future value?
  • Tell us how you can compute an inverse matrix faster by playing with some computation tricks?
  • Describe how gradient boost works.
  • Describe the steps for data wrangling and cleaning before applying machine learning algorithms.
  • How to deal with unbalanced binary classification?
  • How do you detect if a new observation is an outlier? What is the bias-variance trade-off?
  • Explain the purpose of a Support Vector Machine (SVM). Under which circumstance would you use this?

Practise more technical questions with our Microsoft Data Scientist Experts

Book A Mock interview

Onsite Interview


The onsite interview for Data Scientist at Microsoft is a full-day event. At the onsite interview, the candidate will be interviewed by 5 serving Microsoft data scientists.

Here are five topics for which there would be separate interview panels in the onsite loop:

  • Probability and statistics
  • Data structures and algorithms
  • Modelling and machine learning systems
  • Leasing manager and behavioural interview
  • Data manipulation

During the lunch break (1 hour), the candidate will get to spend 1:1 time with one or two data scientists to learn more about Microsoft ,the teams, and the work that they do. The onsite round is going to delve deeper into the technical concepts asked in the technical screen. You would do well to prepare statistics, probability, data science and machine algorithms thoroughly. The Onsite interview may require whiteboarding solutions - especially writing codes and functions in SQL and Python. So make sure you practise plenty of those before appearing for the onsite loop. Also, Microsoft interviews have a lot of open-ended questions where the solutions are open to interpretation. Many questions are also based on data presentation and visualization. This is different from the other companies for similar roles.


3 quick tips crack the onsite interview round:

  1. Think out loud

Narrate your approach to the problem/question asked as you go through the problem so that the interviewer has insight into your thought process.

2. Hints

Resort to mid answer course correction if your interviewer prompts you that you’re heading in the wrong direction.

3. Ask questions or seek clarifications on questions asked

Interview Questions

Most asked interview questions in the Onsite round:

  • Merge k (in this case k=2) arrays and sort them.
  • Tell us the best approach to select a representative sample of search queries from 5 million?
  • Three friends in Seattle told you it’s rainy. Each has a probability of 1/3 of lying. What’s the probability of Seattle being rainy?
  • Explain the fundamentals of Naive Bayes? How do you set the threshold?
  • Can you explain what MapReduce is and how it works?
  • Can you explain SVM?
  • How do you detect if a new observation is an outlier? What is a bias-variance trade-off?
  • Discuss how to randomly select a sample from a product user population.
  • How do you implement autocomplete?
  • Describe the working of gradient boost.
  • Find the maximum of subsequence in an integer list.
  • What would you do to summarize a Twitter feed?
  • Explain the steps for data wrangling and cleaning before applying machine learning algorithms.
  • How to deal with unbalanced binary classification?
  • How to measure the distance between data point?
  • Define variance.
  • What is the difference between the box plot and histogram?
  • How to solve the L2-regularized regression problem?
  • How to compute an inverse matrix faster by playing around with some computational tricks?
  • How to perform a series of calculations without a calculator. Explain the logic behind the steps.
  • Difference between good and bad Data Visualization?
  • How do you calculate percentile? Write the code for it.
  • Find the maximum sum subsequence from a sequence of values.
  • What are the different regularization metrics L1 and L2?
  • Write a code to check if a word is a palindrome.

Want to practise for the onsite round? Book a mock interview with a Microsoft Data Scientist Expert

Book a Mock interview

Frequently Asked Questions

How many rounds are there in the Microsoft Data Scientist Interview?

There are 3 rounds in the Microsoft Data Scientist Interview, namely Initial Screen, Technical Screen, and Onsite Round

What is the length of each round of the interview?

The different rounds of the interview are of varying duration. The initial screen can last for anywhere between 30 minutes and an hour while the technical screen is a bit longer. The onsite round is usually the longest of them all, and can cover a full day.