Twitter Data Scientist

DifficultyhardRounds3

The role of a Twitter Data Scientist

Twitter is an American microblogging and social networking service on which users post and interact with limited-word messages known as "tweets". With an active user base of 353 million in 2021, Twitter is one of the world's largest technology companies.

Data Scientist salary at Twitter:

  • Entry-level salary: USD 176,000.
  • Senior positions: USD 386,000. 
  • Median salary: USD 225,000 with base component being USD 160,000, stock component being USD 50,000 and bonus being USD 15,000.

Role 

The exact role of a data scientist at Twitter depends on the team one is working for. Here are some of the data science teams that are part of Twitter:

Data Science teams at Twitter:

  • Ads 
  • Scaled enforcement heuristics 
  • Consumer Product 
  • Home and Explore 

Here are some of the common responsibilities Twitter data scientists across the board are expected to fulfil.

  • Delivering actionable results through a combination of data science solutions, product thinking, statistical knowledge, and a deep understanding of data.
  • Leading from the front in terms of developing a new hypothesis and connecting observed customer behaviour potential product solutions
  • Mastering a fluid, open-ended mandate and possess a 0 to 1 mentality when it comes to crafting solutions
  • Collaborating with product managers, engineers, designers, and user research to drive product impact
  • Distilling complex analytical results into presentable, digestible, and actionable feedback for product and engineering teams
  • Leading and prioritizing projects
  • Creating a culture of rigour and scientific inquiry
  • Understanding consumer products

Skills/Qualifications preferred

  • 2+ (4+ years for senior data scientist) years experience in data science and quantitative analysis (preferably in an engineering or product reviews context)
  • Demonstrable first principle problem-solving skills
  • Strong programming skills (Python, R, SQL) and experience using common analysis tools (Hive, Presto, Scalding)
  • Strong bias to action, creative problem-solving mindset, and proactive communication
  • An advanced degree in a quantitative domain such as Computer Science, Machine Learning, Statistics, Operations Research, or similar. Masters and PhD is a plus but not required
  • Proficiency with ML and data analytics technologies such as Spark, Airflow, TensorFlow, etc.
  • Prior work experience with building India focused technology products is a plus.

Interview Guide

Twitter follows a 3-stage interview process for selecting candidates for its Data Scientist role. The process comprises an initial phone screen by a hiring manager, followed by a technical screen, and concludes with an onsite interview. The onsite round is a full-day event and consists of six 1:1 interviews.

Initial Screen

Overview

The initial phone interview is a 30-minute get-to-know-you interview conducted by the company's recruiter or hiring manager. The questions in the initial screen are typically based on your resume. If you have previous work experience, the recruiter looks at your past work experience and your previous projects. The recruiter will also explain to you the scope of the different roles and answer any questions that you might be having regarding this. Based on these, he/she assesses if your previous work experience aligns with the company's present role and requirements. The recruiter might also pop in a few questions to gauge your motivation for working at Twitter. 

Interview Questions

Most asked questions in the Initial Screen:

  • Why do you want to work for Twitter?
  • Tell us about a project you enjoyed the most 
  • What motivates you to be a Data Scientist?

Technical screen

Overview

After the initial phone screen, you will be set up for a telephonic technical screen with one of Twitter's data scientists. The duration of the technical screen is usually between 30 minutes and an hour. This interview will test your skills in coding, statistics, machine learning, A/B testing, and product sense. Typically, there are at least two coding questions, one involving SQL/Python and the other an algorithm code.

It is important to let the interviewers know your approach to solving the questions. Sometimes, interviewers focus exclusively on discussing your “approach”, detailing how you got to the solution and why you used the steps you used.

Tips

  • Brush up your SQL and Python skills.
  • Revise core statistics concepts and their applications.
  • Revise machine learning theory and its application to real-life problems, which may have a business dimension.
  • Get a firm conceptual hold on twitter's existing products, and upcoming products, if any. Product sense questions focus mostly on Twitter's products.
  • Follow the "think out loud" approach attempting questions.

Interview Questions

Most important interview questions:

Machine learning

  • How can the bounding box regression be used in object detection?
  • How will you implement the batch norm using numpy?
  • How does a logistic regression model know what the coefficients are?
  • Is random weight assignment better than assigning the same weights to the units in the hidden layer?
  • Given a bar plot and imagine you are pouring water from the top, how to qualify how much water can be kept in the bar chart?
  • What is Overfitting?
  • How would the change of prime membership fee affect the market?
  • Why is gradient checking important?
  • Describe Tree, SVM, Random forest and boosting. Talk about their advantages and disadvantages.

SQL and Python

  • Write an SQL code to explain the month-to-month user retention rate
  • In Python, what is negative indexing? Why is it needed? Give an example.

Statistics

  • Explain the math behind the principal component analysis
  • Give an example of a data set with a non-Gaussian distribution?

Coding

  • Given a string with a word, write a function to delete all the duplicates in it.
  • If given an integer n and an array of numbers, give out the histogram divided into n bins.

A/B Testing

  • Some A/B tests fail to yield insights, outcomes, or value. What reasons can you think of for the same? When do A/B tests provide the most value to a business?
  • Design 3 A/B tests to improve the results for Twitter's promoted tweet feature.

Product sense

  • If twitter were to get into payments, how easy or hard would it be?
  • How would you make Twitter's news feed more relevant to a particular age group?
  • What data-driven insights can help Twitter filter out fake news from its platform?

Want to practice more such questions with a Twitter data scientist?

Book now!

Onsite round

Overview

After clearing the technical phone screen, the recruiter will call you for an onsite interview at one of Twitter's campuses. The onsite loop comprises 5-6 interviews during the day. The interview panel consists of 3 members-a data engineer, a data scientist, and an HR manager. The duration of the onsite round is about 6 hours.

Following are the different interviews you will have to face, in no particular order, as part of the onsite round:-

  • Machine learning and modelling interview with a data scientist
  • Data manipulation and A/B testing interview
  • Product sense interview
  • Interview with a data scientist to test your SQL skills
  • Data structures, Algorithms and/or system design interview
  • Behavioural interview with an HR manager to assess your cultural and experiential fit for Twitter

Tips

  • Brush up on machine learning and data manipulation as it's a core focus.
  • Practice a good number of A/B testing questions.
  • Be aware of Twitter's products and services.
  • Include personal stories from previous work experience(s) in your answers to behavioural questions

Interview Questions

Most important questions asked in the onsite round:

  • Suppose you have unbalanced data where the ratio of positive and negative is huge. How would you deal with such data?
  • Write a python function that displays the first n Fibonacci numbers.
  • Given a large string and a smaller string, write a code to determine if the smaller string can be generated from letters of the larger string.
  • Implement the union and intersection of two arrays (in an efficient way). Note that elements of the two given arrays may be repeated but cannot be repeated in union and intersection arrays.
  • Suggest some changes to make the Twitter app more user-friendly? How would you test if the proposed change is effective or not?
  • How would you design a system to find the top ten Twitter hashtags in the most recent 1 min, 10 min, 1 hr?
  • How would you measure user engagement given all of Twitter’s analytics and tracking data?
  • Write a query in SQL to measure the number of ads that were viewed in moments versus the news feed.
  • Given a two-column file with user codes and counts, retrieve the top-k users based on a score that is a function of the number of times they appear on the file and these counts.
  • If you got the job at Twitter and got access to all of its data what kind of data analysis would you like to perform?
  • Illustrate a tree-based system with a SQL query.
  • How would you modify a table with over a billion rows?
  • Give an instance of a project when you were on a short deadline and you finished it.
  • Given a list of all followers in the format: 123, 345;234, 678;345, 123;…where the first column contains the ID of the follower, and the second one is the ID of who’s followed, find all mutual follows(pair 123, 345 in the example above). Do the same in the case, when this list does not fit into the memory.

So, this was a quick round-up of the Twitter data scientist interview prep strategy. If you follow the tips and strategy mentioned in this guide rigorously, we are sure you would crack the Twitter data scientist interview.

Thanks for reading!

All the best!

Ace the Twitter Data Scientist interview by practicing with a Twitter Data Scientist

Book now!

Frequently Asked Questions

How many rounds are there in the Twitter Data Scientist interview?

There are 3 rounds, namely Initial Screen, Technical screen and Onsite Round.