The famous App and social networking platform, TikTok, enables anyone to share their creativity and talent from their phones. It encourages creators to reach a broad audience and allows them to share short videos on its platform.
A data scientist gathers, examines, and interprets vast amounts of data. The position of data scientist evolved from several traditional technical responsibilities, such as those held by mathematicians, scientists, statisticians, and computer specialists.
You can apply for the TikTok Data Scientist job on the career page or get an internal employee reference through Prepfully's Company Referral services.
Role of a TikTok Data Scientist
- Identify new product features and evaluate their potential impact with the product team.
- Establish product metrics and quantitative measurement structures based on product stages.
- Provide insights to internal teams by designing, implementing, and reporting dashboards and data pipelines.
- Enhance product adoption and performance-driven growth by partnering closely with key stakeholders.
- Provide optimization suggestions to improve operating efficiency by measuring and visualizing the effectiveness of product updates, conducting research, and analyzing various business processes.
- Visualize data to understand its benefits and challenges
- Understand market solutions on a fundamental level
- Analyze data using best practices and techniques
- Provide dashboards or self-service applications for sharing results
- Experience in working directly with data analysis, processing, and visualization programs
- Analyze data with algorithms or programs
- API-based data collection and preparation
TikTok Data Scientist Salary
At TikTok, data scientists earn between $80,000 and $180,000 annually.
The interview process for the TikTok Data Scientist role consists of 3 stages:
- Take-Home Product Analytics
- Technical Screening
- Virtual Onsite Interview
Below is a detailed description of the interview process!
In this round, you have to answer questions in less than 150 words. You will be given the problem statements.
- As a data scientist, how would you evaluate the following feature? Name one primary metric and other additional metrics.
- Propose experimentation and rollout plan, keep in mind the speed of innovation, statistical rigor, and potential risk mitigation in case of unforeseen issues.
- Let's say that you work at TikTok. The goal for the company next quarter is to increase the daily active users metric (DAU).
Executive A believes the best way to increase DAU is to improve the recommendation algorithm in TikTok's "For You" feature (TikTok news feed).
Executive B believes the best way to increase DAU is to acquire more new users.
Executive C believes the best way to increase DAU is to improve the creator tools.
The engineering team must prioritize one feature at a time.
How do you figure out which executive is right?
What data points and metrics would help validate your choice?
- Considering a 2-dimension matrix that can only be traversed by one adjacent position at a time and never diagonally. Create an algorithm to traverse that matrix from its upper-left corner to its lower-right corner using the shortest possible path most efficiently.
- Two people are each stuck on their island, connected by a ferryman with a lockable box. Each person has their lock and key but can't send the key along with the box. One person wants to send the other a diamond, but it must be locked into the box, or the ferryman will steal it. How do you ship the diamond without the ferryman stealing it?
The second round contains SQL questions, one of which must be solved with window functions, all at a medium level in SQLPad.
What the interviewer will assess
- Knowledge of the core aspects of your field
- Coding skills
- Programming efficiency in multiple languages
Have a good understanding of the built-in functions in SQL. Be aware of what you are saying when you write your code. Explain the steps you are taking as you go. Also, make it engaging for your interviewer.
- Any customers who made at least ten movie rentals are happy, write a query to return the dates when the following customers became satisfied customers:customer_id in (1,2,3,4,5,6,7,8,9,10). You can skip a customer if they never became a 'happy customer.'
- Write a query to return the shortest movie from each category. The order of your results doesn't matter. If there are ties, return just one of them. Return the following columns: film_id, title, length, category, row_num
- Write a query to return the percentage of revenue for each of the following films: film_id <= ten by its category. Formula: Revenue (film_id x) * 100.0/ revenue of all movies in the same category. The order of your results doesn't matter. Return 3 columns: film_id, category name, and percentage.
- Write a query to return the average customer spend by month. Definition: average customer spends: total customer spend divided by the unique number of customers for that month. Use EXTRACT(YEAR from ts_field) and EXTRACT(MONTH from ts_field) to get year and month from a timestamp column. The order of your results doesn't matter.
The onsite interview will consist of 3-4 rounds, viz. Behavioral round, product sense round, and stats A/B testing round.
Behavioral/ Culture fit
- Why do you want to work at TikTok?
- What makes you want to find a new job?
- How would you describe your ideal team?
- Which app is your favorite?
- Can you compare its recommendation engine with TikTok's?
- What is the best way to show a customer an ad?
- To what extent should advertisers be informed of the value of an advertising campaign?
Stats & A/B testing:
- Let's say you're the data scientist for your company's marketing/advertising division. The marketing executive wants to test multiple new channels, including:
Google search ads
Direct mail campaigns
Given these new marketing channels, how would you design an a/b test to utilize the marketing budget in the most efficient way possible?
- What is the P-value, and what are types I and II errors?
- How to decide whether to launch a new feature if the p-value is > 0.05.
- Should we launch if we are running multiple testing and the p-value is < 0.05?