ML Knowledge
Can you walk me through the process you take to prepare data for machine learning algorithms?
Machine Learning Engineer
Palo Alto Networks
Microsoft
Honeywell
ByteDance
Tableau
Bosch
Answers
Anonymous
5 months ago
The data should be cleaned, pre processed and converted into the required format necessary for the specific ML algorithm being used. For example, in case of numeric or categorical data there could be missing values that should be imputed. Values in certain columns may need to be bucketed. Unstructured data has to be formatted in the required input format. In case of text data, we may have to tokenize, lematize and perhaps also create an embedding of the text data. The possibilities are many and depends on the specific task at hand.
Interview question asked to Machine Learning Engineers interviewing at Bosch, Palo Alto Networks, Honeywell and others: Can you walk me through the process you take to prepare data for machine learning algorithms?.