There’s a ton of variance in how data science interviews are designed – sometimes even within the same company.
Different groups from the same company could be looking for completely different things – and that variance could be due to more than just managerial-style differences.
On the one hand, one group’s biggest bottleneck could be more on the data engineering side, where just getting reliable and efficient access to relatively clean data is a substantial challenge at the moment – so pretty much any even rudimentary analysis done on the data is quite valuable.
On the other hand, you could have groups who are already world-class at how they handle the ingestion and access to their specific dataset – and the data scientist is more of just a pure consumer of the data (with minimal data wrangling). In that case, what passes for ‘state-of-the-art’ analytics is way different for this group – relative to other groups that are struggling to even get clean access to their data.
With that in mind, below are a couple thoughts I’ve had recently about preparing for a data science interview – regardless of the type of data science work the particular group is doing:
Be ready to explain why you’re interested in data science
I’m not (yet) talking about why you’re interested in the particular company – I’m talking about why you even wanted to be a data scientist at all.
I know my opinion might differ here from some others, but to me, this is one of the most critical aspects of getting hired as a data scientist – especially if it’s your first professional role. So much of data science is being passionate, having that deep driving desire to keep poking around in the data, sometimes for no other reason than you’re curious and you just need to know.
If you have this intrinsic motivation, where you’re always looking to learn more and improve your knowledge of the world – I think that makes up for so many other potential weaknesses.
You can teach someone technical skills; heck the market for educating data scientists is soon-to-be a multi-billion dollar industry.
However…teaching someone to care care about finding new insights? To be motivated to keep learning about the small details of the business domain that are probably 90% likely* to be ultimately irrelevant? I think that’s much harder to teach.
Learn what you can about their business domain
Again, I probably differ from others giving thoughts on data science interviewers here. Not that I’m the only one recommending that it’s good to learn about the company’s business – no, I know other people recommend that.
Where I differ is in how relatively important I think this point is. There is such a huge variance between industries and datasets about what the specific, particular challenges are for the data science team – and these challenges themselves change all the time.
If you’re looking at a biotech startup using AI to help simulate protein folding, vs a company doing autonomous driving with computer vision, vs a company trying to model and prevent credit card fraud – these companies will have completely different perspectives on data science.
As in, when you ask one director from company X to define data science, you’d possibly get a completely different answer from a comparable director at company Y (and company Z). It’s not that their definitions would necessarily conflict; it’s more that these people could be focusing on completely different aspects of data science.
This relates back a bit to the previous point; different companies are at different stages in even getting their data ready to be hardcore analyzed. Whether it’s a governance or security issue (like with financial data) or just insanely huge data sets and computational bottlenecks (protein folding), the real-world pain points could be completely different.
If you go into an interview with a company and demonstrate at least somewhat of an understanding of their current struggles, you’re ahead of probably 80% of data science candidates. Put another way: if you can have a somewhat intelligible discussion about an industry-specific, real-world struggle the group is currently having – that could be quite impressive.
It’s much more common to see a data scientist who essentially thinks their technical skillset is highly generalizable and immediately applicable between different industries. While this is generally true…when you’re doing cutting edge stuff, that generalizability starts mattering a lot less. The actual time-consuming struggles you run into require a more nuanced, customized approach.
If the company (or anyone associated with their data) has given a public talk or presentation, it would probably be worth your time to check it out. You’ll get a feel pretty quickly for what their industry-specific struggles are, and then you’d be able to much more directly speak to how you could help.
What about technical skills?
I haven’t talked much about that yet. For one, there’s a ton of resources out there already discussing the technical aspects of data science interviews. Also, this article is getting pretty long, so I’ll probably write a different article about this later.
Wrapping up
What do you think? Am I right, wrong, way off? Let me know – feel free to email me or connect on LinkedIn.
*Note (from comment about how most of the little details you learn about an industry are probably irrelevant): Some might say, well if 90% of these details about the business domain are irrelevant, why learn about them at all? Or, why not just focus on the 10% that matter?
Great point. However, I would say that it’s nearly impossible to know beforehand which little details will turn out later to actually matter. Especially when you’re hanging out at the cutting edge, and no one really knows what’s going on. If you’re a student of the domain and just keep picking at it and learning more, over time you’ll probably become the go-to person who just seems to ‘know’ the right questions to ask.
The views expressed on this site are my own and do not represent the views of any current or former employer or client.