How hard is hiring? With one sheet of paper and maybe a few hours of conversation, a company is supposed to assess how well someone could contribute hopefully thousands of hours of work.
The hiring manager has to assess how well a prospect’s skills match the job requirements, how quickly they could learn new skills, how well they might interact with their future coworkers, how much they would enjoy the job, and how well it would fit into their desired growth path, not to mention how well they might fit as the company’s needs evolve.
One might imagine that in the field of data science, where technical expertise can be evaluated objectively and the interviewers professionally trained to analyze facts, hiring might be relatively straightforward — less of an art and more of a science. My experience, as both interviewer and interviewee, could not be further from that description. I believe that if data scientists looked at their hiring records as they examined their classification algorithms, most would be embarrassed by their performance.
I’ve failed interviewees I should have passed, passed interviewees I should have failed. On the other side of the table, I believe I’ve also been unreasonably rejected many times, though I say that with imperfect information and imperfect humility.
What sorts of questions are causing these ineffective interviews? The most common are:
- Questions that test specific knowledge
Sure, knowledge is power — how can you not assess how much a candidate knows? Assessing someone’s knowledge level so you understand how much they might come in with is necessary. Where an interviewer errs is when they use knowledge as a proxy for ability.
Data scientists nowadays are expected to cover an enormous breadth of knowledge, ranging from probability theory to infrastructure monitoring to natural language processing to database technologies. In addition, as a fairly new discipline, many data science practitioners enter from different domains, with many largely self-taught. A specific question about log-loss or regularization techniques might fall into an interviewee’s knowledge gap on that day, particularly if it is more theoretical and doesn’t crop up in day-to-day application. Even more insidiously, an interviewer might fault someone for not knowing a fact that they themselves know, without giving the same scrutiny to the converse, whether the interviewee knows facts that the interviewer doesn’t.
A takeaway — if reading one Wikipedia article read would have dramatically altered the interview, it is not a well-structured interview.
- Brainteasers with a trick answer
Brainteasers have an infamous association with software engineering interviews, especially at Google. But these questions have been dropped from standard Google interviews (source) because the company’s data showed they were poor indicators of future employee success. Still, coding-related brainteasers continue to get asked, often with the premise that they “show how someone thinks through a problem.”
There is merit in evaluating whether a candidate knows fundamental computer science concepts, such as a stack or a binary search tree. Where the question devolves into a tortuous riddle is when a trick is required, requiring the interviewee to play “Guess what I’m thinking?” These may be fun, but do they really assess how good an employee this person will make? At best brainteasers are a roundabout way to view someone’s cognitive work, at worst they are infuriating.
- Measuring effort, not talent, with a take-home assignment
Take-home assignments can be very revealing because they truly can emulate a work problem. Here is a real world problem where a candidate can show their solution and their code. However, the trap to avoid is handing out an assignment that takes too long. Those who are good at their job and motivated to keep learning will not have time to do a long assignment just to get an interview. Instead, this type of job interview process will attract those who hate their jobs and are willing to put in tremendous effort to leave. Effort is a good signal, but it is gained at the expense of potentially great, but busy, candidates. Make sure the take-home assignment is clean and clearly scoped to avoid losing good candidates.
- Over-indexing on communication skills
Some customer-facing roles require top communication skills and all data science roles require some. For many data scientists though, much of the job will be spent coding alone, sometimes solving technical problems that no one else ever learns about.
In an interview, the candidate’s communication skills are front and center, while their work abilities are not yet clear. On the job however, those skills to analyze data may be more important than the communication skills. Thus, it is natural, post-interview, to weigh communication skills disproportionately. Especially for junior data scientists, communication skills can be coached on the job far more easily than data scientist skills.
— —
With all these pitfalls to avoid, what questions should a good interviewer ask?
First, it depends on the context of the job opening. Some companies may require an urgent, specific need to be filled, and the new hire must contribute very quickly. In these cases, assessing their current knowledge state is more important, and the first bullet point above is less valid.
In many other cases, however, companies want to find and nourish great talents over time. Here the starting point is less important than one’s growth qualities. How good is the candidate at learning new subjects? How curious are they? How have they demonstrated resilience when their programs seemed hopelessly buggy?
A good data scientist is likely unable to control their company’s environment. If there was no need to do text modeling, the candidate will not have professional experience in natural language processing. Instead of focusing on where they lack experience, delve into what they have experienced. Ask the candidate to go through a model that they built. What were they solving for? Where did the data come from? How many algorithms did they consider? How was the model productionized? Did they test any theories or research? Any new topics to apply to this project?
Some recruiting departments like to standardize their interviews to a script, often under the guise of equity. But candidates’ experiences are full of inequity, and improvising questions off of their experience is the fairest way to assess their aptitude and work characteristics.
With regard to productionizing, did they just shrug and plead that this was the domain of the data engineers? Did they provide informed opinions on their chosen architecture, or did they reveal a tendency to copy and paste? Did they truly understand why their model worked — or didn’t? And what are some areas they can self-identify as ones of weakness, that this next role potentially could help improve?
Finding great data scientists is harder than ever as demand continues to climb and supply is slow to catch up. Hiring based on a laundry list of technologies and buzzwords will lead to frustrations. Seeking out those who have the makeup to grow and allowing them to flourish will be a win-win for interviewers and interviewees.