I know Something You Don’t Know — How Not to Interview

Cal Lee
5 min readJan 25, 2021

How hard is hiring? With one sheet of paper and maybe a few hours of conversation, a company is supposed to assess how well someone could contribute over hopefully thousands of hours of work. The hiring manager has to assess how well a prospect’s skills match the requirements of the job, how quickly they could learn new skills, how well they might interact with their future coworkers, how much they would enjoy this job and how well it would fit into their desired growth path, not to mention how they might fit as the company’s future needs evolve. One might imagine that in the field of data science, where technical expertise can be evaluated objectively and the interviewers professionally trained to analyze facts, hiring might be relatively straightforward — less of an art and more of a science.

My experience, as both interviewer and interviewee, could not be further from that imagination. I believe that if data scientists looked at their hiring records as they examined their classification algorithms, most would be embarrassed by their performance. I’ve failed interviewees I should have passed, passed interviewees I should have failed. As an interviewee, I believe I’ve also been a false negative many a times, though I say this with imperfect information and imperfect humility.

What sort of interview questions are causing these ineffective interviews? The most common are:

  • Questions that test specific knowledge

Knowledge is power — how can you not assess how much a candidate knows? Assessing someone’s level so you understand how much they might come in with, but where an interviewer errs is when they use knowledge as a proxy for ability. Data scientists nowadays are expected to cover an enormous breadth of knowledge, ranging from the probability theory to infrastructure monitoring, from natural language processing to database technologies. In addition, as a fairly new discipline, many data science practitioners enter from different domains, with many largely self-taught. A specific question about log-loss or regularization techniques might fall into an interviewee’s knowledge gap on that day, particularly if it is more theoretical and doesn’t crop up in day-to-day application. Even more insidiously, an interviewer might demerit someone for not knowing a fact that they themselves know, without giving the same scrutiny to the converse, whether the interviewee knows facts that the interviewer doesn’t know.

A takeaway — if one Wikipedia article read would have dramatically altered the interview, it is not a well-structured interview.

  • Brainteasers with a trick answer

Brainteasers have an infamous association with software engineering interviews, especially at Google. But these questions have been dropped from standard Google interviews (source), because their data showed that they were poor indicators for future employee success. Still, coding-related brainteasers still get asked, often with the premise that they “show how someone thinks through a problem.” There is merit in evaluating whether a candidate knows fundamental computer science concepts, such as a stack or a binary search tree. Where the question devolves into a tortuous riddle is when a trick is required, requiring the interviewee to play “guess what I’m thinking?” These may be fun, but do they really assess how good an employee this person will make? At best brainteasers are a roundabout way to view someone’s cognitive work, at worst they are infuriating.

  • Measuring effort, not talent, with a take home assignment

Take home assignments can be very revealing, because they can truly emulate a work problem. Here is a real world problem where a candidate can show their solution and their code. The trap to avoid however, is handing out an assignment that takes too long. Those who are good at their job and motivated to keep learning will not have time to do a long assignment just to get an interview. Instead, this type of interview process will attract those who hate their jobs and are willing to put in tremendous effort to leave. Effort is a good signal, but it is gained at the expense of potentially great, but busy, candidates. Make sure that the take home assignment is clean and clearly-scoped to avoid losing good candidates.

  • Over-indexing on communication skills

Some customer-facing roles require top communication skills and all data science roles require some. For many data scientists though, much of the job will be spent coding alone, sometimes solving technical problems that no one else ever learns about. In an interview, the candidate’s communication skills are front and center while their work abilities are not yet clear. On the job however, those abilities to analyze data may be more important than the abilities to speak. It is thus natural post-interview to weigh communication skills disproportionately. Especially for junior data scientists, communication skills can be coached on the job far more easily than data scientist talent.

— —

With all these pitfalls to avoid, what questions should a good interviewer ask?

First it depends on the context of the opening. Some companies may need an urgent specific need to be filled, and the ideal new hire must be able to contribute very quickly. In these cases, assessing current knowledge state is more important, and the first bullet point above is less valid. In many other cases however, companies want to find and nourish great talents over time. Here the starting point is less than importance than one’s growth qualities. How good is the candidate at learning new subjects? How curious are they? How have they demonstrated resilience when their programs seemed hopelessly buggy?

A good data scientist may not be able to control the environment of their company. If there is no need to do text modeling, the candidate might not get professional experience in natural language processing. Instead of focusing on where they lack experience, delve into what they have experienced. Ask the candidate to go through a model that they built. What were they solving for? Where did the data come from? How many algorithms did they consider? How was the model productionized? Did they test any theories or research any new topics to apply to this project?

Some recruiting departments like to standardize their interviews to a script, often under the guise of equity. But candidates’ experiences are full of inequity, and improvising questions off of their experience is the fairest way to assess their aptitude and work characteristics. With regard to productionizing, did they just shrug and say they handed the model off to the data engineers? Did they provide strong opinions on their chosen architecture, or did they reveal a tendency to copy and paste? Did they truly understand why their model worked — or didn’t? And what are some areas they can self-identify as areas of weakness, that potentially this next role can help improve?

Finding great data scientists is harder than ever as demand continues to climb and supply is slow to catch up. Hiring based on a laundry list of technologies and buzzwords will lead to frustrations— finding people who have the talent and ability to grow and allowing them to flourish will be a win-win.

--

--

Cal Lee

Fairly good writer for a data scientist, fairly multilingual for an American, fairly empathetic for a Patriots fan