Some traits of the best data science communicators

Something I’ve been thinking about lately is how many data scientists seem to be underemphasizing the importance of non-technical skills.

Specifically, if you can’t effectively convey the value of your work to a non-technical audience…it’s going to be hard to be successful.  Your client or user needs to understand why your analysis is valuable, and oftentimes the work might not clearly speak for itself.

Below are some thoughts about a few common traits I’ve seen from good data science communicators.


Begin discussions and presentations by clearly stating the context

Although pretty much every data scientist would agree with the above statement, many seem to have difficulty with consistently putting it into practice.

I can certainly understand why: in the time leading up to a discussion, you were probably deep in the weeds on a highly technical concept within the analysis.  Big picture thinking wasn’t exactly a priority at that point – the details were.

However, one of the quickest ways to lose your audience is to start by immediately throwing technical concepts at them without first providing context – especially if they are non-technical.

For example, a data scientist might just immediately jump into the deep end: “Here’s a correlation for variable X vs variable Y, within context Z and assumptions W and V.”

This sounds great to someone who was just previously immersed in this problem space – but there’s a good chance your audience was not, and they’ll have no idea what you’re talking about or why they should care.

Taking thirty seconds to clearly establish why this discussion is happening is a relatively easy step, but in practice many data scientists don’t do it.


Openly address limitations and key assumptions

The more complex your analysis is, the more likely it is you had to make multiple critical assumptions.  Depending on your client’s preferences, some of these assumptions have probably been made without explicit approval from the client or user.

This itself isn’t a problem, as in practice it would be way too much overhead to immediately check every ongoing assumption with the client, and you’d probably start annoying them.

However, when you are presenting your analysis (even if it’s just an iteration and not the final product), you now need to clearly state what these assumptions and limitations are.

At this point of the discussion (or presentation), this where things can get a bit tense.  If you’re presenting highly complex analysis, it’s likely that you’ve made at least one key assumption that the client would either disagree with, and/or they were not aware of it.

And even if you don’t think the assumption is that impactful (“it’s just an edge case”), there’s a good chance that the client, who probably knows way more about the business domain than you, would feel otherwise.  And maybe strongly.

If you’ve previously worked towards building up the relationship with your client, this is where it could start to really pay off.  If you have a good relationship with your client, this is a smooth conversation where limitations are openly discussed, as among respected peers.  If not – this could be a tough conversation.


Actively encourages questions and discussion

This is a point that newer data scientists might not appreciate as much: you want your client to be actively engaged.

Put another way, if you throw some some sweet analysis for them, and they don’t have much to say…90%+ percent of the time, that’s a really bad sign for you.

You want them to be asking questions, asking why certain decisions were made, commentating on a somewhat obscure aspect of one of your graphs, telling you that they have a slightly different interpretation.  If they’re not really saying anything, it’s probably because they’re about to essentially throw your analysis into the trash.

If the discussion or presentation is in front of less than ten people, and you’re the only one talking for more than five minutes straight…something’s probably wrong.

It’s way too easy to lose your audience – and without little checkpoints to make sure they’re still engaged, you run the strong risk of them completely not understanding your analysis.  And it can be hard for data scientists to remember that it’s their responsibility to keep the audience engaged.


Wrapping up

It’s easy for a data scientist (especially me) to just kind of passively assume their analytics work will stand on it’s own – and the perceived value won’t be very dependent on the quality of how it’s discussed or presented.

In my opinion, you have to keep actively reminding yourself of how that probably isn’t true, and continue to make the effort to build skill in better selling your work.


The views expressed on this site are my own and do not represent the views of any current or former employer or client. 

Leave a Reply

Your email address will not be published. Required fields are marked *