Data science and the art of listening to your clients

Note: This article isn’t just about data science consulting – I use the terms ‘client’ and ‘end-users’ interchangeably, as in either case someone is entrusting you to find insights in their data.

One trend that I’ve started noticing recently is that you can roughly divide up data scientists into two categories: those who really, really care about what their client wants – and those who don’t.

As you can probably guess by my tone, I think it’s pretty important to listen to your clients – and apparently, way more important than many data scientists seem to think it is.

 

Data scientists are getting caught up in their own hype

I think we’re at peak hype for data science.  Not that the hype can’t increase from here; more that it’s probably never been higher.

We’re seeing all-time highs in big promises, rosy predictions, fancy buzzwords, substanceless (but fancy) graphs, self-congratulatory conferences, frothy startup & VC activity in the data science space…

Is this hype legit?  I don’t know.  What I do know is that it seems many more people are starting to think that data science will magically solve all their problems.  And if you happen to be a data scientist – well then you’re God’s gift to business!

The problem is, when you think you’re awesome, you stop listening to feedback and constructive criticism.  After all, you’re already pretty much perfect – why should you make a change?

Of course, no data scientist would say they think like this.  However, if you look at how some of them act, sometimes blatantly ignoring feedback and suggestions from clients…it seems they don’t really care what the client has to say.

 

Data scientists are starting to listen less to their users

It’s not that the data scientists don’t care what the user has to say.  It’s that, data scientists seem to be becoming increasingly likely to brush off a user’s concern as “yeah, well they just don’t understand.”

With all the cool new analytics tools and methods available to data scientists today, it’s easy for them (and especially me) to get caught up in all the flash – and lose sight of what actually matters to the client.

What users are not asking for: fancier graphs, flashier ML algos, bigger buzzwords.

What users are asking for: help finding insights in their data.

 

You’re probably not spending enough time listening to your users

This will sneak up on you – as it has for me, multiple times.  Most of us would like to think that we do a good job listening to users.

It can show up in multiple subtle ways, at least initially: brushing off a user’s concern that about a data issue as “just an edge case,” leaving errors in a graph because “they can still understand it without me fixing it,” not taking the time to sanity check output because “they’ll still get the big picture.”

And then, one day, you get taken off the project – quietly.  You’ve been replaced by somebody who does actually care about the “little things” – because when you add them up, they start to be really big things.

Some of us get so caught up in our own hype, we’re forgetting why users were asking us to do this work in the first place.

 

A note for newer data scientists

Especially for newer data scientists, it can be hard to appreciate these points.  It’s not that they would say “don’t listen to your users.”  Instead, it’s more like they’ve never really been in a situation where they might have to distinguish between their own personal preferences – vs the client’s.

Many newer data scientists have simply never had a client.  Even if they’re done a ton of analytics on their own, the whole time they were essentially their own client.

Once you have to start keeping a client happy, it’s a completely different ballgame.  They don’t get to see all the sweet analytics work you’re doing in the background to try to find them insights – all they see is the end product.  And if your end product isn’t that great, it doesn’t really matter the effort you put into it.

 

Conclusion

Many people, including myself, would like to believe that all the little “imperfections” in our analysis aren’t a big deal – and often, they aren’t.

However, user’s look at this stuff in a completely different way than we do – and eventually, the extra mental willpower needed for a user to power through these cumulative “little” issues…can just be too much for them.  And you get taken off the project.

 

The views expressed on this site are my own and do not represent the views of any current or former employer or client. 

Artificial intelligence, food waste, and computer vision

There’s a lot of hype in AI right now.

It’s hard (especially for me) to distinguish what’s a real-world, legitimate application, vs what’s just pure fantasy – at least right now.

One area I’ve been looking into lately is an interesting application of computer vision within artificial intelligence: using cameras and machine learning to monitor discarded food, with food waste reduction being the end-goal.

For context, about one-fifth of the trash in our landfills is wasted food, and up to one-third of the world’s food supply ends up in the trash.

There’s a couple startups doing interesting work here (Winnow and Zero Foodwaste), and I wanted to talk through some stuff they’re working on.

 

Actually taking action

First, what I like about these startups is they are actually doing something about a societal problem – or at least trying.

In the age of peak slacktivism (10k likes = 1 cancer cure), it’s way more common for us to just hope our retweet is sufficient for addressing the world’s problems.  Raising awareness!

The problem is…if everyone’s ‘raising awareness,’ and no one’s actually doing the work – nothing would get done.  But, I guess at least everyone would feel better about themselves.

When you cofound a startup, you’re taking huge personal risk – both mentally, socially, and professionally.  These guys are putting themselves out there and actually going for it.

 

What do they actually do?

Winnow and Zero Foodwaste use computer vision and essentially weight scales (‘smart meters’) to identify different food types put into waste bins.

This image recognition and weight data is processed by their AI and then reported back to the kitchen manager – to give them a much more specific idea of exactly what type of food (and how much) they’re wasting.

 

A more realistic philosophy for being sustainable

A lot of startups within the impact and sustainability category don’t really make fundamental economic sense – many of them are floating along on the vague notion that “we’ll help some bigger profitable company hit their ESG goals.”

With these startups, it’s not like that.  For example, with Winnow, they are focused on putting actual dollar-amount food waste numbers in front of hotel kitchen managers.

When you start showing business owners the black-and-white dollars they’re losing due to food waste, it becomes less about “am I feeling responsible today” and more about “I want to stop throwing away money.”

 

A concrete, real-world application example that people can easily relate to

Kind of contrary to my ‘AI is all hype’ sentiment from earlier, I do think that it’s pretty important to get people excited about the cool things AI can do – especially when it’s for something that will benefit society in general, as opposed to just a few investors/CEOs.

When people hear stuff like ‘autonomous trucks’ or ‘AI chatbots,’ I think the usual initial reaction is not “hey that’s pretty cool.”  I think it’s more, uh wouldn’t that be dangerous (trucks), that’s kind of creepy (chatbots).

However – using AI computer vision to monitor and reduce food waste – I think that’s an application that everyone can get behind.

It’s an example of AI that many people can more unambiguously appreciate, since we all know what it’s like to throw away perfectly good food – it’s not just some abstract concept.

 

Conclusion

As a former cofounder of a socially-conscious/impact startup, I appreciate the effort of these startups.  It’s really, really hard to do an impact startup – and even more difficult to propose something that actually could be economically viable.

I have no idea if these guys will be successful, but I applaud the effort (slacktivism!).

 

The views expressed on this site are my own and do not represent the views of any current or former employer or client. 

What most students don’t get about the data science industry

With how fast the data science industry is changing, it’s hard (especially for me) to keep up with all that’s going on.

For students, it’s probably even more difficult, as they likely wouldn’t have a base of knowledge to draw upon when sifting through what’s hype vs legit in data science (spoiler: there’s a ton of data science/AI hype right now…but that’s a topic for another day).

With that in mind, below are some thoughts about the industry that some data science students might not yet appreciate:

 

It’ll probably take at least three months for you to meaningfully contribute at your first job

And in many cases, it could be way longer.

It’s hard for a newer data scientist to grasp just how critical business domain knowledge is – often as a prerequisite before you can reasonably expect to start digging through the data and producing meaningful insights.

For example, assuming you just got your first data science job in finance: if you don’t understand the basics of how the stock market works – it’ll be almost impossible for you to contribute any meaningful insights, until you have some baseline of knowledge.

Yes you’ll be able to throw out some cool graphs and maybe some spicy buzzwords – but it won’t actually be helpful to the business.

I understand the excitement, particularly with advances in deep learning, that maybe it’s becoming d less important for you to actually understand the data you’re looking at – before deploying the coolest new ML algorithm and just letting the algo start pumping out the insights.

However…I just don’t think we’re there yet.

And especially if this is your first data science job, it’s pretty unlikely that you’re skilled enough in deep learning to produce something that will rapidly and legitimately benefit the business, without first being proficient in understanding the domain.

This onboarding process of learning takes time, and I think it’s something that most companies are getting more comfortable with openly acknowledging.

For example, in discussing with my little industry group of data science buddies, the general consensus is that, on average, you’re looking at an average of six months before a new data science hire is meaningfully contributing.

 

You don’t need to already know the business domain to get hired

It’s hard enough to find quality data scientists to hire these days – and when you throw in the restriction that they also have to already be fairly knowledgeable in your domain, you could start running into major recruiting problems.

Which brings me to my next point…

 

Humility and curiosity are probably the most underrated traits in data science

It’s not really about what you currently know about the business domain – it’s more about how motivated you are to learn more about the domain.

In other words, if you’re inherently interested in learning more about the given particular industry/business area, you’re probably in pretty good shape.  Lots of would-be data scientists…just don’t really care.

For newer data scientists, this willingness and ability to learn is huge.  Technical skills are great, but in my opinion it’s way harder for a manager to find ways to get an unmotivated person to really learn about a new domain (beyond just surface-level knowledge) – vs a manager helping someone get trained in a new technical skill.

If you’re inherently curious and realistic about what you know (and don’t know), that could set you miles apart from many wannabe data scientists.

Not just in data science, but in fintech and analytics in general: right now, there are a ton of ‘experts’ walking around with high concentrations of confidence and buzzwords, but lacking the aspects of actually understanding what they’re talking about.  Even worse, many of these people have apparently no interest in learning.

Put another way: it’s less about where you are currently, and more about what’s your trajectory, what’s your progress, are you actively taking steps to learn and improve.

If you’re more interested in data science as a ‘job,’ and less interested in it as a way ‘to work on cool stuff and learn about cool things,’ then I think you’re missing out on the things that make data science most rewarding as a career.

 

Wrapping up

If you’re a student interested in data science as a career, it might be helpful to stay mindful about just how fast the industry is changing – and how learning (vs your current knowledge or skillset) is probably going to be the more strategically important long-term trait.

Put another way: to stay relevant and not become obsolete as a career data scientist, it’s probably a good idea to stay mindful of how important it is to keep adapting – even when you think you’re irreplaceable.

 

The views expressed on this site are my own and do not represent the views of any current or former employer or client. 

Philosophy on meetings: a data scientist’s perspective

One thing I’m currently struggling with is balancing two very different types of workloads.

On the one hand, there are items that are more maintenance, budgeting, planning – basically, the not-fun parts of a data scientist’s job, but stuff that kind of has to be done (especially when you start managing people).

On the flip-side, there are the tasks that other people would more readily associate with data science: digging into the data, communicating with users, making cool visualizations, finding insights – basically being innovative.

It’s not really a time-balance issue.  Although currently my time is pretty constrained, that’s not the core issue.

For me, it’s an energy issue – specifically; the huge amount of energy needed to keep context switching back and forth between one type of thinking (maintenance) vs another (innovation).  It’s just a completely different way of looking at the world.

Accompanying those less-fun maintenance tasks is every data scientist’s ‘favorite’ way to spend time: meetings.  They are a necessary evil, so below are some philosophical thoughts:

 

The larger your team grows, the more time you’ll have to spend in meetings

This…is something I’m struggling to accept, but it’s seeming more and more like it’s my new reality.

The fact is, as you add more and more people on our team, working on interconnected items, you will need more coordination to keep everyone on track.

For me, my biggest concern is to make sure that I’m not the bottleneck.  For that reason alone, status meetings are helpful – especially when I’ve been so overwhelmed with other items that I might have lost track that someone is waiting on me for something.

However…something else to consider, especially when switching between ‘maintenance’ and ‘innovative’ modes of thinking:

Intermittent meetings can cause huge context-switching costs

There’s a major difference in mentality between exploring data to find insights that no one else has ever seen – vs the mentality needed to explain to another team why their maintenance estimates were just way too optimistic, and now we need to discuss escalating the issue to upper management.  And how we’re going to handle the ‘messaging.’

All while playing nice, keeping everyone happy, and not seeming like you’re out here just making excuses and being a Debbie Downer.

Some people are really good at that maintenance perspective, while some are better at the innovative perspective – but when you have to actively straddle between both (as many of us do), just a couple intermittent meetings could absolutely kill your productivity for the rest of the day.  We just can’t afford the energy sink.

 

It can be tempting to start scheduling meetings at the first sign of ambiguity

Someone in my little industry group was recently telling me how they were getting flooded with half-hour meeting requests involving 4+ people – when in reality, these things could probably have been resolved in three-minute conversations involving only two people.

Why is this happening?  One theory: especially for people who maybe come from the more ‘maintenance’ background, they just don’t recognize the cost of having an additional meeting – because, dealing with maintenance and meetings is pretty much all they do, all day.

To them, scheduling another meeting is not big deal.  For others (like data scientists), it is a huge deal – but we have a hard time conveying that message while still playing nice.

 

Especially for data scientists, some meeting types are way more draining than others

Not all meetings are bad – in fact, I’ve been in multiple meetings recently that were a net positive, in that I had more energy after the meeting than before.  Some traits and commonalities I noticed:

Uplifting: presenting my analysis, having working sessions to discuss a visualization, discussing next steps with end-users, discussing strategy.

Draining: having conflict about deliverable dates, explaining why an initial estimated timeline was too optimistic, coordinating work between multiple groups, gently trying to explain to a confident person why they are mistaken.

 

Conclusion

Unless you’re somehow doing data science while living under a rock, you’re going to have to spend substantial time and energy in meetings.  At least for me, I’m currently working on having a better relationship with meetings – but it’s of course a work in progress 😀

 

The views expressed on this site are my own and do not represent the views of any current or former employer or client.

Why is career advice becoming less useful?

One thing that people love to do is give career advice.  It feels great; when someone asks you for advice, they’re essentially giving you strong affirmation and compliment of nearly your entire worldview.

Something discussed within my little group of industry buddies is not just us sharing our own career-related thoughts with each other – it’s also discussing, as a concept, is career advice actually helpful?  Some thoughts below:

 

Most career advice is highly autobiographical

Unfortunately, lost within all the career advice self-congratulatory glow is that advice is largely autobiographical.

The person giving advice is essentially saying what advice would have been most useful for themselves years ago.  While also giving subtle backhanded compliments to themselves about their previous life choices.

That’s great and all – except, everyone’s individual situation is highly complex and contextual, where even slight differences in someone’s immediate circumstances could completely change whether a piece of advice is good – or garbage.

 

Some fundamental career-related changes in the past 25 years

And not everyone seems to appreciate that variance – or even really acknowledge the possibility.  For example, say a 50 year old is advising a 25 year old.

Twenty-five years ago, when that advisor was maybe 25 years old, consider for example: the gig economy did not really exist, employer loyalty was still a thing, and student loan burdens didn’t really exist.

Regarding advice for what worked 25 years ago vs what would work now…any single one of those factors could be a complete game changer.

There are other major changes, but just considering these factors alone – the contextual career advice would already be way different.  Unfortunately, sometimes it seems that not everyone has recognized this reality – or thinks that we’re still living in 1995.

Gig economy: Depending on who you ask, the rise of the gig economy is either a great thing or the death knell of the middle class.  Regardless, given how many people take part in it now vs a generation ago, it’s a huge factor,

Staying at a job for an entire career: This used to be way more common.  ‘Blame’ it on whoever you want, whether it’s employees for not sticking enough to one company or companies taking existing employees and gradually squeezing them dry – but the fact remains, people are changing jobs way more often than they used to.

College: Back in the day, most entry-level didn’t require a bachelor’s degree, and it was extremely rare to see an undergrad degree cost the same as a small house (if not more).  The result being, way higher student debt than any other time in our country’s history, which has massive implications on career decisions (among other things).

 

…and things are still changing, except more quickly than before

With the rise (and acceleration) of advanced analytics and artificial intelligence, things are changing really, really fast.  And very few people, including myself, are successfully grasping just how fundamental some of these changes and implications actually are.

As a side note, I’m not very political, but it seems the only semi-mainstream politician who fundamentally understands this seems to be Andrew Yang – at least right now.

I don’t know enough about his policies to say whether I support him, but I do agree with his outlook regarding how AI is causing massive changes to the employment landscape – and with probably much more disruption to come (aka industrial revolution).

We’re kind of at an unprecedented time in history, where job types are being created (and effectively destroyed) at a faster rate than we’ve pretty much ever seen before.

Data science: For example, my field of data science is going through some huge changes; specifically I can’t keep track of all the related startups being funded or acquired, many of which are ultimately trying to make the role of data scientist a relic of the past.

Of course, from a startup/VC PR perspective, that’s not how they’ll frame it though – “Innovation, progress, everyone wins!!!”

 

Conclusion: As usual, this isn’t a complete list, and generally advice is still pretty useful – if for no other reason than to just get another perspective; something else to think about.

However, I do wonder whether people are mindful enough about some of the fundamental limitations of career advice – and whether people recognize just how much some things have changed (are are still changing).

 

The views expressed on this site are my own and do not represent the views of any current or former employer or client.

 

How to get your entire data science team to quit within one year

There’s a ton of hype in the data science industry right now, and the general consensus is that it’s currently quite difficult to hire quality data scientists.

Given this tight hiring environment, something that has been confusing my group of industry buddies is how some managers are still apparently looking for ways to alienate nearly their entire data science team.

With that in mind, below are some notes we discussed about the best ways to ensure that your entire data science team quits within one year.  Thankfully, I’m not personally dealing with these issues at my job…but some of my friends aren’t so fortunate.

 

Focus on ‘tightening up the process’ with more status meetings, organization

It can be hard for some data science managers – especially those who have maybe never actually done data science themselves – to fully appreciate how disruptive intermittent meetings can be.

I certainly understand the desire to keep track of people’s work, ‘keeping everybody on the same page,’ and whatever other Agile/Scrum-like phrasing you want to use.  However…the problem with this thinking is that you’re essentially assuming that data science follows a generally known and linear process.

I do think that Scrum boards can be useful when trying to do cutting edge data science; after all you do need at least some rough way of tracking progress.

However…when managers get religious about this and demand tickets for every item, and meticulous grouping/maintaining of all the tickets, in spite of how unknown and dynamic the future process is…that’s a great way to get a data scientist to quit.

At the end of the day, that type of ‘make the creative thought process linear’ perspective certainly has its place – it’s just a completely different way of thinking from how most data scientists think.  And that mental cost of context-switching is huge.

Put another way: if you’re a data scientist manager, are you worshipping the process – or are you more interested in the result?

 

Promoting people who specialize in buzzwords

If you (as a manager) have ever done this before, you almost certainly didn’t think of it this way.

For example, for a hypothetical “Josh”, your thought process might have gone something like:

“Hey Josh over here is well-spoken, carries himself well, has great leadership qualities, and always seems to have a cool tidbit to talk about from the latest analytics conference.  And everybody seems to like him.”

There’s nothing wrong with that – this ‘Josh’ seems like a great guy, and would probably be a valuable manager for many different team types.

 

However…the way this might look to a data scientist:

“Josh seems alright…but has he ever actually delivered something?  He seems to be throwing around lots of impressive words, and speaks quite confidently – but, some of the things he says, half the time I’m not actually sure he has any idea what he’s talking about.”

If someone like ‘Josh’ gets promoted over a data scientist who actually produces – especially with little to no explanation from upper management of why that promotion decision was made – that could make some people quite upset.

 

Hire non-technical management consultants to give ‘feedback’ on technical matters

I have nothing against consultants – I used to be a consultant, and I know there are some truly world-class technical consultants out there.

However, when companies bring in consultants, something my industry group has noticed is that upper management sometimes does a terrible job of communicating with employees about what exactly is going on.

As in, ‘hey there’s going to be some consultants talking to you’ …and that’s the extent of the communication.

One of my buddies was relaying a horror story of how this high-energy, relatively fresh MBA  consultant (with apparently little technical skill and zero data science experience) came in and essentially started grilling my buddy’s data science team about why they weren’t moving faster.

Long story short, that’s a demoralizing and devaluing experience.  This consultant effectively had no idea what they were talking about when it came to data science – yet they had no issue going off on the team about how they needed more ‘change agents’ or a better ‘greenfield vs. blue ocean strategic approach.’

To be far, I don’t remember the exact details of what my buddy was saying – but he was not happy.

 

Conclusion

This isn’t a comprehensive list; these items are just the most recent hot takes that my industry buddies had about some great ways to get your data science team to quit.

Granted, thankfully most of this behavior doesn’t seem that common (at least in our tiny sample set) – but when it does happen, it can have quick ramifications on your data scientist retention efforts.

 

 

The views expressed on this site are my own and do not represent the views of any current or former employer or client. 

 

Building credibility as a new data scientist

For many people (including me), it can tricky knowing exactly how to build credibility in data science land – especially if you’re a newer data scientist.

This topic has a lot of depth; what’s mentioned below is only a subset of potentially useful considerations.

For context, something my little group of data science industry buddies generally agree on is that it’s pretty easy to find advice on the internet about the technical aspects of data science – but harder to find thoughts (hopefully useful) about the non-technical aspects of the industry.  So here’s my little attempt…

 

False confidence is strategically a really weak way to start

This is a killer, but apparently you wouldn’t know it talking to probably 80% of MBA analytics programs.  Maybe it’s an American thing, maybe we’re at peak data science hype, but it does seem that we are reaching record numbers of data science ‘experts’ – many of whom have never actually worked in industry, let alone actually accomplished anything.

The kind of trickle-down effect here is, even if you have no idea what you’re talking about, they don’t have to know that.  Fake it till you make it bro – chest out, buzzwords ready, just spin up a cluster and you’re all set!  Your graphs will blow them away – even if they’re essentially meaningless!

The problem is, many of us have real deliverables to meet, and the last thing we need is an inexperienced data scientist coming in and apparently already knowing everything.  You can’t learn if you already have all the answers – after all, if you already know everything, what else is there to possibly learn?

Strategically, it’s way easier to be open with your team about your legit strengths and weaknesses – chances are, if you got hired, they will value you for your strengths and want to help you develop your weaker areas.  No one is perfect, but pretending you are is just that useful in data science.

 

Most ‘dumb’ questions are actually not that dumb 

One thing we’ve noticed is that, the closer your company is philosophically to an early-stage startup (vs a government job), the more ‘dumb’ questions are appreciated.

When you work for a team that is hanging out at the cutting edge and actually responsible for producing something valuable (vs maintaining bureaucracy) – no one has all the answers.

When you ask a dumb question, chances are the audience you’re asking will be appreciative of you asking the question.  Especially when we’re new to an industry or domain, the end-user or boss absolutely knows that we are not a day-zero expert – and to implicitly pretend otherwise is a strategic career killer.

Many managers, end-users, and clients have been previously burned by highly-confident (and polished) data scientists who were too confident to ask questions – who then completely botched the analysis because of a dumb little thing that could have been easily addressed if they had just asked.

Additionally, there’s a good chance that the audience would assume you have very little initial knowledge of the domain, so by you not asking dumb questions – they now have to decide whether (a) you’re already an expert (unlikely), (b) you’re too scared to ask a question (normal), or (c) you never even considered that your domain knowledge might be lacking (scary).

 

Be open about your present limitations – but have a plan for getting up to speed

By just straight up admitting that you don’t know something yet, you’re already ahead of probably 90% of new data scientists – who, for maybe the reasons above, just didn’t request the help that they actually need.

I’ve discussed this with my little group, and we all pretty much agree – especially if you’re new to the team, your manager wants you to ask questions and tell them your present limitations, tell them what you want to learn.  They can’t help you if they don’t know what you need.

However, as mentioned above, you additionally should then have a somewhat clear plan for improving.  The plan doesn’t have to be very precise; the bigger factor comes down to whether you have the inherent motivation and discipline to acquire this new skill or knowledge, in a realistic timeframe.

This topic of learning on the job and acquiring new data science skills is a much larger topic to cover later – but essentially your ability to learn and adapt is way more important than whatever your present knowledge or skillset is.

 

The views expressed on this site are my own and do not represent the views of any current or former employer or client. 

How to think like a data scientist

I’ve been doing data science for a few years now, and recently moved into a kind of unofficial ‘team lead’ role.  This means recruiting, interviewing, and all that fun stuff.

In the informal discussions I have with my data science industry buddies, this topic has been coming more frequently (in not so many words): how exactly do data scientists think?

As more of us move into positions where we help interview data scientists, this is becoming a more relevant concern.

With that in mind, here are some notes about what the group was thinking:

 

The mental cost of context-switching is really, really high

The mindset of trying to find new, valuable insights in the data is completely different than the mindset needed for doing standard administrative burden tasks.

Another way to look at it: if you’re a manager and want to kill a data scientist’s productivity for a day, schedule administrative/planning/check-in meetings – and spread them throughout the day.

In talking to my little industry group, we’ve found it can be a struggle to effectively communicate this concept to managers.

How the mindset of making sure your paperwork/bureaucracy is just right to satisfy the unofficial ‘auditors,’ is just a completely different mindset from being curious, and finding a new correlation that no one else has ever even thought of.

To the manager, it’s not ‘administrative burden’ or ‘bureaucracy’ – it’s just ‘stuff we have to get done,’ which is both true and completely understandable.  It’s just – really mentally expensive to keep having to switch the mental perspective from one form of thinking to another.  A completely different way of looking at the world.

 

It’s more about managing your energy than managing your time

This is a subtly tough one to fully appreciate, especially for newer data scientists.  Long story short: especially as a data scientist, where you’ll be doing stuff that requires inspiration and creative thinking, your productivity really doesn’t scale well with time.

In other words, you could have a really productive half-hour (kind of a mini-breakthrough), and what you accomplish in that half-hour would way exceed everything else you produce that day – or even more.

This is also a difficult concept for some managers to fully appreciate – especially if their background isn’t data science.  To some of them, data scientists are doing nothing more than what a standard data analyst would do, and just having to implement tightly defined, precise business requirements.

In that case, time spent would indeed be much more highly correlated to productivity – but as pretty much anyone who has done data science (for more than a year) can attest, that’s not at all what a data scientist actually does.

 

There can be a big difference between having fancy analysis vs actually finding useful insights

This also ties into a concept I have strong opinions about; namely that there’s a ton of excessive hype (and buzzwords) in the industry right now.

My little group sees this trickling down into some real-world internal team meetings, where some data scientists spend most of their time implementing the coolest new ML algo, the fanciest graphing library, or using the most hip buzzwords – and then are quite confused when their end-user or client essentially says their work is garbage.

The problem in these situations is that there’s more focus on the ‘flashy’ part of data science – trying to be impressive – vs actually sitting down and taking the time to understand what the user wants, the struggles they’re having, what their rough hypotheses might be, what they’re looking for, what the data actually means.

Some data scientists think that, hey I’ll just throw the coolest stuff I have at this problem…and then wonder what happened when they get quietly taken off the project.

 

Long-term, it’s better to not get excited about early, premature ‘discoveries’

We’ve probably all ran into this – digging through some data, running some analysis, stepping through the output, and then…wait, that’s odd.  And then, ok this could be useful, it’s probably not a data issue – hmm, this could be huge.

The problem is…when you’re hanging out on the cutting edge, when you’re looking at data sets (or combinations of data sets) that not many other people have ever really looked at, it’s still highly likely that what you’re seeing is not a true insight.  And that’s expected; it’s something all of us would probably logically acknowledge.

However, if you’re like me and you get excited when you think you found something cool, you run into a long-term energy problem: the emotional trip down is way less fun than the trip up (in magnitude).

And if you’re looking for insights all time, while getting excited and disappointed in rapid succession – that’s a tried-and-true recipe for burnout.

 

Note: the views expressed on this site are my own and do not represent the views of any current or former employer or client.

My group of buddies from various parts of the data science industry

One thing I started more clearly appreciating in the past year or so, is just how fast the data science/artificial intelligence industry is changing.

Of course, before that I still would have said that ‘being adaptable is key’ or whatever – but, I completely didn’t grasp just how much change we’re apparently going through.  For example, VC funding for AI/analytics-related startups is at an all-time high, and for the same sectors there’s more and more accelerators and incubators popping up, all over the place.

Long story short, there’s a ton of new data science/AI startups entering the game, and depending on who you ask, they’re either looking to make the data scientist’s job easier or replace the data scientist altogether.  My thought is the motivation is generally somewhere in between – but in any case, there are about to be huge changes to the role of the data scientist.

I have no idea how to keep up with all this, discerning the legit trends vs whatever is just pure hype.  So, against my strongly introverted preferences, I started to more aggressively make it a point to semi-regularly have conversations with people about what’s going on in their little corner of the industry.

These people – some old college buddies, some found through LinkedIn – currently work in all sorts of different domains (mainly fintech, biotech, social media).  Additionally, I have a few data science/AI recruiter buddies as well, and someone in VC.  The roles include both individual contributors and managers (although sometimes not so clearly defined).

I could be completely wrong, but I do feel that this group of industry buddies gives me a pretty decent feel of what’s going on – at least relative to me not talking to these people.  Of course, it’s a challenge for me to keep finding the motivation to have these semi-regular conversations – but so far, I think it’s been worth it.

Anyways, that’s all – so if you hear me referring to a group of people in industry that I frequently talk to, this is what I’m talking about.

 

The views expressed on this site are my own and do not represent the views of any current or former employer or client. 

Career Advice for Students from 2017 Data Science Leaders

As a student interested in data science, it’s not always straightforward to know exactly what you should focus on to get that first data scientist position. Also, being one of the faster-changing career paths, it’s not always clear when certain pieces of career advice have become a bit dated or are even applicable in 2017.

I reached out to several data science leaders of various backgrounds to get their thoughts on this; below are their responses to this question:1

What advice would you give to students that are interested in becoming data scientists?

Simon Petit, Cofounder at dataroots:

“To my opinion, students willing to become a data scientist or to work in this specific area should be naturally curious in life, eager to learn new things and not scared of thinking out of the box.”

“Secondly, they must develop a strong background in applied statistics and mathematics combined with programming skills to be at ease in the different techniques of analytics later on.”

“Third, they need to go beyond their expertise and be able to understand and adapt to any business they work for. Last element, communication and social skills are very important when talking to clients (business owners) and explaining the models and their benefits.”

 

Dan Valente, Head of Data Science at Knotch:

“Be curious, be skeptical, never stop asking questions (even when you think you have the answer!), learn as much math as you can, and get very comfortable programming.”

 

Note: Dr. Saigal is giving advice that is specific to high school students.1

Dr. Sanjay Saigal, Executive Director, Master of Science in Business Analytics at UC Davis:

“I tend to find that high school students don’t suffer from a lack of technical education as much as a lack of curiosity. That is to say, being creatures of culture (like all the rest of us) they don’t very often seek to cultivate the scientific temperament.”

“Were I advising high schoolers, I’d recommend that they look for opportunities to investigate truth – whether be it using methods of analytics or chemistry or any other science. Of course, learning statistics and computing helps discover truth too!”

 

Emily Glassberg Sands, Director of Data Science at Coursera:

“The technical skills are just one piece of the puzzle – necessary to be good but not sufficient to be great. Push beyond the technical. Think deeply about the product and business sides of the challenges you’re trying to solve.”

“Find a company where you care deeply about the end goal – the product fascinates you, the social mission speaks to you, whatever. You deserve to wake up every day excited about both the how and the why.”

 

Anahita Hassanzadeh, Data Science Manager at The Climate Corporation:

“Get your hands dirty with open-source data challenge questions such as the ones on Kaggle. This will help you gauge your strengths and weaknesses and also will make your resume stronger.”

 

Bill Vorhies, Editorial Director at DataScienceCentral:

“If you’re interested in becoming a data scientist you should look specifically for a college that has a data science curriculum, not just computer science.”

“Think also about taking business courses or getting specific business experience in the industry you’d be most interested in since data science is about solving business problems and creating business value, not about math or computers.”

“Mastery of predictive analytics will get you 8 out of 10 data science jobs but if you want to work at the cutting edge, prepare for a Masters that focuses on deep learning using neuromorphic or quantum techniques. Both those will be coming strongly on line over the next four years and will be in great demand.”

 

In Closing

Thanks to all our contributors for sharing their thoughts!

 

 

  1. Note: The contributors quoted for this article were asked slightly different subtypes of this question (ex: advice targeted to high school vs college students).  After going through the responses, I thought it’d make more sense to combine the responses and put them all in one article.  Any errors and omissions are my own.