What most students don’t get about the data science industry

With how fast the data science industry is changing, it’s hard (especially for me) to keep up with all that’s going on.

For students, it’s probably even more difficult, as they likely wouldn’t have a base of knowledge to draw upon when sifting through what’s hype vs legit in data science (spoiler: there’s a ton of data science/AI hype right now…but that’s a topic for another day).

With that in mind, below are some thoughts about the industry that some data science students might not yet appreciate:

 

It’ll probably take at least three months for you to meaningfully contribute at your first job

And in many cases, it could be way longer.

It’s hard for a newer data scientist to grasp just how critical business domain knowledge is – often as a prerequisite before you can reasonably expect to start digging through the data and producing meaningful insights.

For example, assuming you just got your first data science job in finance: if you don’t understand the basics of how the stock market works – it’ll be almost impossible for you to contribute any meaningful insights, until you have some baseline of knowledge.

Yes you’ll be able to throw out some cool graphs and maybe some spicy buzzwords – but it won’t actually be helpful to the business.

I understand the excitement, particularly with advances in deep learning, that maybe it’s becoming d less important for you to actually understand the data you’re looking at – before deploying the coolest new ML algorithm and just letting the algo start pumping out the insights.

However…I just don’t think we’re there yet.

And especially if this is your first data science job, it’s pretty unlikely that you’re skilled enough in deep learning to produce something that will rapidly and legitimately benefit the business, without first being proficient in understanding the domain.

This onboarding process of learning takes time, and I think it’s something that most companies are getting more comfortable with openly acknowledging.

For example, in discussing with my little industry group of data science buddies, the general consensus is that, on average, you’re looking at an average of six months before a new data science hire is meaningfully contributing.

 

You don’t need to already know the business domain to get hired

It’s hard enough to find quality data scientists to hire these days – and when you throw in the restriction that they also have to already be fairly knowledgeable in your domain, you could start running into major recruiting problems.

Which brings me to my next point…

 

Humility and curiosity are probably the most underrated traits in data science

It’s not really about what you currently know about the business domain – it’s more about how motivated you are to learn more about the domain.

In other words, if you’re inherently interested in learning more about the given particular industry/business area, you’re probably in pretty good shape.  Lots of would-be data scientists…just don’t really care.

For newer data scientists, this willingness and ability to learn is huge.  Technical skills are great, but in my opinion it’s way harder for a manager to find ways to get an unmotivated person to really learn about a new domain (beyond just surface-level knowledge) – vs a manager helping someone get trained in a new technical skill.

If you’re inherently curious and realistic about what you know (and don’t know), that could set you miles apart from many wannabe data scientists.

Not just in data science, but in fintech and analytics in general: right now, there are a ton of ‘experts’ walking around with high concentrations of confidence and buzzwords, but lacking the aspects of actually understanding what they’re talking about.  Even worse, many of these people have apparently no interest in learning.

Put another way: it’s less about where you are currently, and more about what’s your trajectory, what’s your progress, are you actively taking steps to learn and improve.

If you’re more interested in data science as a ‘job,’ and less interested in it as a way ‘to work on cool stuff and learn about cool things,’ then I think you’re missing out on the things that make data science most rewarding as a career.

 

Wrapping up

If you’re a student interested in data science as a career, it might be helpful to stay mindful about just how fast the industry is changing – and how learning (vs your current knowledge or skillset) is probably going to be the more strategically important long-term trait.

Put another way: to stay relevant and not become obsolete as a career data scientist, it’s probably a good idea to stay mindful of how important it is to keep adapting – even when you think you’re irreplaceable.

 

The views expressed on this site are my own and do not represent the views of any current or former employer or client. 

Philosophy on meetings: a data scientist’s perspective

One thing I’m currently struggling with is balancing two very different types of workloads.

On the one hand, there are items that are more maintenance, budgeting, planning – basically, the not-fun parts of a data scientist’s job, but stuff that kind of has to be done (especially when you start managing people).

On the flip-side, there are the tasks that other people would more readily associate with data science: digging into the data, communicating with users, making cool visualizations, finding insights – basically being innovative.

It’s not really a time-balance issue.  Although currently my time is pretty constrained, that’s not the core issue.

For me, it’s an energy issue – specifically; the huge amount of energy needed to keep context switching back and forth between one type of thinking (maintenance) vs another (innovation).  It’s just a completely different way of looking at the world.

Accompanying those less-fun maintenance tasks is every data scientist’s ‘favorite’ way to spend time: meetings.  They are a necessary evil, so below are some philosophical thoughts:

 

The larger your team grows, the more time you’ll have to spend in meetings

This…is something I’m struggling to accept, but it’s seeming more and more like it’s my new reality.

The fact is, as you add more and more people on our team, working on interconnected items, you will need more coordination to keep everyone on track.

For me, my biggest concern is to make sure that I’m not the bottleneck.  For that reason alone, status meetings are helpful – especially when I’ve been so overwhelmed with other items that I might have lost track that someone is waiting on me for something.

However…something else to consider, especially when switching between ‘maintenance’ and ‘innovative’ modes of thinking:

Intermittent meetings can cause huge context-switching costs

There’s a major difference in mentality between exploring data to find insights that no one else has ever seen – vs the mentality needed to explain to another team why their maintenance estimates were just way too optimistic, and now we need to discuss escalating the issue to upper management.  And how we’re going to handle the ‘messaging.’

All while playing nice, keeping everyone happy, and not seeming like you’re out here just making excuses and being a Debbie Downer.

Some people are really good at that maintenance perspective, while some are better at the innovative perspective – but when you have to actively straddle between both (as many of us do), just a couple intermittent meetings could absolutely kill your productivity for the rest of the day.  We just can’t afford the energy sink.

 

It can be tempting to start scheduling meetings at the first sign of ambiguity

Someone in my little industry group was recently telling me how they were getting flooded with half-hour meeting requests involving 4+ people – when in reality, these things could probably have been resolved in three-minute conversations involving only two people.

Why is this happening?  One theory: especially for people who maybe come from the more ‘maintenance’ background, they just don’t recognize the cost of having an additional meeting – because, dealing with maintenance and meetings is pretty much all they do, all day.

To them, scheduling another meeting is not big deal.  For others (like data scientists), it is a huge deal – but we have a hard time conveying that message while still playing nice.

 

Especially for data scientists, some meeting types are way more draining than others

Not all meetings are bad – in fact, I’ve been in multiple meetings recently that were a net positive, in that I had more energy after the meeting than before.  Some traits and commonalities I noticed:

Uplifting: presenting my analysis, having working sessions to discuss a visualization, discussing next steps with end-users, discussing strategy.

Draining: having conflict about deliverable dates, explaining why an initial estimated timeline was too optimistic, coordinating work between multiple groups, gently trying to explain to a confident person why they are mistaken.

 

Conclusion

Unless you’re somehow doing data science while living under a rock, you’re going to have to spend substantial time and energy in meetings.  At least for me, I’m currently working on having a better relationship with meetings – but it’s of course a work in progress 😀

 

The views expressed on this site are my own and do not represent the views of any current or former employer or client.

Why is career advice becoming less useful?

One thing that people love to do is give career advice.  It feels great; when someone asks you for advice, they’re essentially giving you strong affirmation and compliment of nearly your entire worldview.

Something discussed within my little group of industry buddies is not just us sharing our own career-related thoughts with each other – it’s also discussing, as a concept, is career advice actually helpful?  Some thoughts below:

 

Most career advice is highly autobiographical

Unfortunately, lost within all the career advice self-congratulatory glow is that advice is largely autobiographical.

The person giving advice is essentially saying what advice would have been most useful for themselves years ago.  While also giving subtle backhanded compliments to themselves about their previous life choices.

That’s great and all – except, everyone’s individual situation is highly complex and contextual, where even slight differences in someone’s immediate circumstances could completely change whether a piece of advice is good – or garbage.

 

Some fundamental career-related changes in the past 25 years

And not everyone seems to appreciate that variance – or even really acknowledge the possibility.  For example, say a 50 year old is advising a 25 year old.

Twenty-five years ago, when that advisor was maybe 25 years old, consider for example: the gig economy did not really exist, employer loyalty was still a thing, and student loan burdens didn’t really exist.

Regarding advice for what worked 25 years ago vs what would work now…any single one of those factors could be a complete game changer.

There are other major changes, but just considering these factors alone – the contextual career advice would already be way different.  Unfortunately, sometimes it seems that not everyone has recognized this reality – or thinks that we’re still living in 1995.

Gig economy: Depending on who you ask, the rise of the gig economy is either a great thing or the death knell of the middle class.  Regardless, given how many people take part in it now vs a generation ago, it’s a huge factor,

Staying at a job for an entire career: This used to be way more common.  ‘Blame’ it on whoever you want, whether it’s employees for not sticking enough to one company or companies taking existing employees and gradually squeezing them dry – but the fact remains, people are changing jobs way more often than they used to.

College: Back in the day, most entry-level didn’t require a bachelor’s degree, and it was extremely rare to see an undergrad degree cost the same as a small house (if not more).  The result being, way higher student debt than any other time in our country’s history, which has massive implications on career decisions (among other things).

 

…and things are still changing, except more quickly than before

With the rise (and acceleration) of advanced analytics and artificial intelligence, things are changing really, really fast.  And very few people, including myself, are successfully grasping just how fundamental some of these changes and implications actually are.

As a side note, I’m not very political, but it seems the only semi-mainstream politician who fundamentally understands this seems to be Andrew Yang – at least right now.

I don’t know enough about his policies to say whether I support him, but I do agree with his outlook regarding how AI is causing massive changes to the employment landscape – and with probably much more disruption to come (aka industrial revolution).

We’re kind of at an unprecedented time in history, where job types are being created (and effectively destroyed) at a faster rate than we’ve pretty much ever seen before.

Data science: For example, my field of data science is going through some huge changes; specifically I can’t keep track of all the related startups being funded or acquired, many of which are ultimately trying to make the role of data scientist a relic of the past.

Of course, from a startup/VC PR perspective, that’s not how they’ll frame it though – “Innovation, progress, everyone wins!!!”

 

Conclusion: As usual, this isn’t a complete list, and generally advice is still pretty useful – if for no other reason than to just get another perspective; something else to think about.

However, I do wonder whether people are mindful enough about some of the fundamental limitations of career advice – and whether people recognize just how much some things have changed (are are still changing).

 

The views expressed on this site are my own and do not represent the views of any current or former employer or client.

 

How to get your entire data science team to quit within one year

There’s a ton of hype in the data science industry right now, and the general consensus is that it’s currently quite difficult to hire quality data scientists.

Given this tight hiring environment, something that has been confusing my group of industry buddies is how some managers are still apparently looking for ways to alienate nearly their entire data science team.

With that in mind, below are some notes we discussed about the best ways to ensure that your entire data science team quits within one year.  Thankfully, I’m not personally dealing with these issues at my job…but some of my friends aren’t so fortunate.

 

Focus on ‘tightening up the process’ with more status meetings, organization

It can be hard for some data science managers – especially those who have maybe never actually done data science themselves – to fully appreciate how disruptive intermittent meetings can be.

I certainly understand the desire to keep track of people’s work, ‘keeping everybody on the same page,’ and whatever other Agile/Scrum-like phrasing you want to use.  However…the problem with this thinking is that you’re essentially assuming that data science follows a generally known and linear process.

I do think that Scrum boards can be useful when trying to do cutting edge data science; after all you do need at least some rough way of tracking progress.

However…when managers get religious about this and demand tickets for every item, and meticulous grouping/maintaining of all the tickets, in spite of how unknown and dynamic the future process is…that’s a great way to get a data scientist to quit.

At the end of the day, that type of ‘make the creative thought process linear’ perspective certainly has its place – it’s just a completely different way of thinking from how most data scientists think.  And that mental cost of context-switching is huge.

Put another way: if you’re a data scientist manager, are you worshipping the process – or are you more interested in the result?

 

Promoting people who specialize in buzzwords

If you (as a manager) have ever done this before, you almost certainly didn’t think of it this way.

For example, for a hypothetical “Josh”, your thought process might have gone something like:

“Hey Josh over here is well-spoken, carries himself well, has great leadership qualities, and always seems to have a cool tidbit to talk about from the latest analytics conference.  And everybody seems to like him.”

There’s nothing wrong with that – this ‘Josh’ seems like a great guy, and would probably be a valuable manager for many different team types.

 

However…the way this might look to a data scientist:

“Josh seems alright…but has he ever actually delivered something?  He seems to be throwing around lots of impressive words, and speaks quite confidently – but, some of the things he says, half the time I’m not actually sure he has any idea what he’s talking about.”

If someone like ‘Josh’ gets promoted over a data scientist who actually produces – especially with little to no explanation from upper management of why that promotion decision was made – that could make some people quite upset.

 

Hire non-technical management consultants to give ‘feedback’ on technical matters

I have nothing against consultants – I used to be a consultant, and I know there are some truly world-class technical consultants out there.

However, when companies bring in consultants, something my industry group has noticed is that upper management sometimes does a terrible job of communicating with employees about what exactly is going on.

As in, ‘hey there’s going to be some consultants talking to you’ …and that’s the extent of the communication.

One of my buddies was relaying a horror story of how this high-energy, relatively fresh MBA  consultant (with apparently little technical skill and zero data science experience) came in and essentially started grilling my buddy’s data science team about why they weren’t moving faster.

Long story short, that’s a demoralizing and devaluing experience.  This consultant effectively had no idea what they were talking about when it came to data science – yet they had no issue going off on the team about how they needed more ‘change agents’ or a better ‘greenfield vs. blue ocean strategic approach.’

To be far, I don’t remember the exact details of what my buddy was saying – but he was not happy.

 

Conclusion

This isn’t a comprehensive list; these items are just the most recent hot takes that my industry buddies had about some great ways to get your data science team to quit.

Granted, thankfully most of this behavior doesn’t seem that common (at least in our tiny sample set) – but when it does happen, it can have quick ramifications on your data scientist retention efforts.

 

 

The views expressed on this site are my own and do not represent the views of any current or former employer or client. 

 

Building credibility as a new data scientist

For many people (including me), it can tricky knowing exactly how to build credibility in data science land – especially if you’re a newer data scientist.

This topic has a lot of depth; what’s mentioned below is only a subset of potentially useful considerations.

For context, something my little group of data science industry buddies generally agree on is that it’s pretty easy to find advice on the internet about the technical aspects of data science – but harder to find thoughts (hopefully useful) about the non-technical aspects of the industry.  So here’s my little attempt…

 

False confidence is strategically a really weak way to start

This is a killer, but apparently you wouldn’t know it talking to probably 80% of MBA analytics programs.  Maybe it’s an American thing, maybe we’re at peak data science hype, but it does seem that we are reaching record numbers of data science ‘experts’ – many of whom have never actually worked in industry, let alone actually accomplished anything.

The kind of trickle-down effect here is, even if you have no idea what you’re talking about, they don’t have to know that.  Fake it till you make it bro – chest out, buzzwords ready, just spin up a cluster and you’re all set!  Your graphs will blow them away – even if they’re essentially meaningless!

The problem is, many of us have real deliverables to meet, and the last thing we need is an inexperienced data scientist coming in and apparently already knowing everything.  You can’t learn if you already have all the answers – after all, if you already know everything, what else is there to possibly learn?

Strategically, it’s way easier to be open with your team about your legit strengths and weaknesses – chances are, if you got hired, they will value you for your strengths and want to help you develop your weaker areas.  No one is perfect, but pretending you are is just that useful in data science.

 

Most ‘dumb’ questions are actually not that dumb 

One thing we’ve noticed is that, the closer your company is philosophically to an early-stage startup (vs a government job), the more ‘dumb’ questions are appreciated.

When you work for a team that is hanging out at the cutting edge and actually responsible for producing something valuable (vs maintaining bureaucracy) – no one has all the answers.

When you ask a dumb question, chances are the audience you’re asking will be appreciative of you asking the question.  Especially when we’re new to an industry or domain, the end-user or boss absolutely knows that we are not a day-zero expert – and to implicitly pretend otherwise is a strategic career killer.

Many managers, end-users, and clients have been previously burned by highly-confident (and polished) data scientists who were too confident to ask questions – who then completely botched the analysis because of a dumb little thing that could have been easily addressed if they had just asked.

Additionally, there’s a good chance that the audience would assume you have very little initial knowledge of the domain, so by you not asking dumb questions – they now have to decide whether (a) you’re already an expert (unlikely), (b) you’re too scared to ask a question (normal), or (c) you never even considered that your domain knowledge might be lacking (scary).

 

Be open about your present limitations – but have a plan for getting up to speed

By just straight up admitting that you don’t know something yet, you’re already ahead of probably 90% of new data scientists – who, for maybe the reasons above, just didn’t request the help that they actually need.

I’ve discussed this with my little group, and we all pretty much agree – especially if you’re new to the team, your manager wants you to ask questions and tell them your present limitations, tell them what you want to learn.  They can’t help you if they don’t know what you need.

However, as mentioned above, you additionally should then have a somewhat clear plan for improving.  The plan doesn’t have to be very precise; the bigger factor comes down to whether you have the inherent motivation and discipline to acquire this new skill or knowledge, in a realistic timeframe.

This topic of learning on the job and acquiring new data science skills is a much larger topic to cover later – but essentially your ability to learn and adapt is way more important than whatever your present knowledge or skillset is.

 

The views expressed on this site are my own and do not represent the views of any current or former employer or client. 

How to think like a data scientist

I’ve been doing data science for a few years now, and recently moved into a kind of unofficial ‘team lead’ role.  This means recruiting, interviewing, and all that fun stuff.

In the informal discussions I have with my data science industry buddies, this topic has been coming more frequently (in not so many words): how exactly do data scientists think?

As more of us move into positions where we help interview data scientists, this is becoming a more relevant concern.

With that in mind, here are some notes about what the group was thinking:

 

The mental cost of context-switching is really, really high

The mindset of trying to find new, valuable insights in the data is completely different than the mindset needed for doing standard administrative burden tasks.

Another way to look at it: if you’re a manager and want to kill a data scientist’s productivity for a day, schedule administrative/planning/check-in meetings – and spread them throughout the day.

In talking to my little industry group, we’ve found it can be a struggle to effectively communicate this concept to managers.

How the mindset of making sure your paperwork/bureaucracy is just right to satisfy the unofficial ‘auditors,’ is just a completely different mindset from being curious, and finding a new correlation that no one else has ever even thought of.

To the manager, it’s not ‘administrative burden’ or ‘bureaucracy’ – it’s just ‘stuff we have to get done,’ which is both true and completely understandable.  It’s just – really mentally expensive to keep having to switch the mental perspective from one form of thinking to another.  A completely different way of looking at the world.

 

It’s more about managing your energy than managing your time

This is a subtly tough one to fully appreciate, especially for newer data scientists.  Long story short: especially as a data scientist, where you’ll be doing stuff that requires inspiration and creative thinking, your productivity really doesn’t scale well with time.

In other words, you could have a really productive half-hour (kind of a mini-breakthrough), and what you accomplish in that half-hour would way exceed everything else you produce that day – or even more.

This is also a difficult concept for some managers to fully appreciate – especially if their background isn’t data science.  To some of them, data scientists are doing nothing more than what a standard data analyst would do, and just having to implement tightly defined, precise business requirements.

In that case, time spent would indeed be much more highly correlated to productivity – but as pretty much anyone who has done data science (for more than a year) can attest, that’s not at all what a data scientist actually does.

 

There can be a big difference between having fancy analysis vs actually finding useful insights

This also ties into a concept I have strong opinions about; namely that there’s a ton of excessive hype (and buzzwords) in the industry right now.

My little group sees this trickling down into some real-world internal team meetings, where some data scientists spend most of their time implementing the coolest new ML algo, the fanciest graphing library, or using the most hip buzzwords – and then are quite confused when their end-user or client essentially says their work is garbage.

The problem in these situations is that there’s more focus on the ‘flashy’ part of data science – trying to be impressive – vs actually sitting down and taking the time to understand what the user wants, the struggles they’re having, what their rough hypotheses might be, what they’re looking for, what the data actually means.

Some data scientists think that, hey I’ll just throw the coolest stuff I have at this problem…and then wonder what happened when they get quietly taken off the project.

 

Long-term, it’s better to not get excited about early, premature ‘discoveries’

We’ve probably all ran into this – digging through some data, running some analysis, stepping through the output, and then…wait, that’s odd.  And then, ok this could be useful, it’s probably not a data issue – hmm, this could be huge.

The problem is…when you’re hanging out on the cutting edge, when you’re looking at data sets (or combinations of data sets) that not many other people have ever really looked at, it’s still highly likely that what you’re seeing is not a true insight.  And that’s expected; it’s something all of us would probably logically acknowledge.

However, if you’re like me and you get excited when you think you found something cool, you run into a long-term energy problem: the emotional trip down is way less fun than the trip up (in magnitude).

And if you’re looking for insights all time, while getting excited and disappointed in rapid succession – that’s a tried-and-true recipe for burnout.

 

Note: the views expressed on this site are my own and do not represent the views of any current or former employer or client.