Aspiring analysts want to know how data analysis is done. Beyond the technical skills required to be an analyst, they assume that, surely, learning the discipline of data analysis includes some sort of repeatable process; a step-by-step method to apply to any analytical problem. Such a method doesn't exist. How seasoned analysts do their work is a mixture of technical and interpersonal skills sprinkled with intuition built up through years of just analysing data. This kind of intuition is hard to put down on paper, which for a student is an unsatisfying thing to hear.
I'm a professional data analyst and educator, and an amateur jazz enthusiast. Hear me out: as it turns out, there are similarities between learning how to analyse data and learning how to play music, especially jazz.
Improvisation
Jazz has multiple components that make it distinct from other musical genres. What makes jazz unique is that it includes improvisation, where musicians can put their own creative stamp on a piece. Because of this, when you hear a performance of a jazz "standard", a popular song that's part of the jazz canon, you never hear the same performance twice. The individual notes aren't as important as the general structure. Jazz musicians don't get given an exact score like classical musicians. In a classical piece, every note is mapped out exactly as the composer intended, and musicians tend to follow this as closely as possible. In jazz, you often just get what's called a lead sheet, which contains a song's main melody, its structure, and its chords. However, even this loose structure is fungible, and jazz musicians take many liberties to make a piece their own. This has parallels for data analysis.
Two analysts attempting the same problem will arrive at different results. The problem, the "analytical lead sheet" if you will, is the same, but because their methods, assumptions, and choices will be different, they end up playing a different tune. There is enough uncertainty in data analysis that this doesn't mean either one is wrong. In fact, the two analysts would likely learn something by comparing notes afterwards.
Just like jazz musicians don't look back and check whether every note they played was perfect, aspiring analysts shouldn't get bogged down with whether every individual piece of their analysis was the perfect choice.
Chances are there were other, equally acceptable, options at multiple points in the analysis.
How does one get to a point where they can confidently improvise? Music and data education are both historically very method-driven. They are focused on drilling the fundamentals before moving on to practical applications. However, because jazz is so different from other musical genres, the best way to learn it is also different and the same should be true of data analysis.
Songs before scales
Rigorous music theory lessons will no doubt make you a better jazz musician. Knowing how notes, scales, and chords interrelate builds a foundation you can use to improve. However, knowing theory doesn't make you better at the practice, in and of itself. You need two things: to apply your knowledge by playing lots of jazz and to learn from the greats. The first is obviously accomplished by sitting down at your instrument and practising. There are no shortcuts here, but one important point is that you want to practise playing songs. Anyone who's ever had lessons in an instrument has had to practice scales and arpeggios. Music lessons typically include these technical practice challenges to hone your skill at the instrument. These are clearly important, but they should not override the purpose of the lessons, which is to make music. If you focus on songs first, learning scales makes more sense since you have the right context for them.
Just like songs should be the first-class citizen of jazz practice, projects should be the first-class citizen of learning data analysis. Once you have the basics under your belt, that is you know how to load, clean, explore, and visualise a dataset, you shouldn't go and learn 10 more tools or algorithms for the sake of it.
More technical training will not give you the best return on investment on your learning time.
It is much better to start solving problems as soon as possible, even if you don't have all the necessary skills yet. Attempting an actual data analysis problem will force you to learn exactly what you need, and you will have the right context for those technical components, which you can always learn when you need them.
There is an important caveat here. You shouldn't learn songs or complete analytical projects in isolation. The way to improve is to do the work, get feedback, and see how the experts do it.
Learning from experts
To figure out how you can get closer to the greatest jazz artists is by listening to and analysing how they play. Rather than guessing which notes you could have played differently to make your solo sound a bit better, it's more effective to listen to a recording of Miles Davis, John Coltrane, or Duke Ellington, and hear what they did. One piece of advice given to aspiring jazz musicians is to try to transcribe your favourite solos. The point of attempting to figure out every note that was played isn't to replicate the solo exactly but to see a master at work and pick up some new tricks. Why couldn't we do the same for data analysis?
When teaching programming, a great tool for the educator is the code-along. Students get a lot of value from an expert instructor talking through every line of code not just to explain the syntax, but to talk about the wider context. On the surface level, the instructor explains the exact command used to remove missing values, but on a deeper level, they reveal the justification for doing so in the first place. This sort of narrated, granular deep dive is incredibly powerful at filling in the gaps in students' knowledge.
Learning from experts by actually seeing them work is underutilised in tech education.
To put my money where my mouth is, I have in the past streamed my attempts at an analytical problem on Twitch, and plan to start doing more of this again. I find that narrating my own thought process helps me learn a lot, too.
Jazz up your data analysis learning
If you want to get better at analysing data, think like a jazz musician. Don't spend long hours reading dry textbooks of yet another analytical tool or algorithm. Play songs instead. Improvise to put your own stamp on a problem, observe the experts, and, most importantly, just make some sweet, sweet data jazz.
About David
I'm a freelance data scientist consultant and educator with an MSc. in Data Science and a background in software and web development. My previous roles have been a range of data science, software development, team management and software architecting jobs.