A couple of years ago I was perusing a forum on LinkedIn when I noticed a post that outlined a number of statistical methodologies in a nice tree format (also known as a dendrite diagram). I thought that the post represented a pretty nice summary of statistical methodologies, so I printed it out and tucked it away in my drawer at work for future reference.
The chart does a nice job of summarizing some of the main statistical techniques used in both descriptive and predictive analysis. I often find that it is hard for people to summarize these concepts. I remember recently posting a question on LinkedIn asking specifically what statistical methodologies a hiring manager should expect those replying to requisitions for a “data scientist” to be knowledgable in. The responses linked to blog posts that talked a bit about how a data scientist should be familiar with “advanced statistics”. But most posts don’t go deeper than that, exploring the specific methodologies that one would be expected, at a minimum, to know.
While filing away some old papers this weekend I found the printout tucked between some administrative papers and I gave it a good look again. I like this chart because it serves as a sort of training guide, in some ways, giving any budding statistician a road map for skills development. I thought it would be fun to convert this printout into a D3 visualization, and perhaps to add a little interactivity, allowing users to pull definitions from Wikipedia by clicking on each node in the visualization.
I’ve managed to embed the D3 component below, but instead of trying to figure out the nuances and hacks required to get JQuery UI to operate nicely in WordPress, I decided to link instead to the final product here:
I’ve hooked the visualization up to the Wikipedia pages for each topic using a Jquery UI dialog window and the MediaWiki API. I’ve pasted an image below, but take a look first-hand at the link above! I hope to explore some of these concepts more deeply in this blog as time goes on. For now, you can take a look at my post on decision trees, which discusses some interesting topics such as information gain and entropy.