R
In a previous post I showed how to use data science tools to find hidden features in unstructured text and analyzed how the complexity of the lyrics of Beatles songs changed over time. In this post I do a little follow-up and compare complete works of The Beatles with that of two others using the same methodology and metrics. Comparing Beatles with other musicians may help put the original numbers into the perspective.
Stargazer is a neat tool to present model estimates. It accepts a fairly large number of object-types and creates nice-looking, ready-to-publish outputs of their main parameters. In many cases, however, the default settings do not give us the proper numerical results, and customizing the output is not that straightforward. This is part one in a two-part series on how to customize stargazer.
When I first encountered stargazer I already had a problem with the model outputs the package created: in cross-sectional data the observations are often of different sizes, which leads to heteroskedastic model residuals where simple standard errors are useless for measuring variable significance.
The Beatles became a hit through its sometimes simple but always powerful music but it has never been famous for its poetry. The group’s lyrics, however, did change during the band’s short existence and we can use text analysis to track these changes. This post is about measuring the change in the complexity of the group’s lyrics, from the Please, Please Me to the Abbey Road albums, showing how we can use basic data secience tools to find really fancy patterns in unstructured text data.