AskHistorians is leading the charge in the online space with community engagement between the public and historians. Over 1.5 million users frequent the community to ask over 4,000 questions a month from the Flaired Members and knowledgeable users. This projects aimed to discover the trends of questions asked by the public to see the most popular topics and their change over time. I looked at the 2018 calendar year and only at threads that were not deleted.
The sorting of topics did lead to challenges and ways we, as historians, categorize topics. The most obvious example would be how Georgia (the state) and Georgia (the country) cannot be differentiated when using Named Entity Recognition. They are instead just lumped together in a hybrid of being both about the state and the country at the same time. The other challenge dealt with time and topics. People asking about Egypt versus Ancient Egypt are very different things, yet they are again hard to split when looking simply at “Egypt” in a box. In other areas it was imperative to split the differences as well as possible, even if it created overlap or obfuscation. The case with the Soviet Union versus Russia, or Nazis versus Germany are two that remained split. Therefore I tried to keep distinct topics, even if they were within the same geographical locations, unless technologically or linguistically unable to do so.
[INSERT BLOCK ABOUT INTERESTING TRENDS AND SO FORTH HERE]
[BLOCK ABOUT CONCLUSIONS]
I assume part of your intent for this is to eventually post it to reddit? It would be helpful to have a brief section about reddit/the function of subreddits/what a flaired user is for the final course version that you can cut for later posting, as well as a brief outline of what rules get a post deleted. More methodological discussion about what is NER would also be helpful. You don’t need to discuss the scraping process, but since your categories are coming from somewhere outside the threads themselves, you should discuss that a bit.
You should think about breaking out different topic streams–for example, one with all places together, one with all religion topics together, and one with all wars/events together. Tableau and human eyeballs don’t do well trying to compare more than twenty categories (or more than 10 really), so you want to either narrow down what the top 10/20 topics are, or give more thematic views of subcategories.
Looking through your workbook, you’ve got a lot of items like “Christ, christ child, Virgin Mary, Bible, Pope” etc that fall under topic headings you’re already displaying like Christianity–how are you handling these? They’re small topics on their own, but could be grouped using tableau’s group function to give a fuller picture of the topics, though it would be a big PITA.