Here are some links for Spring 2019 talks on computational history that I gave at the Fields Institute and MIT.
Sites that can be used with no prior programming experience:
- Gavagai Living Lexicon
- IFTTT (If This Then That) for automating workflow
- MemeTracker and NIFTY for visualizing the 24-hour news cycle
- The Programming Historian for novice-friendly, peer-reviewed tutorials to get started with programming
- Interactive TF-IDF at Wolfram Demonstations
- Webrecorder.io to capture a website in a WARC file that can be browsed later
- Wolfram Alpha for natural language queries of a computable knowledge database
If you are comfortable with scripting:
- Build a Mini Search Engine with Apache Nutch and Solr
- Build an Elasticsearch Search Engine for E-books in a Docker Container
- CommonCrawl.org provides access to years of free web crawl data
- YAGO provides structured access to ~120M facts concerning ~10M entities, derived from Wikipedia, WordNet and GeoNames
Technical sources:
- Achlioptas, “Database-Friendly Random Projections“
- Jurgens & Stevens, “Event Detection in Blogs using Temporal Random Indexing“
- Kanhabua, Nguyen & Niederée, “What Triggers Human Remembering of Events”
- Leskovec, Rajaraman & Ullman, Mining of Massive Datasets
- Schmidt, “Stable Random Projection“
Historiography:
- Brugger, The Archived Web (2019)
- Hartog, Regimes of Historicity (2016)
- Milligan, History in the Age of Abundance? (2019)
- Snyder, The Road to Unfreedom (2018)
- Tooze, Crashed (2018)