Historical research now crucially involves the acquisition and use of digital sources. In History 9877A, students learn to find, harvest, manage, excerpt, cluster and analyze digital materials throughout the research process, from initial exploratory forays through the production of an electronic article or monograph which is ready to submit for publication.
In this course you will learn to apply techniques that are currently used by fewer than one percent of working historians. Computation won’t magically do your research for you, but it will make you much more efficient. You can focus on close reading, interpretation and writing, and use machines to help you find, summarize, organize and visualize sources.
Prerequisites, Workload, Blogging and Evaluation
There are no prerequisites for the course other than a willingness to learn new things and the perseverance to keep working when you’re confused or when you realize that you could spend a lifetime learning about the topics and technologies that we will cover in class, and still not master them all. Students will come into the course with very different levels of experience and expertise. Some, probably most, will be familiar only with the rudiments of computer and internet use. A few may already be skilled programmers.
This course also requires that you spend at least a little bit of time each day (say 20-30 minutes) practicing your new skills. It’s a lot like learning a new language, learning to play a musical instrument or going to the gym. It is going to be hard at first, but be patient with yourself and ask a lot of questions. With daily practice, you will soon find ways to do your research and coursework faster and more efficiently. If you can’t commit to regular practice, however, you should probably not take this course. The techniques that you learn in this class build cumulatively week-by-week. In addition to regular practice, it is essential that you attend every meeting of the class and study the slides carefully.
Every student in the class will have an academic blog and will be required to make weekly posts to it. These entries do not have to be long (300-500 words per week is ample). The use of blogging is to encourage you to engage in ‘reflective practice,’ that is, to force you to think about your learning and research as you are doing it. It also provides me with feedback for how the course is going. You can use each week’s blog entry to talk about what you learned, things that were clear or not, things you would like to know how to do, and so on.
If you do not already have a professional blog you will need one. Before the first class you should go to either WordPress or Blogger (not both) and create an account and a blog. If possible, create the blog under your own name; if not, choose something professional sounding. Post an introductory message about yourself and then send me the URL of your blog so that I can add you to the course blogroll for History 9877A.
You will be graded on your participation in class (20%) and on your reflective blogging (80%). There will be no midterm or final examinations, and no final paper.
Required Software
To get the most out of this class, you will need a Windows, Mac or Linux laptop, which you should bring to every class.
You will also need a desktop student license for Wolfram Research’s Mathematica software. (Don’t let the name scare you, you won’t need any particular experience with mathematics to do well in this course). A license for the software is US $45 per semester, or US $70 for a one-year subscription.
http://www.wolfram.com/mathematica/pricing/students-individuals.php
You don’t have to purchase anything else for the course.
All slides will be made available on the course website.
Students and Auditors
Schedule
- Th Sep 8
1A: Introduction to Digital History - Tu Sep 13
1B: Word Frequency - Th Sep 15
2A: Text Search - Tu Sep 20
2B: N-gram Frequency - Th Sep 22
3A: KWIC - Tu Sep 27
3B: Pattern Matching - Th Sep 29
4A: Capitalized Phrases - Tu Oct 04
4B: Collocations - Th Oct 06
5A: Associations - Tu Oct 11
5B: Named Entities - Th Oct 13
6A: Timelines - Tu Oct 18
6B: Maps
- Th Oct 20
7A: Batch Downloading
- Tu Oct 25
7B: Corpus Search - Th Oct 27 – NO CLASS – FALL STUDY BREAK
- Tu Nov 01
8A: Document Vectors - Th Nov 03
8B: TF-IDF - Tu Nov 08
9A: Markup Languages - Th Nov 10
9B: Scraping - Tu Nov 15
10A: Page Images and OCR / 10B: Image Processing - Tu Nov 22
11A: Identifying and Classifying Images / 11B: Photogrammetry and Georectification - Tu Nov 29
12A: Application Program Interfaces (APIs) / 12B: Entity Network Spidering - Tu Dec 06
NO CLASS