MiddLab Data Workshops: Introduction to Text Mining in R
–
Virtual MiddleburyClosed to the Public

Text mining is the process of transforming unstructured texts of all kinds (literary, scholarly, journalistic, scientific, etc.) into a form where the language of the documents can be analyzed. Using tidy data principles can help make these tasks easier, more efficient, and more interoperable with other tools. Luckily, R has packages that make this process work very well inside the R environment.
In this lesson, participants will learn:
* Some basic text mining/analysis concepts
* How to transform texts (e.g. a novel) into a structured dataset ready to use in R
* How use tidy data packages (such as dplyr and tidyr) to manipulate text data
* How to perform basic sentiment analysis and word count tasks in R
Participants should have basic familiarity with R. If you are completely new to R, please be sure to attend the Introduction to R workshop on June 14, 2022. It would also be beneficial for attendees to be familiar with the material covered in our Data wrangling in R with dplyr and tidyr and Creating high quality graphics in R with ggplot2 workshops, if they are able.
Please click here to learn more and to register for this workshop.
- Sponsored by:
- College Libraries
Contact Organizer
Kemp, Jonathan
jkemp@middlebury.edu
(802) 443-2265