Library LIBRARY

Summer Data Workshops: Introduction to text mining in R

Sponsored by:
College Libraries
Text mining is the process of transforming unstructured texts of all kinds (literary, scholarly, journalistic, scientific, etc.) into a form where the language of the documents can be analyzed. Using tidy data principles can help make these tasks easier, more efficient, and more interoperable with other tools. Luckily, R has packages that make this process work very well inside the R environment.

In this lesson, participants will learn:
* Some basic text mining/analysis concepts
* How to transform texts (e.g. a novel) into a structured dataset ready to use in R

Virtual Middlebury

Closed to the Public

Summer Data Workshops: Data wrangling in R with dpylr and tidyr

Sponsored by:
College Libraries
dplyr is a package for making tabular data manipulation easier by using a limited set of functions that can be combined to extract and summarize insights from your data. It pairs nicely with tidyr which enables you to swiftly convert between different data formats (long vs. wide) for plotting and analysis.

In this lesson, participants will learn:
* How to select specific observations, variables, and/or values from a dataset
* How to combine multiple commands into a single command
* How to create new columns or remove existing columns from a dataset

Virtual Middlebury

Closed to the Public

Summer Data Workshops: Creating high quality graphics in R with ggplot2

Sponsored by:
College Libraries
ggplot2 is a plotting package for R that makes it simple to create complex plots from data stored in a data frame. It provides a programmatic interface for specifying what variables to plot, how they are displayed, and general visual properties. Therefore, researchers only need minimal changes if the underlying data change or if they decide to change from a bar plot to a scatterplot. This helps in creating publication quality plots with minimal amounts of adjustments and tweaking.

In this lesson, participants will learn:
* What the components of a ggplot are

Virtual Middlebury

Closed to the Public

Summer Data Workshops: The Unix Shell

Sponsored by:
College Libraries
The Unix shell has been around longer than most of its users have been alive. It has survived so long because it’s a power tool that allows people to do complex things with just a few keystrokes. More importantly, it helps them combine existing programs in new ways and automate repetitive tasks so they aren’t typing the same things over and over again. Use of the shell is fundamental to using a wide range of other powerful tools and computing resources (including “high-performance computing” supercomputers). This lesson will start you on a path towards using these resources effectively.

Virtual Middlebury

Closed to the Public

MiddLab Coffee Break: What is Research Data Management?

Sponsored by:
College Libraries
Our first MiddLab Coffee Break of Summer 2021 will focus on research data management: what it is, why it’s an important skill, and why funding agencies care about it so much. The discussion will be facilitated by Wendy Shook (Science Data Librarian) and Ryan Clement (Data Services Librarian). Our facilitators have experience in working with faculty and students from a variety of disciplines on designing and executing data management plans, and will give a short presentation introducing data management concepts and principles.

Virtual Middlebury

Closed to the Public

midd.data lightning talks: Big Data in the Crocker Neuroscience Research Lab and the Classroom

Sponsored by:
College Libraries
Neuroscience has recently achieved a new understanding of the role single-cell gene transcription plays in determining neurons’ physiological properties. We are exploring this research frontier in our labs and classrooms at Middlebury College. Access to public data sets has allowed undergraduates both in research labs and classes to explore how behavior, physiology, and gene expression tie together. In my research lab, we use Drosophila to ask what genes play a role in stress behavior, traumatic brain injury, and learning and memory.

Virtual Middlebury

Open to the Public

midd.data lightning talks: Lisa Gates, Phil Murphy, and Netta Avineri: The Middlebury Social Science Research Modules Pr

Sponsored by:
College Libraries
The Online Survey Research Module is the first educational resource developed as part of the larger Middlebury Social Science Research Modules (MSSRM) project. Ideally, this project will continue to grow into a set of interlinking modules that will guide a user through the entire research process, a wide variety of data collection methods, and the analyses that accompany them.

Virtual Middlebury

Open to the Public

Midd.Data Lightning Talks: Writing Centers as Data Repositories and Research Sites: Genie Giaimo

Sponsored by:
College Libraries
Writing Centers are complicated spaces masquerading as simple ones. Over the past century, they have developed and adapted on many occasions to fit educational trends and the changing make-up of higher education. They have changed from faculty-led instructional spaces to peer educational ones. Writing centers are currently transforming, once again, into peer-focused and professional spaces where empirical research—frequently interdisciplinary and student-led—takes place. This lightning talk will showcase part of the new and developing research program at the Middlebury Writing Center.

Virtual Middlebury

Closed to the Public

Data Trouble, by Miriam Posner

Sponsored by:
College Libraries
Digital humanists have no particular problem talking about data. We use it, trade it, and think about it constantly. Many “traditional” humanists, though, bristle at the notion that their sources constitute “data.” And yet humanists work with evidence, and they speak of proving their claims. So is this just a problem of terminology? I’ll argue in this talk that our data trouble is more substantial than we’ve acknowledged.

Virtual Middlebury

Open to the Public