Alex Lyford
Office
Warner 210
Tel
(802) 443-5564
Email
alyford@middlebury.edu
Office Hours
Spring 2024: Tuesday 1--2, Wednesday 3--Infinity, Thursday 10--11, and by appointment.

Alex Lyford is an Assistant Professor of Statistics, and he has been at Middlebury College since 2017. He received a Ph.D. in Statistics from the University of Georgia, and his research areas of interest are machine learning, text analysis, statistics education, and math games. Alex’s hobbies include sports, hiking, and playing board games. Alex also hosts Board Game Night in the Math department once a month on Mondays.

Students interested in doing research with Alex should stop by his office any time or contact him via email.

Courses Taught

Course Description

Data Science Across Disciplines
In this course we will gain exposure to the entire data science pipeline—obtaining and cleaning large and messy data sets, exploring these data and creating meaningful visualizations, and communicating insights from the data in a meaningful manner. During morning sessions, students will attend a combined lecture where they will learn the tools and techniques required to explore new and exciting data sets. During afternoon sessions, students will break out into smaller groups to apply these tools to domain-specific research projects in Art History, Biology, Economics, or Japanese and Linguistics.
Students enrolled in Professor Abe’s (Japanese) afternoon section will use the tools of data science to create visualizations of social and emotive meanings that surface through Japanese language/culture materials. Participants will use these visualizations to engage in various theoretical and pedagogical topics pertaining to (educational) linguistics.
Students enrolled in Professor Allen’s (Biology) afternoon section will use the tools of data science to investigate the drivers of tick abundance and tick-borne disease risk. To do this students will draw from a nation-wide ecological database.
Students enrolled in Professor Anderson’s (History of Art and Architecture) afternoon section will use the tools of data science to create interactive visualizations of the Dutch textile trade in the early eighteenth century. These visualizations will enable users to make connections between global trade patterns and representations of textiles in paintings, prints, and drawings.
Students enrolled in Professor Myers’ (Economics) afternoon section will use the tools of data science to create an interactive visualization of the landscape of abortion policy and access in the United States. This visualization will allow users to explore how abortion access varies across the country and how this variation in turn correlated with demographic, health, and economic outcomes.
This course will utilize the R programming language. No prior experience in statistics, data science, programming, art history, biology, economics, or Japanese is necessary.

Terms Taught

Winter 2021

Requirements

DED, SCI, WTR

View in Course Catalog

Course Description

Data Science Across Disciplines
In this course we will gain exposure to the entire data science pipeline—obtaining and cleaning large and messy data sets, exploring these data and creating meaningful visualizations, and communicating insights from the data in a meaningful manner. During morning sessions, students will attend a combined lecture where they will learn the tools and techniques required to explore new and exciting data sets. During afternoon sessions, students will break out into smaller groups to apply these tools to domain-specific research projects in Art History, Biology, Economics, or Japanese and Linguistics.
Students enrolled in Professor Abe’s (Japanese) afternoon section will use the tools of data science to create visualizations of social and emotive meanings that surface through Japanese language/culture materials. Participants will use these visualizations to engage in various theoretical and pedagogical topics pertaining to (educational) linguistics.
Students enrolled in Professor Allen’s (Biology) afternoon section will use the tools of data science to investigate the drivers of tick abundance and tick-borne disease risk. To do this students will draw from a nation-wide ecological database.
Students enrolled in Professor Anderson’s (History of Art and Architecture) afternoon section will use the tools of data science to create interactive visualizations of the Dutch textile trade in the early eighteenth century. These visualizations will enable users to make connections between global trade patterns and representations of textiles in paintings, prints, and drawings.
Students enrolled in Professor Myers’ (Economics) afternoon section will use the tools of data science to create an interactive visualization of the landscape of abortion policy and access in the United States. This visualization will allow users to explore how abortion access varies across the country and how this variation in turn correlated with demographic, health, and economic outcomes.
This course will utilize the R programming language. No prior experience in statistics, data science, programming, art history, biology, economics, or Japanese is necessary

Terms Taught

Winter 2021

Requirements

DED, SOC, WTR

View in Course Catalog

Course Description

Data Science Across Disciplines
In this course, we will gain exposure to the entire data science pipeline—obtaining and cleaning large and messy data sets, exploring these data and creating engaging visualizations, and communicating insights from the data in a meaningful manner. During morning sessions, we will learn the tools and techniques required to explore new and exciting data sets. During afternoon sessions, students will work in small groups with one of several faculty members on domain-specific research projects in Sociology, Neuroscience, Animation, Art History, or Environmental Science. This course will utilize the R programming language. No prior experience with R is necessary.
ENVS: Students will engage in research within environmental health science—the study of reciprocal relationships between human health and the environment. High-quality data and the skills to make sense of these data are key to studying environmental health across diverse spatial scales, from individual cells through human populations. In this course, we will explore common types of data and analytical tools used to answer environmental health questions and inform policy.
FMMC: Students will explore how to make a series of consequential decisions about how to present data and how to make it clear, impactful, emotional or compelling. In this hands-on course we will use a wide range of new and old art making materials to craft artistic visual representations of data that educate, entertain, and persuade an audience with the fundamentals of data science as our starting point.
NSCI/MATH: Students will use the tools of data science to explore quantitative approaches to understanding and visualizing neural data. The types of neural data that we will study consists of electrical activity (voltage and/or spike trains) measured from individual neurons and can be used to understand how neurons respond to and process different stimuli (e.g., visual or auditory cues). Specifically, we will use this neural data from several regions of the brain to make predictions about neuron connectivity and information flow within and across brain regions.
SOCI: Students will use the tools of data science to examine how experiences in college are associated with social and economic mobility after college. Participants will combine sources of "big data" with survey research to produce visualizations and exploratory analyses that consider the importance of higher education for shaping life chances.
HARC: Students will use the tools of data science to create interactive visualizations of the Dutch textile trade in the early eighteenth century. These visualizations will enable users to make connections between global trade patterns and representations of textiles in paintings, prints, and drawings.

Terms Taught

Winter 2022

Requirements

DED, SCI, WTR

View in Course Catalog

Course Description

Data Science Across Disciplines
In this course, we will gain exposure to the entire data science pipeline—obtaining and cleaning large and messy data sets, exploring these data and creating engaging visualizations, and communicating insights from the data in a meaningful manner. During morning sessions, we will learn the tools and techniques required to explore new and exciting data sets. During afternoon sessions, students will work in small groups with one of several faculty members on domain-specific research projects in Sociology, Neuroscience, Animation, Art History, or Environmental Science. This course will utilize the R programming language. No prior experience with R is necessary.
ENVS: Students will engage in research within environmental health science—the study of reciprocal relationships between human health and the environment. High-quality data and the skills to make sense of these data are key to studying environmental health across diverse spatial scales, from individual cells through human populations. In this course, we will explore common types of data and analytical tools used to answer environmental health questions and inform policy.
FMMC: Students will explore how to make a series of consequential decisions about how to present data and how to make it clear, impactful, emotional or compelling. In this hands-on course we will use a wide range of new and old art making materials to craft artistic visual representations of data that educate, entertain, and persuade an audience with the fundamentals of data science as our starting point.
NSCI/MATH: Students will use the tools of data science to explore quantitative approaches to understanding and visualizing neural data. The types of neural data that we will study consists of electrical activity (voltage and/or spike trains) measured from individual neurons and can be used to understand how neurons respond to and process different stimuli (e.g., visual or auditory cues). Specifically, we will use this neural data from several regions of the brain to make predictions about neuron connectivity and information flow within and across brain regions.
SOCI: Students will use the tools of data science to examine how experiences in college are associated with social and economic mobility after college. Participants will combine sources of "big data" with survey research to produce visualizations and exploratory analyses that consider the importance of higher education for shaping life chances.
HARC: Students will use the tools of data science to create interactive visualizations of the Dutch textile trade in the early eighteenth century. These visualizations will enable users to make connections between global trade patterns and representations of textiles in paintings, prints, and drawings.

Terms Taught

Winter 2022

Requirements

ART, DED, WTR

View in Course Catalog

Course Description

Mathematics of Board Games
People have been playing games since as early as 2000 B.C. Since then, avid players have devised strategies to maximize their chances of winning. In this course we will dissect a variety of modern board games and analyze various strategies for each game using mathematics, computers, and intuition. We will further discuss whether an optimal strategy exists for each game and propose modifications to existing rules and scoring schemes. The course will culminate with a project to construct a board game. All are welcome regardless of mathematical background. 3 hrs. sem

Terms Taught

Fall 2020

Requirements

CW, DED

View in Course Catalog

Course Description

Data Science Across Disciplines
In this course, we will gain exposure to the entire data science pipeline—obtaining and cleaning large and messy data sets, exploring these data and creating engaging visualizations, and communicating insights from the data in a meaningful manner. During morning sessions, we will learn the tools and techniques required to explore new and exciting data sets. During afternoon sessions, students will work in small groups with one of several faculty members on domain-specific research projects in Geography, Political Science, Restorative Justice, or Healthcare. This course will use the R programming language. No prior experience with R is necessary.

GEOG 1230: In this section, students will use data science tools to explore the ways migration systems in the United States changed during the COVID-19 pandemic. We will draw on data collected from mobile phones recording each phone’s monthly place of residence at the census tract level. The dataset includes monthly observations from January 2019 through December 2021 allowing the analysis to compare migration systems pre-pandemic with those during the pandemic.

INTD 1230A: Data is a powerful tool for improving health outcomes by making programmatic choices to support justice. In this afternoon section of Data Across the Disciplines, students will be working with Addison County Restorative Justice (ACRJ) on understanding patterns in the occurrence of driving under the influence. ACRJ has over 1,000 cases and would like to better understand their data and come up with ways to access information. We will explore how identity, geography, and support impact outcomes from DUI cases. Using statistical analysis and data visualizations, along with learning about ethical data practices, we will report our findings.

INTD 1230B: Let’s dive into the minutes and reports of local towns to develop an accessible news and history resource. Could this be a tool for small newspapers to track local news more easily? Can we map this fresh data for a new look across geographies? Do you want to help volunteer town officials make decisions and better wrangle with their town’s history and data?
In this course we will develop a focused database of documents produced by several municipal boards and commissions. We will engage in conversation with local officials, researchers, and journalists. This course aims to introduce students to making data from real world documents and the people that make them to generate useful information that is often open but frequently difficult to sift through.

MATH/STAT 1230: Students will explore pediatric healthcare data to better understand the risks correlated with various childhood illnesses through an emphasis on the intuition behind statistical and machine learning techniques. We will practice making informed decisions from noisy data and the steps to go from messy data to a final report. Students will become proficient in R and gain an understanding of various statistical techniques.

PSCI 1230: How do candidates for U.S. national office raise money? From whom do they raise it? In this section we will explore these questions using Federal Election Commission data on individual campaign contributions to federal candidates. Our analysis using R will help us identify geographic patterns in the data, as well as variations in funds raised across types of candidates. We will discuss what implications these patterns may have for the health and functioning of democracy in the U.S.

Terms Taught

Winter 2024

Requirements

DED, SOC, WTR

View in Course Catalog

Course Description

Data Science Across Disciplines
In this course, we will gain exposure to the entire data science pipeline—obtaining and cleaning large and messy data sets, exploring these data and creating engaging visualizations, and communicating insights from the data in a meaningful manner. During morning sessions, we will learn the tools and techniques required to explore new and exciting data sets. During afternoon sessions, students will work in small groups with one of several faculty members on domain-specific research projects in Sociology, Neuroscience, Animation, Art History, or Environmental Science. This course will utilize the R programming language. No prior experience with R is necessary.
ENVS: Students will engage in research within environmental health science—the study of reciprocal relationships between human health and the environment. High-quality data and the skills to make sense of these data are key to studying environmental health across diverse spatial scales, from individual cells through human populations. In this course, we will explore common types of data and analytical tools used to answer environmental health questions and inform policy.
FMMC: Students will explore how to make a series of consequential decisions about how to present data and how to make it clear, impactful, emotional or compelling. In this hands-on course we will use a wide range of new and old art making materials to craft artistic visual representations of data that educate, entertain, and persuade an audience with the fundamentals of data science as our starting point.
NSCI/MATH: Students will use the tools of data science to explore quantitative approaches to understanding and visualizing neural data. The types of neural data that we will study consists of electrical activity (voltage and/or spike trains) measured from individual neurons and can be used to understand how neurons respond to and process different stimuli (e.g., visual or auditory cues). Specifically, we will use this neural data from several regions of the brain to make predictions about neuron connectivity and information flow within and across brain regions.
SOCI: Students will use the tools of data science to examine how experiences in college are associated with social and economic mobility after college. Participants will combine sources of "big data" with survey research to produce visualizations and exploratory analyses that consider the importance of higher education for shaping life chances.
HARC: Students will use the tools of data science to create interactive visualizations of the Dutch textile trade in the early eighteenth century. These visualizations will enable users to make connections between global trade patterns and representations of textiles in paintings, prints, and drawings.

Terms Taught

Winter 2021, Winter 2022

Requirements

ART, DED, EUR, WTR

View in Course Catalog

Course Description

Data Science Across Disciplines
In this course, we will gain exposure to the entire data science pipeline—obtaining and cleaning large and messy data sets, exploring these data and creating engaging visualizations, and communicating insights from the data in a meaningful manner. During morning sessions, we will learn the tools and techniques required to explore new and exciting data sets. During afternoon sessions, students will work in small groups with one of several faculty members on domain-specific research projects in Geography, Political Science, Restorative Justice, or Healthcare. This course will use the R programming language. No prior experience with R is necessary.

INTD 1230 A: Data is a powerful tool for improving health outcomes by making programmatic choices to support justice. In this afternoon section of Data Across the Disciplines, students will be working with Addison County Restorative Justice (ACRJ) on understanding patterns in the occurrence of driving under the influence. ACRJ has over 1,000 cases and would like to better understand their data and come up with ways to access information. We will explore how identity, geography, and support impact outcomes from DUI cases. Using statistical analysis and data visualizations, along with learning about ethical data practices, we will report our findings.

INTD 1230 B: Let’s dive into the minutes and reports of local towns to develop an accessible news and history resource. Could this be a tool for small newspapers to track local news more easily? Can we map this fresh data for a new look across geographies? Do you want to help volunteer town officials make decisions and better wrangle with their town’s history and data? In this course we will develop a focused database of documents produced by several municipal boards and commissions. We will engage in conversation with local officials, researchers, and journalists. This course aims to introduce students to making data from real world documents and the people that make them to generate useful information that is often open but frequently difficult to sift through.

GEOG 1230: In this section, students will use data science tools to explore the ways migration systems in the United States changed during the COVID-19 pandemic. We will draw on data collected from mobile phones recording each phone’s monthly place of residence at the census tract level. The dataset includes monthly observations from January 2019 through December 2021 allowing the analysis to compare migration systems pre-pandemic with those during the pandemic.

MATH/STAT 1230: Students will explore pediatric healthcare data to better understand the risks correlated with various childhood illnesses through an emphasis on the intuition behind statistical and machine learning techniques. We will practice making informed decisions from noisy data and the steps to go from messy data to a final report. Students will become proficient in R and gain an understanding of various statistical techniques.

PSCI 1230: How do candidates for U.S. national office raise money? From whom do they raise it? In this section we will explore these questions using Federal Election Commission data on individual campaign contributions to federal candidates. Our analysis using R will help us identify geographic patterns in the data, as well as variations in funds raised across types of candidates. We will discuss what implications these patterns may have for the health and functioning of democracy in the U.S.

Terms Taught

Winter 2024

Requirements

DED, SCI, WTR

View in Course Catalog

Course Description

Data Science Across Disciplines
In this course we will gain exposure to the entire data science pipeline—obtaining and cleaning large and messy data sets, exploring these data and creating meaningful visualizations, and communicating insights from the data in a meaningful manner. During morning sessions, students will attend a combined lecture where they will learn the tools and techniques required to explore new and exciting data sets. During afternoon sessions, students will break out into smaller groups to apply these tools to domain-specific research projects in Art History, Biology, Economics, or Japanese and Linguistics.
Students enrolled in Professor Abe’s (Japanese) afternoon section will use the tools of data science to create visualizations of social and emotive meanings that surface through Japanese language/culture materials. Participants will use these visualizations to engage in various theoretical and pedagogical topics pertaining to (educational) linguistics.
Students enrolled in Professor Allen’s (Biology) afternoon section will use the tools of data science to investigate the drivers of tick abundance and tick-borne disease risk. To do this students will draw from a nation-wide ecological database.
Students enrolled in Professor Anderson’s (History of Art and Architecture) afternoon section will use the tools of data science to create interactive visualizations of the Dutch textile trade in the early eighteenth century. These visualizations will enable users to make connections between global trade patterns and representations of textiles in paintings, prints, and drawings.
Students enrolled in Professor Myers’ (Economics) afternoon section will use the tools of data science to create an interactive visualization of the landscape of abortion policy and access in the United States. This visualization will allow users to explore how abortion access varies across the country and how this variation in turn correlated with demographic, health, and economic outcomes.
This course will utilize the R programming language. No prior experience in statistics, data science, programming, art history, biology, economics, or Japanese is necessary

Terms Taught

Winter 2021

Requirements

AAL, DED, NOA, SOC

View in Course Catalog

Course Description

Math and Board Games
Have you ever spent minutes agonizing over which move to make in a board game? Out of all the possible options, how could you possibly determine which move is best? Was there even an objectively best decision? In this course, we will explore the mathematics and underlying gameplay structures of several modern board games. In addition to playing these games during class, we’ll use math and logic to assess and quantify the value of a range of possible in-game decisions. Using formal mathematical proofs, papers, and in-class discussions, we’ll analyze the fairness and equity of strategies across a wide variety of games. We’ll finish the course by designing our own board game based on what we’ve learned! (Students who have completed FYSE1216 are not eligible to enroll in MATH 0106.)

Terms Taught

Fall 2022

Requirements

CW, DED

View in Course Catalog

Course Description

Introduction to Statistical Science
A practical introduction to statistical methods and the examination of data sets. Computer software will play a central role in analyzing a variety of real data sets from the natural and social sciences. Topics include descriptive statistics, elementary distributions for data, hypothesis tests, confidence intervals, correlation, regression, contingency tables, and analysis of variance. The course has no formal mathematics prerequisite, and is especially suited to students in the physical, social, environmental, and life sciences who seek an applied orientation to data analysis. (Credit is not given for MATH 0116 if the student has taken ECON 0111 (formerly ECON 0210) or PSYC 0201 previously or concurrently.) 3 hrs. lect./1 hr. computer lab.

Terms Taught

Fall 2022

Requirements

DED

View in Course Catalog

Course Description

Introduction to Data Science
In this course students will gain exposure to the entire data science pipeline: forming a statistical question, collecting and cleaning data sets, performing exploratory data analyses, identifying appropriate statistical techniques, and communicating the results, all the while leaning heavily on open source computational tools, in particular the R statistical software language. We will focus on analyzing real, messy, and large data sets, requiring the use of advanced data manipulation/wrangling and data visualization packages. Students will be required to bring alaptop (owned or college-loaned) to class as many lectures will involve in-class computational activities. (formerly MATH216) 3 hrs lect./disc. (Not open to students who have taken BIOL 1230, ECON 1230, ENVS 1230, FMMC 1230, HARC 1230, JAPN 1230, LNGT 1230, NSCI 1230, MATH 1230, SOCI 1230, LNGT 1230, PSCI 1230, WRPR 1230, or GEOG 1230.)

Terms Taught

Fall 2021

Requirements

DED

View in Course Catalog

Course Description

Introduction to Data Science
In this course students will gain exposure to the entire data science pipeline: forming a statistical question, collecting and cleaning data sets, performing exploratory data analyses, identifying appropriate statistical techniques, and communicating the results, all the while leaning heavily on open source computational tools, in particular the R statistical software language. We will focus on analyzing real, messy, and large data sets, requiring the use of advanced data manipulation/wrangling and data visualization packages. Students will be required to bring their own laptops as many lectures will involve in-class computational activities. 3 hrs lect./disc.

Terms Taught

Fall 2020, Spring 2021

Requirements

DED

View in Course Catalog

Course Description

Statistical Learning
This course is an introduction to modern statistical, machine learning, and computational methods to analyze large and complex data sets that arise in a variety of fields, from biology to economics to astrophysics. The theoretical underpinnings of the most important modeling and predictive methods will be covered, including regression, classification, clustering, resampling, and tree-based methods. Student work will involve implementation of these concepts using open-source computational tools. (MATH 0118, or MATH 0216, or BIOL 1230, or ECON 1230, or ENVS 1230, or FMMC 1230, or HARC 1230, or JAPN 1230, or LNGT 1230, or NSCI 1230, or MATH 1230 or SOCI 1230) 3 hrs. lect./disc.

Terms Taught

Fall 2021

Requirements

DED

View in Course Catalog

Course Description

Statistical Inference
An introduction to the mathematical methods and applications of statistical inference using both classical methods and modern resampling techniques. Topics will include: permutation tests, parametric and nonparametric problems, estimation, efficiency and the Neyman-Pearsons lemma. Classical tests within the normal theory such as F-test, t-test, and chi-square test will also be considered. Methods of linear least squares are used for the study of analysis of variance and regression. There will be some emphasis on applications to other disciplines. This course is taught using R. (MATH 0310) 3 hrs. lect./disc.

Terms Taught

Spring 2021, Spring 2022

Requirements

DED

View in Course Catalog

Course Description

Advanced Study
Individual study for qualified students in more advanced topics in algebra, number theory, real or complex analysis, topology. Particularly suited for those who enter with advanced standing. (Approval required) 3 hrs. lect./disc.

Terms Taught

Fall 2020, Winter 2021, Spring 2021, Fall 2021, Winter 2022, Spring 2022, Fall 2022, Winter 2023, Spring 2023, Fall 2023, Winter 2024, Spring 2024, Fall 2024, Winter 2025, Spring 2025

View in Course Catalog

Course Description

Statistics Capstone Seminar
In this course we will work with community partners to solve real-world problems using modern statistical and data science techniques. Students will work in small groups to translate research questions into actionable analysis and visualizations. Students will select a project of interest from a subset of community partners, maintain contact and collaboration with the community partner, and present their findings in a final symposium. (MATH 0218, MATH 0311, or by approval) 3 hrs. sem.

Terms Taught

Spring 2022

Requirements

DED

View in Course Catalog

Course Description

Data Science Across Disciplines
In this course, we will gain exposure to the entire data science pipeline—obtaining and cleaning large and messy data sets, exploring these data and creating engaging visualizations, and communicating insights from the data in a meaningful manner. During morning sessions, we will learn the tools and techniques required to explore new and exciting data sets. During afternoon sessions, students will work in small groups with one of several faculty members on domain-specific research projects in Sociology, Neuroscience, Animation, Art History, or Environmental Science. This course will utilize the R programming language. No prior experience with R is necessary.
ENVS: Students will engage in research within environmental health science—the study of reciprocal relationships between human health and the environment. High-quality data and the skills to make sense of these data are key to studying environmental health across diverse spatial scales, from individual cells through human populations. In this course, we will explore common types of data and analytical tools used to answer environmental health questions and inform policy.
FMMC: Students will explore how to make a series of consequential decisions about how to present data and how to make it clear, impactful, emotional or compelling. In this hands-on course we will use a wide range of new and old art making materials to craft artistic visual representations of data that educate, entertain, and persuade an audience with the fundamentals of data science as our starting point.
NSCI/MATH: Students will use the tools of data science to explore quantitative approaches to understanding and visualizing neural data. The types of neural data that we will study consists of electrical activity (voltage and/or spike trains) measured from individual neurons and can be used to understand how neurons respond to and process different stimuli (e.g., visual or auditory cues). Specifically, we will use this neural data from several regions of the brain to make predictions about neuron connectivity and information flow within and across brain regions.
SOCI: Students will use the tools of data science to examine how experiences in college are associated with social and economic mobility after college. Participants will combine sources of "big data" with survey research to produce visualizations and exploratory analyses that consider the importance of higher education for shaping life chances.
HARC: Students will use the tools of data science to create interactive visualizations of the Dutch textile trade in the early eighteenth century. These visualizations will enable users to make connections between global trade patterns and representations of textiles in paintings, prints, and drawings.

Terms Taught

Winter 2022

Requirements

DED, SCI, WTR

View in Course Catalog

Course Description

Data Science Across Disciplines
In this course, we will gain exposure to the entire data science pipeline—obtaining and cleaning large and messy data sets, exploring these data and creating engaging visualizations, and communicating insights from the data in a meaningful manner. During morning sessions, we will learn the tools and techniques required to explore new and exciting data sets. During afternoon sessions, students will work in small groups with one of several faculty members on domain-specific research projects in Geography, Political Science, Restorative Justice, or Healthcare. This course will use the R programming language. No prior experience with R is necessary.

PSCI 1230: How do candidates for U.S. national office raise money? From whom do they raise it? In this section we will explore these questions using Federal Election Commission data on individual campaign contributions to federal candidates. Our analysis using R will help us identify geographic patterns in the data, as well as variations in funds raised across types of candidates. We will discuss what implications these patterns may have for the health and functioning of democracy in the U.S.

INTD 1230A: Data is a powerful tool for improving health outcomes by making programmatic choices to support justice. In this afternoon section of Data Across the Disciplines, students will be working with Addison County Restorative Justice (ACRJ) on understanding patterns in the occurrence of driving under the influence. ACRJ has over 1,000 cases and would like to better understand their data and come up with ways to access information. We will explore how identity, geography, and support impact outcomes from DUI cases. Using statistical analysis and data visualizations, along with learning about ethical data practices, we will report our findings.

INTD 1230B: Let’s dive into the minutes and reports of local towns to develop an accessible news and history resource. Could this be a tool for small newspapers to track local news more easily? Can we map this fresh data for a new look across geographies? Do you want to help volunteer town officials make decisions and better wrangle with their town’s history and data?
In this course we will develop a focused database of documents produced by several municipal boards and commissions. We will engage in conversation with local officials, researchers, and journalists. This course aims to introduce students to making data from real world documents and the people that make them to generate useful information that is often open but frequently difficult to sift through.

GEOG 1230: In this section, students will use data science tools to explore the ways migration systems in the United States changed during the COVID-19 pandemic. We will draw on data collected from mobile phones recording each phone’s monthly place of residence at the census tract level. The dataset includes monthly observations from January 2019 through December 2021 allowing the analysis to compare migration systems pre-pandemic with those during the pandemic.

MATH/STAT 1230: Students will explore pediatric healthcare data to better understand the risks correlated with various childhood illnesses through an emphasis on the intuition behind statistical and machine learning techniques. We will practice making informed decisions from noisy data and the steps to go from messy data to a final report. Students will become proficient in R and gain an understanding of various statistical techniques.

Terms Taught

Winter 2024

Requirements

DED, SOC, WTR

View in Course Catalog

Course Description

Data Science Across Disciplines
In this course, we will gain exposure to the entire data science pipeline—obtaining and cleaning large and messy data sets, exploring these data and creating engaging visualizations, and communicating insights from the data in a meaningful manner. During morning sessions, we will learn the tools and techniques required to explore new and exciting data sets. During afternoon sessions, students will work in small groups with one of several faculty members on domain-specific research projects in Sociology, Neuroscience, Animation, Art History, or Environmental Science. This course will utilize the R programming language. No prior experience with R is necessary.
ENVS: Students will engage in research within environmental health science—the study of reciprocal relationships between human health and the environment. High-quality data and the skills to make sense of these data are key to studying environmental health across diverse spatial scales, from individual cells through human populations. In this course, we will explore common types of data and analytical tools used to answer environmental health questions and inform policy.
FMMC: Students will explore how to make a series of consequential decisions about how to present data and how to make it clear, impactful, emotional or compelling. In this hands-on course we will use a wide range of new and old art making materials to craft artistic visual representations of data that educate, entertain, and persuade an audience with the fundamentals of data science as our starting point.
NSCI/MATH: Students will use the tools of data science to explore quantitative approaches to understanding and visualizing neural data. The types of neural data that we will study consists of electrical activity (voltage and/or spike trains) measured from individual neurons and can be used to understand how neurons respond to and process different stimuli (e.g., visual or auditory cues). Specifically, we will use this neural data from several regions of the brain to make predictions about neuron connectivity and information flow within and across brain regions.
SOCI: Students will use the tools of data science to examine how experiences in college are associated with social and economic mobility after college. Participants will combine sources of "big data" with survey research to produce visualizations and exploratory analyses that consider the importance of higher education for shaping life chances.
HARC: Students will use the tools of data science to create interactive visualizations of the Dutch textile trade in the early eighteenth century. These visualizations will enable users to make connections between global trade patterns and representations of textiles in paintings, prints, and drawings.

Terms Taught

Winter 2022

Requirements

DED, SOC, WTR

View in Course Catalog

Course Description

Advanced Introduction to Statistical and Data Sciences
An introduction to statistical methods and the examination of data sets for students with a background in calculus. Topics include descriptive statistics, elementary distributions for data, hypothesis tests, confidence intervals, and regression. Students develop skills in data cleaning, wrangling, visualization, and model fitting using the Statistical Software R. Emphasis will be placed on reproducibility. (MATH 0121 or APAB 4 or APBC 3, or by waiver) (Not open to students who have taken MATH 0116, MATH 0118, ECON 0111 (formerly ECON 0210), PSYC 0201, STAT 0116, STAT 0118, BIOL 1230, ECON 1230, ENVS 1230, FMMC 1230, HARC 1230, JAPN 1230, LNGT 1230, NSCI 1230, MATH 1230, SOCI 1230, LNGT 1230, PSCI 1230, WRPR 1230, or GEOG 1230.)

Terms Taught

Fall 2023, Spring 2024

Requirements

DED

View in Course Catalog

Course Description

Statistical Learning (formerly MATH 0218)
This course is an introduction to modern statistical, machine learning, and computational methods to analyze large and complex data sets that arise in a variety of fields, from biology to economics to astrophysics. The theoretical underpinnings of the most important modeling and predictive methods will be covered, including regression, classification, clustering, resampling, and tree-based methods. Student work will involve implementation of these concepts using open-source computational tools. (MATH 0118, or MATH 0216, or BIOL 1230, or ECON 1230, or ENVS 1230, or FMMC 1230, or HARC 1230, or JAPN 1230, or LNGT 1230, or NSCI 1230, or MATH 1230 or SOCI 1230) 3 hrs. lect./disc.

Terms Taught

Fall 2023, Spring 2024, Fall 2024

Requirements

DED

View in Course Catalog

Course Description

Independent Study
Individual study for qualified students in more advanced topics in statistics. Particularly suited for those who enter with advanced standing. (Approval required) 3 hrs. lect./disc.

Terms Taught

Spring 2024, Fall 2024, Winter 2025, Spring 2025

View in Course Catalog

Course Description

Data Science Across Disciplines
In this course, we will gain exposure to the entire data science pipeline—obtaining and cleaning large and messy data sets, exploring these data and creating engaging visualizations, and communicating insights from the data in a meaningful manner. During morning sessions, we will learn the tools and techniques required to explore new and exciting data sets. During afternoon sessions, students will work in small groups with one of several faculty members on domain-specific research projects in Geography, Political Science, Restorative Justice, Healthcare, or. This course will use the R programming language. No prior experience with R is necessary.

MATH/STAT 1230: Students will explore pediatric healthcare data to better understand the risks correlated with various childhood illnesses through an emphasis on the intuition behind statistical and machine learning techniques. We will practice making informed decisions from noisy data and the steps to go from messy data to a final report. Students will become proficient in R and gain an understanding of various statistical techniques.

GEOG 1230: In this section, students will use data science tools to explore the ways migration systems in the United States changed during the COVID-19 pandemic. We will draw on data collected from mobile phones recording each phone’s monthly place of residence at the census tract level. The dataset includes monthly observations from January 2019 through December 2021 allowing the analysis to compare migration systems pre-pandemic with those during the pandemic.

INTD 1230A: Data is a powerful tool for improving health outcomes by making programmatic choices to support justice. In this afternoon section of Data Across the Disciplines, students will be working with Addison County Restorative Justice (ACRJ) on understanding patterns in the occurrence of driving under the influence. ACRJ has over 1,000 cases and would like to better understand their data and come up with ways to access information. We will explore how identity, geography, and support impact outcomes from DUI cases. Using statistical analysis and data visualizations, along with learning about ethical data practices, we will report our findings.

INTD 1230B: Let’s dive into the minutes and reports of local towns to develop an accessible news and history resource. Could this be a tool for small newspapers to track local news more easily? Can we map this fresh data for a new look across geographies? Do you want to help volunteer town officials make decisions and better wrangle with their town’s history and data?
In this course we will develop a focused database of documents produced by several municipal boards and commissions. We will engage in conversation with local officials, researchers, and journalists. This course aims to introduce students to making data from real world documents and the people that make them to generate useful information that is often open but frequently difficult to sift through.

PSCI 1230: How do candidates for U.S. national office raise money? From whom do they raise it? In this section we will explore these questions using Federal Election Commission data on individual campaign contributions to federal candidates. Our analysis using R will help us identify geographic patterns in the data, as well as variations in funds raised across types of candidates. We will discuss what implications these patterns may have for the health and functioning of democracy in the U.S.

Terms Taught

Winter 2024

Requirements

DED, SCI, WTR

View in Course Catalog