DSI seminar 2/19: Managing data from multiple tables using SQL

learning
seminar
workshop
event
Author

Robin Donatello

Published

February 13, 2019

Logistics / Setup

  • Presenter: Robin Donatello, Assistant Professor of Statistics
  • Date: Tuesday February 19th, 2019
  • Time: 3-3:50 pm
  • Location: Tehama 116
  • RSVP: http://goo.gl/forms/BnjV0y5zoz09tUU83

Description

You’ve got person level data (like country of origin) in one data set, annual data (like income) for each person across multiple years in another data set, but each person is on a team and team characteristics are on yet a THIRD data set. How do you link all this information together to gain any kind of insights? SQL is a powerful database language that can be used in it’s own program, but also inside many other programs like R, SAS and Python. Come learn the fundamentals of how to link information together across multiple tables, and how to create summary information using this powerful SQL language.

Workshop Files

I will be demonstrating SQL commands using R, but this talk is aimed at everyone.

Additional potential helpful resources

  • Intro to SQL course on Data Camp https://www.datacamp.com/courses/intro-to-sql-for-data-science
  • Software Carpentry lesson on Databases and SQL: http://swcarpentry.github.io/sql-novice-survey/
  • Data Management with SQL for Ecologists: https://datacarpentry.org/sql-ecology-lesson/
  • Using SQL in R
  • Using SQL in Python
    • https://towardsdatascience.com/sql-and-etl-an-introduction-to-sqlalchemy-in-python-fc66e8be1cd4
    • https://datatofish.com/how-to-connect-python-to-sql-server-using-pyodbc/
  • DB Browser for SQLite (standalone program) https://sqlitebrowser.org/