This is the introduction to Stata I wish I had.
Suitable for all levels, this course is designed to make you a better economist. The goal is simple: improve your efficiency, save you hours of headaches, and eliminate errors through automation. Business students use Excel - economists use software (R/Stata/Python).
These scripts provide the hacks necessary to move from "manual" analysis to replicable, professional code.
While vibecoding with LLMs is great, it’s prone to errors and hallucinations. These repositories provide verified, working code you need to provide necessary context to the AI. Use these templates to ground your LLM, ask it how to add specific features, and explain the logic—ensuring you reduce errors and get to the right result faster.
| Video Title | Skill Learned | Script |
|---|---|---|
| 1. Automated Import, Convert, Combine (Watch Here) | Reproducible data loading | 01_Clean_Data_Automated.do |
| 2. Debug Like a Pro (Watch Here) | Solve 95+% of coding errors | 02_Debug_Like_A_Pro.do |
| 3. 5 Essential Time Series Skills (Watch Here) | Visualize, Model, Forecast | 03_Essential_TS_Skills.do |
| 4. Monte Carlo Simulations (Watch Here) | Simulations, Extracting p-values (Output) | 04_Monte_Carlo_Simulations.do |
| 5. The Copy-Paste Intervention (Watch Here) | Exporting Results to Word/Excel/LaTeX | 05_Copy_Paste_Intervention.do |
- Download this repository (green "Code" button > Download ZIP).
- Unzip the folder. Keep all files in the same directory.
- Open any
.dofile by double clicking directly from the folder to set your working directory automatically (never set paths).
The best economists know the data.
The datasets used in this course (CDataQ.csv, CDataM.csv) are real Canadian macroeconomic data.
Want to learn how to fetch this data yourself? I have a separate guide on how to build this exact dataset from scratch using official Statistics Canada sources.
When your code crashes (and it will), do not panic. Follow this 3-step workflow from Video 2 before trying to fix it:
- Read the Error: The Output window is your friend. Read the red text to understand why it failed. Click the blue error code (e.g.,
r(109)). - Find out where the error occurred: Don't guess. Look at the line number. Run the code line by line to find the specific command that caused the stop.
- Check the output after each command: Code can run without crashing and still be wrong- always double check output and verify calculations in the data viewer.
You should care deeply about the quality of your figures and tables. In this course, we adhere to the Stand-Alone Principle: A stranger should be able to pick up your graph or table and understand it perfectly without reading your text.
- Clean Scripts: Files include only the commands used and descriptions.
- No Raw Output: Never show raw Stata output in a report. Everything must be summarized in words or formatted into a suitable table.
- Notes are Mandatory: Must describe the data source, date range, transformations, and seasonal adjustments.
- Titles: It is best to handle titles and notes in LaTeX.
- Real Names: Always use the actual name of the series (e.g., "Real GDP Growth"), never the Stata syntax code (e.g.,
dy_var).
- Axis Labels: Use units only (e.g., "Percent", "Billions of Dollars").
- Legends: Label the series clearly without syntax.
- No Borders: Remove borders around legends and figures.
- Stationarity: Graph what you are modelling. Non-stationary data only if showing trends.
- Visuals: No "Stata Blue" backgrounds. Format figures to look like they belong in a journal.
- Relevance: Only describe stationary data. Do not show variables that were not asked for.
- Precision: No more than three significant digits (e.g.,
0.752, not0.75165).
Stephen Snudden, PhD