class: center, middle, inverse, title-slide # Lec08: Time ## Stat41: Data Viz ### Prof Amanda Luby ### Swarthmore College --- class: center, middle # Today: (1) Return to Regression Discontinuity (5+5) (2) The Great Debate Pt 1 (15+5) (3) Interlude: The negative y-axis (5) (4) The Great Debate Pt 2 (10+5) --- class: center <blockquote class="twitter-tweet"><p lang="und" dir="ltr"><a href="https://t.co/nDIJnSNAn7">pic.twitter.com/nDIJnSNAn7</a></p>— Alex Selby-Boothroyd (@AlexSelbyB) <a href="https://twitter.com/AlexSelbyB/status/1325782481174466562?ref_src=twsrc%5Etfw">November 9, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> --- # Announcements (1) "Office Hours" tomorrow (2) Final Project Milestones --- # Regression Discontinuity Return to jamboards/regression discontinuity with the "time" chapter in mind. + What is the point of the graph? -- + How does it fail? -- + What would be a better way? --- class: center, middle # Recap --- class: inverse, center, middle # The Great Debate Part 1 ### Should y axes start at zero? --- # Examples on jamboards + Pike County COVID Cases + Average Global Temperature + Fox News new cases + Fox News Obamacare enrollment --- class: inverse, center, middle # Recap --- # When is it ok to *not* start at zero? + When small changes really matter -- + When the scale is distorted -- + When values close to zero are impossible! --- ### When small changes really matter .pull-left[  ] -- .pull-right[  ] --- # When scale is distorted .pull-left[  ] -- .pull-right[  ] --- # When values close to zero are impossible .pull-left[  ] -- .pull-right[  ] --- # We **should** start at zero when + Area matters (bar chart, histogram) -- + Distance from zero is important --- # Not everyone agrees on this! Proof from an email [thread](images/start-at-zero-thread.pdf) of stats professors. --- ### Interlude: the inverted y axis + COVID graph on jamboard = misleading -- + FL Gun Death chart = misleading  -- + But what if it's done well? --- class:center  --- class: inverse, middle, center # The Great Debate Pt 2 ### The double y-axis --- # Example  --- # Your jamboards: + Spurious Correlations + COVID Sewage Study + Daily High Temperatures + UK Kennel Club --- class: inverse, center, middle # Recap --- class: inverse, middle, center ### When you choose the start and end y-values, you can force the trends to line up however you want! --- # When can we do it? + When two axes measure the same thing -- + When you want to stress me out --- # Example in R Don't make me regret this .pull-left[ ```r library(palmerpenguins) penguin_counts <- penguins %>% group_by(species) %>% summarize(total = n()) total_penguins <- sum(penguin_counts$total) ggplot(penguin_counts, aes(x = species, y = total, fill = species)) + geom_col() + scale_y_continuous( sec.axis = sec_axis( trans = ~ . / total_penguins, labels = scales::percent) ) + guides(fill = FALSE) + theme_xaringan() ``` <!-- --> ] .pull-right[  ] --- # Preview of next week: Interactivity + [Animation](https://preview.redd.it/dw0pdfw5icj51.gif?format=mp4&s=92dced4a3f9f774dbaf7d84a8108a966d12f697f)