ãã®æçš¿ã¯ç§ã®è±èªããã°ããã®Rã·ãªãŒãºã®ããŒã¿ââååŸã®3ã€ã®éšåã®ç¿»èš³ã§ãã æåã®ã·ãªãŒãºã¯4ã€ã®ããŒãã§æ§æ³ããããã®ãã¡3ã€ããã®æçš¿ã®åºç€ãšãªããŸããã äžè¬çãªçµ±èšããŒã¿ããŒã¹ãžã®ã¢ã¯ã»ã¹ ã 人å£çµ±èšããŒã¿ ; 人å£çµ±èšããŒã¿ ã ãŸã èšè¿°ãããŠããªãæåŸã®éšåã¯ã空éããŒã¿ã®äœ¿çšã«çŠç¹ãåœãŠãŸãã

çµæã®åçŸæ§ãé«ããããã«Rãã·ã£ãŒãã«ããŸããã ãªãã©ã·ãŒããã°ã©ãã³ã°ã®ååãé©çšããã®ã«åœ¹ç«ã€ã·ã¹ãã ããŒãžã§ã³ãšããã±ãŒãžã®äºææ§ãä¿èšŒããåªãããœãªã¥ãŒã·ã§ã³ã¯æ°å€ããããŸã ...ããããRèªäœã䜿çšããŠããŒã¿ãç°¡åã«/å¹ççã«æ€çŽ¢/ããŠã³ããŒã/æœåºããåã¹ããããææžåããããšã§ãããã»ã¹å
šäœã®å®å
šãªåçŸæ§ãä¿èšŒããæ¹æ³ã瀺ããããšæããŸãã ãã¡ãããèãããããã¹ãŠã®ããŒã¿ãœãŒã¹ããªã¹ãããäž»ã«äººå£çµ±èšããŒã¿ã«çŠç¹ãåœãŠãäœæ¥ãèªåã§èšå®ããããã§ã¯ãããŸããã é¢å¿ã人å£çµ±èšã®ç¯å²å€ã§ããå Žåã¯ã壮倧ãªãªãŒãã³ããŒã¿ã¿ã¹ã¯ãã¥ãŒãããžã§ã¯ãã®æ¹åã«ç®ãåãã䟡å€ããããŸã ã
æ
å ±æºã®ããããã®äœ¿çšã説æããããã«ãååŸããããŒã¿ã®èŠèŠåã®äŸã瀺ããŸãã åã³ãŒãäŸã¯ãç¬ç«ãããŠããããšããŠèšèšãããŠããŸã-ã³ããŒããŠåçããŸãã ãã¡ãããæåã«å¿
èŠãªããã±ãŒãžãã€ã³ã¹ããŒã«ããå¿
èŠããããŸãã ã³ãŒãå
šäœãããã«ãããŸã ã
åã蟌ã¿ããŒã¿ã»ãã
å€ãã®ããã±ãŒãžã«ã¯ãã¡ãœããã説æããããã®å°ãããŠäŸ¿å©ãªããŒã¿ã»ãããå«ãŸããŠããŸãã å®éãRã«ãŒãã«ã«ã¯datasets
ããã±ãŒãžãå«ãŸããŠããŸãããã®ããã±ãŒãžã«ã¯ã倿°ã®å°èŠæš¡ã§å€æ§ãªãæã«ã¯éåžžã«æåãªããŒã¿ã»ãããå«ãŸããŠããŸãã ããŸããŸãªããã±ãŒãžã®çµã¿èŸŒã¿ã€ã©ã¹ãããã±ãŒãžã®è©³çްãªãªã¹ãã¯ãVincent Arel-Bundock Webãµã€ãã§å
¥æã§ããŸã ã
çµã¿èŸŒã¿ã®ããŒã¿ã»ããã®åªããæ©èœã¯ããããããåžžã«ããªããšããããšããããšã§ãã äžæã®ãã¬ãŒãã³ã°ããŒã¿ã»ããåã¯ãã°ããŒãã«ç°å¢ã®ãªããžã§ã¯ããšåããããç°¡åã«äœ¿çšã§ããŸãã ã¹ã€ã¹ã®é
åçãªå°ããªããŒã¿ã»ãã-ã¹ã€ã¹ã®çæ®èœåãšç€ŸäŒçµæžææšïŒ1888ïŒã®ããŒã¿ãèŠãŠã¿ãŸãããã 以äžã«ã蟲æäººå£ã®å²åãšã«ããªãã¯ä¿¡ä»°ã®valenceå»¶ã«ããã¹ã€ã¹ã®å·éã®åºççã®éãã瀺ããŸãã
library(tidyverse) swiss %>% ggplot(aes(x = Agriculture, y = Fertility, color = Catholic > 50))+ geom_point()+ stat_ellipse()+ theme_minimal(base_family = "mono")

ã¬ããã³ããŒ
äžéšã®ããã±ãŒãžã¯ãRããŠãŒã¶ãŒãç¹å®ã®ããŒã¿ã»ããã«ç°¡åã«ã¢ã¯ã»ã¹ã§ããããã«ç¹å¥ã«èšèšãããŠããŸãããã®ãããªããã±ãŒãžã®è¯ãäŸã¯ã Hap Rosling Gapminderãããžã§ã¯ãã®ããŒã¿ã®ãµã³ãã«ãå«ãgapminder
ã§ãã
library(tidyverse) library(gapminder) gapminder %>% ggplot(aes(x = year, y = lifeExp, color = continent))+ geom_jitter(size = 1, alpha = .2, width = .75)+ stat_summary(geom = "path", fun.y = mean, size = 1)+ theme_minimal(base_family = "mono")

URLããŒã¿ã»ãããååŸãã
ããŒã¿ã»ããããªã³ã©ã€ã³ã®ã©ããã«ä¿åãããŠãããçŽæ¥ããŠã³ããŒããªã³ã¯ãããå Žåããªã³ã¯ãæå®ããã ãã§Rã§ãããèªãããšãã§ããŸãã äŸãšããŠã HistData
ããã±ãŒãžããç¶èŠªãšåäŸã®æé·ã«é¢ããæåãªGalton
ããŒã¿ã»ãããåãäžããŸãããã Vincent Arel-Bundockã®ãªã¹ãããçŽæ¥ãªã³ã¯ããŒã¿ãååŸããã ãã§ã
library(tidyverse) galton <- read_csv("https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/HistData/Galton.csv") galton %>% ggplot(aes(x = father, y = height))+ geom_point(alpha = .2)+ stat_smooth(method = "lm")+ theme_minimal(base_family = "mono")

ã¢ãŒã«ã€ããããŠã³ããŒãããŠè§£åããŸã
å€ãã®å ŽåãããŒã¿ã»ããã¯ã€ã³ã¿ãŒãããã«ãªãªãŒã¹ãããåã«ã¢ãŒã«ã€ããããŸãã ã¢ãŒã«ã€ãã®èªã¿åãã¯ããã«ã¯æ©èœããŸããããRã®ãœãªã¥ãŒã·ã§ã³ã¯éåžžã«ç°¡åã§ãã ããã»ã¹ã®ããžãã¯ã¯éåžžã«åçŽã§ãããŸããããŒã¿ãå±éãããã£ã¬ã¯ããªãäœæããŸãã æ¬¡ã«ãã¢ãŒã«ã€ããäžæãã¡ã€ã«ã«ããŠã³ããŒãããŸãã æåŸã«ãã¢ãŒã«ã€ãã以åã«äœæãããã£ã¬ã¯ããªã«è§£åããŸãã ããšãã°ããã¥ãŒãšãŒã¯å·æ¿åºãã芪åã«æäŸãããopen data.govããŒã¿ãªããžããªã«ä¿åãããŠãããã¥ãŒãšãŒã¯ã®ç¯çœªããŒã¿ã»ããã§ããHistorical New York City Crime DataãããŠã³ããŒãããŸãã
library(tidyverse) library(readxl) # create a directory for the unzipped data ifelse(!dir.exists("unzipped"), dir.create("unzipped"), "Directory already exists") # specify the URL of the archive url_zip <- "http://www.nyc.gov/html/nypd/downloads/zip/analysis_and_planning/citywide_historical_crime_data_archive.zip" # storing the archive in a temporary file f <- tempfile() download.file(url_zip, destfile = f) unzip(f, exdir = "unzipped/.")
ããŠã³ããŒããããã¢ãŒã«ã€ãã¯éåžžã«éãå Žåããããã³ãŒããåçæãããã³ã«å床ããŠã³ããŒãããããªãå ŽåããããŸãã ãã®å Žåãå
ã®ã¢ãŒã«ã€ããä¿åããäžæãã¡ã€ã«ã䜿çšããªãã®ãçã«ããªã£ãŠããŸãã
# if we want to keep the .zip file path_unzip <- "unzipped/data_archive.zip" ifelse(!file.exists(path_unzip), download.file(url_zip, path_unzip, mode="wb"), 'file alredy exists') unzip(path_unzip, exdir = "unzipped/.")
æåŸã«ãããŠã³ããŒãããŠè§£åããããŒã¿ãã€ã³ããŒãããŠèŠèŠåããŸãã
murder <- read_xls("unzipped/Web Data 2010-2011/Seven Major Felony Offenses 2000 - 2011.xls", sheet = 1, range = "A5:M13") %>% filter(OFFENSE %>% substr(1, 6) == "MURDER") %>% gather("year", "value", 2:13) %>% mutate(year = year %>% as.numeric()) murder %>% ggplot(aes(year, value))+ geom_point()+ stat_smooth(method = "lm")+ theme_minimal(base_family = "mono")+ labs(title = "Murders in New York")

ãã£ã°ã·ã§ã¢
åŠåã®äžçã§ã¯ãçµæã®åçŸæ§ã®åé¡ããŸããŸãæ·±å»ã«ãªã£ãŠããŸãã ãããã£ãŠããŸããŸããªãŒãã³ã«å
¬éãããããŒã¿ã»ããã¯ãç§åŠè«æã®å¿ å®ãªä»²éã«ãªãã€ã€ãããŸãã ãã®ãããªããŒã¿ã»ããã«ã¯ãå€ãã®å°çšãªããžããªããããŸãã æãåºã䜿çšãããŠãããã®ã®1ã€ãFigshareã§ãã 圌ã«ã¯R- rfigshare
ã©ãããŒããããŸãã ããšãã°ãããã±ãŒéžæã®æé·ã«é¢ããç¬èªã®ããŒã¿ã»ãããããŠã³ããŒãããŸãã ããã¯ãHabréã®ä»¥åã®æçš¿ã®1ã€ãæžãããã«åéãããã®ã§ã ã rfigshare
ããã±ãŒãžãåããŠäœ¿çšãããšãã¯ãAPIã«ã¢ã¯ã»ã¹ããããã«Figshareãããã°ã€ã³åãšãã¹ã¯ãŒããå
¥åããããæ±ããããŸãããã©ãŠã¶ã§ç¹å¥ãªWebããŒãžãéããŸãã
ç¹å¥ãªé¢æ°fs_search
ããããŸãããç§ã®çµéšã§ã¯ãèå³ã®ããããŒã¿ã»ãããèŠã€ããŠRã«ããŠã³ããŒãããäžæã®èå¥åãã³ããŒããæ¹ãç°¡åã§ãfs_download
颿°ã¯idããã¡ã€ã«ãããŠã³ããŒãããããã®çŽæ¥ãªã³ã¯ã«å€æããŸãã
library(tidyverse) library(rfigshare) url <- fs_download(article_id = "3394735") hockey <- read_csv(url) hockey %>% ggplot(aes(x = year, y = height))+ geom_jitter(size = 2, color = "#35978f", alpha = .1, width = .25)+ stat_smooth(method = "lm", size = 1)+ ylab("height, cm")+ xlab("year of competition")+ scale_x_continuous(breaks = seq(2005, 2015, 5), labels = seq(2005, 2015, 5))+ theme_minimal(base_family = "mono")

ãŠãŒãã¹ã¿ãã
ãŠãŒãã¹ã¿ããã¯ã欧å·è«žåœãšãã®å°åã«é¢ããä¿¡ããããªãã»ã©ã®éã®çµ±èšããããªãã¯ãã¡ã€ã³ã§å
¬éããŠããŸãã ãã¡ããããã®ãã¹ãŠã®å¯ãåŸãããã®ç¹å¥ãªeurostat
ããã±ãŒãžããããŸãã æ®å¿µãªããšã«ãçµã¿èŸŒã¿ã®search_eurostat
颿°search_eurostat
é¢é£ããããŒã¿ã»ãããèŠã€ããã®search_eurostat
貧匱ã§ãã ããšãã°ã life expectancy
ã¯2ã€ã®ãªãã·ã§ã³ãããããŸããããå®éã«ã¯æ°åã®ããŒã¿ã»ãããå¿
èŠã§ãã ãããã£ãŠãæã䟿å©ãªãœãªã¥ãŒã·ã§ã³ã¯æ¬¡ã®ããã«ãªããŸããEurostatã®Webãµã€ãã«ã¢ã¯ã»ã¹ã㊠ãèå³ã®ããããŒã¿ã»ãããèŠã€ãããã®ã³ãŒãã®ã¿ãã³ããŒããæåŸã«eurostat
ããã±ãŒãžã䜿çšããŠããŠã³ããŒãããŸãã Eurostatå°åçµ±èšã¯ãã¹ãŠå¥ã®ããŒã¿ããŒã¹ã«ããããšã«æ³šæããŠãã ããã
çŸåšããšãŒããã諞åœã®å¹³åäœåœã«é¢ããããŒã¿ãããŠã³ããŒãããŠããŸãã demo_mlexpec
ã³ãŒãã¯demo_mlexpec
ã§ãã
library(tidyverse) library(lubridate) library(eurostat) # download the selected dataset e0 <- get_eurostat("demo_mlexpec")
ããŒã¿ã»ããã®ãµã€ãºã«ãã£ãŠã¯ãããŠã³ããŒãã«æéããããå ŽåããããŸãã ãã®äŸã§ã¯ã400Kã®èŠ³æž¬å€ã®äžèŠæš¡ããŒã¿ã»ããããããŸãã äœããã®çç±ã§ããŒã¿ã»ãããèªåçã«ããŠã³ããŒãã§ããªãå ŽåïŒä»¥åã«ããŒã¿ã»ãããååŸããããšããªãå ŽåïŒãå¥ã®Eurostat Webãµã€ãããäžæ¬ããŠã³ããŒããµãŒãã¹ãæåã§ããŠã³ããŒãã§ããŸãã
ããã€ãã®éžæããããšãŒããã諞åœã®65æ³ã§ã®æ®ãã®å¹³åäœåœããç·æ§ãšå¥³æ§ã«åããŠèŠãŠã¿ãŸãããã 65æ³ã¯ãæãäžè¬çãªäŒçµ±çãªéè·å¹Žéœ¢ã§ãã ãã€ããã¯ã¹ã«ããããã®å¹Žéœ¢ã®æ®ãã®å¹³å寿åœãèŠãã®ã¯ãå®å¹Žã®æ¹é©ã«ã€ããŠã®è©±ã«ç
§ãããŠéåžžã«è峿·±ãã§ãã ããŠã³ããŒãããããŒã¿ããã65æ³ã§ã®å¹³åäœåœã®æšå®å€ã®ã¿ãéžæããç·æ§ãšå¥³æ§ã®æšå®å€ã®ã¿ãåå¥ã«ãã£ã«ã¿ãŒã§é€å€ããæåŸã«ãã€ãããã©ã³ã¹ãã€ã¿ãªã¢ããã·ã¢ãã¹ãã€ã³ãè±åœã®æ°ã«åœã®ã¿ãæ®ããŸãã
e0 %>% filter(! sex == "T", age == "Y65", geo %in% c("DE", "FR", "IT", "RU", "ES", "UK")) %>% ggplot(aes(x = time %>% year(), y = values, color = sex))+ geom_path()+ facet_wrap(~ geo, ncol = 3)+ labs(y = "Life expectancy at age 65", x = NULL)+ theme_minimal(base_family = "mono")

äžçéè¡
Rããã®äžçéè¡ããŒã¿ãžã®ã¢ã¯ã»ã¹ãæäŸããããã±ãŒãžãããã€ããããŸããããããããããã®äžã§æãwbstats
ãŠããã®ã¯ãããªãæè¿ã®wbstats
ã§ãã ãã®wbsearch
颿°ã¯ãé¢é£ããããŒã¿ã»ãããèŠã€ããã®ã«éåžžã«wbsearch
ãŸãã ããšãã°ã wbsearch("fertility")
ã¯ã339 wbsearch("fertility")
ã€ã³ãžã±ãŒã¿ãŒã®èª¬æãªã¹ãã䟿å©ãªãã¬ãŒãã®åœ¢åŒã§è¡šç€ºããŸãã
library(tidyverse) library(wbstats) # search for a dataset of interest wbsearch("fertility") %>% head
| indicatorID | ææš |
---|
2479 | SP.DYN.WFRT.Q5 | åèšåžæåºççïŒå¥³æ§äžäººåœããã®åºççïŒïŒQ5ïŒæé«ïŒ |
2480 | SP.DYN.WFRT.Q4 | åèšåžæåèçïŒå¥³æ§1人åœããã®åºçïŒïŒç¬¬4ååæ |
2481 | SP.DYN.WFRT.Q3 | åèšåžæåèçïŒå¥³æ§1人åœããã®åºçïŒïŒç¬¬3ååæ |
2482 | SP.DYN.WFRT.Q2 | åèšåžæåºççïŒå¥³æ§1人åœããã®åºçïŒïŒç¬¬2ååæ |
2483 | SP.DYN.WFRT.Q1 | åèšåžæåºççïŒå¥³æ§äžäººåœããã®åºççïŒïŒQ1ïŒæäœïŒ |
2484 | SP.DYN.WFRT | åžæåºççïŒå¥³æ§äžäººåœããã®åºçïŒ |
Lifetime risk of maternal death (%)
ïŒã³ãŒãSH.MMR.RISK.ZS
ïŒ-åã©ãã®èªçã«é¢é£ããçæ¶¯ã«ããã女æ§ã®æ»äº¡Lifetime risk of maternal death (%)
ææšãèŠãŠã¿ãŸãããã äžçéè¡ã¯ãåœãã°ã«ãŒãåããããã®ããã€ãã®ç°ãªããªãã·ã§ã³ãšåæ§ã«åœããŒã¿ãæäŸããŸãã æ³šç®ãã¹ãã°ã«ãŒãåãªãã·ã§ã³ã®1ã€ã¯ã人å£çµ±èšåŠçç§»è¡ã®å®å
šæ§ã«ããåé¢ã§ãã 以äžã«ãïŒ1ïŒAPãæ¢ã«å®äºããŠããåœãïŒ2ïŒAPãå®äºããŠããªãåœãããã³ïŒ3ïŒå
šäžçã«ã€ããŠéžæããã€ã³ãžã±ãŒã¿ã衚瀺ããŸãã
# fetch the selected dataset df_wb <- wb(indicator = "SH.MMR.RISK.ZS", startdate = 2000, enddate = 2015) # have look at the data for one year df_wb %>% filter(date == 2015) %>% View df_wb %>% filter(iso2c %in% c("V4", "V1", "1W")) %>% ggplot(aes(x = date %>% as.numeric(), y = value, color = country))+ geom_path(size = 1)+ scale_color_brewer(NULL, palette = "Dark2")+ labs(x = NULL, y = NULL, title = "Lifetime risk of maternal death (%)")+ theme_minimal(base_family = "mono")+ theme(panel.grid.minor = element_blank(), legend.position = c(.8, .9))

Oecd
çµæžååéçºæ©æ§ïŒOECDïŒã¯ãå çåœã®çµæžçããã³äººå£çµ±èšåŠçéçºã«é¢ããå€ãã®ããŒã¿ãå
¬éããŠããŸãã OECD
ããã±ãŒãžã¯ãRã§ã®ãã®ããŒã¿ã®äœ¿çšã倧å¹
ã«ç°¡çŽ åããŸãsearch_dataset
颿°search_dataset
ããŒã¯ãŒãã§å¿
èŠãªããŒã¿ãèŠã€ããã®ã«get_dataset
ã get_dataset
ã¯éžæããããŒã¿ã»ãããããŒãããŸãã æ¬¡ã®äŸã§ã¯ãå€±æ¥æéã®é·ãã«é¢ããããŒã¿ãããŠã³ããŒãããããŒããããèŠèŠåæ¹æ³ã䜿çšããŠãEU16ãEU28ãããã³ç±³åœã®ç·æ§äººå£ã«ã€ããŠãã®ããŒã¿ã衚瀺ããŸãã
library(tidyverse) library(viridis) library(OECD) # search by keyword search_dataset("unemployment") %>% View # download the selected dataset df_oecd <- get_dataset("AVD_DUR") # turn variable names to lowercase names(df_oecd) <- names(df_oecd) %>% tolower() df_oecd %>% filter(country %in% c("EU16", "EU28", "USA"), sex == "MEN", ! age == "1524") %>% ggplot(aes(obstime, age, fill = obsvalue))+ geom_tile()+ scale_fill_viridis("Months", option = "B")+ scale_x_discrete(breaks = seq(1970, 2015, 5) %>% paste)+ facet_wrap(~ country, ncol = 1)+ labs(x = NULL, y = "Age groups", title = "Average duration of unemployment in months, males")+ theme_minimal(base_family = "mono")

WID
World Wealth and Income Databaseã¯ãæåŸã®äžå¹³çãšå¯ã«é¢ãã調åããæç³»åããŒã¿ã§ãã ããŒã¿ããŒã¹éçºè
ã¯ããããŸã§ã®ãšããgithubã§ã®ã¿å©çšå¯èœãªç¹å¥ãªããã±ãŒãžRã®äœæãæ
åœããŸããã
library(tidyverse) #install.packages("devtools") devtools::install_github("WIDworld/wid-r-tool") library(wid)
ããŒã¿ãããŠã³ããŒãããããã®é¢æ°ã¯download_wid()
ã§ãã ããŒã¿ãããŠã³ããŒãããããã®åŒæ°ãæ£ããæå®ããã«ã¯ãããã±ãŒãžã®ããã¥ã¡ã³ããå°ãæãäžããŠãããªãè€éãªå€æ°ãšã³ã³ãŒãã·ã¹ãã ãææ¡ããå¿
èŠããããŸãã
?wid_series_type ?wid_concepts
äžèšã®äŸã¯ã ããããããã±ãŒãžãæ¹é ãããã®ã§ãã ãã©ã³ã¹ãšè±åœã®äººå£ã®æãè£çŠãª1ïŒ
ãš10ïŒ
ãææããå¯ã®ã·ã§ã¢ã衚瀺ããŸãã
df_wid <- download_wid( indicators = "shweal", # Shares of personal wealth areas = c("FR", "GB"), # In France an Italy perc = c("p90p100", "p99p100") # Top 1% and top 10% ) df_wid %>% ggplot(aes(x = year, y = value, color = country)) + geom_path()+ labs(title = "Top 1% and top 10% personal wealth shares in France and Great Britain", y = "top share")+ facet_wrap(~ percentile)+ theme_minimal(base_family = "mono")

ããæ»äº¡çããŒã¿ããŒã¹
人é人å£ã®ãã€ããã¯ã¹ã®æ³åã«é¢ãã倧ããªè³ªåã確èªããå Žåã 人éã®æ»äº¡çããŒã¿ããŒã¹ã»ã©ä¿¡é Œã§ããæ
å ±æºã¯ãããŸããã ãã®ããŒã¿ããŒã¹ã¯ããœãŒã¹ããŒã¿ã調åãããããã«æå
ç«¯ã®æ¹æ³è«ã䜿çšãã人å£çµ±èšåŠè
ã«ãã£ãŠèšèšããã³ç®¡çãããŠããŸãã HMDã¡ãœãããããã³ã«ã¯ã人å£çµ±èšããŒã¿ã®æ¹æ³è«ã®åäœã§ãã éã«ãååã«é«å質ã®ãœãŒã¹ããŒã¿ã¯ãæ¯èŒçå°æ°ã®åœã§ã®ã¿å©çšå¯èœã§ãã ãŠãŒã¶ãŒãå©çšã§ããããŒã¿ã«æ
£ããããã«ã JonasSchöleyãäœæãã[Human Mortality Database Explorer] [exp]ãå¿ãããå§ãããŸãã
Tim HMDHFDplus
ãããã§ããããæ°è¡ã®ã³ãŒãã§HMDããŒã¿ãRã«çŽæ¥ããŒãã§ããHMDHFDplus
ããã±ãŒãžããããŸãã ããŒã¿ã«ã¢ã¯ã»ã¹ããã«ã¯ãdeath.orgã®ç¡æã¢ã«ãŠã³ããå¿
èŠã§ãã ããã±ãŒãžã®ååããæšæž¬ã§ããããã«ãåæ§ã«çŸããHuman Fertility DatabaseããããŒã¿ãããŠã³ããŒãããããšãã§ããŸãã
以äžã®äŸã¯ä»¥åã®æçš¿ããåŒçšãããã®ã§ãå°ãæŽæ°ãããŠããŸãã Rãžã®èªåããŒã¿èªã¿èŸŒã¿ã®å
šåãååã«ç€ºããŠããããã«æããŸãã女æ§ãšç·æ§ãå¥ã
ã«ãå©çšå¯èœãªãã¹ãŠã®å¹Žã®ãã¹ãŠã®HMD諞åœã®1幎幎霢æ§é ãç°¡åãã€èªç¶ã«ããŠã³ããŒãããŸãã ãã®ã³ãŒããåçãããšããæ°åã¡ã¬ãã€ãã®ããŒã¿ãããŠã³ããŒããããããšãèŠããŠãã䟡å€ããããŸãã æ¬¡ã«ã2012幎ã«ãã¹ãŠã®åœã®ãã¹ãŠã®å¹Žéœ¢ã§æ§æ¯ãèšç®ããŠè¡šç€ºããŸãã æ§æ¯ã¯2ã€ã®äž»èŠãªäººå£çµ±èšãã¿ãŒã³ãåæ ããŠããŸãã1ïŒããå€ãã®ç·ã®åãåžžã«çãŸããŸãã 2ïŒç·æ§ã®æ»äº¡çã¯ããã¹ãŠã®å¹Žéœ¢ã§å¥³æ§ãããé«ãã ããç ç©¶ãããããã€ãã®äŸãé€ããŠãåºçæã®æ§æ¯ã¯ã100人ã®å¥³ã®åããã105ã106人ã®ç·ã®åã§ãã ãããã£ãŠãæ§æ¯ã®å¹Žéœ¢ãããã¡ã€ã«ã®ææå·®ã¯ãäž»ã«æ»äº¡çã®æ§å·®ãåæ ããŠããŸãã
# load required packages library(HMDHFDplus) library(tidyverse) library(purrr) # help function to list the available countries country <- getHMDcountries() # remove optional populations opt_pop <- c("FRACNP", "DEUTE", "DEUTW", "GBRCENW", "GBR_NP") country <- country[!country %in% opt_pop] # temporary function to download HMD data for a simgle county (dot = input) tempf_get_hmd <- . %>% readHMDweb("Exposures_1x1", ik_user_hmd, ik_pass_hmd) # download the data iteratively for all countries using purrr::map() exposures <- country %>% map(tempf_get_hmd) # data transformation to apply to each county dataframe tempf_trans_data <- . %>% select(Year, Age, Female, Male) %>% filter(Year %in% 2012) %>% select(-Year) %>% transmute(age = Age, ratio = Male / Female * 100) # perform transformation df_hmd <- exposures %>% map(tempf_trans_data) %>% bind_rows(.id = "country") # summarize all ages older than 90 (too jerky) df_hmd_90 <- df_hmd %>% filter(age %in% 90:110) %>% group_by(country) %>% summarise(ratio = ratio %>% mean(na.rm = T)) %>% ungroup() %>% transmute(country, age = 90, ratio) # insert summarized 90+ df_hmd_fin <- bind_rows(df_hmd %>% filter(!age %in% 90:110), df_hmd_90) # finaly - plot df_hmd_fin %>% ggplot(aes(age, ratio, color = country, group = country))+ geom_hline(yintercept = 100, color = "grey50", size = 1)+ geom_line(size = 1)+ scale_y_continuous(limits = c(0, 120), expand = c(0, 0), breaks = seq(0, 120, 20))+ scale_x_continuous(limits = c(0, 90), expand = c(0, 0), breaks = seq(0, 80, 20))+ facet_wrap(~country, ncol = 6)+ theme_minimal(base_family = "mono", base_size = 15)+ theme(legend.position = "none", panel.border = element_rect(size = .5, fill = NA, color = "grey50"))+ labs(x = "Age", y = "Sex ratio, males per 100 females", title = "Sex ratio in all countries from Human Mortality Database", subtitle = "HMD 2012, via HMDHFDplus by @timriffe1", caption = "ikashnitsky.imtqy.com")

åœé£ã®äžç人å£èŠéã
åœé£äººå£çã¯ãäžçã®ãã¹ãŠã®åœã«é¢ããé«åè³ªã®æšå®å€ãšäººå£äºæž¬ãå
¬éããŠããŸãã èšç®ã¯2ã3幎ããšã«æŽæ°ããã äžç人å£èŠéãã®ã€ã³ã¿ã©ã¯ãã£ããªã¬ããŒãã®åœ¢åŒã§å
¬éãããŸãã ãããã®ã¬ããŒãã«ã¯ãéåžžã«äžè¬çãªèšè¿°åæãšããã¡ããè±å¯ãªããŒã¿ãå«ãŸããŠããŸãã ããŒã¿ã¯wpp20xx
ãããªååãæã€ç¹å¥ãªããã±ãŒãžã§Rã§å©çšå¯èœã§ãã çŸåšãŸã§ã«ã2008ã2010ã2012ã2015ãããã³2017ã®ãªãªãŒã¹ã®ããŒã¿ãå©çšã§ããŸãã ããã§ãååã®æçš¿ããåãããwpp2015
ããŒã¿ã䜿çšããäŸã瀺ããŸãã
ã¯ã©ãŠã¹ã»ãŠã£ã«ã¯ã®çŽ æŽãããggridges
ããã±ãŒãžã®ãããã§ã ggjoy
å€ïŒä»¥åã çŸåšã¯æåŠããã ggjoy
åã®äžïŒã«ãã®å€ã«ggjoy
ã§äººæ°ãåãããªããžããããããžã¥ã¢ã©ã€ãŒãŒã·ã§ã³ã¿ã€ãã䜿çšããŠã1950幎以éã®äžçã®ç·æ§ã®å¹³å寿åœã®å°è±¡çãªåæã瀺ããŸãã
library(wpp2015) library(tidyverse) library(ggridges) library(viridis) # get the UN country names data(UNlocations) countries <- UNlocations %>% pull(name) %>% paste # data on male life expectancy at birth data(e0M) e0M %>% filter(country %in% countries) %>% select(-last.observed) %>% gather(period, value, 3:15) %>% ggplot(aes(x = value, y = period %>% fct_rev()))+ geom_density_ridges(aes(fill = period))+ scale_fill_viridis(discrete = T, option = "B", direction = -1, begin = .1, end = .9)+ labs(x = "Male life expectancy at birth", y = "Period", title = "Global convergence in male life expectancy at birth since 1950", subtitle = "UNPD World Population Prospects 2015 Revision, via wpp2015", caption = "ikashnitsky.imtqy.com")+ theme_minimal(base_family = "mono")+ theme(legend.position = "none")

欧å·ç€ŸäŒèª¿æ»ïŒESSïŒ
欧å·ç€ŸäŒèª¿æ»ã¯ãåœã¬ãã«ã®ä»£è¡šã§ãããåœéã§æ¯èŒå¯èœãªãšãŒããã人ã®äŸ¡å€ã«é¢ããç¬èªã®è©³çްããŒã¿ãå
¬éããŠããŸãã 2幎ããšã«ãåå åœã®ããããã§å¥ã®èª¿æ»ãè¡ãããŸãã ããŒã¿ã¯ç¡æç»é²åŸã«å
¥æã§ããŸãã ããŒã¿ã»ããã¯ãSASãSPSSããŸãã¯STATAã䜿çšããŠããã«åæã§ããããã«é
åžãããŸãã Jorge Cimentadaã®ãããã§ã ess
ããã±ãŒãžã䜿çšããŠãã®ããŒã¿ãæºåããRã§ååŸã§ããããã«ãªããŸããã 調æ»ã®æåŸã®æ³¢ã«åå ããŠãããã¹ãŠã®åœã§ãåçè
ãå°å
ã®èŠå¯ã«å¯Ÿããä¿¡é Œã¬ãã«ãã©ã®ããã«è©äŸ¡ãããã瀺ããŸãã
library(ess) library(tidyverse) # help gunction to see the available countries show_countries() # check the available rounds for a selected country show_country_rounds("Netherlands") # get the full dataset of the last (8) round df_ess <- ess_rounds(8, your_email = ik_email) # select a variable and calculate mean value df_ess_select <- df_ess %>% bind_rows() %>% select(idno, cntry, trstplc) %>% group_by(cntry) %>% mutate(avg = trstplc %>% mean(na.rm = T)) %>% ungroup() %>% mutate(cntry = cntry %>% as_factor() %>% fct_reorder(avg)) df_ess_select %>% ggplot(aes(trstplc, fill = avg))+ geom_histogram()+ scale_x_continuous(limits = c(0, 11), breaks = seq(2, 10, 2))+ scale_fill_gradient("Average\ntrust\nscore", low = "black", high = "aquamarine")+ facet_wrap(~cntry, ncol = 6)+ theme_minimal(base_family = "mono")+ labs(x = "Trust score [0 -- 10]", y = "# of respondents", title = "Trust in police", subtitle = "ESS wave 8 2017, via ess by @cimentadaj", caption = "ikashnitsky.imtqy.com")

ã¢ã¡ãªã«ã®ã³ãã¥ããã£èª¿æ»ãšåœå¢èª¿æ»
ããã€ãã®ããã±ãŒãžã¯ãç±³åœåœå¢èª¿æ»ã®ããŒã¿ãšå®æçãªã¢ã¡ãªã«ã®ã³ãã¥ããã£èª¿æ»èª¿æ»ãžã®ã¢ã¯ã»ã¹ãæäŸããŸãã ããããæãçŸããå®è£
ã¯ãæè¿Kyle Walker - tidycensus
ã«ãã£ãŠäœæãããŸããã ãã®ããã±ãŒãžã®ãã«æ©èœã¯ãçµ±èšãšãšãã«ç©ºéããŒã¿ãããŠã³ããŒãããæ©èœã§ãã 空éããŒã¿ã¯åçŽãªæ©èœã®åœ¢åŒã§ããŠã³ããŒããããŸããããã¯ãRã®ãžãªããŒã¿ã«å¯Ÿãã驿°çãªã¢ãããŒãã§ãæè¿sf
Edzer Pebesmaã§å°å
¥ãããŸãã ã ãã®ã¢ãããŒãã«ããããã¹ã¯ãŒãã®åŠçãäœååãé«éåãããã³ãŒããäžå¯èœã«ãªããŸãã ãããã詳现ã¯ã·ãªãŒãºã®æçµæçš¿ã«ãããŸãã ã·ã³ãã«ãªæ©èœãæç»ããã«ã¯ã ggplot2
ããã±ãŒãžã®éçºããŒãžã§ã³ãã€ã³ã¹ããŒã«ããå¿
èŠãããããšã«æ³šæããŠãã ããã
以äžã®ãããã¯ã2015幎ã®ACSããŒã¿ã«ãããšãã·ã«ãŽåžã®åœå¢èª¿æ»å°åºã®äººå£ã®äžå€®å€ã瀺ããŠããŸãã tidycensus
ã䜿çšããŠãã®ããŒã¿ãæœåºtidycensus
ã¯ãæåã«APIããŒãååŸããå¿
èŠããããŸããããã¯ã ããã§ç»é²ãããšãã«ç°¡åã«å®è¡ã§ããŸã ã
library(tidycensus) library(tidyverse) library(viridis) library(janitor) library(sf) # to use geom_sf we need the latest development version of ggplot2 devtools::install_github("tidyverse/ggplot2", "develop") library(ggplot2) # you need a personal API key, available free at # https://api.census.gov/data/key_signup.html # normally, this key is to be stored in .Renviron # see state and county codes and names fips_codes %>% View # the available variables load_variables(year = 2015, dataset = "acs5") %>% View # data on median age of population in Chicago df_acs <- get_acs( geography = "tract", county = "Cook County", state = "IL", variables = "B01002_001E", year = 2015, key = ik_api_acs, geometry = TRUE ) %>% clean_names() # map the data df_acs %>% ggplot()+ geom_sf(aes(fill = estimate %>% cut(breaks = seq(20, 60, 10))), color = NA)+ scale_fill_viridis_d("Median age", begin = .4)+ coord_sf(datum = NA)+ theme_void(base_family = "mono")+ theme(legend.position = c(.15, .15))+ labs(title = "Median age of population in Chicago\nby census tracts\n", subtitle = "ACS 2015, via tidycensus by @kyle_e_walker", caption = "ikashnitsky.imtqy.com", x = NULL, y = NULL)

Rã§ãªãŒãã³ããŒã¿ã®äžçãžã®æ
ããæ¥œãã¿ãã ããïŒ