Fetching data from GHO¶
[1]:
from ghoclient import GHOSession, index
import pandas as pd
%pylab inline
Populating the interactive namespace from numpy and matplotlib
[2]:
GC = GHOSession()
Listing indicator codes¶
[4]:
codes = GC.get_data_codes(format='dataframe')
print(f"Number of available indicators: {len(codes)}")
codes.head()
Number of available indicators: 3466
[4]:
| @Label | @DisplaySequence | @URL | Attr | Display | |
|---|---|---|---|---|---|
| 0 | MDG_0000000001 | 2 | https://www.who.int/data/gho/indicator-metadat... | [{'@Category': 'DISPLAY_FR', 'Value': {'Displa... | Infant mortality rate (probability of dying be... |
| 1 | MDG_0000000003 | 5 | https://www.who.int/data/gho/indicator-metadat... | [{'@Category': 'DEFINITION_XML', 'Value': {'Di... | Adolescent birth rate (per 1000 women aged 15-... |
| 2 | MDG_0000000005 | 10 | https://www.who.int/data/gho/indicator-metadat... | [{'@Category': 'CATEGORY', 'Value': {'Display'... | Contraceptive prevalence (%) |
| 3 | MDG_0000000006 | 5 | https://www.who.int/data/gho/indicator-metadat... | [{'@Category': 'CATEGORY', 'Value': {'Display'... | Unmet need for family planning (%) |
| 4 | MDG_0000000007 | 5 | https://www.who.int/data/gho/indicator-metadat... | [{'@Category': 'DISPLAY_FR', 'Value': {'Displa... | Under-five mortality rate (probability of dyin... |
Serching by keyword¶
Since the codes are not exactly mnemonic, we can search for all codes about tuberculosis, for example.
[15]:
index.build_index(codes)
results = index.search('tuberculosis')
results = pd.DataFrame(results)
results
[15]:
| code | description | |
|---|---|---|
| 0 | TB_1 | Tuberculosis treatment coverage |
| 1 | UHC_TB_DT | Tuberculosis detection and treatment |
| 2 | WHS3_522 | Number of reported cases of tuberculosis |
| 3 | TB_e_prev_num | Number of prevalent tuberculosis cases |
| 4 | TB_e_inc_num | Number of incident tuberculosis cases |
| 5 | TB_tot_newrel | Tuberculosis - new and relapse cases |
| 6 | TB_newinc | Tuberculosis - new and relapse cases |
| 7 | TB_c_newinc | Tuberculosis - new and relapse cases |
| 8 | TB_effective_treatment_coverage | Tuberculosis effective treatment coverage (%) |
| 9 | MDG_0000000022 | Tuberculosis detection rate under DOTS (%) |
| 10 | MDG_0000000024 | Tuberculosis treatment success under DOTS (%) |
| 11 | WHS3_54 | Number of reported cases of tuberculosis (DOTS) |
| 12 | MDG_0000000023 | Prevalence of tuberculosis (per 100 000 popula... |
| 13 | MDG_0000000030 | Smear-positive tuberculosis case-detection rat... |
| 14 | MDG_0000000031 | Smear-positive tuberculosis treatment-success ... |
| 15 | TB_e_mort_exc_tbhiv_num | Number of deaths due to tuberculosis, excludin... |
| 16 | TB_8_c_cdr | Case detection rate for all forms of tuberculosis |
| 17 | TB_e_inc_tbhiv_num | Number of incident tuberculosis cases, (HIV-p... |
| 18 | TB_e_inc_num_014 | Number of incident tuberculosis cases in child... |
| 19 | MDG_0000000020 | Incidence of tuberculosis (per 100 000 populat... |
| 20 | TB_e_inc_tbhiv_100k | Incidence of tuberculosis (per 100 000 populat... |
| 21 | TB_c_lab_cul_5m | Laboratories providing tuberculosis diagnostic... |
| 22 | TB_c_new_snep_tsr | Treatment success rate for new pulmonary smear... |
| 23 | MDG_0000000017 | Deaths due to tuberculosis among HIV-negative ... |
| 24 | MDG_0000000018 | Deaths due to tuberculosis among HIV-positive ... |
| 25 | TB_c_lab_sm_100k | Laboratories providing tuberculosis diagnostic... |
Let’s look at the “Incidence of tuberculosis (per 100 000 population per year)”: MDG_0000000020
[14]:
data = GC.fetch_data_from_codes(code='MDG_0000000020')
data = data[(data.REGION=='AFR')]
data
[14]:
| GHO | PUBLISHSTATE | YEAR | REGION | WORLDBANKINCOMEGROUP | COUNTRY | Display Value | Numeric | Low | High | Comments | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 15 | MDG_0000000020 | PUBLISHED | 2006 | AFR | NaN | NaN | 356 [312-404] | 356.0 | 312.0 | 404.0 | NaN |
| 16 | MDG_0000000020 | PUBLISHED | 2013 | AFR | NaN | NaN | 291 [258-326] | 291.0 | 258.0 | 326.0 | NaN |
| 17 | MDG_0000000020 | PUBLISHED | 2015 | AFR | NaN | NaN | 270 [240-302] | 270.0 | 240.0 | 302.0 | NaN |
| 40 | MDG_0000000020 | PUBLISHED | 2004 | AFR | NaN | AGO | 350 [227-500] | 350.0 | 227.0 | 500.0 | NaN |
| 41 | MDG_0000000020 | PUBLISHED | 2013 | AFR | NaN | AGO | 376 [243-537] | 376.0 | 243.0 | 537.0 | NaN |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 4056 | MDG_0000000020 | PUBLISHED | 2018 | AFR | NaN | UGA | 200 [118-304] | 200.0 | 118.0 | 304.0 | NaN |
| 4073 | MDG_0000000020 | PUBLISHED | 2013 | AFR | NaN | ZAF | 1110 [770-1500] | 1110.0 | 770.0 | 1500.0 | NaN |
| 4074 | MDG_0000000020 | PUBLISHED | 2016 | AFR | NaN | ZAF | 805 [561-1090] | 805.0 | 561.0 | 1090.0 | NaN |
| 4075 | MDG_0000000020 | PUBLISHED | 2013 | AFR | NaN | ZMB | 437 [283-625] | 437.0 | 283.0 | 625.0 | NaN |
| 4076 | MDG_0000000020 | PUBLISHED | 2010 | AFR | NaN | ZWE | 416 [324-518] | 416.0 | 324.0 | 518.0 | NaN |
949 rows × 11 columns
Now let’s find indicators related to water
[17]:
water_codes = index.search('water')
pd.DataFrame(water_codes)
[17]:
| code | description | |
|---|---|---|
| 0 | WAS_0000000001 | Access to improved drinking water sources |
| 1 | EQ_HANDWASHING | Households with soap and water at a handwashin... |
| 2 | WSH_10_WAT | Number of diarrhoea deaths from inadequate water |
| 3 | WSH_20_WAT | Attributable fraction of diarrhoea to inadequa... |
| 4 | WSH_30_WAT | Number of diarrhoea DALYs from inadequate water |
| 5 | RADON_Q405 | Radon in national drinking-water regulations |
| 6 | WHS5_122 | Population using improved drinking-water sourc... |
| 7 | WSH_5 | Water, sanitation and hygiene attributable DAL... |
| 8 | EQ_WATER | Population using improved drinking-water sourc... |
| 9 | EQ_WATERIMPROVED | Households using an improved drinking-water so... |
| 10 | EQ_WATERPIPED | Households using a piped drinking-water source... |
| 11 | WSH_09 | Water, sanitation and hygiene - population att... |
| 12 | WSH_WATER_SAFELY_MANAGED | Population using safely managed drinking-water... |
| 13 | WSH_WATER_BASIC | Population using at least basic drinking-water... |
| 14 | WSH_10 | Number of diarrhoea deaths from inadequate wat... |
| 15 | WSH_40_WAT | Diarrhoea deaths from inadequate water in chil... |
| 16 | WSH_50_WAT | Diarrhoea DALYs from inadequate water in child... |
| 17 | WSH_20 | Attributable fraction of diarrhoea to inadequa... |
| 18 | WSH_30 | Number of diarrhoea DALYs from inadequate wate... |
| 19 | NLIS_NU_CA_017 | NLIS: Population using improved drinking-water... |
| 20 | EQ_WATERPREMISES | Households using a piped onto premises drinkin... |
| 21 | WSH_2 | Water, sanitation and hygiene attributable dea... |
| 22 | WSH_3 | Water, sanitation and hygiene attributable dea... |
| 23 | WSH_6 | Water, sanitation and hygiene attributable DAL... |
| 24 | WSH_7 | Water, sanitation and hygiene attributable DAL... |
| 25 | WSH_40 | Diarrhoea deaths from inadequate water, sanita... |
| 26 | WSH_50 | Diarrhoea DALYs from inadequate water, sanitat... |
| 27 | WSH_4 | Water, sanitation and hygiene attributable de... |
| 28 | WSH_8 | Water, sanitation and hygiene attributable DA... |
| 29 | SDGODAWS | Amount of water- and sanitation-related offici... |
Not all indicators are available for all countries in recent years, so we can easily check what’s available.
[24]:
for c in water_codes:
print(f"Checking {c['code']}: {c['description']}")
data = GC.fetch_data_from_codes(code=c['code'])
try:
data = data[(data.REGION=='AFR')&(data.YEAR==data.YEAR.max())]
print(f"\tLatest year available for {c['code']} in Africa: {data.YEAR.max()}")
except AttributeError as e:
print("\tno data available:\n\t ", e)
if len(data) >=54:
print(f"\tCode available on all countries for {data.YEAR.max()}")
Checking WAS_0000000001: Access to improved drinking water sources
no data available:
'DataFrame' object has no attribute 'REGION'
Checking EQ_HANDWASHING: Households with soap and water at a handwashing facility (%)
no data available:
'DataFrame' object has no attribute 'REGION'
Checking WSH_10_WAT: Number of diarrhoea deaths from inadequate water
Latest year available for WSH_10_WAT in Africa: 2016
Code available on all countries for 2016
Checking WSH_20_WAT: Attributable fraction of diarrhoea to inadequate water
Latest year available for WSH_20_WAT in Africa: 2016
Checking WSH_30_WAT: Number of diarrhoea DALYs from inadequate water
Latest year available for WSH_30_WAT in Africa: 2016
Code available on all countries for 2016
Checking RADON_Q405: Radon in national drinking-water regulations
Latest year available for RADON_Q405 in Africa: 2019
Checking WHS5_122: Population using improved drinking-water sources (%)
no data available:
'DataFrame' object has no attribute 'REGION'
Checking WSH_5: Water, sanitation and hygiene attributable DALYs ('000)
Latest year available for WSH_5 in Africa: 2004
Checking EQ_WATER: Population using improved drinking-water sources (%)
no data available:
'DataFrame' object has no attribute 'REGION'
Checking EQ_WATERIMPROVED: Households using an improved drinking-water source (%)
no data available:
'DataFrame' object has no attribute 'REGION'
Checking EQ_WATERPIPED: Households using a piped drinking-water source (%)
no data available:
'DataFrame' object has no attribute 'REGION'
Checking WSH_09: Water, sanitation and hygiene - population attributable fractions
no data available:
'DataFrame' object has no attribute 'REGION'
Checking WSH_WATER_SAFELY_MANAGED: Population using safely managed drinking-water services (%)
Latest year available for WSH_WATER_SAFELY_MANAGED in Africa: 2017
Checking WSH_WATER_BASIC: Population using at least basic drinking-water services (%)
Latest year available for WSH_WATER_BASIC in Africa: 2017
Code available on all countries for 2017
Checking WSH_10: Number of diarrhoea deaths from inadequate water, sanitation and hygiene
Latest year available for WSH_10 in Africa: 2016
Code available on all countries for 2016
Checking WSH_40_WAT: Diarrhoea deaths from inadequate water in children under 5 years
no data available:
'DataFrame' object has no attribute 'REGION'
Checking WSH_50_WAT: Diarrhoea DALYs from inadequate water in children under 5 years
no data available:
'DataFrame' object has no attribute 'REGION'
Checking WSH_20: Attributable fraction of diarrhoea to inadequate water, sanitation and hygiene
Latest year available for WSH_20 in Africa: 2016
Checking WSH_30: Number of diarrhoea DALYs from inadequate water, sanitation and hygiene
Latest year available for WSH_30 in Africa: 2016
Code available on all countries for 2016
Checking NLIS_NU_CA_017: NLIS: Population using improved drinking-water sources (%)
Latest year available for NLIS_NU_CA_017 in Africa: 2017
Checking EQ_WATERPREMISES: Households using a piped onto premises drinking-water source (%)
no data available:
'DataFrame' object has no attribute 'REGION'
Checking WSH_2: Water, sanitation and hygiene attributable deaths ('000) in children under 5 years
Latest year available for WSH_2 in Africa: 2004
Checking WSH_3: Water, sanitation and hygiene attributable deaths per 100'000 capita
Latest year available for WSH_3 in Africa: 2004
Checking WSH_6: Water, sanitation and hygiene attributable DALYs ('000) in children under 5 years
Latest year available for WSH_6 in Africa: 2004
Checking WSH_7: Water, sanitation and hygiene attributable DALYs per 100'000 capita
Latest year available for WSH_7 in Africa: 2004
Checking WSH_40: Diarrhoea deaths from inadequate water, sanitation and hygiene in children under 5 years
no data available:
'DataFrame' object has no attribute 'REGION'
Checking WSH_50: Diarrhoea DALYs from inadequate water, sanitation and hygiene in children under 5 years
no data available:
'DataFrame' object has no attribute 'REGION'
Checking WSH_4: Water, sanitation and hygiene attributable deaths per 100'000 children under 5 years
Latest year available for WSH_4 in Africa: 2004
Checking WSH_8: Water, sanitation and hygiene attributable DALYs per 100'000 children under 5 years
Latest year available for WSH_8 in Africa: 2004
Checking SDGODAWS: Amount of water- and sanitation-related official development assistance that is part of a government-coordinated spending plan (current US$ millions)
Latest year available for SDGODAWS in Africa: nan
[23]:
data = GC.fetch_data_from_codes(code='WSH_WATER_BASIC')
data = data[(data.REGION=='AFR')&(data.YEAR==data.YEAR.max())]
data
[23]:
| GHO | PUBLISHSTATE | YEAR | REGION | COUNTRY | RESIDENCEAREATYPE | Display Value | Numeric | Low | High | Comments | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 159 | WSH_WATER_BASIC | PUBLISHED | 2017 | AFR | DZA | RUR | 89.0 | 88.69096 | NaN | NaN | NaN |
| 160 | WSH_WATER_BASIC | PUBLISHED | 2017 | AFR | DZA | TOTL | 94.0 | 93.55589 | NaN | NaN | NaN |
| 161 | WSH_WATER_BASIC | PUBLISHED | 2017 | AFR | DZA | URB | 95.0 | 95.44293 | NaN | NaN | NaN |
| 267 | WSH_WATER_BASIC | PUBLISHED | 2017 | AFR | AGO | RUR | 27.0 | 27.44429 | NaN | NaN | NaN |
| 268 | WSH_WATER_BASIC | PUBLISHED | 2017 | AFR | AGO | TOTL | 56.0 | 55.84290 | NaN | NaN | NaN |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 10420 | WSH_WATER_BASIC | PUBLISHED | 2017 | AFR | ZMB | TOTL | 60.0 | 59.96376 | NaN | NaN | NaN |
| 10421 | WSH_WATER_BASIC | PUBLISHED | 2017 | AFR | ZMB | URB | 84.0 | 83.86312 | NaN | NaN | NaN |
| 10473 | WSH_WATER_BASIC | PUBLISHED | 2017 | AFR | ZWE | RUR | 50.0 | 49.80476 | NaN | NaN | NaN |
| 10474 | WSH_WATER_BASIC | PUBLISHED | 2017 | AFR | ZWE | TOTL | 64.0 | 64.05123 | NaN | NaN | NaN |
| 10475 | WSH_WATER_BASIC | PUBLISHED | 2017 | AFR | ZWE | URB | 94.0 | 93.99767 | NaN | NaN | NaN |
141 rows × 11 columns
[ ]: