About Dataset¶
This meticulously curated dataset offers a panoramic view of education on a global scale , delivering profound insights into the dynamic landscape of education across diverse countries and regions. Spanning a rich tapestry of educational aspects, it encapsulates crucial metrics including out-of-school rates, completion rates, proficiency levels, literacy rates, birth rates, and primary and tertiary education enrollment statistics. A treasure trove of knowledge, this dataset is an indispensable asset for discerning researchers, dedicated educators, and forward-thinking policymakers, enabling them to embark on a transformative journey of assessing, enhancing, and reshaping education systems worldwide.
This dataset provides a comprehensive global perspective on education, offering vital insights into diverse education systems worldwide. It covers essential metrics like out-of-school rates, completion rates, proficiency levels, literacy rates, birth rates, and enrollment in primary and tertiary education. It's a valuable resource for researchers, educators, and policymakers looking to assess and improve education systems globally.
Dataset Link -Kaggle link
Data Dictionary¶
Countries and Areas
: Name of the countries and areas.Latitude
: Latitude coordinates of the geographical location.Longitude
: Longitude coordinates of the geographical location.OOSR_Pre0Primary_Age_Male
: Out-of-school rate for pre-primary age males.OOSR_Pre0Primary_Age_Female
: Out-of-school rate for pre-primary age females.OOSR_Primary_Age_Male
: Out-of-school rate for primary age males.OOSR_Primary_Age_Female
: Out-of-school rate for primary age females.OOSR_Lower_Secondary_Age_Male
: Out-of-school rate for lower secondary age males.OOSR_Lower_Secondary_Age_Female
: Out-of-school rate for lower secondary age females.OOSR_Upper_Secondary_Age_Male
: Out-of-school rate for upper secondary age males.OOSR_Upper_Secondary_Age_Female
: Out-of-school rate for upper secondary age females.Completion_Rate_Primary_Male
: Completion rate for primary education among males.Completion_Rate_Primary_Female
: Completion rate for primary education among females.Completion_Rate_Lower_Secondary_Male
: Completion rate for lower secondary education among males.Completion_Rate_Lower_Secondary_Female
: Completion rate for lower secondary education among females.Completion_Rate_Upper_Secondary_Male
: Completion rate for upper secondary education among males.Completion_Rate_Upper_Secondary_Female
: Completion rate for upper secondary education among females.Grade_2_3_Proficiency_Reading
: Proficiency in reading for grade 2-3 students.Grade_2_3_Proficiency_Math
: Proficiency in math for grade 2-3 students.Primary_End_Proficiency_Reading
: Proficiency in reading at the end of primary education.Primary_End_Proficiency_Math
: Proficiency in math at the end of primary education.Lower_Secondary_End_Proficiency_Reading
: Proficiency in reading at the end of lower secondary education.Lower_Secondary_End_Proficiency_Math
: Proficiency in math at the end of lower secondary education.Youth_15_24_Literacy_Rate_Male
: Literacy rate among male youths aged 15-24.Youth_15_24_Literacy_Rate_Female
: Literacy rate among female youths aged 15-24.Birth_Rate
: Birth rate in the respective countries/areas.Gross_Primary_Education_Enrollment
: Gross enrollment in primary education.Gross_Tertiary_Education_Enrollment
: Gross enrollment in tertiary education.Unemployment_Rate
: Unemployment rate in the respective countries/areas.
Installing dependency¶
👉Ignore It if already installed
1. !pip install numpy
2. !pip install pandas
3. !pip install matplotlib
4. !pip install seaborn
step -1 Data Preprocessing and Cleaning¶
Importing Required library¶
# perform linear operations
import numpy as np
# Data manipulation
import pandas as pd
#Data Visualization
import matplotlib.pyplot as plt
import seaborn as sns
# Remove warnings
import warnings
warnings.filterwarnings('ignore')
#Load the dataset
education = pd.read_csv(r"C:\Users\Lenovo\Downloads\content\World Education Data Analysis\Global_Education.csv", encoding='ISO-8859-1')
# Print top 5 rows
education.head()
Countries and areas | Latitude | Longitude | OOSR_Pre0Primary_Age_Male | OOSR_Pre0Primary_Age_Female | OOSR_Primary_Age_Male | OOSR_Primary_Age_Female | OOSR_Lower_Secondary_Age_Male | OOSR_Lower_Secondary_Age_Female | OOSR_Upper_Secondary_Age_Male | ... | Primary_End_Proficiency_Reading | Primary_End_Proficiency_Math | Lower_Secondary_End_Proficiency_Reading | Lower_Secondary_End_Proficiency_Math | Youth_15_24_Literacy_Rate_Male | Youth_15_24_Literacy_Rate_Female | Birth_Rate | Gross_Primary_Education_Enrollment | Gross_Tertiary_Education_Enrollment | Unemployment_Rate | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Afghanistan | 33.939110 | 67.709953 | 0 | 0 | 0 | 0 | 0 | 0 | 44 | ... | 13 | 11 | 0 | 0 | 74 | 56 | 32.49 | 104.0 | 9.7 | 11.12 |
1 | Albania | 41.153332 | 20.168331 | 4 | 2 | 6 | 3 | 6 | 1 | 21 | ... | 0 | 0 | 48 | 58 | 99 | 100 | 11.78 | 107.0 | 55.0 | 12.33 |
2 | Algeria | 28.033886 | 1.659626 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 21 | 19 | 98 | 97 | 24.28 | 109.9 | 51.4 | 11.70 |
3 | Andorra | 42.506285 | 1.521801 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 7.20 | 106.4 | 0.0 | 0.00 |
4 | Angola | 11.202692 | 17.873887 | 31 | 39 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 40.73 | 113.5 | 9.3 | 6.89 |
5 rows × 29 columns
education
Countries and areas | Latitude | Longitude | OOSR_Pre0Primary_Age_Male | OOSR_Pre0Primary_Age_Female | OOSR_Primary_Age_Male | OOSR_Primary_Age_Female | OOSR_Lower_Secondary_Age_Male | OOSR_Lower_Secondary_Age_Female | OOSR_Upper_Secondary_Age_Male | ... | Primary_End_Proficiency_Reading | Primary_End_Proficiency_Math | Lower_Secondary_End_Proficiency_Reading | Lower_Secondary_End_Proficiency_Math | Youth_15_24_Literacy_Rate_Male | Youth_15_24_Literacy_Rate_Female | Birth_Rate | Gross_Primary_Education_Enrollment | Gross_Tertiary_Education_Enrollment | Unemployment_Rate | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Afghanistan | 33.939110 | 67.709953 | 0 | 0 | 0 | 0 | 0 | 0 | 44 | ... | 13 | 11 | 0 | 0 | 74 | 56 | 32.49 | 104.0 | 9.7 | 11.12 |
1 | Albania | 41.153332 | 20.168331 | 4 | 2 | 6 | 3 | 6 | 1 | 21 | ... | 0 | 0 | 48 | 58 | 99 | 100 | 11.78 | 107.0 | 55.0 | 12.33 |
2 | Algeria | 28.033886 | 1.659626 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 21 | 19 | 98 | 97 | 24.28 | 109.9 | 51.4 | 11.70 |
3 | Andorra | 42.506285 | 1.521801 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 7.20 | 106.4 | 0.0 | 0.00 |
4 | Angola | 11.202692 | 17.873887 | 31 | 39 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 40.73 | 113.5 | 9.3 | 6.89 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
197 | Venezuela | 6.423750 | 66.589730 | 14 | 14 | 10 | 10 | 15 | 13 | 28 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 17.88 | 97.2 | 79.3 | 8.80 |
198 | Vietnam | 14.058324 | 108.277199 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 55 | 51 | 86 | 81 | 98 | 98 | 16.75 | 110.6 | 28.5 | 2.01 |
199 | Yemen | 15.552727 | 48.516388 | 96 | 96 | 10 | 21 | 23 | 34 | 46 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 30.45 | 93.6 | 10.2 | 12.91 |
200 | Zambia | 13.133897 | 27.849332 | 0 | 0 | 17 | 13 | 0 | 0 | 0 | ... | 0 | 0 | 5 | 2 | 93 | 92 | 36.19 | 98.7 | 4.1 | 11.43 |
201 | Zimbabwe | 19.015438 | 29.154857 | 60 | 58 | 0 | 0 | 0 | 0 | 45 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 30.68 | 109.9 | 10.0 | 4.95 |
202 rows × 29 columns
education.shape
(202, 29)
From above cell we see that the dataset contain 202 observations and 29 columns
#Check info of each colummn
education.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 202 entries, 0 to 201 Data columns (total 29 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Countries and areas 202 non-null object 1 Latitude 202 non-null float64 2 Longitude 202 non-null float64 3 OOSR_Pre0Primary_Age_Male 202 non-null int64 4 OOSR_Pre0Primary_Age_Female 202 non-null int64 5 OOSR_Primary_Age_Male 202 non-null int64 6 OOSR_Primary_Age_Female 202 non-null int64 7 OOSR_Lower_Secondary_Age_Male 202 non-null int64 8 OOSR_Lower_Secondary_Age_Female 202 non-null int64 9 OOSR_Upper_Secondary_Age_Male 202 non-null int64 10 OOSR_Upper_Secondary_Age_Female 202 non-null int64 11 Completion_Rate_Primary_Male 202 non-null int64 12 Completion_Rate_Primary_Female 202 non-null int64 13 Completion_Rate_Lower_Secondary_Male 202 non-null int64 14 Completion_Rate_Lower_Secondary_Female 202 non-null int64 15 Completion_Rate_Upper_Secondary_Male 202 non-null int64 16 Completion_Rate_Upper_Secondary_Female 202 non-null int64 17 Grade_2_3_Proficiency_Reading 202 non-null int64 18 Grade_2_3_Proficiency_Math 202 non-null int64 19 Primary_End_Proficiency_Reading 202 non-null int64 20 Primary_End_Proficiency_Math 202 non-null int64 21 Lower_Secondary_End_Proficiency_Reading 202 non-null int64 22 Lower_Secondary_End_Proficiency_Math 202 non-null int64 23 Youth_15_24_Literacy_Rate_Male 202 non-null int64 24 Youth_15_24_Literacy_Rate_Female 202 non-null int64 25 Birth_Rate 202 non-null float64 26 Gross_Primary_Education_Enrollment 202 non-null float64 27 Gross_Tertiary_Education_Enrollment 202 non-null float64 28 Unemployment_Rate 202 non-null float64 dtypes: float64(6), int64(22), object(1) memory usage: 45.9+ KB
From above cell we see that there are 1 object column and 22 integer and 6 column contain float values
# Checking null values
education.isnull().sum()
Countries and areas 0 Latitude 0 Longitude 0 OOSR_Pre0Primary_Age_Male 0 OOSR_Pre0Primary_Age_Female 0 OOSR_Primary_Age_Male 0 OOSR_Primary_Age_Female 0 OOSR_Lower_Secondary_Age_Male 0 OOSR_Lower_Secondary_Age_Female 0 OOSR_Upper_Secondary_Age_Male 0 OOSR_Upper_Secondary_Age_Female 0 Completion_Rate_Primary_Male 0 Completion_Rate_Primary_Female 0 Completion_Rate_Lower_Secondary_Male 0 Completion_Rate_Lower_Secondary_Female 0 Completion_Rate_Upper_Secondary_Male 0 Completion_Rate_Upper_Secondary_Female 0 Grade_2_3_Proficiency_Reading 0 Grade_2_3_Proficiency_Math 0 Primary_End_Proficiency_Reading 0 Primary_End_Proficiency_Math 0 Lower_Secondary_End_Proficiency_Reading 0 Lower_Secondary_End_Proficiency_Math 0 Youth_15_24_Literacy_Rate_Male 0 Youth_15_24_Literacy_Rate_Female 0 Birth_Rate 0 Gross_Primary_Education_Enrollment 0 Gross_Tertiary_Education_Enrollment 0 Unemployment_Rate 0 dtype: int64
From above cell we see that there are no missing values in the dataset
# check for duplicate
education.duplicated().sum()
0
From above cell we see that there is no duplicate present in our dataset
# find the summary satistics of our data
education.describe()
Latitude | Longitude | OOSR_Pre0Primary_Age_Male | OOSR_Pre0Primary_Age_Female | OOSR_Primary_Age_Male | OOSR_Primary_Age_Female | OOSR_Lower_Secondary_Age_Male | OOSR_Lower_Secondary_Age_Female | OOSR_Upper_Secondary_Age_Male | OOSR_Upper_Secondary_Age_Female | ... | Primary_End_Proficiency_Reading | Primary_End_Proficiency_Math | Lower_Secondary_End_Proficiency_Reading | Lower_Secondary_End_Proficiency_Math | Youth_15_24_Literacy_Rate_Male | Youth_15_24_Literacy_Rate_Female | Birth_Rate | Gross_Primary_Education_Enrollment | Gross_Tertiary_Education_Enrollment | Unemployment_Rate | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
count | 202.000000 | 202.000000 | 202.000000 | 202.000000 | 202.000000 | 202.000000 | 202.000000 | 202.000000 | 202.000000 | 202.000000 | ... | 202.000000 | 202.000000 | 202.000000 | 202.000000 | 202.000000 | 202.000000 | 202.000000 | 202.000000 | 202.000000 | 202.000000 |
mean | 25.081422 | 55.166928 | 19.658416 | 19.282178 | 5.282178 | 5.569307 | 8.707921 | 8.831683 | 20.292079 | 19.975248 | ... | 10.717822 | 10.376238 | 25.787129 | 24.450495 | 35.801980 | 35.084158 | 18.914010 | 94.942574 | 34.392574 | 6.000000 |
std | 16.813639 | 45.976287 | 25.007604 | 25.171147 | 9.396442 | 10.383092 | 13.258203 | 14.724717 | 21.485592 | 23.140376 | ... | 24.866101 | 22.484423 | 33.181384 | 31.965467 | 45.535186 | 45.249643 | 10.828184 | 29.769338 | 29.978206 | 5.273136 |
min | 0.023559 | 0.824782 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | ... | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
25% | 11.685062 | 18.665678 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.250000 | 0.250000 | ... | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 10.355000 | 97.200000 | 9.000000 | 2.302500 |
50% | 21.207861 | 43.518091 | 9.000000 | 7.000000 | 1.000000 | 1.000000 | 2.000000 | 2.000000 | 15.000000 | 12.000000 | ... | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 17.550000 | 101.850000 | 24.850000 | 4.585000 |
75% | 39.901792 | 77.684945 | 31.000000 | 30.000000 | 6.000000 | 6.750000 | 12.750000 | 10.750000 | 32.750000 | 30.000000 | ... | 0.000000 | 0.000000 | 56.750000 | 50.750000 | 94.000000 | 96.750000 | 27.692500 | 107.300000 | 59.975000 | 8.655000 |
max | 64.963051 | 178.065032 | 96.000000 | 96.000000 | 58.000000 | 67.000000 | 61.000000 | 70.000000 | 84.000000 | 89.000000 | ... | 99.000000 | 89.000000 | 89.000000 | 94.000000 | 100.000000 | 100.000000 | 46.080000 | 142.500000 | 136.600000 | 28.180000 |
8 rows × 28 columns
education.columns
Index(['Countries and areas', 'Latitude ', 'Longitude', 'OOSR_Pre0Primary_Age_Male', 'OOSR_Pre0Primary_Age_Female', 'OOSR_Primary_Age_Male', 'OOSR_Primary_Age_Female', 'OOSR_Lower_Secondary_Age_Male', 'OOSR_Lower_Secondary_Age_Female', 'OOSR_Upper_Secondary_Age_Male', 'OOSR_Upper_Secondary_Age_Female', 'Completion_Rate_Primary_Male', 'Completion_Rate_Primary_Female', 'Completion_Rate_Lower_Secondary_Male', 'Completion_Rate_Lower_Secondary_Female', 'Completion_Rate_Upper_Secondary_Male', 'Completion_Rate_Upper_Secondary_Female', 'Grade_2_3_Proficiency_Reading', 'Grade_2_3_Proficiency_Math', 'Primary_End_Proficiency_Reading', 'Primary_End_Proficiency_Math', 'Lower_Secondary_End_Proficiency_Reading', 'Lower_Secondary_End_Proficiency_Math', 'Youth_15_24_Literacy_Rate_Male', 'Youth_15_24_Literacy_Rate_Female', 'Birth_Rate', 'Gross_Primary_Education_Enrollment', 'Gross_Tertiary_Education_Enrollment', 'Unemployment_Rate'], dtype='object')
Step -2 Data analysis¶
Let's Solve some practice Questions
grouped_data = education.groupby('Countries and areas')
# Compute average completion rates for each countries and education level
completion_rates = grouped_data[['Completion_Rate_Primary_Male', 'Completion_Rate_Primary_Female',
'Completion_Rate_Lower_Secondary_Male', 'Completion_Rate_Lower_Secondary_Female',
'Completion_Rate_Upper_Secondary_Male', 'Completion_Rate_Upper_Secondary_Female']].mean()
top_10_countries_completion_rates=completion_rates.sort_values(by=['Completion_Rate_Primary_Male', 'Completion_Rate_Primary_Female',
'Completion_Rate_Lower_Secondary_Male', 'Completion_Rate_Lower_Secondary_Female',
'Completion_Rate_Upper_Secondary_Male', 'Completion_Rate_Upper_Secondary_Female'],ascending=False).head(10)
top_10_countries_completion_rates
Completion_Rate_Primary_Male | Completion_Rate_Primary_Female | Completion_Rate_Lower_Secondary_Male | Completion_Rate_Lower_Secondary_Female | Completion_Rate_Upper_Secondary_Male | Completion_Rate_Upper_Secondary_Female | |
---|---|---|---|---|---|---|
Countries and areas | ||||||
North Korea | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 |
Kazakhstan | 100.0 | 100.0 | 100.0 | 100.0 | 95.0 | 96.0 |
Belarus | 100.0 | 100.0 | 100.0 | 100.0 | 91.0 | 94.0 |
Turkmenistan | 100.0 | 100.0 | 99.0 | 100.0 | 93.0 | 95.0 |
Georgia | 100.0 | 100.0 | 98.0 | 98.0 | 79.0 | 83.0 |
Ukraine | 100.0 | 99.0 | 100.0 | 100.0 | 97.0 | 97.0 |
Cuba | 100.0 | 98.0 | 95.0 | 98.0 | 85.0 | 86.0 |
Kyrgyzstan | 99.0 | 100.0 | 99.0 | 99.0 | 89.0 | 85.0 |
Serbia | 99.0 | 100.0 | 99.0 | 99.0 | 71.0 | 81.0 |
Bosnia and Herzegovina | 99.0 | 100.0 | 97.0 | 97.0 | 92.0 | 92.0 |
# Plot the data
plt.figure(figsize=(14, 5))
sns.heatmap(top_10_countries_completion_rates, annot=True, cmap="YlGnBu", fmt=".0f", cbar_kws={'label': 'Average Completion Rate'})
plt.title('Average Completion Rates of top 10 countries and Education Levels')
plt.show()
Based on the aforementioned plots, it is evident that the top 10 countries—North Korea, Kazakhstan, Belarus, Turkmenistan, Georgia, Ukraine, Cuba, Kyrgyzstan, Serbia, and Bosnia and Herzegovina—demonstrate superior average completion rates across various education levels. The analysis specifically focuses on completion rates for both male and female students in primary, lower secondary, and upper secondary education.
least_10_countries=completion_rates.sort_values(by=['Completion_Rate_Primary_Male', 'Completion_Rate_Primary_Female',
'Completion_Rate_Lower_Secondary_Male', 'Completion_Rate_Lower_Secondary_Female',
'Completion_Rate_Upper_Secondary_Male', 'Completion_Rate_Upper_Secondary_Female'],ascending=True).head(10)
least_10_countries
Completion_Rate_Primary_Male | Completion_Rate_Primary_Female | Completion_Rate_Lower_Secondary_Male | Completion_Rate_Lower_Secondary_Female | Completion_Rate_Upper_Secondary_Male | Completion_Rate_Upper_Secondary_Female | |
---|---|---|---|---|---|---|
Countries and areas | ||||||
Andorra | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
Anguilla | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
Antigua and Barbuda | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
Australia | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
Austria | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
Azerbaijan | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
Bahrain | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
Belgium | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
Bolivia | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
British Virgin Islands | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
# Plot the data
plt.figure(figsize=(14, 5))
sns.heatmap(least_10_countries, annot=True, cmap="YlGnBu", fmt=".0f", cbar_kws={'label': 'Average Completion Rate'})
plt.title('Average Completion Rates of least 10 countries and Education Levels')
plt.show()
Upon analyzing the provided plots, it becomes apparent that the bottom 10 countries—Andorra, Anguilla, Antigua and Barbuda, Australia, Austria, Azerbaijan, Bahrain, Belgium, Bolivia, and the British Virgin Islands—exhibit notably lower average completion rates across various education levels. This analysis is particularly concentrated on the completion rates for male and female students in primary, lower secondary, and upper secondary education.
Are there noticeable differences in literacy rates between countries in different geographical locations?¶
grouped_data = education.groupby('Countries and areas')
avg_literacy_rates = grouped_data[['Youth_15_24_Literacy_Rate_Male', 'Youth_15_24_Literacy_Rate_Female']].mean()
top_10_countries_by_literacy=avg_literacy_rates.sort_values(by=['Youth_15_24_Literacy_Rate_Male', 'Youth_15_24_Literacy_Rate_Female'],ascending=False).head(10)
top_10_countries_by_literacy
Youth_15_24_Literacy_Rate_Male | Youth_15_24_Literacy_Rate_Female | |
---|---|---|
Countries and areas | ||
Belarus | 100.0 | 100.0 |
Brunei | 100.0 | 100.0 |
China | 100.0 | 100.0 |
Indonesia | 100.0 | 100.0 |
Italy | 100.0 | 100.0 |
Kazakhstan | 100.0 | 100.0 |
Kyrgyzstan | 100.0 | 100.0 |
Latvia | 100.0 | 100.0 |
Lebanon | 100.0 | 100.0 |
Portugal | 100.0 | 100.0 |
top_10_countries_by_literacy.plot(kind='bar', figsize=(10, 6), title='Average Literacy Rates of top 10 countries')
plt.xlabel('Countries')
plt.ylabel('Literacy Rate')
plt.show()
According to the preceding analysis, the top 10 countries—Belarus, Brunei, China, Indonesia, Italy, Kazakhstan, Kyrgyzstan, Latvia, Lebanon, and Portugal—stand out for their commendable average literacy rates. This evaluation is based on the analysis of the 'Youth_15_24_Literacy_Rate_Male' and 'Youth_15_24_Literacy_Rate_Female' columns.
least_10_countries_by_literacy=avg_literacy_rates.sort_values(by=['Youth_15_24_Literacy_Rate_Male', 'Youth_15_24_Literacy_Rate_Female'],ascending=False).tail(10)
least_10_countries_by_literacy
Youth_15_24_Literacy_Rate_Male | Youth_15_24_Literacy_Rate_Female | |
---|---|---|
Countries and areas | ||
Turks and Caicos Islands | 0.0 | 0.0 |
Tuvalu | 0.0 | 0.0 |
Ukraine | 0.0 | 0.0 |
United Arab Emirates | 0.0 | 0.0 |
United Kingdom | 0.0 | 0.0 |
United States | 0.0 | 0.0 |
Vatican City | 0.0 | 0.0 |
Venezuela | 0.0 | 0.0 |
Yemen | 0.0 | 0.0 |
Zimbabwe | 0.0 | 0.0 |
least_10_countries_by_literacy.plot(kind='bar', figsize=(10, 5), title='Average Literacy Rates of least 10 countries')
plt.xlabel('Countries')
plt.ylabel('Literacy Rate')
plt.show()
Upon careful analysis, the following represent the bottom 10 countries in terms of average literacy rates—'Turks and Caicos Islands', Tuvalu, Ukraine, United Arab Emirates, United Kingdom, United States, Vatican City, Venezuela, Yemen, and Zimbabwe. Noteworthy for their commendable literacy rates, this assessment is grounded in the analysis of the 'Youth_15_24_Literacy_Rate_Male' and 'Youth_15_24_Literacy_Rate_Female' columns.
education['Gender_Gap_Primary'] = education['Completion_Rate_Primary_Male'] - education['Completion_Rate_Primary_Female']
education['Gender_Gap_Lower_Secondary'] = education['Completion_Rate_Lower_Secondary_Male'] - education['Completion_Rate_Lower_Secondary_Female']
education['Gender_Gap_Upper_Secondary'] = education['Completion_Rate_Upper_Secondary_Male'] - education['Completion_Rate_Upper_Secondary_Female']
overall_gender_gap = education[['Gender_Gap_Primary', 'Gender_Gap_Lower_Secondary', 'Gender_Gap_Upper_Secondary']].mean().mean()
print(f'Overall Gender Gap in Completion Rates: {overall_gender_gap}')
Overall Gender Gap in Completion Rates: -0.40759075907590764
gender_gap_means = education[['Gender_Gap_Primary', 'Gender_Gap_Lower_Secondary', 'Gender_Gap_Upper_Secondary']].mean()
gender_gap_means.plot(kind='bar', figsize=(10, 6), title='Gender Gap in Completion Rates by Education Level')
plt.xlabel('Education Level')
plt.ylabel('Gender Gap')
plt.show()
The analysis of completion rates across various education levels reveals distinct gender gaps. Specifically, the gender gap in primary education is calculated at 0.4, while in lower secondary education, it widens to 0.5. Notably, the gender gap in upper secondary education remains below 0.4, indicating a comparatively narrower divide. These findings shed light on the nuanced variations in educational attainment between genders across different levels.
The comprehensive assessment of completion rates across all education levels reveals an overall gender gap of approximately -0.41. This negative value implies that, on average, completion rates are slightly higher for females compared to males. The negative sign indicates a marginal advantage for females in overall educational attainment.
are there countries where the gender gap in literacy rates is particularly high or low?¶
education['Gender_Gap_Literacy'] = education['Youth_15_24_Literacy_Rate_Male'] - education['Youth_15_24_Literacy_Rate_Female']
high_gender_gap_countries = education.sort_values(by='Gender_Gap_Literacy', ascending=False).head(5)
low_gender_gap_countries = education.sort_values(by='Gender_Gap_Literacy').head(5)
print(f'Countries with High Gender Gap in Literacy Rates:\n{high_gender_gap_countries[["Countries and areas", "Gender_Gap_Literacy"]]}')
print(f'Countries with Low Gender Gap in Literacy Rates:\n{low_gender_gap_countries[["Countries and areas", "Gender_Gap_Literacy"]]}')
Countries with High Gender Gap in Literacy Rates: Countries and areas Gender_Gap_Literacy 73 Guinea 27 34 Central African Republic 19 0 Afghanistan 18 19 Benin 18 109 Mali 15 Countries with Low Gender Gap in Literacy Rates: Countries and areas Gender_Gap_Literacy 149 Rwanda -5 65 Gabon -3 78 Honduras -3 60 Eswatini -3 178 East Timor -3
# Alternatively, create a bar chart to visualize the gender gaps
plt.figure(figsize=(10, 6))
sns.barplot(x="Countries and areas",y="Gender_Gap_Literacy",data=high_gender_gap_countries)
plt.xlabel('Country')
plt.ylabel('Gender Gap')
plt.title('Top five countries literacy rates by gender gap')
plt.xticks(rotation=90)
plt.show()
Gender Gap in Literacy Rates Across Countries The analysis of literacy rates among youth aged 15-24 reveals distinct gender gaps in various countries. The metric, calculated as the difference between the literacy rates of males and females, provides insights into gender-based disparities in education. Here are the key findings:
Countries with High Gender Gap in Literacy Rates:
- Guinea
- Central African Republic
- Afghanistan
- Benin
- Mali
# Alternatively, create a bar chart to visualize the gender gaps
plt.figure(figsize=(10, 6))
sns.barplot(x="Countries and areas",y="Gender_Gap_Literacy",data=low_gender_gap_countries)
plt.xlabel('Country')
plt.ylabel('Gender Gap')
plt.title('least five countries literacy rates by gender gap')
plt.xticks(rotation=90)
plt.show()
Least Gender Gap in Literacy Rates:
Rwanda: The country exhibits an exemplary low gender gap in literacy rates, reflecting a commitment to gender-inclusive education.
Gabon: With a minimal gender gap in literacy rates, Gabon demonstrates a noteworthy balance in educational opportunities for both genders.
Honduras: Honduras stands out for its commendable efforts in ensuring nearly equal literacy rates for both males and females.
Eswatini: Eswatini showcases a progressive approach to education, resulting in a minimal gender gap in literacy rates.
East Timor: The nation's focus on gender equality is evident in its low gender gap in literacy rates, promoting inclusive education.
Is there a correlation between proficiency in reading and math at the end of primary education?¶
import seaborn as sns
import matplotlib.pyplot as plt
reading_column = 'Primary_End_Proficiency_Reading'
math_column = 'Primary_End_Proficiency_Math'
# Create a scatter plot
plt.figure(figsize=(10, 6))
sns.scatterplot(x=reading_column, y=math_column, data=education)
# Add labels and title
plt.title(f'Correlation between {reading_column} and {math_column}')
plt.xlabel(reading_column)
plt.ylabel(math_column)
# Display the plot
plt.show()
# Calculate the correlation coefficient
correlation_coefficient = education[reading_column].corr(education[math_column])
print(f'Correlation Coefficient: {correlation_coefficient}')
Correlation Coefficient: 0.7736977468586856
Correlation Between Proficiency in Reading and Math at the End of Primary Education: The scatter plot depicts a strong positive correlation between proficiency in reading and math at the end of primary education. As the proficiency in reading increases, there is a corresponding increase in proficiency in math. This indicates a significant relationship between the two variables.
Insights:
- When primary end proficiency in reading is high, there is a noticeable trend of elevated proficiency in math.
- The Correlation Coefficient of 0.7737 reinforces the robust positive correlation between reading and math proficiency.
Conclusion: The education system's emphasis on enhancing reading skills at the end of primary education positively influences math proficiency, contributing to a well-rounded educational foundation.
Is there a relationship between the birth rate and gross enrollment in primary education?¶
birth_rate_column = 'Birth_Rate'
enrollment_column = 'Gross_Primary_Education_Enrollment'
# Create a scatter plot
plt.figure(figsize=(10, 6))
sns.scatterplot(x=birth_rate_column, y=enrollment_column, data=education)
# Add labels and title
plt.title(f'Relationship between {birth_rate_column} and {enrollment_column}')
plt.xlabel(birth_rate_column)
plt.ylabel(enrollment_column)
# Display the plot
plt.show()
# Calculate the correlation coefficient
correlation_coefficient = education[birth_rate_column].corr(education[enrollment_column])
print(f'Correlation Coefficient: {correlation_coefficient}')
Correlation Coefficient: 0.39586202769648277
Recommendations and Conclusions -¶
Recommendations:¶
Targeted Interventions for Lower-Performing Countries:
- Direct resources and support towards countries with lower completion rates, especially focusing on Andorra, Anguilla, Antigua and Barbuda, Australia, Austria, Azerbaijan, Bahrain, Belgium, Bolivia, and the British Virgin Islands.
Promoting Gender Equality in Education:
- Develop strategies to address the gender gap in completion rates, particularly in lower secondary education, to ensure equal educational opportunities for both genders.
Enhancing Literacy Programs:
- Implement literacy programs, especially in countries with lower literacy rates like Turks and Caicos Islands, Tuvalu, Ukraine, United Arab Emirates, United Kingdom, United States, Vatican City, Venezuela, Yemen, and Zimbabwe.
Focused Proficiency Improvement:
- Tailor educational programs to improve proficiency in reading, which has a positive correlation with math proficiency. This will contribute to a more well-rounded academic foundation.
Conclusions:¶
Educational Disparities Across Countries:
- There are significant disparities in completion rates and literacy levels among countries. Understanding these variations is crucial for designing targeted interventions.
Gender Gap Dynamics:
- Gender gaps in completion rates vary across education levels. While primary and lower secondary education show wider gaps, upper secondary education sees a narrower divide. The overall gender gap favors females.
Literacy Rates as Indicators:
- Literacy rates serve as crucial indicators of a country's educational health. Efforts should be directed towards sustaining or improving literacy rates for comprehensive educational development.
Correlation between Reading and Math Proficiency:
- The positive correlation between proficiency in reading and math at the end of primary education underscores the interconnectedness of these skills. Educational programs should emphasize both to foster holistic development.
Top-Performing and Improving Countries:
- Recognize and learn from the success stories of countries with high completion rates (North Korea, Kazakhstan, Belarus, Turkmenistan, etc.) and commendable literacy rates (Belarus, Brunei, China, etc.).
Identification of Challenges:
- Countries facing challenges, such as high gender gaps in literacy (Guinea, Central African Republic, Afghanistan, etc.), require targeted interventions and international cooperation to address these disparities.
Importance of Inclusive Education:
- Inclusive education practices, coupled with literacy programs, contribute significantly to minimizing gender gaps and fostering equal opportunities for all students.
Continuous Monitoring and Evaluation:
- Implement continuous monitoring and evaluation mechanisms to track the effectiveness of interventions over time. Regular assessments will guide adjustments to policies and strategies.
By implementing these recommendations and recognizing the nuanced dynamics observed in the dataset, education systems can move towards more inclusive, equitable, and effective practices, fostering holistic development across diverse countries.
Practice Questions -¶
- Are there specific age groups where the out-of-school rates are notably high?
- How does the out-of-school rate vary between males and females across different education levels?
- How does unemployment correlate with tertiary education enrollment?