Homework

Объедините таблицы городов и стран, добавив к таблице городов наименование страны. В итоговом датафрейме должны быть следующие столбцы:

  • ID (ID города)
  • City (Название города)
  • CountryCode (Код страны)
  • Country (Название страны)
  • Population (Кол-во населения города)
  • Итоговый датафрейм сохраните в файл Result.xlsx
In [1]:
import pandas as pd
In [2]:
df_city = pd.read_csv('city.csv', sep=';')
df_city
Out[2]:
ID Name CountryCode District Population
0 1 Kabul AFG Kabol 1780000
1 2 Qandahar AFG Qandahar 237500
2 3 Herat AFG Herat 186800
3 4 Mazar-e-Sharif AFG Balkh 127800
4 5 Amsterdam NLD Noord-Holland 731200
... ... ... ... ... ...
4074 4075 Khan Yunis PSE Khan Yunis 123175
4075 4076 Hebron PSE Hebron 119401
4076 4077 Jabaliya PSE North Gaza 113901
4077 4078 Nablus PSE Nablus 100231
4078 4079 Rafah PSE Rafah 92020

4079 rows × 5 columns

In [3]:
df_country = pd.read_csv('country.csv', sep=';')
df_country
Out[3]:
Code Name Continent Region SurfaceArea IndepYear Population LifeExpectancy GNP GNPOld LocalName GovernmentForm HeadOfState Capital Code2
0 ABW Aruba North America Caribbean 193.0 NaN 103000 78.4 828.0 793.0 Aruba Nonmetropolitan Territory of The Netherlands Beatrix 129.0 AW
1 AFG Afghanistan Asia Southern and Central Asia 652090.0 1919.0 22720000 45.9 5976.0 NaN Afganistan/Afqanestan Islamic Emirate Mohammad Omar 1.0 AF
2 AGO Angola Africa Central Africa 1246700.0 1975.0 12878000 38.3 6648.0 7984.0 Angola Republic José Eduardo dos Santos 56.0 AO
3 AIA Anguilla North America Caribbean 96.0 NaN 8000 76.1 63.2 NaN Anguilla Dependent Territory of the UK Elisabeth II 62.0 AI
4 ALB Albania Europe Southern Europe 28748.0 1912.0 3401200 71.6 3205.0 2500.0 Shqipëria Republic Rexhep Mejdani 34.0 AL
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
234 YEM Yemen Asia Middle East 527968.0 1918.0 18112000 59.8 6041.0 5729.0 Al-Yaman Republic Ali Abdallah Salih 1780.0 YE
235 YUG Yugoslavia Europe Southern Europe 102173.0 1918.0 10640000 72.4 17000.0 NaN Jugoslavija Federal Republic Vojislav Koštunica 1792.0 YU
236 ZAF South Africa Africa Southern Africa 1221037.0 1910.0 40377000 51.1 116729.0 129092.0 South Africa Republic Thabo Mbeki 716.0 ZA
237 ZMB Zambia Africa Eastern Africa 752618.0 1964.0 9169000 37.2 3377.0 3922.0 Zambia Republic Frederick Chiluba 3162.0 ZM
238 ZWE Zimbabwe Africa Eastern Africa 390757.0 1980.0 11669000 37.8 5951.0 8670.0 Zimbabwe Republic Robert G. Mugabe 4068.0 ZW

239 rows × 15 columns

In [6]:
pd.merge(df_city, df_country, how='left', left_on='CountryCode', right_on='Code')\
    [['ID', 'Name_x', 'CountryCode', 'Name_y', 'Population_x']]\
    .rename(columns={'Name_x': 'City', 'Name_y': 'Country', 'Population_x': 'Population'})
Out[6]:
ID City CountryCode Country Population
0 1 Kabul AFG Afghanistan 1780000
1 2 Qandahar AFG Afghanistan 237500
2 3 Herat AFG Afghanistan 186800
3 4 Mazar-e-Sharif AFG Afghanistan 127800
4 5 Amsterdam NLD Netherlands 731200
... ... ... ... ... ...
4074 4075 Khan Yunis PSE Palestine 123175
4075 4076 Hebron PSE Palestine 119401
4076 4077 Jabaliya PSE Palestine 113901
4077 4078 Nablus PSE Palestine 100231
4078 4079 Rafah PSE Palestine 92020

4079 rows × 5 columns

In [7]:
res = pd.merge(df_city, df_country, how='left', left_on='CountryCode', right_on='Code')\
    [['ID', 'Name_x', 'CountryCode', 'Name_y', 'Population_x']]\
    .rename(columns={'Name_x': 'City', 'Name_y': 'Country', 'Population_x': 'Population'})
In [9]:
res.to_excel('Result.xlsx', index=False)
In [ ]: