{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "(assignment-3)=\n", "# Home assignment 3" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "tags": [ "remove-input" ] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Last updated: 2022-05-11 11:29:46\n" ] } ], "source": [ "!echo Last updated: `date +\"%Y-%m-%d %H:%M:%S\"`" ] }, { "cell_type": "code", "execution_count": null, "id": "9be78d54", "metadata": { "tags": [ "remove-input" ] }, "outputs": [], "source": [ "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "pd.options.display.max_rows = 10\n", "pd.options.display.max_columns = 10\n", "pd.options.display.max_colwidth = 35\n", "plt.rcParams[\"figure.figsize\"] = (6, 6)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*****" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Question 1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> * Read the `world_cities.csv` file into a `DataFrame` object.\n", "> * Calculate and print a new table, where each row represents a *country*, with the following columns:\n", "> * `\"country\"`—Country name\n", "> * `\"capital\"`—The name of the capital city\n", "> * `\"pop_total\"`—The total population (population in all cities summed)\n", "> * Note that for some countries there is more than one value marked as the capital! The resulting table still needs to have one row per country: capital name (`\"capital\"`) should be the *first* if there is more than one, while total population (`\"pop_total\"`) needs to be the sum of all cities (regardless of duplicates)." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "tags": [ "remove-input" ] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countrycapitalpop_total
0AfghanistanKabul7543856
1AlbaniaTirana1536232
2AlgeriaAlgiers20508642
3American SamoaPago Pago58021
4AndorraAndorra la Vella69031
............
223Wallis and FutunaMata'utu11380
224Western Saharaal-'Ayun338786
225YemenSan'a5492077
226ZambiaLusaka4032170
227ZimbabweHarare4231859
\n", "

228 rows × 3 columns

\n", "
" ], "text/plain": [ " country capital pop_total\n", "0 Afghanistan Kabul 7543856\n", "1 Albania Tirana 1536232\n", "2 Algeria Algiers 20508642\n", "3 American Samoa Pago Pago 58021\n", "4 Andorra Andorra la Vella 69031\n", ".. ... ... ...\n", "223 Wallis and Futuna Mata'utu 11380\n", "224 Western Sahara al-'Ayun 338786\n", "225 Yemen San'a 5492077\n", "226 Zambia Lusaka 4032170\n", "227 Zimbabwe Harare 4231859\n", "\n", "[228 rows x 3 columns]" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "\n", "dat = pd.read_csv(\"data/world_cities.csv\")\n", "capitals = dat[dat[\"capital\"] == 1][[\"country\", \"city\"]].groupby(\"country\").first().reset_index()\n", "capitals = capitals.rename(columns = {\"city\": \"capital\"})\n", "populations = dat[[\"country\", \"pop\"]].groupby(\"country\").sum().reset_index()\n", "populations = populations.rename(columns = {\"pop\": \"pop_total\"})\n", "result = pd.merge(capitals, populations, on = \"country\")\n", "result" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Question 2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> * The text file named `bgu.wkt` (see {ref}`sample-data`) contains a WKT string representing the geometry of the BGU logo. \n", "> * Read the WKT string from the `bgu.wkt` file, using the `open` and `.readline` methods (see {ref}`working-with-files`). Convert the string into a `shapely` geometry\n", "> * Note: Do not copy and paste the WKT string into your code! You need to read it from the `bgu.wkt` file.\n", "> * Display the logo graphically." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "tags": [ "remove-input" ] }, "outputs": [ { "data": { "image/svg+xml": "", "text/plain": [ "" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import shapely.geometry\n", "import shapely.wkt\n", "\n", "f = open(\"data/bgu.wkt\", \"r\", encoding = \"utf-8\")\n", "text = f.readline()\n", "f.close()\n", "logo = shapely.wkt.loads(text)\n", "logo" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> * Calculate the area of the logo." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "tags": [ "remove-input" ] }, "outputs": [ { "data": { "text/plain": [ "0.21286694897980007" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "logo.area" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> * Calculate a circle that is enclosing the logo. To do that, first calculate the average of x-axis bounds and the average of the y-axis bounds, then construct a point according to those x and y values, and finally buffer the point to a distance of your choice so that the logo is completely within the buffer.\n", "> * Calculate the geometry of the *difference* between the logo and the bouding circle you calculated, then *plot* it." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "tags": [ "remove-input" ] }, "outputs": [ { "data": { "image/svg+xml": "", "text/plain": [ "" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bb = logo.bounds\n", "x = (bb[0] + bb[2]) / 2\n", "y = (bb[1] + bb[3]) / 2\n", "pnt = shapely.wkt.loads(\"POINT (\" + str(x) + \" \" + str(y) + \")\")\n", "circ = pnt.buffer(0.7)\n", "circ.difference(logo)" ] } ], "metadata": { "interpreter": { "hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6" }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.10" } }, "nbformat": 4, "nbformat_minor": 4 }