Frequently Asked Questions (FAQ)

Last updated: 2022-06-22 18:54:34

pandas

Split DataFrame to parts

Sample data

import pandas as pd
dat = pd.read_csv('output/stations.csv')
dat
name city lines piano lon lat
0 Beer-Sheva Center Beer-Sheva 4 False 34.798443 31.243288
1 Beer-Sheva University Beer-Sheva 5 True 34.812831 31.260284
2 Dimona Dimona 1 False 35.011635 31.068616

Function definition

import math
def split_dataframe(df, chunk_size = 1_000_000): 
    chunks = list()
    num_chunks = math.ceil(len(df) / chunk_size)
    for i in range(num_chunks):
        chunks.append(df[i*chunk_size:(i+1)*chunk_size])
    return chunks

Example

split_dataframe(dat, 2)
[                    name        city  lines  piano        lon        lat
 0      Beer-Sheva Center  Beer-Sheva      4  False  34.798443  31.243288
 1  Beer-Sheva University  Beer-Sheva      5   True  34.812831  31.260284,
      name    city  lines  piano        lon        lat
 2  Dimona  Dimona      1  False  35.011635  31.068616]

References:

Shift column(s) to beginning

Sample data

import pandas as pd
dat = pd.read_csv('output/stations.csv')
dat
name city lines piano lon lat
0 Beer-Sheva Center Beer-Sheva 4 False 34.798443 31.243288
1 Beer-Sheva University Beer-Sheva 5 True 34.812831 31.260284
2 Dimona Dimona 1 False 35.011635 31.068616

Example

cols = ['lon', 'lat']
dat = dat[ cols  + [ col for col in dat.columns if col not in cols ] ]
dat
lon lat name city lines piano
0 34.798443 31.243288 Beer-Sheva Center Beer-Sheva 4 False
1 34.812831 31.260284 Beer-Sheva University Beer-Sheva 5 True
2 35.011635 31.068616 Dimona Dimona 1 False

References:

geopandas

Calculating distances in WGS84

Question

How can we calculate distances over large regions given lon/lat points?

Sample data

Two (lon,lat) points:

pnt1 = (0, 0)
pnt2 = (1, 0)

True distance according to Wikipedia:

dist = 111320

Using the Harvesine formula (less accurate)

See Example: distance function:

import math
def distance(origin, destination):
    lon1, lat1 = origin
    lon2, lat2 = destination
    radius = 6371000
    dlat = math.radians(lat2 - lat1)
    dlon = math.radians(lon2 - lon1)
    a = (math.sin(dlat / 2) * math.sin(dlat / 2) +
         math.cos(math.radians(lat1)) * math.cos(math.radians(lat2)) *
         math.sin(dlon / 2) * math.sin(dlon / 2))
    c = 2 * math.atan2(math.sqrt(a), math.sqrt(1 - a))
    d = radius * c
    return d
result = distance(pnt1, pnt2)
result = round(result)
result
111195
dist-result  ## Error of 125 meters
125

Using geodesic distance function from geopy (most accurate)

See geopy documentation:

import geopy.distance
result = geopy.distance.distance(tuple(reversed(pnt1)), tuple(reversed(pnt2))).meters
result = round(result)
result
111319
dist - result  ## Error of 1 meter
1

Beyond distance: The S2 Geometry Library

The S2 Geometry Library by Google can be used for more complicated calculations in WGS84, such as polygon area. It has a Python interface called s2sphere.

arcpy

Listing all layers on current map

import arcpy
aprx = arcpy.mp.ArcGISProject("CURRENT")
map = aprx.listMaps()[0]
layers = map.listLayers()
print([i.dataSource for i in layers])
_images/arcpy_list_layers.png

Fig. 84 Listing layers on current map with arcpy