Python References & Coding Strategy
This is a collection of Python references that I have found useful. These were originally in the landmapy repository. See related in Environmental Systems. Please offer suggestions to improve.
- Python References & Coding Strategy (Quarto Slideshow)
- Python Overview
- Python Coding Strategy
- Earth Data Analytics (EDA) Workbook
- Useful Python Libraries
- Plot Libraries and Systems
- IPython Methods
byandell.github.io/Documentation
Python Overview
- Python Tutorial
- Pandas Library
- Packaging Python Projects
- Try and Except code for NetCDF;
- Documenting Python Code: A Complete Guide
- Python Developer’s Guide
- Integrating Python Code With R
Earth Data Analytics (EDA) Workbook
- EDA Scientific Data Structures in Python
- Subtract One Raster from Another and Export a New GeoTIFF in Open Source Python
- Earth Analytics Python Env
Useful Python Libraries
- pandas
- numpy
- math
- time
- glob
- os
- re
- csv
- pathlib
- zipfile
- warnings
- datetime
- scipy
- scikit-fuzzy
- sklearn (scikit-learn)
- seaborn
- statsmodels
- pystac_client
- pystac
Lists of Python Libraries
Plot Libraries and Systems
See landmapy Plot Functions.
Spatial Libraries
- geopandas
- rasterio
- xarray
- xrspatial
- earthaccess
- getpass
- earthpy
- tqdm
- regionmask
- cartopy
- folium
- geopy
- pyproj
Interactive Plots
- Creating an Animated GIF with Python
- HoloViews Overview
- matplotlib
- matplotlib.widgets (
%matplotlib widget) - ipympl
- Using
%matplotlibwidget
- matplotlib.widgets (
IPython Methods
AI overview: IPython methods enhance interactive computing in Python, offering features beyond the standard interpreter. Some key methods include:
- Tab Completion: Simplifies code writing by suggesting attributes and methods of objects or modules as you type.
- Introspection: Provides detailed information about objects, functions, or modules using
?or??. - Magic Commands: Special commands prefixed with
%for tasks like timing code execution (%timeit), running external scripts (%run), or accessing shell commands (!). - Input Caching: Stores previous commands and outputs, accessible via
_,__,___for outputs and_i,_ii,_iiiorIn[n]for inputs. - Rich Display: Enables richer object representations using
_ipython_display_()or_repr_*_()methods for custom display formats like HTML or images. - History: Allows browsing and reusing previous commands across sessions. These methods streamline development, debugging, and exploration in interactive Python environments.
References:
Data
Data can be explicitly stored in a file using read and write methods, or implicitly using the pickle module. Store Magic and caching are two ways to store data using pickle.
Read and Write
Many projects read and write to files. Following course guidelines, we use the data_dir variable to store data in a consistent location, which for EDA is ~/earth-analytics/data. The landmapy/initial.py has function create_data_dir() to create a directory if it does not exist.
def create_data_dir(new_dir):
import os
import pathlib
data_dir = os.path.join(
pathlib.Path.home(),
'earth-analytics',
'data',
new_dir
)
os.makedirs(data_dir, exist_ok=True)
return data_dir
Store Magic Data
Store Magic stores user data on demand via the magic command %store blah in a named file in ~/.ipython/profile_default/db/autorestore/, and retrieve it with %store -r blah. This is useful for storing data between sessions or projects.
The following code will try to retrieve the object buffalo_gdf if it was previously stored. The try statement checks if buffalo_gdf exists, creating a NameError exception if not, which leads to code to create and %store it.
%store -r buffalo_gdf
try:
buffalo_gdf
except NameError:
import geopandas as gpd
# Assume `data_dir` is defined and `geojson` file is saved there.
# Read all grasslands GeoJSON into `grassland_gdf`.
grassland_url = f"{data_dir}/National_Grassland_Units_(Feature_Layer).geojson"
grassland_gdf = gpd.read_file(grassland_url)
# Subset to desired locations.
buffalo_gdf = grassland_gdf.loc[grassland_gdf['GRASSLANDNAME'].isin(
["Buffalo Gap National Grassland", "Oglala National Grassland"])]
%store buffalo_gdf
print("buffalo_gdf created and stored")
else:
print("buffalo_gdf retrieved from StoreMagic")
Cached Data via Decorator
The landmapy/cached.py decorator caches data in the jars directory ~/earth-analytics/data/jars/. The decorator @cached is used to cache the results of a function. See examples in clustering.qmd using functions in the landmapy/reflect.py module. Some explanation of decorators is in the next section. There is no need to use Store Magic with this decorator, as it already caches the data in the jars directory.
Decorators
Code for a caching decorator is in landmapy/cached.py, which you can use in your code. This decorator will pickle the results of running a do_something() function, and only run the code if the results do not already exist. To override the caching, for example temporarily after making changes to your code, set override=True. Note that to use the caching decorator, you must write your own function to perform each task. See examples in landmapy/delta.py and landmapy/reflectance.py.
- Clustering Project
- Decorators in Python (Geeks4Geeks)
- Primer on Python Decorators (RealPython)
- PEP 318 – Decorators for Functions and Methods
- Python Decorators with Examples (Programiz)
- Practical Decorators by Reuven M Lerner
- Python Workout by Reuven M Lerner
- Decorators with Parameters (StackOverflow)
One way of thinking about decorators with arguments is
@decorator
def foo(*args, **kwargs):
pass
translates to
foo = decorator(foo)
So if the decorator had arguments,
@decorator_with_args(arg)
def foo(*args, **kwargs):
pass
translates to
foo = decorator_with_args(arg)(foo)
That is, decorator_with_args() is a function which accepts a custom argument and which returns the actual decorator (that will be applied to the decorated function).
A decorator with arguments can be used in a notebook or document. However, in order to embed the arguments within a module takes a bit more care. For instance, landmapy/reflect.py uses the @cached decorator from landmapy/cached.py to cache the results of the function. The original static use of the decorator was
from landmapy.cached import cached
@cached('wbd_08')
def read_wbd_file(wbd_filename, huc_level, cache_key):
...
def read_delta_gdf(huc_level=12, huc_region='08', watershed='080902030506'):
wbd_gdf = read_wbd_file(
f"WBD_{huc_region}_HU2_Shape", huc_level, cache_key=f'hu{huc_level}')
...
Note the keyword argument cache_key is used in the function read_delta_gdf() when calling the decorated function read_wbd_file(), with data cached in the jars directory as f'wbd_08_hu{huc_level}.pickle', with the HUC level 12 changeable, but not the HUC region. To make this more flexible, the code was changed as follows:
from landmapy.cached import cached
def read_wbd_file(wbd_filename, huc_level, cache_key,
func_key='wbd_08', override=False):
@cached(func_key, override)
def read_wbd_cached(wbd_filename, huc_level, cache_key):
...
wbd_gdf = read_wbd_cached(wbd_filename, huc_level, cache_key=cache_key)
return wbd_gdf
def read_delta_gdf(huc_level=12, huc_region='08', watershed='080902030506',
func_key='wbd_08', override=False):
wbd_gdf = read_wbd_file(
f"WBD_{huc_region}_HU2_Shape", huc_level,
cache_key=f'hu{huc_level}',
func_key=func_key, override=override)
...
The revised function read_delta_gdf() has added arguments for the @cached decorator func_key and override. In addition, read_wbd_file() is now an undecorated function that calls the internal decorated function read_wbd_cached(). The decorator @cached is now inside the function read_wbd_file(), called with arguments func_key and override. The keyword argument cache_key is still used in the function read_delta_gdf() and importantly in the call to the decorated function read_wbd_cached() from within read_wbd_file(). Data are now cached in the jars directory as 'f{func_key}_hu{huc_level}.pickle', which changes with the HUC level 12 and HUC region.
Classes
A class is a function with output of an object that has new methods, which are in turn functions defined in the class. In addition, the @property decorator defines attributes for an object. The main uses of classes are to:
- add functionality to an existing class
- streamline different functions with the same parameters to keep track of metadata
AI overview: In Python, a class serves as a blueprint for creating objects, which are instances that encapsulate data (attributes) and behavior (methods). Classes facilitate object-oriented programming (OOP) principles, enabling code reusability, modularity, and organization. A class is defined using the class keyword, followed by the class name and a colon. Inside the class block, attributes and methods are defined. The init method is a special method, known as the constructor, which is automatically called when an object of the class is created to initialize the object’s attributes.