Project Design¶

Welcome to the design guide for geoenv! This page provides an overview of the architecture, components, and principles that shape the project. Our goal is to make geoenv intuitive to understand and easy to contribute to.

Have suggestions or questions? Open a GitHub issue —we’d love your feedback!

Project Overview¶

geoenv resolves geographic locations (as geometries) into meaningful environmental descriptions using spatial datasets 🌍 and semantic vocabularies 📖.

To make this possible, we designed around a clear set of priorities:

Functional Goals¶

Resolve spatial geometries to detailed environmental descriptions
Support key, high-value spatial data sources
Efficiently iterate over large numbers of geometries
Enable dynamic data source selection
Maintain traceability via location identifiers

Interoperability¶

Use open, widely adopted standards
Map terms to multiple ontologies and vocabularies
Output data in GeoJSON-compatible format
Convert to Science-On-Schema.Org Spatial Coverage

Efficiency¶

Process multiple data sources in parallel using asynchronous requests
Cache responses to avoid redundant queries
Cache data sources locally where possible

Sustainability¶

Use pluggable DataSource implementations
Promote community-driven growth

Architecture¶

The system is composed of core classes that collaborate using clearly defined contracts. The architecture 🏗 follows a strategy pattern for modular extensibility.

Resolver¶

The Resolver is the main entry point. You pass in a Geometry and a list of DataSource instances, and get back a structured Response:

Calls all configured data sources concurrently using asynchronous I/O
Wraps results into Environment objects
Maps terms to ENVO by default
Returns a Response object with the result set

Response¶

The Response structures results using a GeoJSON format, where environmental descriptions are stored under properties.environment.

You can:

Map terms to other vocabularies
Convert the response to Science-On-Schema.Org
Save or load the result for reuse

DataSource (ABC)¶

Defines the interface for any data source:

Standard methods and properties for consistency
Custom behaviors for data source-specific needs
May implement fallback behavior (e.g., point approximation for polygons)

Returns an Environment for each query.

Environment¶

Encapsulates the returned values from a data source:

Lightweight, minimal post-processing
Includes original terms

Geometry¶

Handles all client-supplied geometries in GeoJSON:

Identifies type (Point, Polygon)
Converts points to polygons
Transforms to formats required by a data source

Supports GeoJSON Point and Polygon types for now, with plans for GeometryCollections.

Response Data Format¶

The output is a GeoJSON Feature with nested environmental data. 📦

Top Level:

type (string): always “Feature”
identifier (string): unique ID for the query
geometry (object): the original geometry
properties (object): extra metadata, including environments

Properties:

description (string): the geometry description
environment (array): the resolved environments

Environment Object:

type (string): always “Environment”
dataSource (object): ID and name of the source
dateCreated (string): timestamp of the query
properties (object): key/value pairs of environmental properties
mappedProperties (array): label/uri pairs for semantic mappings

Example

{
  "type": "Feature",
  "identifier": "...",
  "geometry": {...},
  "properties": {
    "description": "...",
    "environment": [
      {
        "type": "Environment",
        "dataSource": {
          "identifier": "...",
          "name": "..."
        },
        "dateCreated": "...",
        "properties": {
          "temperature": "Warm Temperate",
          "moisture": "Dry",
        },
        "mappedProperties": [
          {"label": "temperate", "uri": "..."},
          {"label": "arid", "uri": "..."}
        ]
      }
    ]
  }
}

Semantic Mapping¶

We use SSSOM to link data source terminology to semantic vocabularies. 🧠

Mapping logic lives in Response.apply_term_mapping
Each data source has SSSOM files for each ontology/vocabulary

Error Handling¶

Error Propagation¶

Raised at the relevant layer 🚨
Always include actionable info ✅

Logging with daiquiri¶

Supports DEBUG, INFO, WARNING, ERROR
Logs include relevant metadata

Testing¶

We ensure test 🧪 coverage through:

Geometry tests – validation, conversions, type detection
DataSource tests – standard contract + edge cases
Response tests – semantic mapping and transformation checks
Mock tests – generated from real HTTP requests
Integration tests – Resolver end-to-end scenarios

Adding a New Data Source¶

Data Source

Add a module under data_sources/
Register it in data_sources/__init__.py
Implement the DataSource ABC
Support all required geometry types
Document special behaviors or config options
Keep data source-specific utilities scoped to the module

Semantic Mappings

Create SSSOM files for your vocabularies
Follow filename conventions for discovery

Tests

Create mock geometries
Use create_mock_data.py to record responses
Add tests for both valid and invalid inputs
Test both expected and edge behavior

We’re building geoenv to be sustainable, useful, and open. Your input helps shape its future 💚