Skip to content

Add bbox and datetime to /collections/{collectionId}/items #477

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Nov 7, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
### Added

* Add support in pgstac backend for /queryables and /collections/{collection_id}/queryables endpoints with functions exposed in pgstac 0.6.8
* Add `bbox` and `datetime` query parameters to `/collections/{collection_id}/items`. [476](https://github.com/stac-utils/stac-fastapi/issues/476) [380](https://github.com/stac-utils/stac-fastapi/issues/380)
* Update pgstac requirement to 0.6.10

### Changed
Expand Down
3 changes: 3 additions & 0 deletions stac_fastapi/api/stac_fastapi/api/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
APIRequest,
BaseSearchGetRequest,
BaseSearchPostRequest,
str2list,
)


Expand Down Expand Up @@ -123,6 +124,8 @@ class ItemCollectionUri(CollectionUri):
"""Get item collection."""

limit: int = attr.ib(default=10)
bbox: Optional[str] = attr.ib(default=None, converter=str2list)
Copy link
Contributor

@m-mohr m-mohr Oct 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to convert this directly to e.g. Optional[List[float]] for bbox and Optional[List[Union[datetime, None]]] for datetime so that you don't need to handle the conversion issues downstream?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I mentioned that in my comment above point 3.

I held off on changing it until I get some feedback, as that is the current behavior in the similar routes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

datetime is a little more complicated, because it also supports ranges like 2020-01-01/2022-01-01 or 2020-01-01/.., so it must allow str input (or a clever pydantic type enforcing either datetime or something with a slash in it).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I did not see that comment. Still, a list of datetime/None should work. None could be used for open date ranges as specified also for STAC extents in collections...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry I misunderstood you. Yes that makes sense, but as I've looked into this I've found some inconsistency.

For sqlalchemy, I added code based on the post_search implementation: it has some complicated logic on deciding how to treat the datetime parameter, and treats a single datetime as an exact datetime. It's also more lenient on datetime formatting.

For pgstac, I just used post_request_model, which treats a single value as an end-date, and is stricter on enforcing RFC3339.

So a few questions:

  1. Should we update the sqlalchemy post_search and item_collection routes to use post_request_model?
  2. Should a single datetime be considered as an end date? It seems much more useful (if less obvious) than the unlikely case you're able to hit on the exact datetime of an item...
  3. If the latter, datetime could be typed as (being pedantic): None | tuple[datetime, datetime] | tuple[datetime, None] | tuple[None, datetime]. This is all already validated in post_request_model (see the last link above). I think actually enforcing this typing would break the pgstac implementation, as it passes the raw datetime/datetime string to the DB's search function.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we update the sqlalchemy post_search and item_collection routes to use post_request_model?

I think the sqlalchemy backend would be better if, like pgstac, it had one function (_search_base) which contains most of the important query logic and is wrapped by all of the other methods.

Should a single datetime be considered as an end date? It seems much more useful (if less obvious) than the unlikely case you're able to hit on the exact datetime of an item...

My interpretation of the OAF datetime parameter is a single date time is an exact date time (==) while date time ranges are either ../end_date or start_date/... I agree that == is arguably less useful when searching than a datetime range but the distinction is there to let users do both.

If the latter, datetime could be typed as (being pedantic): None | tuple[datetime, datetime] | tuple[datetime, None] | tuple[None, datetime]. This is all already validated in post_request_model (see the last link above). I think actually enforcing this typing would break the pgstac implementation, as it passes the raw datetime/datetime string to the DB's search function.

Lol I should have read this thread first, I proposed the same thing in PR comment below. We shouldn't be too concerned about breaking backend implementations w/ changes to the API layer as long as those changes make the API layer stronger and improve separation of concerns (I think here they do). Backend implementations shouldn't have to guess what information is included in a request, that information should be presented in a way that is clear and easy to use and I'm not in love with the current solution for datetime. Pgstac can always coerce these back to a string.

datetime: Optional[str] = attr.ib(default=None)


class POSTTokenPagination(BaseModel):
Expand Down
17 changes: 16 additions & 1 deletion stac_fastapi/pgstac/stac_fastapi/pgstac/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -248,6 +248,8 @@ async def _get_base_item(collection_id: str) -> Dict[str, Any]:
async def item_collection(
self,
collection_id: str,
bbox: Optional[List[NumType]] = None,
datetime: Optional[Union[str, datetime]] = None,
limit: Optional[int] = None,
token: str = None,
**kwargs,
Expand All @@ -267,8 +269,21 @@ async def item_collection(
# If collection does not exist, NotFoundError wil be raised
await self.get_collection(collection_id, **kwargs)

base_args = {
"collections": [collection_id],
"bbox": bbox,
"datetime": datetime,
"limit": limit,
"token": token,
}

clean = {}
for k, v in base_args.items():
if v is not None and v != []:
clean[k] = v

req = self.post_request_model(
collections=[collection_id], limit=limit, token=token
**clean,
)
item_collection = await self._search_base(req, **kwargs)
links = await CollectionLinks(
Expand Down
48 changes: 48 additions & 0 deletions stac_fastapi/pgstac/tests/api/test_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -417,6 +417,54 @@ async def test_collection_queryables(load_test_data, app_client, load_test_colle
assert "eo:cloud_cover" in q["properties"]


@pytest.mark.asyncio
async def test_item_collection_filter_bbox(
load_test_data, app_client, load_test_collection
):
coll = load_test_collection
first_item = load_test_data("test_item.json")
resp = await app_client.post(f"/collections/{coll.id}/items", json=first_item)
assert resp.status_code == 200

bbox = "100,-50,170,-20"
resp = await app_client.get(f"/collections/{coll.id}/items", params={"bbox": bbox})
assert resp.status_code == 200
resp_json = resp.json()
assert len(resp_json["features"]) == 1

bbox = "1,2,3,4"
resp = await app_client.get(f"/collections/{coll.id}/items", params={"bbox": bbox})
assert resp.status_code == 200
resp_json = resp.json()
assert len(resp_json["features"]) == 0


@pytest.mark.asyncio
async def test_item_collection_filter_datetime(
load_test_data, app_client, load_test_collection
):
coll = load_test_collection
first_item = load_test_data("test_item.json")
resp = await app_client.post(f"/collections/{coll.id}/items", json=first_item)
assert resp.status_code == 200

datetime_range = "2020-01-01T00:00:00.00Z/.."
resp = await app_client.get(
f"/collections/{coll.id}/items", params={"datetime": datetime_range}
)
assert resp.status_code == 200
resp_json = resp.json()
assert len(resp_json["features"]) == 1

datetime_range = "2018-01-01T00:00:00.00Z/2019-01-01T00:00:00.00Z"
resp = await app_client.get(
f"/collections/{coll.id}/items", params={"datetime": datetime_range}
)
assert resp.status_code == 200
resp_json = resp.json()
assert len(resp_json["features"]) == 0


@pytest.mark.asyncio
async def test_bad_collection_queryables(
load_test_data, app_client, load_test_collection
Expand Down
49 changes: 44 additions & 5 deletions stac_fastapi/sqlalchemy/stac_fastapi/sqlalchemy/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -99,25 +99,64 @@ def get_collection(self, collection_id: str, **kwargs) -> Collection:
return self.collection_serializer.db_to_stac(collection, base_url)

def item_collection(
self, collection_id: str, limit: int = 10, token: str = None, **kwargs
self,
collection_id: str,
bbox: Optional[List[NumType]] = None,
datetime: Optional[str] = None,
limit: int = 10,
token: str = None,
**kwargs,
) -> ItemCollection:
"""Read an item collection from the database."""
base_url = str(kwargs["request"].base_url)
with self.session.reader.context_session() as session:
collection_children = (
query = (
session.query(self.item_table)
.join(self.collection_table)
.filter(self.collection_table.id == collection_id)
.order_by(self.item_table.datetime.desc(), self.item_table.id)
)
# Spatial query
geom = None
if bbox:
bbox = [float(x) for x in bbox]
if len(bbox) == 4:
geom = ShapelyPolygon.from_bounds(*bbox)
elif len(bbox) == 6:
"""Shapely doesn't support 3d bounding boxes so use the 2d portion"""
bbox_2d = [bbox[0], bbox[1], bbox[3], bbox[4]]
geom = ShapelyPolygon.from_bounds(*bbox_2d)
if geom:
filter_geom = ga.shape.from_shape(geom, srid=4326)
query = query.filter(
ga.func.ST_Intersects(self.item_table.geometry, filter_geom)
)

# Temporal query
if datetime:
# Two tailed query (between)
dts = datetime.split("/")
# Non-interval date ex. "2000-02-02T00:00:00.00Z"
if len(dts) == 1:
query = query.filter(self.item_table.datetime == dts[0])
# is there a benefit to between instead of >= and <= ?
elif dts[0] not in ["", ".."] and dts[1] not in ["", ".."]:
query = query.filter(self.item_table.datetime.between(*dts))
# All items after the start date
elif dts[0] not in ["", ".."]:
query = query.filter(self.item_table.datetime >= dts[0])
# All items before the end date
elif dts[1] not in ["", ".."]:
query = query.filter(self.item_table.datetime <= dts[1])

count = None
if self.extension_is_enabled("ContextExtension"):
count_query = collection_children.statement.with_only_columns(
count_query = query.statement.with_only_columns(
[func.count()]
).order_by(None)
count = collection_children.session.execute(count_query).scalar()
count = query.session.execute(count_query).scalar()
token = self.get_token(token) if token else token
page = get_page(collection_children, per_page=limit, page=(token or False))
page = get_page(query, per_page=limit, page=(token or False))
# Create dynamic attributes for each page
page.next = (
self.insert_token(keyset=page.paging.bookmark_next)
Expand Down
46 changes: 46 additions & 0 deletions stac_fastapi/sqlalchemy/tests/api/test_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -434,3 +434,49 @@ def test_app_search_response_duplicate_forwarded_headers(
for feature in resp.json()["features"]:
for link in feature["links"]:
assert link["href"].startswith("https://testserver:1234/")


def test_item_collection_filter_bbox(load_test_data, app_client, postgres_transactions):
item = load_test_data("test_item.json")
collection = item["collection"]
postgres_transactions.create_item(
item["collection"], item, request=MockStarletteRequest
)

bbox = "100,-50,170,-20"
resp = app_client.get(f"/collections/{collection}/items", params={"bbox": bbox})
assert resp.status_code == 200
resp_json = resp.json()
assert len(resp_json["features"]) == 1

bbox = "1,2,3,4"
resp = app_client.get(f"/collections/{collection}/items", params={"bbox": bbox})
assert resp.status_code == 200
resp_json = resp.json()
assert len(resp_json["features"]) == 0


def test_item_collection_filter_datetime(
load_test_data, app_client, postgres_transactions
):
item = load_test_data("test_item.json")
collection = item["collection"]
postgres_transactions.create_item(
item["collection"], item, request=MockStarletteRequest
)

datetime_range = "2020-01-01T00:00:00.00Z/.."
resp = app_client.get(
f"/collections/{collection}/items", params={"datetime": datetime_range}
)
assert resp.status_code == 200
resp_json = resp.json()
assert len(resp_json["features"]) == 1

datetime_range = "2018-01-01T00:00:00.00Z/2019-01-01T00:00:00.00Z"
resp = app_client.get(
f"/collections/{collection}/items", params={"datetime": datetime_range}
)
assert resp.status_code == 200
resp_json = resp.json()
assert len(resp_json["features"]) == 0
16 changes: 14 additions & 2 deletions stac_fastapi/types/stac_fastapi/types/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -489,7 +489,13 @@ def get_collection(self, collection_id: str, **kwargs) -> stac_types.Collection:

@abc.abstractmethod
def item_collection(
self, collection_id: str, limit: int = 10, token: str = None, **kwargs
self,
collection_id: str,
bbox: Optional[List[NumType]] = None,
datetime: Optional[Union[str, datetime]] = None,
limit: int = 10,
token: str = None,
**kwargs,
) -> stac_types.ItemCollection:
"""Get all items from a specific collection.

Expand Down Expand Up @@ -684,7 +690,13 @@ async def get_collection(

@abc.abstractmethod
async def item_collection(
self, collection_id: str, limit: int = 10, token: str = None, **kwargs
self,
collection_id: str,
bbox: Optional[List[NumType]] = None,
datetime: Optional[Union[str, datetime]] = None,
limit: int = 10,
token: str = None,
**kwargs,
) -> stac_types.ItemCollection:
"""Get all items from a specific collection.

Expand Down