Skip to content

Add bbox and datetime to /collections/{collectionId}/items #477

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Nov 7, 2022

Conversation

carderne
Copy link
Contributor

@carderne carderne commented Oct 7, 2022

Related Issue(s):

Description:
Add bbox and datetime query parameters to the /collections{collection_id}/items route.
Updated base classes, request models and pgstac and sqlalchemy clients.

PR Checklist:

  • Code is formatted and linted (run pre-commit run --all-files)
  • Tests pass (run make test)
  • Documentation has been updated to reflect changes, if applicable, and docs build successfully (run make docs)
  • Changes are added to the CHANGELOG.

Imported str2list function from stac_fastapi.types.search
as already have dependency on it.
Using same type signature as get_search
Used the same base_args -> clean from the search routes
Probably not necessary in this situation but thought
I'd keep it similar (and extensible).
@carderne carderne force-pushed the master branch 2 times, most recently from debac82 to 158e2c4 Compare October 7, 2022 13:20
@carderne
Copy link
Contributor Author

carderne commented Oct 8, 2022

Some comments:

  1. Updated the pgstac code to use the base_args dict approach from the search route. Bit of duplication but it seems okay.
  2. For sqlalchemy, I did the same thing with the filtering code from the search route. Could be worth pulling some of that filtering code out into two separate functions, as it's pretty self-contained?
  3. Haven't added any tests, but could add some similar to the search tests, just checking bbox and datetime functionality? Added some basic tests.

Then three separate points that probably need attention:

  1. There's some type inconsistency in datetime parameters. In the type defs it's str but in the clients, it's Union[str, datetime]. This causes a type issue here calling .split() on it. It should probably just be str everywhere, as it's supposed to allow /-separated ranges. Could also type it more cleverly and check before spliting.
  2. datetime parameters shadow from datetime import datetime eg here. Should probably rename to datetime_range or something.
  3. bbox is alternately types as List[NumType], str and BBox. Should probably be BBox everywhere, and the str2list converter could be a bit more aggressive in converting it into a BBox (a tuple of floats, not just a list of anything). For now I did this but it's not great.

@carderne carderne marked this pull request as ready for review October 8, 2022 10:59
@@ -123,6 +124,8 @@ class ItemCollectionUri(CollectionUri):
"""Get item collection."""

limit: int = attr.ib(default=10)
bbox: Optional[str] = attr.ib(default=None, converter=str2list)
Copy link
Contributor

@m-mohr m-mohr Oct 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to convert this directly to e.g. Optional[List[float]] for bbox and Optional[List[Union[datetime, None]]] for datetime so that you don't need to handle the conversion issues downstream?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I mentioned that in my comment above point 3.

I held off on changing it until I get some feedback, as that is the current behavior in the similar routes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

datetime is a little more complicated, because it also supports ranges like 2020-01-01/2022-01-01 or 2020-01-01/.., so it must allow str input (or a clever pydantic type enforcing either datetime or something with a slash in it).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I did not see that comment. Still, a list of datetime/None should work. None could be used for open date ranges as specified also for STAC extents in collections...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry I misunderstood you. Yes that makes sense, but as I've looked into this I've found some inconsistency.

For sqlalchemy, I added code based on the post_search implementation: it has some complicated logic on deciding how to treat the datetime parameter, and treats a single datetime as an exact datetime. It's also more lenient on datetime formatting.

For pgstac, I just used post_request_model, which treats a single value as an end-date, and is stricter on enforcing RFC3339.

So a few questions:

  1. Should we update the sqlalchemy post_search and item_collection routes to use post_request_model?
  2. Should a single datetime be considered as an end date? It seems much more useful (if less obvious) than the unlikely case you're able to hit on the exact datetime of an item...
  3. If the latter, datetime could be typed as (being pedantic): None | tuple[datetime, datetime] | tuple[datetime, None] | tuple[None, datetime]. This is all already validated in post_request_model (see the last link above). I think actually enforcing this typing would break the pgstac implementation, as it passes the raw datetime/datetime string to the DB's search function.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we update the sqlalchemy post_search and item_collection routes to use post_request_model?

I think the sqlalchemy backend would be better if, like pgstac, it had one function (_search_base) which contains most of the important query logic and is wrapped by all of the other methods.

Should a single datetime be considered as an end date? It seems much more useful (if less obvious) than the unlikely case you're able to hit on the exact datetime of an item...

My interpretation of the OAF datetime parameter is a single date time is an exact date time (==) while date time ranges are either ../end_date or start_date/... I agree that == is arguably less useful when searching than a datetime range but the distinction is there to let users do both.

If the latter, datetime could be typed as (being pedantic): None | tuple[datetime, datetime] | tuple[datetime, None] | tuple[None, datetime]. This is all already validated in post_request_model (see the last link above). I think actually enforcing this typing would break the pgstac implementation, as it passes the raw datetime/datetime string to the DB's search function.

Lol I should have read this thread first, I proposed the same thing in PR comment below. We shouldn't be too concerned about breaking backend implementations w/ changes to the API layer as long as those changes make the API layer stronger and improve separation of concerns (I think here they do). Backend implementations shouldn't have to guess what information is included in a request, that information should be presented in a way that is clear and easy to use and I'm not in love with the current solution for datetime. Pgstac can always coerce these back to a string.

@geospatial-jeff
Copy link
Collaborator

Sorry its taken me a while to get around to this.

There's some type inconsistency in datetime parameters. In the type defs it's str but in the clients, it's Union[str, datetime]. This causes a type issue here calling .split() on it. It should probably just be str everywhere, as it's supposed to allow /-separated ranges. Could also type it more cleverly and check before spliting.

I agree that the Union[str, datetime] is messy. I think str is more ambiguous because the spec allows both single dates and date time ranges. If we were to make changes to datetime handling I would prefer for the API layer to coerce the datetime parameters to a tuple of datetime objects. Maybe (None, datetime.datetime), (datetime.datetime, None) for single tailed date ranges and (datetime.datetime, datetime.datetime) for two tailed date ranges.

datetime parameters shadow from datetime import datetime eg here. Should probably rename to datetime_range or something.

Parameters aim to use the same language as what is in the API spec for clarity. I'm fine changing this but I don't think its too straight forward. Pydantic does support aliasing but I don't believe attrs does.

bbox is alternately types as List[NumType], str and BBox. Should probably be BBox everywhere, and the str2list converter could be a bit more aggressive in converting it into a BBox (a tuple of floats, not just a list of anything). For now I did this but it's not great.

I agree with all of this, bbox should be consistently typed as BBox and str2list should be more aggressive.

@geospatial-jeff geospatial-jeff self-requested a review October 17, 2022 03:04
Copy link
Collaborator

@geospatial-jeff geospatial-jeff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM, thanks for the PR! We can continue discussion in this PR, but if its okay with you @carderne I think any other changes should become new issues / new PRs.

@carderne
Copy link
Contributor Author

carderne commented Oct 17, 2022

Sounds good!

To sort-of summarise the other issues:

  1. Coerce datetime to a well-typed tuple at the API surface, fix in the backends.
  2. Same for bbox.
  3. Datetime treatment should follow the spec. Single date is ==; intervals don't need the .. before/after the /.
  4. Update sqlalchemy backend to use a single search route.
  5. datetime shadowing. Easiest way would be to from datetime import datetime as dt_lib (or something less horrible). As it is now, this if statement (and probably others) is broken.

@carderne
Copy link
Contributor Author

@geospatial-jeff I don’t think I can merge this

@carderne
Copy link
Contributor Author

Fixed conflicts introduced by #482.

@carderne
Copy link
Contributor Author

carderne commented Nov 5, 2022

Fixed the linting issue from the merge above

@carderne
Copy link
Contributor Author

carderne commented Nov 5, 2022

@geospatial-jeff @bitner

Just bumping, any reason not to merge?

Copy link
Collaborator

@bitner bitner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, didn't see this until just now. Looks good to me.

@bitner bitner merged commit 06218c5 into stac-utils:master Nov 7, 2022
@carderne carderne mentioned this pull request Nov 9, 2022
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants