-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Add GeoJson style map to counter style duplication #1046
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GeoJson style map to counter style duplication #1046
Conversation
@sgvandijk thanks for putting together this PR. The bloated styling embedded in GeoJson is something I've wanted to address for a while too, but hadn't thought of a good way to approach it - I was initially trying to think of a way to directly translate a python function -> JS function. I think your style dictionary mapper approach is really good. I have a couple of additional ideas on possible functionality.
Maybe wrap this ^ into a function so that the how I could envision it looking:
Let me know what you think about these ideas. Thanks for your contribution to the project! |
Thanks for the feedback @jtbaker! I'll address your points:
I also didn't like this, but it was because I wanted to make it so that a feature's ID is used by default to determine the value, unless a property is given, so the default value is
I'm hesitant about this, at least for this PR. I feel this would allow the user to break things too easily by providing some operator that is not valid (in JS), or makes the output more difficult to predict; for instance if
So if the value is greater or equal than 40, the other At the moment it is always possible to encode the more complex logic in an additional property (like I did in the notebook) and use equality on this property. I like it that that way it's all explicit on the Python side. I also have an idea how to more easily support different common style functions, by using different objects, I think I will build that out as an example here.
That's a good idea, this can be easily done on the JS side, I will add that! Regarding your comment about
|
I have broken up the class into a base and derived class, with another derived class, I thought about introducing more examples, such as a binning function that could be used for defining styles for ranges of values to implement @jtbaker's example, but decided that would be too much feature creep for this PR. I have also added docstrings and basic tests, please have a look and let me know any further feedback! |
Updated to fix merge conflicts. Possibly something for @talbertc-usgs and @Conengmo to look at too, because there was a conflict with 75a44c3, which has new features, cleanups and stylistic changes squashed together into one commit. Tests and notebooks seem fine, but it's more difficult to see what all changes are for and whether I've resolved conflicts correctly. |
Hi @sgvandijk, thanks for making this PR! I have some comments I hope we can discuss.
You're right, the current way is not very smart. But it's not that big of a problem I would say. I think usability is more important than filesize, so I'm critical of adding new ways of working, new syntax. Preferable there is only one way to style a GeoJson object. I think the current way a user provides a (lambda) style function is pretty straight forward, so I wouldn't want to make that more complicated without good reason. What would be awesome if we could make embedding styling in the html more efficient without placing the burden of that on the user. Can we run the function a users passes to the
So in short I'm certainly not against this, but I want to keep it simple for the user (and ourselves as maintainers). I hope to hear your thoughts on this. |
Hi @Conengmo, thanks for the input, I think those are indeed good points. I also didn't want to take away the very convenient option to just pass in a lambda, this PR leaves that intact so there is no added complexity for those who don't want it. My aim is only to give more options to those who do care about the output size, like in our case where we regularly create and store maps from GeoJson with 1000s of features with a range of style properties set. This really adds to the size, into the megabytes; in the very simple US states example in the GeoJson example notebook, using I think your idea of creating a map automatically would be pretty cool. The work in this PR supports that as it captures some of those steps already. I would not add that to this PR though. As a final note, part of my motivation for this work is also to be able to reinstate the ability to load GeoJson from an external source (in a separate PR), rather than having to embed it. With Related to that, having a separate class (hierarchy) responsible for styling, away from the main Let me know what you think! |
That's a more important goal than skimming of style declaration size IMO. They are related though; if we can keep the style out of the geojson data, we don't need to embed the geojson data. (Except for in Jupyter notebooks.)
I feel that the way the PR is now the added complexity doesn't outweigh the benefits. For me that's mainly due to 1. the new styling syntax and 2. the added classes. I think changing or expanding the syntax is a no-go for me unless really necessary or warranted. If we agree that the goal is to have html files without embedded geojson data, than I'd rather regard this problem as a whole:
So I would like to invite you to think about how we can achieve 1 (and optionally 3 at the same time) without changing the syntax for the users. I appreciate you taking the time to help discuss this topic, getting this right is maybe not as straight forward as a simple plugin but will be rewarding I think. |
Thanks again for the feedback. I definitely see your point about aiming to minimise complexity. However. I'm not sure I really see the difference between this and the new GeoJsonToolTip for instance, where for both you can still pass in a simple lambda/string, or you can use the bit more complex classes to achieve more complex things. Maybe because the added tool tip complexity adds more visual features? In our use cases, where we use folium to do more than show a one-off map for a single user in Jupyter, significantly reducing data bloat is at least as relevant as more nice visual features. But if that's not a priority for this project for now, that's fine. Though if you agree being able to avoid embedding is a priority, I again think you can't do without these extensions. Because I don't think you will be able to generate HTML files without embedding while still be able to define styling without changing the syntax: if you restrict the user to only pass in a function you will always need to fetch the data in Python to create the right function in JS, which would defeat the purposes of not having the data embedded; if you only can pass in styling that is a function of the data, they cannot be separate and you won't achieve 1. Except maybe if you a) transpile the Python function to JS which is a nonstarter I'd think b) define a subset of functions that can generate the JS function in a deterministic way without having to see the data, which I've gone for, for which I don't see how you can get around having to extend the syntax. I'd be very happy to be proven wrong though if you can think of a smart way to achieve it still :) |
Definitely a goal! Not first priority though, but important, let’s do this. Your comment is very helpful. Your premise is that not embedding the data means not seeing it at all. I don’t agree with that, I think we should always see the data, even when we don’t embed it in the html. It’s possible, load the data in Python but don’t put it in the html. The reason is that we should do all our parsing and error checking in Python, that’s where our users are and that’s where we can provide meaningful error messages. A lot of the issues here and on other channels are from users seeing blank maps, not knowing how to debug the JavaScript. I want to prevent that as much as possible, also to reduce the strain on the team having to handle the issues. That means checking whether a field name provided by a user actually matches the geojson data. We’ve not done that fully in Choropleth for example and you can see a lot of questions about that on Stackoverflow. So what I’m trying to say is that I think it’s a given we should see the GeoJson data in Python, even when not embedding in the html. It may be more efficient not to load the data, but I think a more important goal is to have any rendered html be error free, so all errors are shown to the user in Python. I hope to hear your thoughts on this! I’ve also worked on a demonstrator which I’ll share later, might be interesting to discuss it. |
Fair enough, that's a good reason yeah, and can definitely understand your point as a maintainer to make it impossible to use things incorrectly. Our priorities are a bit different, where we'd like the instance running the Python app generating the map not having to be granted access to the datasource, or where generating the GeoJson is quite complex, so we want to present the map as fast as possible with other info and have the GeoJson loaded separately. I will have a look at putting this kind of things into a separate plugin, |
Thanks for understanding. Can I ask what your use case is exactly? Interesting to know what folium is used for in practice. It would be cool if we can accomodate more advanced usage as well. But I wonder, if speed is really important and you don’t have your data in Python, why not use Leaflet natively in JavaScript. You of course have your reasons so I’m not judging. Interesting idea to create a sort of geojson class for experts without error checking, we could consider that. As a plugin, or possibly in the core. Alternatively, we could consider adding a sort of “don’t do any checks” flag for the regular geojson class, after we applied the style embedding and data embedding changes. But whether that’s possible depends on how the code turns out. |
53546b8
to
9f2299a
Compare
Closing this since we tackled the style duplication and enabling not embedding geojson data in #1058. Note that we chose to always load the geojson data in Python for error checking. If your use case requires speed, I reckon you can put together a custom class by inheriting from |
I saw the work on on the deduplication, looks nice 👍 :)
Not meaning to drag this out further :) but this PR did do just that, it has a class based on |
Often when I use a GeoJson, feature styling is based on a categorical/ordinal property with only a small number of distinct values. Currently when supplying a style function, a style dictionary is added to each feature, which possibly results in a lot of repetition and unnecessarily large output, especially when the GeoJson contains many more features than number of styles.
This PR introduces the
GeoJsonStyleMap
class to help combat that: it takes a dictionary with styles and creates a JavaScript style function that applies the styles on the client side, rather than adding the styles to each feature beforehand.I still want to add some better input handling, tests, proper docstring, etc, but already wanted to put this forward to see if the approach is acceptable and get any feedback early :) it was inspired by the
GeoJsonToolTip
implementation.