Skip to content

Update files to chunk #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 27, 2024
Merged

Conversation

justmobilize
Copy link

No description provided.

if isinstance(boundary_object, str):
self._send_as_bytes(socket, boundary_object)
else:
chunk_size = 32
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is assuming the file is opened in binary mode, should probably add something to allow both

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I assumed that reading in bytes mode was a requirement because all of the examples I found and referenced used that.

Looking further into it now I found that the requests docs here https://requests.readthedocs.io/en/latest/user/quickstart/#post-a-multipart-encoded-file contain a red warning that binary mode is strongly recommended.
image

I checked a few of the top results on google for "requests upload file" and all of the ones I saw used rb mode even a few that were .txt and .csv which could have supported text mode.

While it does seem that CPython can work with a simple example of a .txt file using mode "r", I think I'm okay just not supporting that in the circuitpython version. It'll keep the code a little leaner and less complex, plus will force the adoption of the "strong recommendation" that requests documentation makes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me. I was going to look into this one you reviewed... We should still see if we can detect and error if it's not binary

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume it would result in some kind of encoding error, but I'm certain and don't know where it might raise from off the top of my head.

It should also be possible to detect the difference with type() (or isinstance())

>>> f = open("something.txt", "rb")
>>> type(f)
<class 'FileIO'>
>>> f.close()
>>> f = open("something.txt", "r")
>>> type(f)
<class 'TextIOWrapper'>
>>> f.close()

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read the first byte. Seemed the safest way, incase someone built a different file handler class

@@ -399,10 +447,31 @@ def _send(socket: SocketType, data: bytes):
# Not EAGAIN; that was already handled.
raise OSError(errno.EIO)
total_sent += sent
return total_sent
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was missing

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it related to the rest of these changes or just something you noticed while working on this?

Is there some reference somewhere that documents it's need to return here? I only had a quick look but I didn't see any spots where self._send() is called that do anything with a return value from it so I'm guessing it won't make any actual difference as it is used now?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's won't. When I was building this out, I had a flow I needed to know how much was sent. I figured out another way, but this should still be there.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reference to somewhere that shows it should be there? like a link to the same thing inside of cpython requests, or any documentation somewhere? I'm not really doubting you, but I don't have any real insight into it so I have no way to really know whether it should or shouldn't.

In any case if it is added, the type annotations should be updated to show the return type ideally I think.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will remove for now

@@ -46,6 +46,8 @@

from adafruit_connection_manager import get_connection_manager

SEEK_END = 2
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is os.SEEK_END in CPython


for field_name, field_values in files.items():
file_name = field_values[0]
file_data = field_values[1]
Copy link
Owner

@FoamyGuy FoamyGuy Apr 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the name file_data makes me think that it's going to contain the actual data from the file rather than the FileIO stream object thing (not certain of the proper name) that allows you to read() the data, but isn't the actual data itself.

In the original code I think I was using file_handle for this concept which fits closer in my mind to what the variable is, but there may be something better than that as well.

In any case, I'm in favor of something other than _data suffix for this

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, it's either a string or a handle... I hate naming. If everything else is good, I'll thing through this as I add tests.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that's true. I was thinking only about the file case, not the "simple value" case with just a string.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only other bit I had a question about was the return that was added to _send() I'm not opposed to it, but want to understand more about it. Everything else here is looking good to me.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll start adding some tests!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay CPython Requests, uses fh for file_handle. I'll set it to that

@justmobilize justmobilize requested a review from FoamyGuy April 22, 2024 17:11
@justmobilize justmobilize marked this pull request as ready for review April 22, 2024 17:11
@justmobilize
Copy link
Author

@FoamyGuy any other changes you would like to see?

@FoamyGuy FoamyGuy mentioned this pull request Apr 27, 2024
@FoamyGuy FoamyGuy merged commit a2b43ac into FoamyGuy:files_arg Apr 27, 2024
b"s\x00\x00\x00\x01sRGB\x00\xae\xce\x1c\xe9\x00\x00\x00\x04gAMA\x00\x00\xb1\x8f\x0b\xfca\x05\x00\x00"
),
mock.call(
b"\x00\tpHYs\x00\x00\x0e\xc3\x00\x00\x0e\xc3\x01\xc7o\xa8d\x00\x00\x00\x10IDAT\x18Wc\xf8\xcf"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand what these call(b"big string of bytes") are doing or what the bytes are and I'm a little hesitant about merging tests that pass weird bytestrings to call() methods without having a better understanding of what they are or what it does functionally.

I guess maybe the PNG data based on that appearing in the first one, but I'm not sure. If that is the case I'm in favor of maybe using PIL or something that is comparing image file to image file ideally with less need for the "magic" bytestrings hardcoded into the test.

You can submit these back as a PR to the main repo if you want.

@FoamyGuy
Copy link
Owner

Everything in the library and example were all looking good to me, I did very minimal but still successful testing.

I didn't understand some of the stuff in the new test and was hesitant about merging it so I made #2 as a branch from this that removed that test for now. You can re-submit that to the main repo if you'd like.

@justmobilize
Copy link
Author

@FoamyGuy sounds good. I will reopen once merged. What it does is validates that the data sent matches what would be sent from CPython. This makes sure that if someone makes a change (like adding a space or extra line feed, when updating logic) that it doesn't break anything.

@FoamyGuy
Copy link
Owner

@justmobilize okay, thank you. A few questions to follow up from that:

  • Since the test is going to be running in CPython and has access to CPython requests, instead of having a sample request script commented out in the test, could the test actually create a real CPYthon requests request, and then compare the contents that it generates against the contents that are generated by adafruit_requests rather than having the CPython one be a comment and the test looking for hardcoded strings that were created originally by the commented CPython script? If we ran that CPYthon version "live" it would mean we would be sure to notice any changes made to the format that CPython uses vs. ours, whereas if the content of the request is hardcoded we would only notice that if/when someone runs the commented out code and then notices any discrepency vs. the hardcoded strings.
  • If that is not possible (or I've completely misunderstood what it's doing) then my next question is if we do end up having to keep the hardcoded bytestrings to compare to, could we use a .txt file with plain text inside of it instead of a .png or any other binary formatted file? That way the hardcoded strings can be visible ASCII characters that look more like a "normal" string than the seemingly random / magical bytes that make up the image file. Since the file is being opened as rb they'll still be bytestrings instead of "normal" strings but the contents of them will be visible ASCII perhaps even some "This file is for testing purposes" type message that makes clear what it's for. If I am understanding correct I think using a text file would confirm the same thing as using an image file since it will still notice any difference between the CPython version and the CircuitPython version.

@justmobilize justmobilize deleted the files_arg_plus branch April 27, 2024 21:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants