Skip to content

Ability to specify blob data #22

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
unphased opened this issue Dec 9, 2015 · 19 comments
Closed

Ability to specify blob data #22

unphased opened this issue Dec 9, 2015 · 19 comments

Comments

@unphased
Copy link
Contributor

unphased commented Dec 9, 2015

I'm in the middle of reading the code and understanding it but it's taking some time.

But I've gotten far enough to have some questions. I want an easy way to use this wrapper to handle blobs. It seems challenging because is_sqlite_value isn't true for pointer types and furthermore I am not sure how it would work. I guess passing in a tuple of a pointer type and a byte count would suffice.

Does that make sense for the way to extend this so it can be done?

@aminroosta
Copy link
Collaborator

@unphased
I think there are a lot of ways we can handle it. using a tuple will be good way.
What i am thinking is something like this:

// use a boble function that returns a special type and then overload it
db << "insert something with value * " << sqlite::boble(void* , count) ;
db << "insert something with value *"  << sqlite::boble(vector<T>);

// or something like this 
// use a tag type and set flags that should affect processing of remaining parameters
struct boble_tag { } boble;  // tag type
db << "insert something with value * " << boble << (void*)ptr << count;

If you can think of a simpler syntax that would be fine too :)

@edrosten @Killili @KnairdA What do you think?

@Killili
Copy link
Collaborator

Killili commented Dec 9, 2015

Hi, so i get nervous whenever i see a void* and i think we should not sacrifice type safety here.
So we already have the type extension system to handle stuff like that in a nice and clean way. @unphased you just have to specify what you're data is and how to handle it.

Take a look at
https://github.com/aminroosta/sqlite_modern_cpp/blob/master/hdr/sqlite_modern_cpp/extensions/boost_json_spirit.h

Here we handle a json string as a blob and import and export it from the DB as an object of a specific type this way we have type safety and convenience in one.
I think this can be adapted to your needs too. Just add your data structure as an extension.

@edrosten
Copy link
Collaborator

edrosten commented Dec 9, 2015

I agree with not using a void*.

Besides, it's only valid to store, uh... is_trivially_copyable data (maybe? that's weaker than POD, but I think it's OK) in the database.

we probably want the code to be something very loosely like this:

struct blob_data
{
    void* data; //It's got to be void* eventually
    size_t length;
};

template<class C> sqlite::blob(const C* data, size_t length)
{
    static_assert(is_trivially_copyable_V<C>);
    return blob_data{data, length};
}

@edrosten
Copy link
Collaborator

edrosten commented Dec 9, 2015

Also is there any reason we'd want to wrap a std::vector in with a class? We could always assume that vector is going to be inserted as a blob (provided T is trivially copyable, that could be tested easily in operator<<).

@aminroosta
Copy link
Collaborator

@edrosten Sure, i think overloading << for vector is a better idea.

template<typename T>
struct blob {
    T* data;
    size_t length;

    blobl(T* _data = null, size_t _length = 0){
        this.data = _data;
        this.length = _length;
    }
};
// and we could extract data overloading >> operator
blob<int> blb;
db << "extract some data"  >> (blob<int> b)[&] { blb = move(b); };

// and as Ed suggested for vector<T>
vector<int> vec;
db << "extract some data" >> (vector<int> v)[&] { vec = move(v); };

So who has time to implement it ? i can do this after next 2 weeks :)

@Killili
Copy link
Collaborator

Killili commented Dec 9, 2015

I dont get it. We have support for any datatype provided the user provides a serialization and deserialization function. Why would we need that?
Its not that blobs are just something that exists without a type, its a Picture or a Text or something else.

So something like this is a lot more appealing:

Picture *pic;
db << "select pic from book" >> pic;

And it leaves the programmer with just one code point to check if he is doing right with his data handling, the extension method. And not the program logic part of things. Image code littered with stuff like this:

blob<byte> data;
db << "extract some data"  >> (blob<byte> b)[&] { data = move(b); };
Picture pic;
int i = 0;
for(auto pixel:data){ pic.set(i++,pixel);};

It might be that I have seen to many people abuse blobs for absurd things to see a valid reason for this. If so dont be mad at me.

@aminroosta
Copy link
Collaborator

@Killili we don't support blob data yet!
The library doesn't provide an easy way to serialize and deserialize blob data.
So we are proposing a way to handle blob data, I think we can encourage end users to serialize/deserialize their own types using vector<T> and blob<T> in the Readme file.

Picture class will be a good example.

@Killili
Copy link
Collaborator

Killili commented Dec 9, 2015

I know that we dont, i rewrote that type system ;) and at that time i had a build in extension for blobs and did not commit it. It used vectors btw. I could dig it out and add it again.

But i dont think its a good idea, sorry if this did not come over as a question of "should we have this" and more as a "what are we missing?".

I think as a library we should guide the user a bit to do right thing right out the box and not discover 2 month into the project that its not a good idea to do it like that. Image all the copy/paste-bug facepalmes we could avoid ;)

@unphased
Copy link
Contributor Author

unphased commented Dec 9, 2015

This all sounds very interesting, thanks for shedding a lot of light on this stuff.

For now in my application I have simplified my requirements and what I am really looking to store for now are strings at the end of the day so I am going to employ the TEXT type for now but the blobs are going to get used soon.

I am 100% in agreement with this:

I think as a library we should guide the user a bit to do right thing right out the box

Some kind of strong type awareness is something that I had not considered as possibility. Mostly because Sqlite itself is pretty lax about typing since it allows a form of dynamic typing, and besides everything goes back to void* inside this DB as @edrosten points out in his snippet's comment, so the DB provides seemingly no natural way to persist a true type of an object...

I like the line of logic that you are following a lot, @Killili, but I wonder if following it means that we would want to go beyond and implement some concept of persisting the type itself into the DB, and at this point the content in the DB needs to be augmented! And then it won't be pure anymore (so there will be metadata in there at this point that is specific to this wrapper library), so we'll definitely need to provide the user with a choice about this.

For instance I am talking about how it may be super cool to have it so that if you try to >> to a Picture that it will perform a type check against the item specified in the SQL about to get serialized into the Picture object. And then that type information would have had to be inserted by us during the INSERT somehow...

There are problems with this, maybe. As I understand it, implementations probably cannot be relied upon to produce matching mangled names, and we might expect that the same application built on different platforms with different toolchains will end up with different values so sharing a database might make the types irreconcilable. Perhaps a compromise is to just store and check demangled type names, so maybe we can just be testing against "Picture".

Again, i have no idea if these shenanigans fall within the scope of the project.

@Killili
Copy link
Collaborator

Killili commented Dec 9, 2015

RTTI is not a strong-suit of c++ so i dont think that that is possible at library level and at runtime.
But nothing stops you from having stuff like that in your serialization extension.

@aminroosta:
I thought about having generic blobs thou. Its youre project afterall. So what about doing it the c++ STL way and support OutputIterator and InputIterator?
Something like:

std::container<byte> c;
db << "select container from table" >> std::back_insert_iterator<std::container<byte>>(v);

This is in the spirit of c++11 and works for a whole lot of container types.

@aminroosta
Copy link
Collaborator

@Killili Great so we will be overloading << operator
for template< class Container > class back_insert_iterator (:+1:)

what about input iterators ?

I agree not to implement blob<T>, but i think we should still overload for vector<T>, not all users of the library are C++ experts. i have friends that are java developers and sometimes need a simple library in C++ to do a homework or something else 😆

@Killili
Copy link
Collaborator

Killili commented Dec 9, 2015

// Reading
std::vector<byte> c;
db << "select container from table" >> std::back_insert_iterator<std::vector<byte>>(c);

// writing
db << "insert into ..." << c; // the beauty of STL. std::vector is an InputIterator it has .begin() and .end()

@unphased
Copy link
Contributor Author

unphased commented Dec 9, 2015

vectors of bytes is very comfortable interface :)

@Killili yes that is good point, all the crap i was going on about can be shoved inside the serializing overloads, and done.

@Killili
Copy link
Collaborator

Killili commented Dec 9, 2015

I got curios at how to actually implement the thing i suggested so i guess i do it.

@aminroosta
Copy link
Collaborator

👍

@edrosten
Copy link
Collaborator

edrosten commented Dec 9, 2015

Is the interface with back_inserter for extracting multiple rows as an alternative to lambdas, or as a way of doing blobs?

@Killili
Copy link
Collaborator

Killili commented Dec 9, 2015

As a way of supporting different container types as output. so operator>> only needs to support that type of iterator and the actual object handles the assignment and stuff.

@Killili
Copy link
Collaborator

Killili commented Dec 9, 2015

So this might take a while, the extensions are all broken because of the liberal use of ampersands in the new core (are rvalues worth it?).

And adding this line just as a quick test:

namespace sqlite {
template<> void get_col_from_db(database_binder& db, int inx, std::vector<uint8_t>& iter) {}
}

actually crashes vc2015 ... in the function_traits.h file, who wrote that again?
Looks like it does not like templated types.

@aminroosta
Copy link
Collaborator

This is now merged to master with #32

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants