-
Notifications
You must be signed in to change notification settings - Fork 945
Untangle Datastore #2971
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Untangle Datastore #2971
Conversation
1128453
to
dbb548e
Compare
Binary Size ReportAffected SDKs
Test Logs
|
private connection: Connection, | ||
private credentials: CredentialsProvider, | ||
private serializer: JsonProtoSerializer | ||
) {} | ||
|
||
newPersistentWriteStream( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than changing the object models and completely rearranging the client (so that it doesn't match the other ports), a better strategy would be to adopt tree-shakeable APIs internally.
That is, make classes like Datastore
more like just dumb structs, and make the methods that used to be in the class plain functions that take Datastore
as an argument.
This will greatly reduce the amount of churn in the relative arrangement of components. Indeed, you could apply this to commit
and other methods on datastore to ensure that all of it tree-shakes out if unused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the downsides is that it makes every member public, and I kind of wanted to avoid this. Your argument that we should pull out all methods does win here though. I updated the PR, which also greatly reduces the diff (a large part are now just test cleanup).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I meant by dumb structs. Tree-shakeable APIs make this code look like old-school C.
Is it possible to adopt a convention of which members are actually public vs those which would be private, but are public by virtue of this kind of code structure?
In C programs a way to handle this was to forward declare a private member of your struct to hide details:
typedef struct DatastoreDetails DatastoreDetails;
typedef struct Datastore {
DatastoreDetails* details;
} Datastore;
Then you only define struct DatastoreDetails
in your .c
file.
A similar technique could be translated into typescript:
- The exported type contains a
private details: unknown
DatastoreDetails
is not exported- The constructor assigns an instance of DatastoreDetails to the
details
field. - There's a non-exported
details(datastore: Datastore) => DatastoreDetails
that casts
I'm fairly certain this isn't actually worth it, but this is how we implemented access control before it existed in the language.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a commit with a TypeScript version of this hack: 0295cd5
It should be pretty easy to understand, but I wanted to point out a downside: With this change, at least IntelliJ now consider all members of DatastoreImpl
unused, which would breaks its automated refactor should we use it on classes like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a lot better than my suggestion. If we can get IntelliJ happy, I think this is the way to go.
writeStream.start(); | ||
return streamListener.awaitCallback('open'); | ||
}).then(async () => { | ||
return withTestWriteStream(async (writeStream, streamListener) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Despite the other comment on thinking the newPersistentStream methods should stay in Datastore
, this change is a good one (though withTestWriteStream
should call newPersistentWriteStream(datastore)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
private connection: Connection, | ||
private credentials: CredentialsProvider, | ||
private serializer: JsonProtoSerializer | ||
) {} | ||
|
||
newPersistentWriteStream( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I meant by dumb structs. Tree-shakeable APIs make this code look like old-school C.
Is it possible to adopt a convention of which members are actually public vs those which would be private, but are public by virtue of this kind of code structure?
In C programs a way to handle this was to forward declare a private member of your struct to hide details:
typedef struct DatastoreDetails DatastoreDetails;
typedef struct Datastore {
DatastoreDetails* details;
} Datastore;
Then you only define struct DatastoreDetails
in your .c
file.
A similar technique could be translated into typescript:
- The exported type contains a
private details: unknown
DatastoreDetails
is not exported- The constructor assigns an instance of DatastoreDetails to the
details
field. - There's a non-exported
details(datastore: Datastore) => DatastoreDetails
that casts
I'm fairly certain this isn't actually worth it, but this is how we implemented access control before it existed in the language.
database: datastore.serializer.encodedDatabaseId, | ||
writes: mutations.map(m => datastore.serializer.toMutation(m)) | ||
}; | ||
return datastore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might as well make this async/await.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
writes: mutations.map(m => datastore.serializer.toMutation(m)) | ||
}; | ||
return datastore | ||
.invokeRPC<CommitRequest, api.CommitResponse>('Commit', params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The stylistic mix of functions and methods seems arbitrary. It seems like the Datastore "class" could just completely convert.
Optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather keep this as is, but we can revisit if this pattern becomes a problem.
return datastore | ||
.invokeStreamingRPC< | ||
BatchGetDocumentsRequest, | ||
api.BatchGetDocumentsResponse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it BatchGetDocumentsRequest
but api.BatchGetDocumentsResponse
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BatchGetDocumentsRequest
is api. BatchGetDocumentsRequest
with the database
field set. I just added it in the Proto d.ts field since I/we have been editing this file manually lately.
return ds.lookup([k]); | ||
}) | ||
.then((docs: MaybeDocument[]) => { | ||
return invokeCommitRpc(ds, [mutation]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
async/await?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Converted this file
* Cloud Datastore grpc API, which provides an interface that is more convenient | ||
* for the rest of the client SDK architecture to consume. | ||
*/ | ||
export class Datastore {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you made this an interface
this wouldn't even materialize in the generated code.
The downside is that empty interfaces match any object so we'd effectively lose type safety. Maybe that's why you chose class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, an empty interface matches all objects in TypeScript. It is even commonly used to do exactly that.
datastore: Datastore, | ||
mutations: Mutation[] | ||
): Promise<MutationResult[]> { | ||
debugAssert( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is IntelliJ happier if you made this something like this here:
const datastoreImpl = cast(datastore);
and also added a function like this:
function cast(datastore: Datastore): DatastoreImpl {
debugAssert(...);
return datastore as DatastoreImpl;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just used an explicit cast to a new type, which makes IntelliJ happy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A cast
function also has the benefit of reducing the boilerplate at the top of every function from 5 lines to 1. The specific text of which function needed DatastoreImpl isn't important to preserve--should we run afoul of this assertion the stack will show with function was invoked improperly.
If you want to avoid the extra function in the emitted code (if that's a problem) we could add a rewrite rule to inline the function call (to just a cast). If cast
is too ambiguous for the strategy, maybe name it castDatastoreImpl
, recoverDatastoreImpl
, 'toDatastoreImpl, or
asDatastoreImpl`? This way you could write dumb, unscoped global rules for this (again--if that's even worthwhile).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not 100% convinced that this effort was worth it (since all these extra lines are stripped in production), but I added a debugCast
function that not only taught me a lot about TypeScript and constructor signatures but also can be used everywhere else in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that this makes little to no difference in the production code (after stripping). Where it makes a difference is in the readability of this required transformation. Having the same 5-line incantation at the top of nearly every function would have made my eyes bleed :-).
private connection: Connection, | ||
private credentials: CredentialsProvider, | ||
private serializer: JsonProtoSerializer | ||
) {} | ||
|
||
newPersistentWriteStream( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a lot better than my suggestion. If we can get IntelliJ happy, I think this is the way to go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically LGTM, except I think we should find a way to cut down the 5 line prolog to every function definition in here now.
datastore: Datastore, | ||
mutations: Mutation[] | ||
): Promise<MutationResult[]> { | ||
debugAssert( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A cast
function also has the benefit of reducing the boilerplate at the top of every function from 5 lines to 1. The specific text of which function needed DatastoreImpl isn't important to preserve--should we run afoul of this assertion the stack will show with function was invoked improperly.
If you want to avoid the extra function in the emitted code (if that's a problem) we could add a rewrite rule to inline the function call (to just a cast). If cast
is too ambiguous for the strategy, maybe name it castDatastoreImpl
, recoverDatastoreImpl
, 'toDatastoreImpl, or
asDatastoreImpl`? This way you could write dumb, unscoped global rules for this (again--if that's even worthwhile).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
datastore: Datastore, | ||
mutations: Mutation[] | ||
): Promise<MutationResult[]> { | ||
debugAssert( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that this makes little to no difference in the production code (after stripping). Where it makes a difference is in the readability of this required transformation. Having the same 5-line incantation at the top of nearly every function would have made my eyes bleed :-).
This removes the Datastore dependencies on SortedMap, AsyncQueue and the persistent streams, all of which are not needed in the "Rest Wrapper" client.