Skip to content

Untangle Datastore #2971

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Apr 24, 2020
Merged

Untangle Datastore #2971

merged 13 commits into from
Apr 24, 2020

Conversation

schmidt-sebastian
Copy link
Contributor

@schmidt-sebastian schmidt-sebastian commented Apr 22, 2020

This removes the Datastore dependencies on SortedMap, AsyncQueue and the persistent streams, all of which are not needed in the "Rest Wrapper" client.

@google-oss-bot
Copy link
Contributor

google-oss-bot commented Apr 22, 2020

Binary Size Report

Affected SDKs

  • @firebase/firestore

    Type Base (d92d67b) Head (bc3cf36) Diff
    browser 247 kB 247 kB +302 B (+0.1%)
    esm2017 193 kB 193 kB +96 B (+0.0%)
    main 486 kB 487 kB +1.10 kB (+0.2%)
    module 245 kB 245 kB +248 B (+0.1%)
  • @firebase/firestore/memory

    Type Base (d92d67b) Head (bc3cf36) Diff
    browser 188 kB 188 kB +302 B (+0.2%)
    esm2017 148 kB 148 kB +96 B (+0.1%)
    main 363 kB 364 kB +1.10 kB (+0.3%)
    module 186 kB 186 kB +248 B (+0.1%)
  • firebase

    Type Base (d92d67b) Head (bc3cf36) Diff
    firebase-firestore.js 289 kB 289 kB +200 B (+0.1%)
    firebase-firestore.memory.js 231 kB 231 kB +200 B (+0.1%)
    firebase.js 822 kB 822 kB +205 B (+0.0%)

Test Logs

private connection: Connection,
private credentials: CredentialsProvider,
private serializer: JsonProtoSerializer
) {}

newPersistentWriteStream(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than changing the object models and completely rearranging the client (so that it doesn't match the other ports), a better strategy would be to adopt tree-shakeable APIs internally.

That is, make classes like Datastore more like just dumb structs, and make the methods that used to be in the class plain functions that take Datastore as an argument.

This will greatly reduce the amount of churn in the relative arrangement of components. Indeed, you could apply this to commit and other methods on datastore to ensure that all of it tree-shakes out if unused.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the downsides is that it makes every member public, and I kind of wanted to avoid this. Your argument that we should pull out all methods does win here though. I updated the PR, which also greatly reduces the diff (a large part are now just test cleanup).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I meant by dumb structs. Tree-shakeable APIs make this code look like old-school C.

Is it possible to adopt a convention of which members are actually public vs those which would be private, but are public by virtue of this kind of code structure?

In C programs a way to handle this was to forward declare a private member of your struct to hide details:

typedef struct DatastoreDetails DatastoreDetails;

typedef struct Datastore {
  DatastoreDetails* details;
} Datastore;

Then you only define struct DatastoreDetails in your .c file.

A similar technique could be translated into typescript:

  • The exported type contains a private details: unknown
  • DatastoreDetails is not exported
  • The constructor assigns an instance of DatastoreDetails to the details field.
  • There's a non-exported details(datastore: Datastore) => DatastoreDetails that casts

I'm fairly certain this isn't actually worth it, but this is how we implemented access control before it existed in the language.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a commit with a TypeScript version of this hack: 0295cd5

It should be pretty easy to understand, but I wanted to point out a downside: With this change, at least IntelliJ now consider all members of DatastoreImpl unused, which would breaks its automated refactor should we use it on classes like this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a lot better than my suggestion. If we can get IntelliJ happy, I think this is the way to go.

writeStream.start();
return streamListener.awaitCallback('open');
}).then(async () => {
return withTestWriteStream(async (writeStream, streamListener) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Despite the other comment on thinking the newPersistentStream methods should stay in Datastore, this change is a good one (though withTestWriteStream should call newPersistentWriteStream(datastore).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@wilhuff wilhuff assigned schmidt-sebastian and unassigned wilhuff Apr 22, 2020
private connection: Connection,
private credentials: CredentialsProvider,
private serializer: JsonProtoSerializer
) {}

newPersistentWriteStream(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I meant by dumb structs. Tree-shakeable APIs make this code look like old-school C.

Is it possible to adopt a convention of which members are actually public vs those which would be private, but are public by virtue of this kind of code structure?

In C programs a way to handle this was to forward declare a private member of your struct to hide details:

typedef struct DatastoreDetails DatastoreDetails;

typedef struct Datastore {
  DatastoreDetails* details;
} Datastore;

Then you only define struct DatastoreDetails in your .c file.

A similar technique could be translated into typescript:

  • The exported type contains a private details: unknown
  • DatastoreDetails is not exported
  • The constructor assigns an instance of DatastoreDetails to the details field.
  • There's a non-exported details(datastore: Datastore) => DatastoreDetails that casts

I'm fairly certain this isn't actually worth it, but this is how we implemented access control before it existed in the language.

database: datastore.serializer.encodedDatabaseId,
writes: mutations.map(m => datastore.serializer.toMutation(m))
};
return datastore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might as well make this async/await.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

writes: mutations.map(m => datastore.serializer.toMutation(m))
};
return datastore
.invokeRPC<CommitRequest, api.CommitResponse>('Commit', params)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The stylistic mix of functions and methods seems arbitrary. It seems like the Datastore "class" could just completely convert.

Optional.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather keep this as is, but we can revisit if this pattern becomes a problem.

return datastore
.invokeStreamingRPC<
BatchGetDocumentsRequest,
api.BatchGetDocumentsResponse
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it BatchGetDocumentsRequest but api.BatchGetDocumentsResponse?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BatchGetDocumentsRequest is api. BatchGetDocumentsRequest with the database field set. I just added it in the Proto d.ts field since I/we have been editing this file manually lately.

return ds.lookup([k]);
})
.then((docs: MaybeDocument[]) => {
return invokeCommitRpc(ds, [mutation])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

async/await?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Converted this file

@wilhuff wilhuff assigned schmidt-sebastian and unassigned wilhuff Apr 23, 2020
* Cloud Datastore grpc API, which provides an interface that is more convenient
* for the rest of the client SDK architecture to consume.
*/
export class Datastore {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you made this an interface this wouldn't even materialize in the generated code.

The downside is that empty interfaces match any object so we'd effectively lose type safety. Maybe that's why you chose class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, an empty interface matches all objects in TypeScript. It is even commonly used to do exactly that.

datastore: Datastore,
mutations: Mutation[]
): Promise<MutationResult[]> {
debugAssert(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is IntelliJ happier if you made this something like this here:

const datastoreImpl = cast(datastore);

and also added a function like this:

function cast(datastore: Datastore): DatastoreImpl {
  debugAssert(...);
  return datastore as DatastoreImpl;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just used an explicit cast to a new type, which makes IntelliJ happy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A cast function also has the benefit of reducing the boilerplate at the top of every function from 5 lines to 1. The specific text of which function needed DatastoreImpl isn't important to preserve--should we run afoul of this assertion the stack will show with function was invoked improperly.

If you want to avoid the extra function in the emitted code (if that's a problem) we could add a rewrite rule to inline the function call (to just a cast). If cast is too ambiguous for the strategy, maybe name it castDatastoreImpl, recoverDatastoreImpl, 'toDatastoreImpl, or asDatastoreImpl`? This way you could write dumb, unscoped global rules for this (again--if that's even worthwhile).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not 100% convinced that this effort was worth it (since all these extra lines are stripped in production), but I added a debugCast function that not only taught me a lot about TypeScript and constructor signatures but also can be used everywhere else in the code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that this makes little to no difference in the production code (after stripping). Where it makes a difference is in the readability of this required transformation. Having the same 5-line incantation at the top of nearly every function would have made my eyes bleed :-).

private connection: Connection,
private credentials: CredentialsProvider,
private serializer: JsonProtoSerializer
) {}

newPersistentWriteStream(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a lot better than my suggestion. If we can get IntelliJ happy, I think this is the way to go.

@wilhuff wilhuff assigned schmidt-sebastian and unassigned wilhuff Apr 24, 2020
Copy link
Contributor

@wilhuff wilhuff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically LGTM, except I think we should find a way to cut down the 5 line prolog to every function definition in here now.

datastore: Datastore,
mutations: Mutation[]
): Promise<MutationResult[]> {
debugAssert(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A cast function also has the benefit of reducing the boilerplate at the top of every function from 5 lines to 1. The specific text of which function needed DatastoreImpl isn't important to preserve--should we run afoul of this assertion the stack will show with function was invoked improperly.

If you want to avoid the extra function in the emitted code (if that's a problem) we could add a rewrite rule to inline the function call (to just a cast). If cast is too ambiguous for the strategy, maybe name it castDatastoreImpl, recoverDatastoreImpl, 'toDatastoreImpl, or asDatastoreImpl`? This way you could write dumb, unscoped global rules for this (again--if that's even worthwhile).

Copy link
Contributor

@wilhuff wilhuff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

datastore: Datastore,
mutations: Mutation[]
): Promise<MutationResult[]> {
debugAssert(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that this makes little to no difference in the production code (after stripping). Where it makes a difference is in the readability of this required transformation. Having the same 5-line incantation at the top of nearly every function would have made my eyes bleed :-).

@schmidt-sebastian schmidt-sebastian merged commit 8c36958 into master Apr 24, 2020
scottcrossen pushed a commit that referenced this pull request May 4, 2020
@firebase firebase locked and limited conversation to collaborators May 25, 2020
@schmidt-sebastian schmidt-sebastian deleted the mrschmidt/reads branch July 9, 2020 17:45
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants