-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Added doc update for dataset builder #3539
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: Eric Zou <[email protected]>
Co-authored-by: Eric Zou <[email protected]>
) * Bug fixed - as_of, event_range, join, default behavior and duplicates and tests Bugs: 1. as_of was not working properly on deleted events 2. Same event_time_range 3. Join was not working when including feature names 4. Default sql was returning only most recent, whereas it should all excluding duplicates 5. Include duplicates was not return all non-deleted data 6. instanceof(dataframe) case was also applied to non-df cases while join 7. Include column was returning unnecessary columns. * Fix on pylint error * Fix on include_duplicated_records for panda data frames * Fix format issue for black * Bug fixed related to line break * Bug fix related to dataframe and inclde_deleted_record and include_duplicated_record * Addressed comments and code refactored * changed to_csv to to_csv_file and added error messages for query limit and recent record limit * Revert a change which was not intended * Resolved the leak of feature group deletion in integration test
/bot run all |
1 similar comment
/bot run all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/bot run all
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/bot run unit-tests
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #3539 +/- ##
==========================================
- Coverage 89.57% 88.78% -0.80%
==========================================
Files 960 226 -734
Lines 88744 21945 -66799
==========================================
- Hits 79495 19483 -60012
+ Misses 9249 2462 -6787 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
* Add list_feature_groups API (aws#647) * feat: Feature/get record api (aws#650) Co-authored-by: Eric Zou <[email protected]> * Add delete_record API (aws#664) * feat: Add DatasetBuilder class (aws#667) Co-authored-by: Eric Zou <[email protected]> * feat: Add to_csv method in DatasetBuilder (aws#699) * feat: Add pandas.Dataframe as base case (aws#708) * feat: Add with_feature_group method in DatasetBuilder (aws#726) * feat: Handle merge and timestamp filters (aws#727) * feat: Add to_dataframe method in DatasetBuilder (aws#729) * Address TODOs (aws#731) * Unit test for DatasetBuilder (aws#734) * fix: Fix list_feature_groups max_results (aws#744) * Add integration tests for create_dataset (aws#743) * feature: Aggregate commits * fix: as_of, event_range, join, default behavior and duplicates… (aws#764) * Bug fixed - as_of, event_range, join, default behavior and duplicates and tests Bugs: 1. as_of was not working properly on deleted events 2. Same event_time_range 3. Join was not working when including feature names 4. Default sql was returning only most recent, whereas it should all excluding duplicates 5. Include duplicates was not return all non-deleted data 6. instanceof(dataframe) case was also applied to non-df cases while join 7. Include column was returning unnecessary columns. * Fix on pylint error * Fix on include_duplicated_records for panda data frames * Fix format issue for black * Bug fixed related to line break * Bug fix related to dataframe and inclde_deleted_record and include_duplicated_record * Addressed comments and code refactored * changed to_csv to to_csv_file and added error messages for query limit and recent record limit * Revert a change which was not intended * Resolved the leak of feature group deletion in integration test * Added doc update for dataset builder * Fix the issue in doc Co-authored-by: Yiming Zou <[email protected]> Co-authored-by: Brandon Chatham <[email protected]> Co-authored-by: Eric Zou <[email protected]> Co-authored-by: jiapinw <[email protected]>
* Add list_feature_groups API (aws#647) * feat: Feature/get record api (aws#650) Co-authored-by: Eric Zou <[email protected]> * Add delete_record API (aws#664) * feat: Add DatasetBuilder class (aws#667) Co-authored-by: Eric Zou <[email protected]> * feat: Add to_csv method in DatasetBuilder (aws#699) * feat: Add pandas.Dataframe as base case (aws#708) * feat: Add with_feature_group method in DatasetBuilder (aws#726) * feat: Handle merge and timestamp filters (aws#727) * feat: Add to_dataframe method in DatasetBuilder (aws#729) * Address TODOs (aws#731) * Unit test for DatasetBuilder (aws#734) * fix: Fix list_feature_groups max_results (aws#744) * Add integration tests for create_dataset (aws#743) * feature: Aggregate commits * fix: as_of, event_range, join, default behavior and duplicates… (aws#764) * Bug fixed - as_of, event_range, join, default behavior and duplicates and tests Bugs: 1. as_of was not working properly on deleted events 2. Same event_time_range 3. Join was not working when including feature names 4. Default sql was returning only most recent, whereas it should all excluding duplicates 5. Include duplicates was not return all non-deleted data 6. instanceof(dataframe) case was also applied to non-df cases while join 7. Include column was returning unnecessary columns. * Fix on pylint error * Fix on include_duplicated_records for panda data frames * Fix format issue for black * Bug fixed related to line break * Bug fix related to dataframe and inclde_deleted_record and include_duplicated_record * Addressed comments and code refactored * changed to_csv to to_csv_file and added error messages for query limit and recent record limit * Revert a change which was not intended * Resolved the leak of feature group deletion in integration test * Added doc update for dataset builder * Fix the issue in doc Co-authored-by: Yiming Zou <[email protected]> Co-authored-by: Brandon Chatham <[email protected]> Co-authored-by: Eric Zou <[email protected]> Co-authored-by: jiapinw <[email protected]>
* Add list_feature_groups API (#647) * feat: Feature/get record api (#650) Co-authored-by: Eric Zou <[email protected]> * Add delete_record API (#664) * feat: Add DatasetBuilder class (#667) Co-authored-by: Eric Zou <[email protected]> * feat: Add to_csv method in DatasetBuilder (#699) * feat: Add pandas.Dataframe as base case (#708) * feat: Add with_feature_group method in DatasetBuilder (#726) * feat: Handle merge and timestamp filters (#727) * feat: Add to_dataframe method in DatasetBuilder (#729) * Address TODOs (#731) * Unit test for DatasetBuilder (#734) * fix: Fix list_feature_groups max_results (#744) * Add integration tests for create_dataset (#743) * feature: Aggregate commits * fix: as_of, event_range, join, default behavior and duplicates… (#764) * Bug fixed - as_of, event_range, join, default behavior and duplicates and tests Bugs: 1. as_of was not working properly on deleted events 2. Same event_time_range 3. Join was not working when including feature names 4. Default sql was returning only most recent, whereas it should all excluding duplicates 5. Include duplicates was not return all non-deleted data 6. instanceof(dataframe) case was also applied to non-df cases while join 7. Include column was returning unnecessary columns. * Fix on pylint error * Fix on include_duplicated_records for panda data frames * Fix format issue for black * Bug fixed related to line break * Bug fix related to dataframe and inclde_deleted_record and include_duplicated_record * Addressed comments and code refactored * changed to_csv to to_csv_file and added error messages for query limit and recent record limit * Revert a change which was not intended * Resolved the leak of feature group deletion in integration test * Added doc update for dataset builder * Fix the issue in doc Co-authored-by: Yiming Zou <[email protected]> Co-authored-by: Brandon Chatham <[email protected]> Co-authored-by: Eric Zou <[email protected]> Co-authored-by: jiapinw <[email protected]>
* Add list_feature_groups API (aws#647) * feat: Feature/get record api (aws#650) Co-authored-by: Eric Zou <[email protected]> * Add delete_record API (aws#664) * feat: Add DatasetBuilder class (aws#667) Co-authored-by: Eric Zou <[email protected]> * feat: Add to_csv method in DatasetBuilder (aws#699) * feat: Add pandas.Dataframe as base case (aws#708) * feat: Add with_feature_group method in DatasetBuilder (aws#726) * feat: Handle merge and timestamp filters (aws#727) * feat: Add to_dataframe method in DatasetBuilder (aws#729) * Address TODOs (aws#731) * Unit test for DatasetBuilder (aws#734) * fix: Fix list_feature_groups max_results (aws#744) * Add integration tests for create_dataset (aws#743) * feature: Aggregate commits * fix: as_of, event_range, join, default behavior and duplicates… (aws#764) * Bug fixed - as_of, event_range, join, default behavior and duplicates and tests Bugs: 1. as_of was not working properly on deleted events 2. Same event_time_range 3. Join was not working when including feature names 4. Default sql was returning only most recent, whereas it should all excluding duplicates 5. Include duplicates was not return all non-deleted data 6. instanceof(dataframe) case was also applied to non-df cases while join 7. Include column was returning unnecessary columns. * Fix on pylint error * Fix on include_duplicated_records for panda data frames * Fix format issue for black * Bug fixed related to line break * Bug fix related to dataframe and inclde_deleted_record and include_duplicated_record * Addressed comments and code refactored * changed to_csv to to_csv_file and added error messages for query limit and recent record limit * Revert a change which was not intended * Resolved the leak of feature group deletion in integration test * Added doc update for dataset builder * Fix the issue in doc Co-authored-by: Yiming Zou <[email protected]> Co-authored-by: Brandon Chatham <[email protected]> Co-authored-by: Eric Zou <[email protected]> Co-authored-by: jiapinw <[email protected]>
* Add list_feature_groups API (aws#647) * feat: Feature/get record api (aws#650) Co-authored-by: Eric Zou <[email protected]> * Add delete_record API (aws#664) * feat: Add DatasetBuilder class (aws#667) Co-authored-by: Eric Zou <[email protected]> * feat: Add to_csv method in DatasetBuilder (aws#699) * feat: Add pandas.Dataframe as base case (aws#708) * feat: Add with_feature_group method in DatasetBuilder (aws#726) * feat: Handle merge and timestamp filters (aws#727) * feat: Add to_dataframe method in DatasetBuilder (aws#729) * Address TODOs (aws#731) * Unit test for DatasetBuilder (aws#734) * fix: Fix list_feature_groups max_results (aws#744) * Add integration tests for create_dataset (aws#743) * feature: Aggregate commits * fix: as_of, event_range, join, default behavior and duplicates… (aws#764) * Bug fixed - as_of, event_range, join, default behavior and duplicates and tests Bugs: 1. as_of was not working properly on deleted events 2. Same event_time_range 3. Join was not working when including feature names 4. Default sql was returning only most recent, whereas it should all excluding duplicates 5. Include duplicates was not return all non-deleted data 6. instanceof(dataframe) case was also applied to non-df cases while join 7. Include column was returning unnecessary columns. * Fix on pylint error * Fix on include_duplicated_records for panda data frames * Fix format issue for black * Bug fixed related to line break * Bug fix related to dataframe and inclde_deleted_record and include_duplicated_record * Addressed comments and code refactored * changed to_csv to to_csv_file and added error messages for query limit and recent record limit * Revert a change which was not intended * Resolved the leak of feature group deletion in integration test * Added doc update for dataset builder * Fix the issue in doc Co-authored-by: Yiming Zou <[email protected]> Co-authored-by: Brandon Chatham <[email protected]> Co-authored-by: Eric Zou <[email protected]> Co-authored-by: jiapinw <[email protected]>
* Add list_feature_groups API (aws#647) * feat: Feature/get record api (aws#650) Co-authored-by: Eric Zou <[email protected]> * Add delete_record API (aws#664) * feat: Add DatasetBuilder class (aws#667) Co-authored-by: Eric Zou <[email protected]> * feat: Add to_csv method in DatasetBuilder (aws#699) * feat: Add pandas.Dataframe as base case (aws#708) * feat: Add with_feature_group method in DatasetBuilder (aws#726) * feat: Handle merge and timestamp filters (aws#727) * feat: Add to_dataframe method in DatasetBuilder (aws#729) * Address TODOs (aws#731) * Unit test for DatasetBuilder (aws#734) * fix: Fix list_feature_groups max_results (aws#744) * Add integration tests for create_dataset (aws#743) * feature: Aggregate commits * fix: as_of, event_range, join, default behavior and duplicates… (aws#764) * Bug fixed - as_of, event_range, join, default behavior and duplicates and tests Bugs: 1. as_of was not working properly on deleted events 2. Same event_time_range 3. Join was not working when including feature names 4. Default sql was returning only most recent, whereas it should all excluding duplicates 5. Include duplicates was not return all non-deleted data 6. instanceof(dataframe) case was also applied to non-df cases while join 7. Include column was returning unnecessary columns. * Fix on pylint error * Fix on include_duplicated_records for panda data frames * Fix format issue for black * Bug fixed related to line break * Bug fix related to dataframe and inclde_deleted_record and include_duplicated_record * Addressed comments and code refactored * changed to_csv to to_csv_file and added error messages for query limit and recent record limit * Revert a change which was not intended * Resolved the leak of feature group deletion in integration test * Added doc update for dataset builder * Fix the issue in doc Co-authored-by: Yiming Zou <[email protected]> Co-authored-by: Brandon Chatham <[email protected]> Co-authored-by: Eric Zou <[email protected]> Co-authored-by: jiapinw <[email protected]>
Issue #, if available:
Added doc update for dataset builder
Description of changes:
Added doc update for dataset builder
Testing done:
Merge Checklist
Put an
x
in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.General
Tests
unique_name_from_base
to create resource names in integ tests (if appropriate)By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.