Skip to content

ADL Updates for Jan release #3363

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Jan 12, 2017
Merged

ADL Updates for Jan release #3363

merged 23 commits into from
Jan 12, 2017

Conversation

begoldsm
Copy link
Contributor

@begoldsm begoldsm commented Jan 6, 2017

Description

This change set includes the following.

  1. Fix TraceLogging for unstructured requests and responses to truncate at 10KB of data to protect against very large payloads being sent or received and bogging down memory
  2. Add support for Top to Get-AdlJob
  3. Fix Get-AdlJob to, by default, sort by the newest jobs when displaying
  4. Re-enable serviceClientTracing for upload and download, now that TraceLogging does not use as much memory. The next step will be to include custom trace logging (to a file) instead of to debug output for these commands
  5. Add explicit support for "None" as an encryption option for New-AdlStore and added warnings that, by default, encryption will be enabled.
  6. Full help updates for all ADLA and ADLS cmdlets to be platyPS compliant and accurate.
  7. As a result of the help update, cleaned up incorrect output types/return types and enabled some new passthru options.
  8. Add ability to select product tier during new-adlstore, set-adlstore and new-adlanalyticsaccount and set-adlanalyticsaccount

This checklist is used to make sure that common guidelines for a pull request are followed. You can find a more complete discussion of PowerShell cmdlet best practices here.

General Guidelines

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.
  • The pull request does not introduce breaking changes (unless a major version change occurs in the assembly and module).

Testing Guidelines

  • Pull request includes test coverage for the included changes.
  • PowerShell scripts used in tests should do any necessary setup as part of the test or suite setup, and should not use hard-coded values for locations or existing resources.

Cmdlet Signature Guidelines

  • New cmdlets that make changes or have side effects should implement ShouldProcess and have SupportShouldProcess=true specified in the cmdlet attribute. You can find more information on ShouldProcess here.
  • Cmdlet specifies OutputType attribute if any output is produced - if the cmdlet produces no output, it should implement a PassThrough parameter.

Cmdlet Parameter Guidelines

  • Parameter types should not expose types from the management library - complex parameter types should be defined in the module.
  • Complex parameter types are discouraged - a parameter type should be simple types as often as possible. If complex types are used, they should be shallow and easily creatable from a constructor or another cmdlet.
  • Cmdlet parameter sets should be mutually exclusive - each parameter set must have at least one mandatory parameter not in other parameter sets.

begoldsm added 14 commits December 13, 2016 13:55
1. Allow user to select Top number of jobs to return from a list
2. Explicitly order by newest job (keeps existing behavior, but ensures
it doesn't change if the underying service changes ordering).
3. Test out removing the disable of service client tracing for
upload/download.
4. Remove unused resx string.
1. Fix the trace logging to reduce the amount of information logged for
streams to 10KB
2. Updates for encryption to support "none" explicitly.
Part of this upcoming PR helps reduce the amount of logging that happens
for stream responses and requests. However, more work needs to be done
to enable a more performant logging solution. Perhaps through
System.Diagnostics.Trace
Compared each cmdlet to the help and ensured that all attributes are
correct. Fixed the following:
1. Corrected output types to reflect what is actually output for
GetAccount, NewCred, NewSecret, SetCred, SetSecret and StopJob
2. Updated SetSecret and UpdateSecret method to properly return void,
since nothing is returned from the server (by design).
3. Added output types and descriptions to all of the help files.
This change preps Commitment Tier support for ADL account creation and
updates. It includes:
1. New parameter for new-adla account and set-adla account
2. New parameter for new-adls account and set-adls account
3. Tests for both (in alias as well as fully qualified command names)

Remaining work:
1. Remove private packages when real ones are published
2. Update help.
Refactored to be more explicit for enum use.
begoldsm added 5 commits January 9, 2017 13:48
From discussions with AzurePowerShell team, we will hold off on fixing
the outputtype attributes until the next breaking version of powershell
and add warnings that indicate there is a mismatch between what the
outputtype says and what is actually output.
1. Remove breaking enum
2. Add explicit -DisableEncryption flag
3. Add test
4. Fix tests to remove the use of .Properties (which is deprecated).
Note that I am still testing out the addition of the logger for ADL
import and I will need to add it for export as well. So there is at
least one more commit needed here.
@@ -318,7 +321,9 @@ public static string FormatString(string content)
}
else
{
return content;
return content.Length <= GeneralUtilities.StreamCutOffSize ?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a quick, elegant solution, I like it.

@cormacpayne
Copy link
Member

Copy link
Member

@cormacpayne cormacpayne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@begoldsm a few comments

Also, would you mind updating the change logs for both Analytics and Store with the changes you are making in this PR?

if (Uri != null && Uri.Port <= 0)
{
WriteWarning(string.Format(Resources.NoPortSpecified, Uri));
}

var toUse = Uri ?? new Uri(string.Format("https://{0}:{1}", DatabaseHost, Port));

WriteObject(DataLakeAnalyticsClient.UpdateSecret(Account, DatabaseName, Secret.UserName,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@begoldsm why was this change made? This will break some scripts that depend on the output of this cmdlet

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I will update this to return an empty object, since the SDK correctly returns null now, I will need to write out a nothing object.

[Parameter(ParameterSetName = BaseParameterSetName, ValueFromPipelineByPropertyName = true,
Mandatory = false, HelpMessage = "An optional value which indicates the number of jobs to return. Default value is 500")]
[ValidateNotNullOrEmpty]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@begoldsm please remove this space


if (PassThru)
{
WriteObject(true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@begoldsm if the user doesn't want output from the cmdlet, shouldn't the bool represent if the removal was successful or not, not just a hard-coded value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its a good question. In this case, failure will result in an exception, so no false will be returned. If they want an actual value returned from this cmdlet (which otherwise would not return anything) the result will always be either true or exception. If there is a better way that we should be handling this I am happy to do so, though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline - this is the intended pattern

DataLakeStoreFileSystemClient.RemoveAclEntries(Path.TransformedPath, Account, aclSpec);
if (PassThru)
{
WriteObject(true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@begoldsm same comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same reply :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline

@cormacpayne
Copy link
Member

Copy link
Member

@markcowl markcowl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those marked as 'Suggestion' are not necessary to change this time around, but good to consider for the future.

Resume,
ForceBinary,
cmdletRunningRequest: this);
if (DiagnosticLogLevel != LogLevel.None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a lot of repeated code in cmdlets. It would make sens to add this to a common class or encapsulate it in the methods of your listener

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I have moved everything into the constructor and dispose() methods of the logger for now. I think in the future (based on your suggestions) this can be refactored further.

}

// Setting -Debug overwrites LogLevel to debug.
if (MyInvocation.BoundParameters.ContainsKey("Debug") && (SwitchParameter)MyInvocation.BoundParameters["Debug"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is equivalent to $DebugPreference="Continue", if you really want to use this, you should abide by both the preference variable and the flag. Also, this does not detect -Debug:$false which is also a possible use of the flag.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, I would suggest using your own explicit log level, and thinking about how to use powershell settings for Verbose, Warning, Information, and Debug streams intelligently ina future release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. I have removed anything involving the -Debug flag for now, and I agree that it would be nice to have some way to enable some kind of logging (to a file or otherwise) with the -debug flag, or be able to use some other kind of interceptor. We can discuss this more during PowerShell office hours, since I would like to do something that can benefit everyone and not just these two cmdlets (ideally!)

LogLevel = logLevel;
if (string.IsNullOrEmpty(LogFilePath))
{
LogFilePath = string.Format(@"{0}\ADLDataTransfer\ADLDataTransfer_{1}.log",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you really want to log if this is not specified? This would be surprising to the user, I think, unless this default location is well documented.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this is guaranteed to blow up when the cmdlets move to .Net core (later this year), instead, I would require a log location

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, I have added a Diagnostic parameter set and removed the default log location and updated this to be a required parameter in the diagnostic parameter set. Seems reasonable to me.

Environment.GetFolderPath(Environment.SpecialFolder.LocalApplicationData),
DateTime.Now.ToString("MM-dd-yyyy.HH.mm"));
}
else if(Directory.Exists(LogFilePath)) // the user passed in a directory instead of a file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: You should consider usign IDataStore for file system operations - this allows you to mock the file system in tests. You would need to augment it to allow returning a stream to support this entire scenario.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good suggestion. I am open to doing this in the next release to add to testability. I will work with you to make sure I understand the pattern and can implement it nicely.

@@ -95,20 +116,59 @@ public override void ExecuteCmdlet()
throw new CloudException(string.Format(Resources.InvalidExportPathType, Path.TransformedPath));
}

if (type == FileType.FILE)
DataLakeStoreTraceLogger logger = null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: For the future, you'll need to think about how this would need to be augmented to support muyltiple cmdlet executions simultaneously in an AppDomain. In this case, you would want to differentiate which traces go to which log file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. This trace logging is a "first step" to enable some of our internal and external partners to get initial heuristics about upload and download operations (mostly to help us with performance tuning). As this matures we will definitely need to update this and make it more robust. Ideally, we are able to integrate this into core AzurePowerShell functionality so that everyone can take advantage of it.

HelpMessage =
"Indicates the resulting ACL should be returned.",
ParameterSetName = SpecificAceParameterSetName
)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@begoldsm this parameter does not need any parameterset names. This parameter will be part of global parameter set

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok great, I will remove it for passthru.

begoldsm added 2 commits January 12, 2017 10:37
1. Move all logging related logic into the logger constructor and
dispose() methods to reduce duplication
2. Add parameter sets for diagnostic logging and make the log path
required. Also update the default log level to be Error (the least
verbose logging that actually logs, since None does not log).
3. Updated help to reflect these changes.
4. Removed parameter sets from passthru.
@cormacpayne
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants