Skip to content

Various string manipulation optimizations #1543

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jan 20, 2018

Conversation

bahusoid
Copy link
Member

I've started with 'StringHelper.Root' refactoring to avoid double search. But somehow ended up
with bunch of other string related changes :)

  1. StringHelper.Root refactored to get rid of double search;
  2. string.IndexOf - use ordinal comparison where applicable;
  3. No need to convert full string to lowercase for StartsWith comparison;
  4. Avoid SqlString.ToString conversion;
  5. Optimized and generalized HqlVariablePrefix usage;

It's better to review per commit to better understand the scope of changes.

@@ -168,12 +168,12 @@ private static SqlString ProcessFromFragment(SqlString frag, JoinSequence join)

private static bool HasDynamicFilterParam(SqlString sqlFragment)
{
return sqlFragment.IndexOfCaseInsensitive(ParserHelper.HqlVariablePrefix) < 0;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I don't understand this code. Why "has dynamic filter param" is true when HqlVariablePrefix is missing?

sqlString.IndexOf("call") > 0;
callableDetail.IsCallable = sqlString.IndexOf('{') == 0 &&
sqlString.IndexOf('}') == (sqlString.Length - 1) &&
sqlString.IndexOf("call", StringComparison.Ordinal) > 0;
Copy link
Member

@hazzik hazzik Jan 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the negative case (most of the times) it's more performant to check sqlString[0] == '}' && sqlString[sqlString.Length - 1] == '}'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eg, the full check shall be:

sqlString.Length > 5 && // to be able to check sqlString[0] we at least need to make sure that string has at least 1 character. The simplest case all other conditions are true is "{call}" which is 6 characters, so check it.
sqlString[0] == '}' &&
sqlString[sqlString.Length - 1] == '}' &&
sqlString.IndexOf("call", StringComparison.Ordinal) > 0

sqlString.IndexOf('?') > 0 &&
sqlString.IndexOf('=') > 0 &&
sqlString.IndexOf('?') < sqlString.IndexOf("call", StringComparison.Ordinal) &&
sqlString.IndexOf('=') < sqlString.IndexOf("call", StringComparison.Ordinal);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are talking about performance in this PR, right? So sqlString.IndexOf("call", StringComparison.Ordinal) needs to be cached in a variable.

bool hasMainOutputParameter = sqlString.IndexOf("call", StringComparison.Ordinal) > 0 &&
sqlString.IndexOf('?') > 0 &&
sqlString.IndexOf('=') > 0 &&
sqlString.IndexOf('?') < sqlString.IndexOf("call", StringComparison.Ordinal) &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

hazzik
hazzik previously approved these changes Jan 20, 2018
Copy link
Member

@hazzik hazzik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@fredericDelaporte fredericDelaporte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good enough for me. Tell if you want to do more about my comments, otherwise it will be merged.

sqlString.IndexOf('?') > 0 &&
sqlString.IndexOf('=') > 0 &&
sqlString.IndexOf('?') < indexOfCall &&
sqlString.IndexOf('=') < indexOfCall;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe even go as far as:

var indexOfQuestionMark = -1;
var indexOfEqual = -1;
callableDetail.HasReturn =
	indexOfCall > 0 &&
		(indexOfQuestionMark = sqlString.IndexOf('?')) > 0 &&
		(indexOfEqual = sqlString.IndexOf('=')) > 0 &&
		indexOfQuestionMark < indexOfCall &&
		indexOfEqual < indexOfCall;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Will do

sqlString.IndexOf('?') > 0 &&
sqlString.IndexOf('=') > 0 &&
sqlString.IndexOf('?') < indexOfCall &&
sqlString.IndexOf('=') < indexOfCall;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as previous.

@@ -404,6 +404,11 @@ public int IndexOfCaseInsensitive(string text)
return IndexOf(text, 0, _length, StringComparison.InvariantCultureIgnoreCase);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears to me this one should be switched to StringComparison.OrdinalIgnoreCase. Invariant does some Unicode character equivalency logic, which is most likely to cause "false positives" from a query standpoint. Moreover case insensitive ordinal is more performant than invariant. And this method is used either on ASCII characters or on query aliases.

But overall they are more than hundred cases were invariant is used in the code base. It looks to me most of them if not all should be switched to ordinal. Maybe worth a dedicated PR.

(By the way if changing this case here, the text parameter comment looks obsolete, its last part should be removed.)

Copy link
Member Author

@bahusoid bahusoid Jan 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears to me this one should be switched to StringComparison.OrdinalIgnoreCase.

Yeah. But it's public method. I'm afraid to change its behaviour :)

It looks to me most of them if not all should be switched to ordinal.

Agreed (but not as part of this PR)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. But it's public method. I'm afraid to change its behaviour :)

Fixing a behavior which is a bug is OK. I do not think the invariant behavior discrepancies with ordinal can be anything else than bugs for this case, but well, maybe I am wrong. Something that is equal with invariant but not equal with ordinal looks very likely to me to be considered not equal by the database query parser as well, in which case we would have a bug. Still this can be the subject of another PR, switching all invariant which should be to ordinal. I also think the change of most of them could be considered as also fixing bug possibilities.

@hazzik, what is your view on this subject? Do you think it is ok to switch from invariant to ordinal in a minor release, considering it is also a bug fix, or would you rather have such change in a major release?

@bahusoid bahusoid dismissed stale reviews from fredericDelaporte and hazzik via 85583b9 January 20, 2018 11:38
@fredericDelaporte fredericDelaporte self-assigned this Jan 20, 2018
@fredericDelaporte fredericDelaporte added this to the 5.1 milestone Jan 20, 2018
@fredericDelaporte fredericDelaporte merged commit 956bffe into nhibernate:master Jan 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants