-
Notifications
You must be signed in to change notification settings - Fork 931
Deal with problematic characters in comments #96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -11,6 +11,7 @@ | |||
#![feature(rustc_private)] | |||
#![feature(collections)] | |||
#![feature(str_char)] | |||
#![feature(core)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this?
Thanks for doing this, looks good. With the minor comments addressed, I'd be happy to land this as is. However, I think the algorithm you're using could be better - it is O(n^2), and the constant multiplier is high too. I think this can be solved in O(n) by walking the string once and doing the matching yourself (this at the expense of some flexibility, because you probably want to just match against another string, rather than some So, I'm happy to land as is, but add a comment, or you can try to come up with an O(n) algorithm. |
I honestly didn't think it would really matter that much. I was very wrong. I ran on the following test string: // Ick, just needed to get a &'static to handle_result.
static HANDLE_RESULT: &'static Fn(HashMap<String, String>) = &handle_result;
pub fn idempotent_check(filename: String) -> Result<(), HashMap<String, String>> {
let args = vec!["rustfmt".to_owned(), filename];
let mut def_config_file = fs::File::open("default.toml").unwrap();
let mut def_config = String::new();
def_config_file.read_to_string(&mut def_config).unwrap();
// this thread is not used for concurrency, but rather to workaround the issue that the passed
// function handle needs to have static lifetime. Instead of using a global RefCell, we use
// panic to return a result in case of failure. This has the advantage of smoothing the road to
// multithreaded rustfmt
thread::catch_panic(move || {
run(args, WriteMode::Return(HANDLE_RESULT), &def_config);
}).map_err(|any|
*any.downcast().unwrap()
)
} These were the results:
The new implementation isn't as elegant, but you can't argue with those measurements. I'll push the code in a bit. |
Merge conflict resolved! |
let mut match_count: usize = 0; | ||
let mut possible_comment = false; | ||
|
||
for (i, b) in self.as_bytes().iter().cloned().enumerate() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need to clone here? Feels like it shouldn't be necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we can inspect a u8
instead of an &u8
. The alternative is to pattern match: for (i, &b) in ...
, which is arguably nicer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latter does seem nicer
The new approach looks great! I think the only thing to do is to iterate over chars rather than bytes. |
You have convinced me. It now iterates over |
Deal with problematic characters in comments
Resolves several issues with comments containing problematic characters.