Skip to content

Support Rust core::fmt::Argument in vsprintf #280

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 26, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions lib/vsprintf.c
Original file line number Diff line number Diff line change
Expand Up @@ -2090,6 +2090,10 @@ char *fwnode_string(char *buf, char *end, struct fwnode_handle *fwnode,
return widen_string(buf, buf - buf_start, end, spec);
}

#ifdef CONFIG_RUST
char *rust_fmt_argument(char* buf, char* end, void *ptr);
#endif

/* Disable pointer hashing if requested */
bool no_hash_pointers __ro_after_init;
EXPORT_SYMBOL_GPL(no_hash_pointers);
Expand Down Expand Up @@ -2236,6 +2240,10 @@ early_param("no_hash_pointers", no_hash_pointers_enable);
*
* Note: The default behaviour (unadorned %p) is to hash the address,
* rendering it useful as a unique identifier.
*
* There is also a '%pA' format specifier, but it is only intended to be used
* from Rust code to format core::fmt::Arguments. Do *not* use it from C.
* See rust/kernel/print.rs for details.
*/
static noinline_for_stack
char *pointer(const char *fmt, char *buf, char *end, void *ptr,
Expand Down Expand Up @@ -2306,6 +2314,10 @@ char *pointer(const char *fmt, char *buf, char *end, void *ptr,
return device_node_string(buf, end, ptr, spec, fmt + 1);
case 'f':
return fwnode_string(buf, end, ptr, spec, fmt + 1);
#ifdef CONFIG_RUST
case 'A':
return rust_fmt_argument(buf, end, ptr);
#endif
case 'x':
return pointer_string(buf, end, ptr, spec);
case 'e':
Expand Down
2 changes: 1 addition & 1 deletion rust/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ $(objtree)/rust/bindings_generated.rs: $(srctree)/rust/kernel/bindings_helper.h
quiet_cmd_exports = EXPORTS $@
cmd_exports = \
$(NM) -p --defined-only $< \
| grep -E ' (T|R|D) ' | cut -d ' ' -f 3 | grep -E '^(__rust_|_R)' \
| grep -E ' (T|R|D) ' | cut -d ' ' -f 3 \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to remove the filtering? Which symbols are needed now?

(Also, please put the reason in the commit message.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to expose the no_mangle-ed symbol so functions can be called from C-side. I actually don't understand why there is the filtering in the first place, since nm is only showing defined symbols anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ones that would appear would be:

__rdl_oom
__rg_oom
__rg_alloc
__rg_alloc_zeroed
__rg_dealloc
__rg_realloc
rust_oom

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I don't quite get it, why shouldn't they appear?

Copy link
Member

@ojeda ojeda May 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was simply stating the ones for information. :)

I do not remember if there were other symbols being included when I wrote it originally. Perhaps I was trying to ensure we did not expose anything else, or trying to use the smaller set I could.

Should those 7 be exposed, i.e. do we expect rustc to emit calls to those? (even from e.g. generics?)

Copy link
Member

@ojeda ojeda May 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean emitting calls to those from leaf modules or external third-party modules; and by "from generics" I mean whether we could have a call in one of the generics in e.g. kernel and then a leaf module instantiating that generic, generating the call etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is not something that can be easily answered, or not guaranteed, or something like that, we can remove the filtering, it is not a big deal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the disassembly, and it seems they are never called directly, but always through __rust_alloc, __rust_alloc_error_handler and friends. Anyway even if we want some filter it should be a denylist rather than by symbol prefixes?

Copy link
Member

@ojeda ojeda May 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the prefixes approach was intended to pick the "Rust-y" symbols. If we go for a list, perhaps the opposite: only allowing the ones we really need (that way we cover future ones that may appear etc.). We can put the adding the list on an issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll just leave it as is then. We can open an issue to investigate what's best approach for symbol exports.

| xargs -Isymbol \
echo 'EXPORT_SYMBOL_RUST_GPL(symbol);' > $@

Expand Down
192 changes: 60 additions & 132 deletions rust/kernel/print.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,52 @@ use core::cmp;
use core::fmt;

use crate::bindings;
use crate::c_types::c_int;
use crate::c_types::{c_char, c_void};

// Called from `vsprintf` with format specifier `%pA`.
#[no_mangle]
unsafe fn rust_fmt_argument(buf: *mut c_char, end: *mut c_char, ptr: *const c_void) -> *mut c_char {
use fmt::Write;

// Use `usize` to use `saturating_*` functions.
struct Writer {
buf: usize,
end: usize,
}

impl Write for Writer {
fn write_str(&mut self, s: &str) -> fmt::Result {
// `buf` value after writing `len` bytes. This does not have to be bounded
// by `end`, but we don't want it to wrap around to 0.
let buf_new = self.buf.saturating_add(s.len());

// Amount that we can copy. `saturating_sub` ensures we get 0 if
// `buf` goes past `end`.
let len_to_copy = cmp::min(buf_new, self.end).saturating_sub(self.buf);

// SAFETY: In any case, `buf` is non-null and properly aligned.
// If `len_to_copy` is non-zero, then we know `buf` has not past
// `end` yet and so is valid.
unsafe {
core::ptr::copy_nonoverlapping(
s.as_bytes().as_ptr(),
self.buf as *mut u8,
len_to_copy,
)
};

self.buf = buf_new;
Ok(())
}
}

let mut w = Writer {
buf: buf as _,
end: end as _,
};
let _ = w.write_fmt(*(ptr as *const fmt::Arguments<'_>));
w.buf as _
}

/// Format strings.
///
Expand All @@ -23,7 +68,7 @@ pub mod format_strings {
const LENGTH_PREFIX: usize = 2;

/// The length of the fixed format strings.
pub const LENGTH: usize = 11;
pub const LENGTH: usize = 10;

/// Generates a fixed format string for the kernel's [`printk`].
///
Expand All @@ -42,14 +87,14 @@ pub mod format_strings {
assert!(prefix[2] == b'\x00');

let suffix: &[u8; LENGTH - LENGTH_PREFIX] = if is_cont {
b"%.*s\0\0\0\0\0"
b"%pA\0\0\0\0\0"
} else {
b"%s: %.*s\0"
b"%s: %pA\0"
};

[
prefix[0], prefix[1], suffix[0], suffix[1], suffix[2], suffix[3], suffix[4], suffix[5],
suffix[6], suffix[7], suffix[8],
suffix[6], suffix[7],
]
}

Expand Down Expand Up @@ -84,14 +129,13 @@ pub mod format_strings {
pub unsafe fn call_printk(
format_string: &[u8; format_strings::LENGTH],
module_name: &[u8],
string: &[u8],
args: fmt::Arguments<'_>,
) {
// `printk` does not seem to fail in any path.
bindings::printk(
format_string.as_ptr() as _,
module_name.as_ptr(),
string.len() as c_int,
string.as_ptr(),
&args as *const _ as *const c_void,
);
}

Expand All @@ -101,112 +145,26 @@ pub unsafe fn call_printk(
///
/// [`printk`]: ../../../../include/linux/printk.h
#[doc(hidden)]
pub fn call_printk_cont(string: &[u8]) {
pub fn call_printk_cont(args: fmt::Arguments<'_>) {
// `printk` does not seem to fail in any path.
//
// SAFETY: The format string is fixed.
unsafe {
bindings::printk(
format_strings::CONT.as_ptr() as _,
string.len() as c_int,
string.as_ptr(),
&args as *const _ as *const c_void,
);
}
}

/// The maximum size of a log line in the kernel.
///
/// From `kernel/printk/printk.c`.
const LOG_LINE_MAX: usize = 1024 - 32;

/// The maximum size of a log line in our side.
///
/// FIXME: We should be smarter than this, but for the moment, to reduce stack
/// usage, we only allow this much which should work for most purposes.
const LOG_LINE_SIZE: usize = 300;
crate::static_assert!(LOG_LINE_SIZE <= LOG_LINE_MAX);

/// Public but hidden since it should only be used from public macros.
#[doc(hidden)]
pub struct LogLineWriter {
data: [u8; LOG_LINE_SIZE],
pos: usize,
}

impl LogLineWriter {
/// Creates a new [`LogLineWriter`].
pub fn new() -> LogLineWriter {
LogLineWriter {
data: [0u8; LOG_LINE_SIZE],
pos: 0,
}
}

/// Returns the internal buffer as a byte slice.
pub fn as_bytes(&self) -> &[u8] {
&self.data[..self.pos]
}
}

impl Default for LogLineWriter {
fn default() -> Self {
Self::new()
}
}

impl fmt::Write for LogLineWriter {
fn write_str(&mut self, s: &str) -> fmt::Result {
let copy_len = cmp::min(LOG_LINE_SIZE - self.pos, s.as_bytes().len());
self.data[self.pos..self.pos + copy_len].copy_from_slice(&s.as_bytes()[..copy_len]);
self.pos += copy_len;
Ok(())
}
}

/// Helper function for the [`print_macro!`] to reduce stack usage.
///
/// Public but hidden since it should only be used from public macros.
///
/// # Safety
///
/// The format string must be one of the ones in [`format_strings`], and
/// the module name must be null-terminated.
#[doc(hidden)]
pub unsafe fn format_and_call<const CONT: bool>(
format_string: &[u8; format_strings::LENGTH],
module_name: &[u8],
args: fmt::Arguments,
) {
// Careful: this object takes quite a bit of stack.
let mut writer = LogLineWriter::new();

match fmt::write(&mut writer, args) {
Ok(_) => {
if CONT {
call_printk_cont(writer.as_bytes());
} else {
call_printk(format_string, module_name, writer.as_bytes());
}
}

Err(_) => {
call_printk(
&format_strings::CRIT,
module_name,
b"Failure to format string.\n",
);
}
};
}

/// Performs formatting and forwards the string to [`call_printk`].
///
/// Public but hidden since it should only be used from public macros.
#[doc(hidden)]
#[macro_export]
macro_rules! print_macro (
// Without extra arguments: no need to format anything.
($format_string:path, false, $fmt:expr) => (
// The non-continuation cases (most of them, e.g. `INFO`).
($format_string:path, false, $($arg:tt)+) => (
// SAFETY: This hidden macro should only be called by the documented
// printing macros which ensure the format string is one of the fixed
// ones. All `__LOG_PREFIX`s are null-terminated as they are generated
Expand All @@ -216,47 +174,17 @@ macro_rules! print_macro (
$crate::print::call_printk(
&$format_string,
crate::__LOG_PREFIX,
$fmt.as_bytes(),
format_args!($($arg)+),
);
}
);

// Without extra arguments: no need to format anything (`CONT` case).
($format_string:path, true, $fmt:expr) => (
// The `CONT` case.
($format_string:path, true, $($arg:tt)+) => (
$crate::print::call_printk_cont(
$fmt.as_bytes(),
format_args!($($arg)+),
);
);

// With extra arguments: we need to perform formatting.
($format_string:path, $cont:literal, $fmt:expr, $($arg:tt)*) => (
// Forwarding the call to a function to perform the formatting
// is needed here to avoid stack overflows in non-optimized builds when
// invoking the printing macros a lot of times in the same function.
// Without it, the compiler reserves one `LogLineWriter` per macro
// invocation, which is a huge type.
//
// We could use an immediately-invoked closure for this, which
// seems to lower even more the stack usage at `opt-level=0` because
// `fmt::Arguments` objects do not pile up. However, that breaks
// the `?` operator if used in one of the arguments.
//
// At `opt-level=2`, the generated code is basically the same for
// all alternatives.
//
// SAFETY: This hidden macro should only be called by the documented
// printing macros which ensure the format string is one of the fixed
// ones. All `__LOG_PREFIX`s are null-terminated as they are generated
// by the `module!` proc macro or fixed values defined in a kernel
// crate.
unsafe {
$crate::print::format_and_call::<$cont>(
&$format_string,
crate::__LOG_PREFIX,
format_args!($fmt, $($arg)*),
);
}
);
);

// We could use a macro to generate these macros. However, doing so ends
Expand Down