Skip to content

Commit b0c7459

Browse files
kaihuangsuryasaimadhu
authored andcommitted
x86/sgx: Wipe out EREMOVE from sgx_free_epc_page()
EREMOVE takes a page and removes any association between that page and an enclave. It must be run on a page before it can be added into another enclave. Currently, EREMOVE is run as part of pages being freed into the SGX page allocator. It is not expected to fail, as it would indicate a use-after-free of EPC pages. Rather than add the page back to the pool of available EPC pages, the kernel intentionally leaks the page to avoid additional errors in the future. However, KVM does not track how guest pages are used, which means that SGX virtualization use of EREMOVE might fail. Specifically, it is legitimate that EREMOVE returns SGX_CHILD_PRESENT for EPC assigned to KVM guest, because KVM/kernel doesn't track SECS pages. To allow SGX/KVM to introduce a more permissive EREMOVE helper and to let the SGX virtualization code use the allocator directly, break out the EREMOVE call from the SGX page allocator. Rename the original sgx_free_epc_page() to sgx_encl_free_epc_page(), indicating that it is used to free an EPC page assigned to a host enclave. Replace sgx_free_epc_page() with sgx_encl_free_epc_page() in all call sites so there's no functional change. At the same time, improve the error message when EREMOVE fails, and add documentation to explain to the user what that failure means and to suggest to the user what to do when this bug happens in the case it happens. [ bp: Massage commit message, fix typos and sanitize text, simplify. ] Signed-off-by: Kai Huang <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Jarkko Sakkinen <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
1 parent b8921dc commit b0c7459

File tree

6 files changed

+64
-17
lines changed

6 files changed

+64
-17
lines changed

Documentation/x86/sgx.rst

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -209,3 +209,28 @@ An application may be loaded into a container enclave which is specially
209209
configured with a library OS and run-time which permits the application to run.
210210
The enclave run-time and library OS work together to execute the application
211211
when a thread enters the enclave.
212+
213+
Impact of Potential Kernel SGX Bugs
214+
===================================
215+
216+
EPC leaks
217+
---------
218+
219+
When EPC page leaks happen, a WARNING like this is shown in dmesg:
220+
221+
"EREMOVE returned ... and an EPC page was leaked. SGX may become unusable..."
222+
223+
This is effectively a kernel use-after-free of an EPC page, and due
224+
to the way SGX works, the bug is detected at freeing. Rather than
225+
adding the page back to the pool of available EPC pages, the kernel
226+
intentionally leaks the page to avoid additional errors in the future.
227+
228+
When this happens, the kernel will likely soon leak more EPC pages, and
229+
SGX will likely become unusable because the memory available to SGX is
230+
limited. However, while this may be fatal to SGX, the rest of the kernel
231+
is unlikely to be impacted and should continue to work.
232+
233+
As a result, when this happpens, user should stop running any new
234+
SGX workloads, (or just any new workloads), and migrate all valuable
235+
workloads. Although a machine reboot can recover all EPC memory, the bug
236+
should be reported to Linux developers.

arch/x86/kernel/cpu/sgx/encl.c

Lines changed: 26 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ static struct sgx_epc_page *sgx_encl_eldu(struct sgx_encl_page *encl_page,
7878

7979
ret = __sgx_encl_eldu(encl_page, epc_page, secs_page);
8080
if (ret) {
81-
sgx_free_epc_page(epc_page);
81+
sgx_encl_free_epc_page(epc_page);
8282
return ERR_PTR(ret);
8383
}
8484

@@ -404,7 +404,7 @@ void sgx_encl_release(struct kref *ref)
404404
if (sgx_unmark_page_reclaimable(entry->epc_page))
405405
continue;
406406

407-
sgx_free_epc_page(entry->epc_page);
407+
sgx_encl_free_epc_page(entry->epc_page);
408408
encl->secs_child_cnt--;
409409
entry->epc_page = NULL;
410410
}
@@ -415,15 +415,15 @@ void sgx_encl_release(struct kref *ref)
415415
xa_destroy(&encl->page_array);
416416

417417
if (!encl->secs_child_cnt && encl->secs.epc_page) {
418-
sgx_free_epc_page(encl->secs.epc_page);
418+
sgx_encl_free_epc_page(encl->secs.epc_page);
419419
encl->secs.epc_page = NULL;
420420
}
421421

422422
while (!list_empty(&encl->va_pages)) {
423423
va_page = list_first_entry(&encl->va_pages, struct sgx_va_page,
424424
list);
425425
list_del(&va_page->list);
426-
sgx_free_epc_page(va_page->epc_page);
426+
sgx_encl_free_epc_page(va_page->epc_page);
427427
kfree(va_page);
428428
}
429429

@@ -686,7 +686,7 @@ struct sgx_epc_page *sgx_alloc_va_page(void)
686686
ret = __epa(sgx_get_epc_virt_addr(epc_page));
687687
if (ret) {
688688
WARN_ONCE(1, "EPA returned %d (0x%x)", ret, ret);
689-
sgx_free_epc_page(epc_page);
689+
sgx_encl_free_epc_page(epc_page);
690690
return ERR_PTR(-EFAULT);
691691
}
692692

@@ -735,3 +735,24 @@ bool sgx_va_page_full(struct sgx_va_page *va_page)
735735

736736
return slot == SGX_VA_SLOT_COUNT;
737737
}
738+
739+
/**
740+
* sgx_encl_free_epc_page - free an EPC page assigned to an enclave
741+
* @page: EPC page to be freed
742+
*
743+
* Free an EPC page assigned to an enclave. It does EREMOVE for the page, and
744+
* only upon success, it puts the page back to free page list. Otherwise, it
745+
* gives a WARNING to indicate page is leaked.
746+
*/
747+
void sgx_encl_free_epc_page(struct sgx_epc_page *page)
748+
{
749+
int ret;
750+
751+
WARN_ON_ONCE(page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED);
752+
753+
ret = __eremove(sgx_get_epc_virt_addr(page));
754+
if (WARN_ONCE(ret, EREMOVE_ERROR_MESSAGE, ret, ret))
755+
return;
756+
757+
sgx_free_epc_page(page);
758+
}

arch/x86/kernel/cpu/sgx/encl.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -115,5 +115,6 @@ struct sgx_epc_page *sgx_alloc_va_page(void);
115115
unsigned int sgx_alloc_va_slot(struct sgx_va_page *va_page);
116116
void sgx_free_va_slot(struct sgx_va_page *va_page, unsigned int offset);
117117
bool sgx_va_page_full(struct sgx_va_page *va_page);
118+
void sgx_encl_free_epc_page(struct sgx_epc_page *page);
118119

119120
#endif /* _X86_ENCL_H */

arch/x86/kernel/cpu/sgx/ioctl.c

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ static void sgx_encl_shrink(struct sgx_encl *encl, struct sgx_va_page *va_page)
4747
encl->page_cnt--;
4848

4949
if (va_page) {
50-
sgx_free_epc_page(va_page->epc_page);
50+
sgx_encl_free_epc_page(va_page->epc_page);
5151
list_del(&va_page->list);
5252
kfree(va_page);
5353
}
@@ -117,7 +117,7 @@ static int sgx_encl_create(struct sgx_encl *encl, struct sgx_secs *secs)
117117
return 0;
118118

119119
err_out:
120-
sgx_free_epc_page(encl->secs.epc_page);
120+
sgx_encl_free_epc_page(encl->secs.epc_page);
121121
encl->secs.epc_page = NULL;
122122

123123
err_out_backing:
@@ -365,7 +365,7 @@ static int sgx_encl_add_page(struct sgx_encl *encl, unsigned long src,
365365
mmap_read_unlock(current->mm);
366366

367367
err_out_free:
368-
sgx_free_epc_page(epc_page);
368+
sgx_encl_free_epc_page(epc_page);
369369
kfree(encl_page);
370370

371371
return ret;

arch/x86/kernel/cpu/sgx/main.c

Lines changed: 5 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -294,7 +294,7 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page,
294294

295295
sgx_encl_ewb(encl->secs.epc_page, &secs_backing);
296296

297-
sgx_free_epc_page(encl->secs.epc_page);
297+
sgx_encl_free_epc_page(encl->secs.epc_page);
298298
encl->secs.epc_page = NULL;
299299

300300
sgx_encl_put_backing(&secs_backing, true);
@@ -609,19 +609,15 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim)
609609
* sgx_free_epc_page() - Free an EPC page
610610
* @page: an EPC page
611611
*
612-
* Call EREMOVE for an EPC page and insert it back to the list of free pages.
612+
* Put the EPC page back to the list of free pages. It's the caller's
613+
* responsibility to make sure that the page is in uninitialized state. In other
614+
* words, do EREMOVE, EWB or whatever operation is necessary before calling
615+
* this function.
613616
*/
614617
void sgx_free_epc_page(struct sgx_epc_page *page)
615618
{
616619
struct sgx_epc_section *section = &sgx_epc_sections[page->section];
617620
struct sgx_numa_node *node = section->node;
618-
int ret;
619-
620-
WARN_ON_ONCE(page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED);
621-
622-
ret = __eremove(sgx_get_epc_virt_addr(page));
623-
if (WARN_ONCE(ret, "EREMOVE returned %d (0x%x)", ret, ret))
624-
return;
625621

626622
spin_lock(&node->lock);
627623

arch/x86/kernel/cpu/sgx/sgx.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,10 @@
1313
#undef pr_fmt
1414
#define pr_fmt(fmt) "sgx: " fmt
1515

16+
#define EREMOVE_ERROR_MESSAGE \
17+
"EREMOVE returned %d (0x%x) and an EPC page was leaked. SGX may become unusable. " \
18+
"Refer to Documentation/x86/sgx.rst for more information."
19+
1620
#define SGX_MAX_EPC_SECTIONS 8
1721
#define SGX_EEXTEND_BLOCK_SIZE 256
1822
#define SGX_NR_TO_SCAN 16

0 commit comments

Comments
 (0)