Skip to content

[Runtime] Improve performance and memory footprint of compatibility o… #78818

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 24, 2025

Conversation

drexin
Copy link
Contributor

@drexin drexin commented Jan 22, 2025

…verrides

rdar://143401725

Replacing the (non-inlined) call to swift_once with a relaxed atomic significantly improves the generated code and reduces the memory footprint. The mechanism itself now does not cause a stack frame to be generated and the expected case (no override) should be perfectly predicted and executed in straight line code. The override case should also be well predicted, with only two branches on the same value.

@drexin drexin requested a review from a team as a code owner January 22, 2025 17:48
@drexin drexin requested a review from mikeash January 22, 2025 17:48
@drexin
Copy link
Contributor Author

drexin commented Jan 22, 2025

@swift-ci smoke test

@drexin
Copy link
Contributor Author

drexin commented Jan 22, 2025

Example of non-override path for swift_dynamicCast:

Before

_swift_dynamicCast:
    sub        sp, sp, #0x50                               ; CODE XREF=_$ss12_ArrayBufferV18_typeCheckSlowPathyySiF+196, _$ss12_ArrayBufferV18_typeCheckSlowPathyySiF+640, _$ss12_ArrayBufferV19_getElementSlowPathyyXlSiF+296, _$ss12_ArrayBufferV19_getElementSlowPathyyXlSiF+392, _$ss15_arrayForceCastySayq_GSayxGr0_lFACSays9CodingKey_pGsAD_pRs_r0_lIetgo_Tp5+368, _$ss15_arrayForceCastySayq_GSayxGr0_lFSays9CodingKey_pGABsAD_pRszr0_lIetgo_Tp5s011_DictionarydE0V_Tg5+232, _$ss15_arrayForceCastySayq_GSayxGr0_lFSays9CodingKey_pGABsAD_pRszr0_lIetgo_Tp5+260, _$ss15_arrayForceCastySayq_GSayxGr0_lF+488, _$ss22_ContiguousArrayBufferV24storesOnlyElementsOfTypeySbqd__mlF+296, _$ss21_arrayConditionalCastySayq_GSgSayxGr0_lF+500, _$sSa12customMirrors0B0Vvg+400
    stp        x24, x23, [sp, #0x10]
    stp        x22, x21, [sp, #0x20]
    stp        x20, x19, [sp, #0x30]
    stp        fp, lr, [sp, #0x40]
    add        fp, sp, #0x40
    mov        x21, x4
    mov        x22, x3
    mov        x19, x2
    mov        x20, x1
    mov        x23, x0
    adrp       x0, #0x580000                               ; 0x580e40@PAGE
    add        x0, x0, #0xe40                              ; 0x580e40@PAGEOFF, argument #1 for method _swift_once, __ZZ17swift_dynamicCastE9Predicate
    adrp       x1, #0x349000                               ; 0x349db0@PAGE
    add        x1, x1, #0xdb0                              ; 0x349db0@PAGEOFF, argument #2 for method _swift_once, __ZZ17swift_dynamicCastEN3$_08__invokeEPv
    mov        x2, #0x0                                    ; argument #3 for method _swift_once
    bl         _swift_once                                 ; _swift_once
    adrp       x8, #0x580000
    ldr        x6, [x8, #0xe38]                            ; __ZZ17swift_dynamicCastE8Override
    cbz        x6, loc_349c68

    adrp       x5, #0x349000                               ; 0x349d00@PAGE
    add        x5, x5, #0xd00                              ; 0x349d00@PAGEOFF, __ZL21swift_dynamicCastImplPN5swift11OpaqueValueES1_PKNS_14TargetMetadataINS_9InProcessEEES6_NS_16DynamicCastFlagsE
    mov        x0, x23
    mov        x1, x20
    mov        x2, x19
    mov        x3, x22
    mov        x4, x21
    ldp        fp, lr, [sp, #0x40]
    ldp        x20, x19, [sp, #0x30]
    ldp        x22, x21, [sp, #0x20]
    ldp        x24, x23, [sp, #0x10]
    add        sp, sp, #0x50
    br         x6
   ; endp

loc_349c68:
    ubfx       x6, x21, #0x1, #0x1                         ; argument #7 for method __ZL7tryCastPN5swift11OpaqueValueEPKNS_14TargetMetadataINS_9InProcessEEES1_S6_RS6_S7_bb, CODE XREF=_swift_dynamicCast+76
    stp        x19, x22, [sp]
    add        x4, sp, #0x8                                ; argument #5 for method __ZL7tryCastPN5swift11OpaqueValueEPKNS_14TargetMetadataINS_9InProcessEEES1_S6_RS6_S7_bb
    mov        x5, sp                                      ; argument #6 for method __ZL7tryCastPN5swift11OpaqueValueEPKNS_14TargetMetadataINS_9InProcessEEES1_S6_RS6_S7_bb
    and        w7, w21, #0x1                               ; argument #8 for method __ZL7tryCastPN5swift11OpaqueValueEPKNS_14TargetMetadataINS_9InProcessEEES1_S6_RS6_S7_bb
    mov        x0, x23                                     ; argument #1 for method __ZL7tryCastPN5swift11OpaqueValueEPKNS_14TargetMetadataINS_9InProcessEEES1_S6_RS6_S7_bb
    mov        x1, x22                                     ; argument #2 for method __ZL7tryCastPN5swift11OpaqueValueEPKNS_14TargetMetadataINS_9InProcessEEES1_S6_RS6_S7_bb
    mov        x2, x20                                     ; argument #3 for method __ZL7tryCastPN5swift11OpaqueValueEPKNS_14TargetMetadataINS_9InProcessEEES1_S6_RS6_S7_bb
    mov        x3, x19                                     ; argument #4 for method __ZL7tryCastPN5swift11OpaqueValueEPKNS_14TargetMetadataINS_9InProcessEEES1_S6_RS6_S7_bb
    bl         __ZL7tryCastPN5swift11OpaqueValueEPKNS_14TargetMetadataINS_9InProcessEEES1_S6_RS6_S7_bb
[...]

After

_swift_dynamicCast:
    sub        sp, sp, #0x50
    stp        x22, x21, [sp, #0x20]
    stp        x20, x19, [sp, #0x30]
    stp        fp, lr, [sp, #0x40]
    add        fp, sp, #0x40
    mov        x20, x4
    mov        x19, x2
    mov        x21, x1
    adrp       x22, #0x580000
    ldr        x6, [x22, #0xe08]                           ; __ZZ17swift_dynamicCastE8Override.0
    cmp        x6, #0x1
    b.ne       loc_349b60

    ubfx       x6, x20, #0x1, #0x1
    stp        x19, x3, [sp, #0x10]
    add        x4, sp, #0x18
    add        x5, sp, #0x10
    and        w7, w20, #0x1
    mov        x1, x3
    mov        x2, x21
    mov        x3, x19
    bl         __ZL7tryCastPN5swift11OpaqueValueEPKNS_14TargetMetadataINS_9InProcessEEES1_S6_RS6_S7_bb
[...]

@drexin
Copy link
Contributor Author

drexin commented Jan 22, 2025

@swift-ci smoke test windows

@drexin
Copy link
Contributor Author

drexin commented Jan 22, 2025

@swift-ci smoke test

@drexin
Copy link
Contributor Author

drexin commented Jan 23, 2025

@swift-ci smoke test

@drexin
Copy link
Contributor Author

drexin commented Jan 23, 2025

@swift-ci smoke test

@drexin drexin force-pushed the wip-143401725 branch 2 times, most recently from e626c57 to 4ff0f88 Compare January 23, 2025 20:54
@drexin
Copy link
Contributor Author

drexin commented Jan 23, 2025

@swift-ci smoke test

…verrides

rdar://143401725

Replacing the (non-inlined) call to `swift_once` with a relaxed atomic significantly improves the generated code and reduces the memory footprint. The mechanism itself now does not cause a stack frame to be generated and the expected case (no override) should be perfectly predicted and executed in straight line code. The override case should also be well predicted, with only two branches on the same value.
@glessard
Copy link
Contributor

@swift-ci please smoke test

@drexin
Copy link
Contributor Author

drexin commented Jan 24, 2025

@swift-ci smoke test

@drexin drexin merged commit 82d7260 into swiftlang:main Jan 24, 2025
4 of 5 checks passed
@drexin drexin deleted the wip-143401725 branch January 24, 2025 23:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants