Skip to content

Commit 914988e

Browse files
John David Anglinhdeller
authored andcommitted
parisc: Restore __ldcw_align for PA-RISC 2.0 processors
Back in 2005, Kyle McMartin removed the 16-byte alignment for ldcw semaphores on PA 2.0 machines (CONFIG_PA20). This broke spinlocks on pre PA8800 processors. The main symptom was random faults in mmap'd memory (e.g., gcc compilations, etc). Unfortunately, the errata for this ldcw change is lost. The issue is the 16-byte alignment required for ldcw semaphore instructions can only be reduced to natural alignment when the ldcw operation can be handled coherently in cache. Only PA8800 and PA8900 processors actually support doing the operation in cache. Aligning the spinlock dynamically adds two integer instructions to each spinlock. Tested on rp3440, c8000 and a500. Signed-off-by: John David Anglin <[email protected]> Link: https://lore.kernel.org/linux-parisc/[email protected]/T/#t Link: https://lore.kernel.org/linux-parisc/[email protected]/ Cc: [email protected] Signed-off-by: Helge Deller <[email protected]>
1 parent d3b3c63 commit 914988e

File tree

2 files changed

+20
-22
lines changed

2 files changed

+20
-22
lines changed

arch/parisc/include/asm/ldcw.h

Lines changed: 20 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -2,39 +2,42 @@
22
#ifndef __PARISC_LDCW_H
33
#define __PARISC_LDCW_H
44

5-
#ifndef CONFIG_PA20
65
/* Because kmalloc only guarantees 8-byte alignment for kmalloc'd data,
76
and GCC only guarantees 8-byte alignment for stack locals, we can't
87
be assured of 16-byte alignment for atomic lock data even if we
98
specify "__attribute ((aligned(16)))" in the type declaration. So,
109
we use a struct containing an array of four ints for the atomic lock
1110
type and dynamically select the 16-byte aligned int from the array
12-
for the semaphore. */
11+
for the semaphore. */
12+
13+
/* From: "Jim Hull" <jim.hull of hp.com>
14+
I've attached a summary of the change, but basically, for PA 2.0, as
15+
long as the ",CO" (coherent operation) completer is implemented, then the
16+
16-byte alignment requirement for ldcw and ldcd is relaxed, and instead
17+
they only require "natural" alignment (4-byte for ldcw, 8-byte for
18+
ldcd).
19+
20+
Although the cache control hint is accepted by all PA 2.0 processors,
21+
it is only implemented on PA8800/PA8900 CPUs. Prior PA8X00 CPUs still
22+
require 16-byte alignment. If the address is unaligned, the operation
23+
of the instruction is undefined. The ldcw instruction does not generate
24+
unaligned data reference traps so misaligned accesses are not detected.
25+
This hid the problem for years. So, restore the 16-byte alignment dropped
26+
by Kyle McMartin in "Remove __ldcw_align for PA-RISC 2.0 processors". */
1327

1428
#define __PA_LDCW_ALIGNMENT 16
15-
#define __PA_LDCW_ALIGN_ORDER 4
1629
#define __ldcw_align(a) ({ \
1730
unsigned long __ret = (unsigned long) &(a)->lock[0]; \
1831
__ret = (__ret + __PA_LDCW_ALIGNMENT - 1) \
1932
& ~(__PA_LDCW_ALIGNMENT - 1); \
2033
(volatile unsigned int *) __ret; \
2134
})
22-
#define __LDCW "ldcw"
2335

24-
#else /*CONFIG_PA20*/
25-
/* From: "Jim Hull" <jim.hull of hp.com>
26-
I've attached a summary of the change, but basically, for PA 2.0, as
27-
long as the ",CO" (coherent operation) completer is specified, then the
28-
16-byte alignment requirement for ldcw and ldcd is relaxed, and instead
29-
they only require "natural" alignment (4-byte for ldcw, 8-byte for
30-
ldcd). */
31-
32-
#define __PA_LDCW_ALIGNMENT 4
33-
#define __PA_LDCW_ALIGN_ORDER 2
34-
#define __ldcw_align(a) (&(a)->slock)
36+
#ifdef CONFIG_PA20
3537
#define __LDCW "ldcw,co"
36-
37-
#endif /*!CONFIG_PA20*/
38+
#else
39+
#define __LDCW "ldcw"
40+
#endif
3841

3942
/* LDCW, the only atomic read-write operation PA-RISC has. *sigh*.
4043
We don't explicitly expose that "*a" may be written as reload

arch/parisc/include/asm/spinlock_types.h

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -9,15 +9,10 @@
99
#ifndef __ASSEMBLY__
1010

1111
typedef struct {
12-
#ifdef CONFIG_PA20
13-
volatile unsigned int slock;
14-
# define __ARCH_SPIN_LOCK_UNLOCKED { __ARCH_SPIN_LOCK_UNLOCKED_VAL }
15-
#else
1612
volatile unsigned int lock[4];
1713
# define __ARCH_SPIN_LOCK_UNLOCKED \
1814
{ { __ARCH_SPIN_LOCK_UNLOCKED_VAL, __ARCH_SPIN_LOCK_UNLOCKED_VAL, \
1915
__ARCH_SPIN_LOCK_UNLOCKED_VAL, __ARCH_SPIN_LOCK_UNLOCKED_VAL } }
20-
#endif
2116
} arch_spinlock_t;
2217

2318

0 commit comments

Comments
 (0)