Skip to content

Commit 244258e

Browse files
committed
Modify DataEncoder to be able to encode data in an object owned buffer.
DataEncoder was previously made to modify data within an existing buffer. As the code progressed, new clients started using DataEncoder to create binary data. In these cases the use of this class was possibly, but only if you knew exactly how large your buffer would be ahead of time. This patchs adds the ability for DataEncoder to own a buffer that can be dynamically resized as data is appended to the buffer. Change in this patch: - Allow a DataEncoder object to be created that owns a DataBufferHeap object that can dynamically grow as data is appended - Add new methods that start with "Append" to append data to the buffer and grow it as needed - Adds full testing of the API to assure modifications don't regress any functionality - Has two constructors: one that uses caller owned data and one that creates an object with object owned data - "Append" methods only work if the object owns it own data - Removes the ability to specify a shared memory buffer as no one was using this functionality. This allows us to switch to a case where the object owns its own data in a DataBufferHeap that can be resized as data is added "Put" methods work on both caller and object owned data. "Append" methods work on only object owned data where we can grow the buffer. These methods will return false if called on a DataEncoder object that has caller owned data. The main reason for these modifications is to be able to use the DateEncoder objects instead of llvm::gsym::FileWriter in https://reviews.llvm.org/D113789. This patch wants to add the ability to create symbol table caching to LLDB and the code needs to build binary caches and save them to disk. Reviewed By: labath Differential Revision: https://reviews.llvm.org/D115073
1 parent 5034e17 commit 244258e

File tree

8 files changed

+761
-202
lines changed

8 files changed

+761
-202
lines changed

lldb/include/lldb/Utility/DataEncoder.h

Lines changed: 131 additions & 91 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,9 @@
1616
#include "lldb/lldb-forward.h"
1717
#include "lldb/lldb-types.h"
1818

19+
#include "llvm/ADT/ArrayRef.h"
20+
#include "llvm/ADT/StringRef.h"
21+
1922
#include <cstddef>
2023
#include <cstdint>
2124

@@ -26,21 +29,34 @@ namespace lldb_private {
2629
/// An binary data encoding class.
2730
///
2831
/// DataEncoder is a class that can encode binary data (swapping if needed) to
29-
/// a data buffer. The data buffer can be caller owned, or can be shared data
30-
/// that can be shared between multiple DataEncoder or DataEncoder instances.
32+
/// a data buffer. The DataEncoder can be constructed with data that will be
33+
/// copied into the internally owned buffer. This allows data to be modified
34+
/// in the internal buffer. The DataEncoder object can also be constructed with
35+
/// just a byte order and address size and data can be appended to the
36+
/// internally owned buffer.
37+
///
38+
/// Clients can get a shared pointer to the data buffer when done modifying or
39+
/// creating the data to keep the data around after the lifetime of a
40+
/// DataEncoder object. \see GetDataBuffer
3141
///
32-
/// \see DataBuffer
42+
/// Client can get a reference to the object owned data as an array by calling
43+
/// the GetData method. \see GetData
3344
class DataEncoder {
3445
public:
3546
/// Default constructor.
3647
///
37-
/// Initialize all members to a default empty state.
48+
/// Initialize all members to a default empty state and create a empty memory
49+
/// buffer that can be appended to. The ByteOrder and address size will be set
50+
/// to match the current host system.
3851
DataEncoder();
3952

40-
/// Construct with a buffer that is owned by the caller.
53+
/// Construct an encoder that copies the specified data into the object owned
54+
/// data buffer.
4155
///
42-
/// This constructor allows us to use data that is owned by the caller. The
43-
/// data must stay around as long as this object is valid.
56+
/// This constructor is designed to be used when you have a data buffer and
57+
/// want to modify values within the buffer. A copy of the data will be made
58+
/// in the internally owned buffer and that data can be fixed up and appended
59+
/// to.
4460
///
4561
/// \param[in] data
4662
/// A pointer to caller owned data.
@@ -49,54 +65,37 @@ class DataEncoder {
4965
/// The length in bytes of \a data.
5066
///
5167
/// \param[in] byte_order
52-
/// A byte order of the data that we are extracting from.
68+
/// A byte order for the data that will be encoded.
5369
///
5470
/// \param[in] addr_size
55-
/// A new address byte size value.
56-
DataEncoder(void *data, uint32_t data_length, lldb::ByteOrder byte_order,
57-
uint8_t addr_size);
71+
/// A size of an address in bytes. \see PutAddress, AppendAddress
72+
DataEncoder(const void *data, uint32_t data_length,
73+
lldb::ByteOrder byte_order, uint8_t addr_size);
5874

59-
/// Construct with shared data.
75+
/// Construct an encoder that owns a heap based memory buffer.
6076
///
61-
/// Copies the data shared pointer which adds a reference to the contained
62-
/// in \a data_sp. The shared data reference is reference counted to ensure
63-
/// the data lives as long as anyone still has a valid shared pointer to the
64-
/// data in \a data_sp.
65-
///
66-
/// \param[in] data_sp
67-
/// A shared pointer to data.
77+
/// This allows clients to create binary data from scratch by appending values
78+
/// with the methods that start with "Append".
6879
///
6980
/// \param[in] byte_order
70-
/// A byte order of the data that we are extracting from.
81+
/// A byte order for the data that will be encoded.
7182
///
7283
/// \param[in] addr_size
73-
/// A new address byte size value.
74-
DataEncoder(const lldb::DataBufferSP &data_sp, lldb::ByteOrder byte_order,
75-
uint8_t addr_size);
84+
/// A size of an address in bytes. \see PutAddress, AppendAddress
85+
DataEncoder(lldb::ByteOrder byte_order, uint8_t addr_size);
7686

77-
/// Destructor
78-
///
79-
/// If this object contains a valid shared data reference, the reference
80-
/// count on the data will be decremented, and if zero, the data will be
81-
/// freed.
8287
~DataEncoder();
8388

84-
/// Clears the object state.
85-
///
86-
/// Clears the object contents back to a default invalid state, and release
87-
/// any references to shared data that this object may contain.
88-
void Clear();
89-
9089
/// Encode an unsigned integer of size \a byte_size to \a offset.
9190
///
9291
/// Encode a single integer value at \a offset and return the offset that
9392
/// follows the newly encoded integer when the data is successfully encoded
94-
/// into the existing data. There must be enough room in the data, else
95-
/// UINT32_MAX will be returned to indicate that encoding failed.
93+
/// into the existing data. There must be enough room in the existing data,
94+
/// else UINT32_MAX will be returned to indicate that encoding failed.
9695
///
9796
/// \param[in] offset
98-
/// The offset within the contained data at which to put the
99-
/// encoded integer.
97+
/// The offset within the contained data at which to put the encoded
98+
/// integer.
10099
///
101100
/// \param[in] byte_size
102101
/// The size in byte of the integer to encode.
@@ -111,6 +110,64 @@ class DataEncoder {
111110
/// was successfully encoded, UINT32_MAX if the encoding failed.
112111
uint32_t PutUnsigned(uint32_t offset, uint32_t byte_size, uint64_t value);
113112

113+
/// Encode an unsigned integer at offset \a offset.
114+
///
115+
/// Encode a single unsigned integer value at \a offset and return the offset
116+
/// that follows the newly encoded integer when the data is successfully
117+
/// encoded into the existing data. There must be enough room in the data,
118+
/// else UINT32_MAX will be returned to indicate that encoding failed.
119+
///
120+
/// \param[in] offset
121+
/// The offset within the contained data at which to put the encoded
122+
/// integer.
123+
///
124+
/// \param[in] value
125+
/// The integer value to write.
126+
///
127+
/// \return
128+
/// The next offset in the bytes of this data if the integer was
129+
/// successfully encoded, UINT32_MAX if the encoding failed.
130+
uint32_t PutU8(uint32_t offset, uint8_t value);
131+
uint32_t PutU16(uint32_t offset, uint16_t value);
132+
uint32_t PutU32(uint32_t offset, uint32_t value);
133+
uint32_t PutU64(uint32_t offset, uint64_t value);
134+
135+
/// Append a unsigned integer to the end of the owned data.
136+
///
137+
/// \param value
138+
/// A unsigned integer value to append.
139+
void AppendU8(uint8_t value);
140+
void AppendU16(uint16_t value);
141+
void AppendU32(uint32_t value);
142+
void AppendU64(uint64_t value);
143+
144+
/// Append an address sized integer to the end of the owned data.
145+
///
146+
/// \param addr
147+
/// A unsigned integer address value to append. The size of the address
148+
/// will be determined by the address size specified in the constructor.
149+
void AppendAddress(lldb::addr_t addr);
150+
151+
/// Append a bytes to the end of the owned data.
152+
///
153+
/// Append the bytes contained in the string reference. This function will
154+
/// not append a NULL termination character for a C string. Use the
155+
/// AppendCString function for this purpose.
156+
///
157+
/// \param data
158+
/// A string reference that contains bytes to append.
159+
void AppendData(llvm::StringRef data);
160+
161+
/// Append a C string to the end of the owned data.
162+
///
163+
/// Append the bytes contained in the string reference along with an extra
164+
/// NULL termination character if the StringRef bytes doesn't include one as
165+
/// the last byte.
166+
///
167+
/// \param data
168+
/// A string reference that contains bytes to append.
169+
void AppendCString(llvm::StringRef data);
170+
114171
/// Encode an arbitrary number of bytes.
115172
///
116173
/// \param[in] offset
@@ -131,11 +188,10 @@ class DataEncoder {
131188
/// Encode an address in the existing buffer at \a offset bytes into the
132189
/// buffer.
133190
///
134-
/// Encode a single address (honoring the m_addr_size member) to the data
135-
/// and return the next offset where subsequent data would go. pointed to by
136-
/// \a offset_ptr. The size of the extracted address comes from the \a
137-
/// m_addr_size member variable and should be set correctly prior to
138-
/// extracting any address values.
191+
/// Encode a single address to the data and return the next offset where
192+
/// subsequent data would go. The size of the address comes from the \a
193+
/// m_addr_size member variable and should be set correctly prior to encoding
194+
/// any address values.
139195
///
140196
/// \param[in] offset
141197
/// The offset where to encode the address.
@@ -150,7 +206,10 @@ class DataEncoder {
150206

151207
/// Put a C string to \a offset.
152208
///
153-
/// Encodes a C string into the existing data including the terminating
209+
/// Encodes a C string into the existing data including the terminating. If
210+
/// there is not enough room in the buffer to fit the entire C string and the
211+
/// NULL terminator in the existing buffer bounds, then this function will
212+
/// fail.
154213
///
155214
/// \param[in] offset
156215
/// The offset where to encode the string.
@@ -159,18 +218,32 @@ class DataEncoder {
159218
/// The string to encode.
160219
///
161220
/// \return
162-
/// A pointer to the C string value in the data. If the offset
163-
/// pointed to by \a offset_ptr is out of bounds, or if the
164-
/// offset plus the length of the C string is out of bounds,
165-
/// NULL will be returned.
221+
/// The next valid offset within data if the put operation was successful,
222+
/// else UINT32_MAX to indicate the put failed.
166223
uint32_t PutCString(uint32_t offset, const char *cstr);
167224

168-
private:
169-
uint32_t PutU8(uint32_t offset, uint8_t value);
170-
uint32_t PutU16(uint32_t offset, uint16_t value);
171-
uint32_t PutU32(uint32_t offset, uint32_t value);
172-
uint32_t PutU64(uint32_t offset, uint64_t value);
225+
/// Get a shared copy of the heap based memory buffer owned by this object.
226+
///
227+
/// This allows a data encoder to be used to create a data buffer that can
228+
/// be extracted and used elsewhere after this object is destroyed.
229+
///
230+
/// \return
231+
/// A shared pointer to the DataBufferHeap that contains the data that was
232+
/// encoded into this object.
233+
std::shared_ptr<lldb_private::DataBufferHeap> GetDataBuffer() {
234+
return m_data_sp;
235+
}
173236

237+
/// Get a access to the bytes that this references.
238+
///
239+
/// This value will always return the data that this object references even if
240+
/// the object was constructed with caller owned data.
241+
///
242+
/// \return
243+
/// A array reference to the data that this object references.
244+
llvm::ArrayRef<uint8_t> GetData() const;
245+
246+
private:
174247
uint32_t BytesLeft(uint32_t offset) const {
175248
const uint32_t size = GetByteSize();
176249
if (size > offset)
@@ -187,31 +260,6 @@ class DataEncoder {
187260
return length <= BytesLeft(offset);
188261
}
189262

190-
/// Adopt a subset of shared data in \a data_sp.
191-
///
192-
/// Copies the data shared pointer which adds a reference to the contained
193-
/// in \a data_sp. The shared data reference is reference counted to ensure
194-
/// the data lives as long as anyone still has a valid shared pointer to the
195-
/// data in \a data_sp. The byte order and address byte size settings remain
196-
/// the same. If \a offset is not a valid offset in \a data_sp, then no
197-
/// reference to the shared data will be added. If there are not \a length
198-
/// bytes available in \a data starting at \a offset, the length will be
199-
/// truncated to contains as many bytes as possible.
200-
///
201-
/// \param[in] data_sp
202-
/// A shared pointer to data.
203-
///
204-
/// \param[in] offset
205-
/// The offset into \a data_sp at which the subset starts.
206-
///
207-
/// \param[in] length
208-
/// The length in bytes of the subset of \a data_sp.
209-
///
210-
/// \return
211-
/// The number of bytes that this object now contains.
212-
uint32_t SetData(const lldb::DataBufferSP &data_sp, uint32_t offset = 0,
213-
uint32_t length = UINT32_MAX);
214-
215263
/// Test the validity of \a offset.
216264
///
217265
/// \return
@@ -223,25 +271,17 @@ class DataEncoder {
223271
///
224272
/// \return
225273
/// The total number of bytes of data this object refers to.
226-
size_t GetByteSize() const { return m_end - m_start; }
227-
228-
/// A pointer to the first byte of data.
229-
uint8_t *m_start = nullptr;
274+
size_t GetByteSize() const;
230275

231-
/// A pointer to the byte that is past the end of the data.
232-
uint8_t *m_end = nullptr;
276+
/// The shared pointer to data that can grow as data is added
277+
std::shared_ptr<lldb_private::DataBufferHeap> m_data_sp;
233278

234-
/// The byte order of the data we are extracting from.
279+
/// The byte order of the data we are encoding to.
235280
lldb::ByteOrder m_byte_order;
236281

237-
/// The address size to use when extracting pointers or
238-
/// addresses
282+
/// The address size to use when encoding pointers or addresses.
239283
uint8_t m_addr_size;
240284

241-
/// The shared pointer to data that can
242-
/// be shared among multiple instances
243-
mutable lldb::DataBufferSP m_data_sp;
244-
245285
DataEncoder(const DataEncoder &) = delete;
246286
const DataEncoder &operator=(const DataEncoder &) = delete;
247287
};

lldb/include/lldb/lldb-forward.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,7 @@ class DWARFCallFrameInfo;
6565
class DWARFDataExtractor;
6666
class DWARFExpression;
6767
class DataBuffer;
68+
class DataBufferHeap;
6869
class DataEncoder;
6970
class DataExtractor;
7071
class Debugger;

lldb/source/Expression/DWARFExpression.cpp

Lines changed: 11 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -460,22 +460,19 @@ bool DWARFExpression::Update_DW_OP_addr(lldb::addr_t file_addr) {
460460
// first, then modify it, and if all goes well, we then replace the data
461461
// for this expression
462462

463-
// So first we copy the data into a heap buffer
464-
std::unique_ptr<DataBufferHeap> head_data_up(
465-
new DataBufferHeap(m_data.GetDataStart(), m_data.GetByteSize()));
466-
467-
// Make en encoder so we can write the address into the buffer using the
468-
// correct byte order (endianness)
469-
DataEncoder encoder(head_data_up->GetBytes(), head_data_up->GetByteSize(),
463+
// Make en encoder that contains a copy of the location expression data
464+
// so we can write the address into the buffer using the correct byte
465+
// order.
466+
DataEncoder encoder(m_data.GetDataStart(), m_data.GetByteSize(),
470467
m_data.GetByteOrder(), addr_byte_size);
471468

472469
// Replace the address in the new buffer
473-
if (encoder.PutUnsigned(offset, addr_byte_size, file_addr) == UINT32_MAX)
470+
if (encoder.PutAddress(offset, file_addr) == UINT32_MAX)
474471
return false;
475472

476473
// All went well, so now we can reset the data using a shared pointer to
477474
// the heap data so "m_data" will now correctly manage the heap data.
478-
m_data.SetData(DataBufferSP(head_data_up.release()));
475+
m_data.SetData(encoder.GetDataBuffer());
479476
return true;
480477
} else {
481478
const offset_t op_arg_size = GetOpcodeDataSize(m_data, offset, op);
@@ -521,15 +518,11 @@ bool DWARFExpression::LinkThreadLocalStorage(
521518
// We have to make a copy of the data as we don't know if this data is from a
522519
// read only memory mapped buffer, so we duplicate all of the data first,
523520
// then modify it, and if all goes well, we then replace the data for this
524-
// expression
525-
526-
// So first we copy the data into a heap buffer
527-
std::shared_ptr<DataBufferHeap> heap_data_sp(
528-
new DataBufferHeap(m_data.GetDataStart(), m_data.GetByteSize()));
521+
// expression.
529522

530-
// Make en encoder so we can write the address into the buffer using the
531-
// correct byte order (endianness)
532-
DataEncoder encoder(heap_data_sp->GetBytes(), heap_data_sp->GetByteSize(),
523+
// Make en encoder that contains a copy of the location expression data so we
524+
// can write the address into the buffer using the correct byte order.
525+
DataEncoder encoder(m_data.GetDataStart(), m_data.GetByteSize(),
533526
m_data.GetByteOrder(), addr_byte_size);
534527

535528
lldb::offset_t offset = 0;
@@ -603,7 +596,7 @@ bool DWARFExpression::LinkThreadLocalStorage(
603596
// and read the
604597
// TLS data
605598
m_module_wp = new_module_sp;
606-
m_data.SetData(heap_data_sp);
599+
m_data.SetData(encoder.GetDataBuffer());
607600
return true;
608601
}
609602

@@ -2817,4 +2810,3 @@ bool DWARFExpression::MatchesOperand(StackFrame &frame,
28172810
return MatchRegOp(*reg)(operand);
28182811
}
28192812
}
2820-

0 commit comments

Comments
 (0)