-
-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enh : Create physical string descriptor to be used with allocatable strings #4918
Conversation
06b154f
to
fba28db
Compare
Can you please help me understand the below ASR changes? (I've put in comments in ASR itself for what my question is) On your branch, for the below program: program main
implicit none
character(len=:), allocatable :: str1
character(len=:), allocatable :: str2(:)
end program main
below is the ASR generated using (TranslationUnit
(SymbolTable
1
{
main:
(Program
(SymbolTable
2
{
str1:
(Variable
2
str1
[]
Local
()
()
Default
(Allocatable
(Character 1 -2 () DescriptorString) ;; this seems alright to me
)
()
Source
Public
Required
.false.
),
str2:
(Variable
2
str2
[]
Local
()
()
Default
(Allocatable
(Array
(Character 1 -2 () PointerString) ;; is this expected to be DescriptorArray or PointerString?
[(()
())]
DescriptorArray
)
)
()
Source
Public
Required
.false.
)
})
main
[]
[(ImplicitDeallocate
[(Var 2 str1)
(Var 2 str2)]
)]
)
})
[]
) |
src/libasr/asr_utils.h
Outdated
@@ -196,6 +196,12 @@ static inline ASR::ttype_t* type_get_past_array_pointer_allocatable( | |||
type_get_past_pointer(f))); | |||
} | |||
|
|||
static inline bool is_allocatable_string(ASR::ttype_t* t){ | |||
return ASR::is_a<ASR::Character_t>(*ASRUtils::type_get_past_array_pointer_allocatable(t)) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also maybe put an assert that it's an Allocatable
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's much safer. I'll do that.
Do you think We should add such an assertion in ASR verify? to make sure any character type with physicalType = DescriptorString is an allocatable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think We should add such an assertion in ASR verify? to make sure any character type with physicalType = DescriptorString is an allocatable
Yes, that would be nice. If my understanding is right, that should be the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes definitely put this into verify(). All "assumptions" should be checked by verify().
It's a design detail. but I didn't see a benefit for |
@assem2002 let me know when this is ready for review from me. |
Go ahead and resolve the conflicts. |
Ok, so there is one fundamental problem here that we need to fix. Consider, let's say, this test: https://github.com/assem2002/lfortran/edit/descriptor_string_FEEDBACK/tests/reference/asr-string_04-2ebc0e6.stdout?pr=%2Flfortran%2Flfortran%2Fpull%2F4918 In here, we have the
And then we assign to it like this:
The types here do not match. The LHS of the assignment is of type So there are two issues to fix:
|
@certik so the key goal here is to make clean ASR, right? as using assignment statement with incompatible sides aren't an issue in the backend we just use runtime function |
Clean ASR and clean backend. When converting descriptor to pointer, you just grab the pointer, but you do it in the visit_cast in the backend (when casting from pointer to descriptor, you need to create a descriptor, in the visit_cast). So in assignment you only assign either two pointers or two descriptors, which is straightforward. |
Yeah I think explicit ASR casting is better way to to assignments and argument passing. This may even can get rid of the
|
Yes, exactly.
Yes. |
efd1f1f
to
929c571
Compare
ASRUtils::type_get_past_array_pointer_allocatable(t))->m_physical_type == ASR::string_physical_typeType::DescriptorString; | ||
} | ||
|
||
static inline ASR::expr_t* cast_string_pointer_to_descriptor(Allocator& al, ASR::expr_t* string){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add documentation:
This functions casts if necessary.
Let's document in the String type in doc/... when each physical type is used. Currently:
|
3457583
to
4123d10
Compare
The PR is ready for review. I'm pushing documentation, It'd be helpful to read it before reviewing this. |
I'm sorry I forgot about this today, I'll take a look tomorrow morning. |
d23cf94
to
63e376f
Compare
|
||
- **DescriptorString :** | ||
- It's an LLVM struct to hold information about data, size and capacity. | ||
- It's represented in LLVM IR as : `{char*, int64, int64}`. The benefit from having `size` and `capacity` is that it gives us the ability to have dynamic string that's not computationally expensive, just like `std::vector` in c++. So the key points are: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- It's represented in LLVM IR as : `{char*, int64, int64}`. The benefit from having `size` and `capacity` is that it gives us the ability to have dynamic string that's not computationally expensive, just like `std::vector` in c++. So the key points are: | |
- It's represented in LLVM IR as : `{char*, int64, int64}`. The benefit from having `size` and `capacity` is that it gives us the ability to have dynamic string that's not computationally expensive, just like `std::string` (and `std::vector`) in c++. So the key points are: |
- It's represented in LLVM IR as : `{char*, int64, int64}`. The benefit from having `size` and `capacity` is that it gives us the ability to have dynamic string that's not computationally expensive, just like `std::vector` in c++. So the key points are: | ||
- `size` avoids calling `strlen` when we want to know the size of a string. | ||
- `capacity` gives us the flexibility of extending the string without losing performance by doubling the memory location every time we run out of allocated memory space. | ||
- DescriptorString must always be an allocatable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add here when in ASR the DescriptorString is (or should be) used, and when PointerString should be used. See #4918 (comment).
Currently:
- if it is allocatable, it is DescriptorString
- otherwise it is PointerString
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, a fortran example would be nice as well, just the way we've it for StringPhysicalCast above (that was quite easy to understand)
This is ready for review. This PR does one change: for allocatable strings it uses DescriptorString physical type now. @gxyd, @Pranavchiku can you please carefully review this PR? At high level it looks good to me, we just need to check all details. |
…apacity with the runtime functions
…hysical descriptor string
…g Types + minor change in string_section_assign
…not value (function return by value not pointer to avoid dangling pointer)
7b2a524
to
5e64d08
Compare
ready to review. |
@gxyd, @Pranavchiku can you please review this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this PR is good. We can improve upon it later as needed.
Sorry, I didn't get the time to review it. |
No worries, we reviewed it at the phone call yesterday. |
Towards #2257
In separate PR I'll utilize
size
andcapacity
inlfortran_alloc()
,lfortran_strcpy()
,lfortran_string_write()