Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copybook.parseSimple not dropping the fillers in output ast #324

Closed
ghost opened this issue Sep 17, 2020 · 19 comments
Closed

Copybook.parseSimple not dropping the fillers in output ast #324

ghost opened this issue Sep 17, 2020 · 19 comments
Labels
accepted Accepted for implementation bug Something isn't working

Comments

@ghost
Copy link

ghost commented Sep 17, 2020

String copybookContents = "01 RECORD.
05 FILLER PIC X(1).
05 COMPANY_PREFIX PIC X(3).
05 FILLER PIC X(1).
05 FILLER PIC X(1).
05 COMPANY_NAME PIC X(9)."

Group grp = CopybookParser.parseSimple(copybookContents, true,true,commentPolicy.apply(false,6,72)).ast();

in the above scenario as drop_value_fillers is true then the output ast also should not contain the FILLERS. But output ast is providing each column details.

@ghost ghost added the bug Something isn't working label Sep 17, 2020
@yruslan
Copy link
Collaborator

yruslan commented Sep 22, 2020

Thanks for the bug report. Dropping of fillers is a feature of 'spark-cobol'. Fillers should be deopped in Spark schema while still can be present in the AST.
But your expectation makes sense. We can drop fillers from AST as well.

@yruslan yruslan added the accepted Accepted for implementation label Sep 22, 2020
@yruslan
Copy link
Collaborator

yruslan commented Feb 2, 2022

Hi, I was revisiting this issue.
Each AST element has isFiller flag, so you can always ignore all fields if the flag is true.

Is it necessary for your use case to actually remove fillers from the AST if you have this flag?

@ghost
Copy link
Author

ghost commented Feb 2, 2022

That's how we are handling in our code while parsing the ast. It's a good to have feature to drive via parsesimple

@yruslan
Copy link
Collaborator

yruslan commented Feb 2, 2022

Could you please elaborate on what are you trying to achieve?

Every use case that I can think of can be done by just ignoring fields (AST statements) for which isFiller == true.

@ghost
Copy link
Author

ghost commented Feb 2, 2022

We are currently parsing the copybook using ast and converting to json. We are using parsesimple to get the ast. if based on the user options if we can drop or retain the filler in ast then for us its not required to handle while converting back to json.

@yruslan
Copy link
Collaborator

yruslan commented Feb 2, 2022

Sure, but using node.isFiller can be used to achieve the same, isn't it?

What's the code snippet that you are using?

@yruslan
Copy link
Collaborator

yruslan commented Feb 2, 2022

I see why you are asking for the feature. There are parameters that say 'dropGroupFiller' and 'dropValueFillers' but nothing actually dropped, just marked. It makes sense. We can implement the dropping as well

@ghost
Copy link
Author

ghost commented Feb 2, 2022

You are correct. Currently we are parsing and dropping the fillers by utilizing the isFiller. currently the parameter we are sending for dropping filler to parseSimple is not functional.

@yruslan
Copy link
Collaborator

yruslan commented Feb 3, 2022

This is implemented.

Please, check the latest master. Note that the signature of the method (parseSimple) has changed. It now reflects what is actually being done.

Spark schemas doe not support having 2 column names having tha same name so FILLERs need to be renamed in order to be retained. So the method allow controlling if fillers are going to be renamed, and a separate flag that controls if non-renamed fields should be dropped.

@ghost
Copy link
Author

ghost commented Feb 3, 2022

@yruslan If signature is changed then it will break our existing code if we update the version, we need to make it backward compatible. Can we have it overloaded.

@yruslan
Copy link
Collaborator

yruslan commented Feb 3, 2022

Okay, fair point. Will make it compatible

@ghost
Copy link
Author

ghost commented Feb 3, 2022

@yruslan thanks a lot. you can mark the existing parsesimple signature to @deprecated

@yruslan
Copy link
Collaborator

yruslan commented Feb 3, 2022

The changes are in master - you can check. To retain the compatibility the behavior is the same by default.
Use 'dropFillersFromAst = true' to actually drop fillers from the AST.

@ghost
Copy link
Author

ghost commented Feb 4, 2022

So as in parseSimple is still have a new parameter dropFillersFromAst in the signature so once we upgrade we have to provide the value in our calling code. In scala it won't ask as the default value is given for it as false but in Java we have to explicitly specify to false while calling.

@yruslan
Copy link
Collaborator

yruslan commented Feb 4, 2022

Yeah, but this is the only way to preserve backward compatible behavior. Since the method has default values it cannot be overloaded.

@yruslan
Copy link
Collaborator

yruslan commented Feb 4, 2022

2.4.8 is released and it has the fix.

Btw, Cobrix has converters project that currently just provides examples on how to convert mainframe files to JSON without Spark. Consider contributing your converter to the project :)

@yruslan yruslan closed this as completed Feb 4, 2022
@ghost
Copy link
Author

ghost commented Feb 4, 2022

Sure I would be happy to contribute.
Can I know the overall requirement and detail plan for it. Is there any high level design document or still it is in process. I can contribute their also. Any plans to have the code base in Java or it will be completely in scala :)

@yruslan
Copy link
Collaborator

yruslan commented Feb 7, 2022

We don't have an exact plan at the moment. Just ideas. The one that we might likely to implement is a command-line tool + library that allows converting mainframe files to JSON.

Yes, the project will continue to be in Scala.

@ghost
Copy link
Author

ghost commented Feb 7, 2022

Sure then I think of 2 requirements

  1. Converting copybook ast to corresponding json
  2. Converting copybook to a map with key as parent hierarchy and value as column name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted Accepted for implementation bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant