Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify processing ASCII text files that have line ending character #295

Closed
3 tasks done
yruslan opened this issue May 25, 2020 · 0 comments · Fixed by #296
Closed
3 tasks done

Simplify processing ASCII text files that have line ending character #295

yruslan opened this issue May 25, 2020 · 0 comments · Fixed by #296
Assignees
Labels
enhancement New feature or request

Comments

@yruslan
Copy link
Collaborator

yruslan commented May 25, 2020

Background

Currently, all parameters must be specified in this code snippet:

    val parsedCopybook = CopybookParser.parseTree(ASCII(), copybook, dropGroupFillers = false, segmentRedefines = Seq(), stringTrimmingPolicy = StringTrimmingPolicy.TrimNone, ebcdicCodePage = CodePage.getCodePageByName("common"), nonTerminals = Seq())
    val cobolSchema = new CobolSchema(parsedCopybook, SchemaRetentionPolicy.CollapseRoot, false)
    val sparkSchema = cobolSchema.getSparkSchema

Feature

  • Simplify loading ASCII files by creating a method that has default options for all parameters.
  • Add direct support for such kind of files.
  • Update the documentation.
@yruslan yruslan added the enhancement New feature or request label May 25, 2020
@yruslan yruslan self-assigned this May 25, 2020
@yruslan yruslan changed the title Simplify copybook parsing for ASCII encoded data Simplify processing ASCII text files that have line ending character May 25, 2020
yruslan added a commit that referenced this issue May 27, 2020
Spark's support for text files is very limited for our use case since it does not support
encodings and custom line endings. We need to re-implement this feature using a
variable-length reader.
yruslan added a commit that referenced this issue May 27, 2020
Spark's support for text files is very limited for our use case since it does not support
encodings and custom line endings. We need to re-implement this feature using a
variable-length reader.
yruslan added a commit that referenced this issue May 28, 2020
Spark's support for text files is very limited for our use case since it does not support
encodings and custom line endings. We need to re-implement this feature using a
variable-length reader.
yruslan added a commit that referenced this issue May 28, 2020
Spark's support for text files is very limited for our use case since it does not support
encodings and custom line endings. We need to re-implement this feature using a
variable-length reader.
@yruslan yruslan reopened this May 28, 2020
@yruslan yruslan closed this as completed Jun 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant