Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patrick/move out las reader writer #4

Merged
merged 39 commits into from
Oct 18, 2021

Conversation

patrick-reinhard
Copy link
Owner

@patrick-reinhard patrick-reinhard commented Sep 27, 2021

Note: First merge #3 before merging this.

  • Tests are passing
  • Moving out LAS reader and writer functionality from Curve and Well class to the las.py module.
  • Tried leaving interfaces intact. Only interface change is that the Project.header attribute is now a pd.DataFrame instead of Header class. Adapted functions and methods to accommodate for Project.header being a pd.DataFrame
  • Redirect from_lasio() to from_las() and add deprecation warning to from_lasio().
  • Mapping a LAS file (version <= 2.0) 1 to 1 to a pd.DataFrame by splitting the header and the data section from the LAS in two dataframes:
  datasets = {'Curves': (data, header))

  data = pd.DataFrame({
      'DEPT': [100.0, 101.0, 102.0],
      'GR': [80.0, 85.0, 82.0],
      'DEN': [2.10, 2.15, 2.20]
  })

  header = pd.DataFrame({
      'original_mnemonic': ['VERS', 'WRAP', 'STRT', 'STOP', 'STEP', 'DEPT', 'GR', 'DEN', ''],
      'mnemonic': ['VERS', 'WRAP', 'STRT', 'STOP', 'STEP', 'DEPT', 'GR', 'DEN', ''],
      'unit': ['', '', 'M', 'M', 'M', 'M', 'GAPI', 'g/cm3', '']
      'value': [2.0, 'NO', 100.0, 102.0, 1.0, '', '', '', '']
      'descr': ['Version 2.0', 'One line per depth step', '', '', '', 'DEPTH', 'Gamma Ray', 'Density', 'Comment']
      'section': ['Version', 'Version', 'Well', 'Well', 'Well', 'Curves', 'Curves', 'Curves', 'Other']
  })
  • Already accommodated datasets design for LAS 3.0 (2D and 3D data) but lasio work on this is not yet finished. LAS <=2.0 you can map a las file 1 to 1 to a dataframe. For LAS 3.0 you can have multiple sections and runs that each map 1 tot 1 to a dataframe:
datasets = {
        'Curves':   (data, header), # for LAS 1.2 & LAS 2.0
        'ASCII':    (data, header), # for LAS 3.0
        'Drilling': (data, header), # for LAS 3.0
        'Core[1]':  (data, header), # for LAS 3.0 - Run 1
        'Core[2]':  (data, header)  # for LAS 3.0 - Run 2
    }

welly/las.py Outdated Show resolved Hide resolved
welly/las.py Outdated Show resolved Hide resolved
welly/well.py Outdated Show resolved Hide resolved
welly/well.py Outdated Show resolved Hide resolved
welly/well.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@fkiraly-shell fkiraly-shell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice piece of work!

I've looked only at curve, well, las and the tests.

Major comments:

  • this PR seems to contain changes to plotting that are unrelated to the claimed content (las writer)? This seems problematic, especially since the other PR is still being worked on. What is the branch dependency and merge sequence here?
  • I'd put the las module not in the root but in a data_io folder or similar.
  • there are also changes to file level fixture paths. Why is that? Should this not "safer" in a separate PR, to avoid doing too much in one place? Especially since it is also touching tests and methods unrelated to the LAS IO concern? Or is there a good reason to do this here?
  • read/write tests can act up due to paths being interpreted differently on different operating systems. Are we sure that the writing works correctly for every developer setup?
  • there is still a lot of read/write/formatting logic in the well class, in from_las and to_las and add_curves_from_lasio. Do we really need that in there? It feels like it should be a loading/IO concern, and calling a method that creates curves from the output of the loaders in the las module.
  • similar in curve.from_lasio_curve, do we really want all that stuff in there?

@patrick-reinhard
Copy link
Owner Author

patrick-reinhard commented Oct 6, 2021

  • this PR seems to contain changes to plotting that are unrelated to the claimed content (las writer)? This seems problematic, especially since the other PR is still being worked on. What is the branch dependency and merge sequence here?

Merging sequence is #3, then this PR, then #6. #3 Is already merged into this one and there are no conflicts.

  • I'd put the las module not in the root but in a data_io folder or similar.

I'd like to stick to the current structure without subfolders.

  • there are also changes to file level fixture paths. Why is that? Should this not "safer" in a separate PR, to avoid doing too much in one place? Especially since it is also touching tests and methods unrelated to the LAS IO concern? Or is there a good reason to do this here?

I think it is safe as we are sequentially merging PRs one-by-one. Needed to adjust a few tests to accommodate for the new header object type (pd.DataFrame instead of Header)

  • read/write tests can act up due to paths being interpreted differently on different operating systems. Are we sure that the writing works correctly for every developer setup?

Added utils.to_filename(path) and lasio is also taking care of that here

  • there is still a lot of read/write/formatting logic in the well class, in from_las and to_las and add_curves_from_lasio. Do we really need that in there? It feels like it should be a loading/IO concern, and calling a method that creates curves from the output of the loaders in the las module.

Yes I agree. We could move that out. I can also address that in the next #6 PR. This PR is really only about moving out the reader/writer logic (from and to disk) while trying to keep interfaces intact.

  • similar in curve.from_lasio_curve, do we really want all that stuff in there?

This I will address in #6 where I change the curve object. Didn't want to touch the curve construction in this PR.

@patrick-reinhard patrick-reinhard marked this pull request as ready for review October 6, 2021 15:53
@patrick-reinhard patrick-reinhard linked an issue Oct 7, 2021 that may be closed by this pull request
8 tasks
Copy link
Collaborator

@justyna-przybysz justyna-przybysz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Patrick,

this looks really good. I had few comments related to doc. DO you have a notebook or .py file for testing to share?

welly/las.py Outdated Show resolved Hide resolved
welly/las.py Outdated Show resolved Hide resolved
welly/las.py Outdated Show resolved Hide resolved
welly/las.py Outdated Show resolved Hide resolved
welly/las.py Outdated Show resolved Hide resolved
welly/las.py Outdated Show resolved Hide resolved
welly/las.py Outdated Show resolved Hide resolved
welly/las.py Outdated Show resolved Hide resolved
welly/las.py Show resolved Hide resolved
welly/well.py Outdated Show resolved Hide resolved
welly/well.py Outdated Show resolved Hide resolved
welly/well.py Show resolved Hide resolved
welly/well.py Outdated Show resolved Hide resolved
@patrick-reinhard
Copy link
Owner Author

@fkiraly-shell
I've pushed changes to address your feedback. There are 2 things I would like to do in #6 because it will be influenced by that PR and therefore making the changes now will create extra rework:

  • Implementing well.to_datasets()
  • Moving out construction logic from well.from_datasets().

I've added TODO tags to keep track of those.

Requesting your review and approval.

@patrick-reinhard patrick-reinhard merged commit 086deb3 into develop Oct 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Welly refactor plan
4 participants