Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export WordEmbeddingsTransform to ONNX #1249

Merged
merged 44 commits into from
Oct 25, 2018

Conversation

vaeksare
Copy link
Member

@vaeksare vaeksare commented Oct 12, 2018

Implements the ability to export the WordEmbeddingsTransform by converting it to an ONNX model, as well as expanding the functionality of some existing structures to allow for more efficient conversion implementation. The detailed conversion strategy can be found in comments inline. Fixes #1248

Testing was done through running the model on the same input using ML.NET and Lotus runtime directly using python bindings, producing the same results. The verified ONNX model (run in Lotus to check results) saved in Json format is used as a baseline for the formal tests.

The resulting ONNX model looks as follows:
wordembeddings

@vaeksare vaeksare self-assigned this Oct 12, 2018
@dnfclas
Copy link

dnfclas commented Oct 12, 2018

CLA assistant check
All CLA requirements met.

@vaeksare vaeksare changed the title Export WordEmbeddingsTransform to ONNX WIP: Export WordEmbeddingsTransform to ONNX Oct 15, 2018
@vaeksare vaeksare changed the title WIP: Export WordEmbeddingsTransform to ONNX Export WordEmbeddingsTransform to ONNX Oct 17, 2018
// | | |
// J[j] K[j] L[j]
// | | |
// --------------------Concat (axis = 1) ---------------------------
Copy link
Member

@wschin wschin Oct 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

``suggestion
// '-------------------Concat (axis = 1) ---------------------------'

Copy link
Member

@wschin wschin Oct 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not fixed yet? #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed now.


In reply to: 226109842 [](ancestors = 226109842)

Copy link
Contributor

@justinormont justinormont left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@vaeksare : Would it be worth writing a unit test which checks output equality between the WordEmbedding transform using ML.NET & its ONNX export? This will ensure the ONNX export stays in sync w/ the behavior of the transform in the future.

Currently, I think you are testing that the ONNX output is the expected value, but I'm concerned that when we do checkins which change the test values, we simply blindly update the expected test outputs without checking for equality between the ML.NET model & its ONNX conversion.

Possible cases (same type you manually checked): #1249 (comment)

// / / | | \ \
// | Cast (to = int64) | Cast (to = float) Not |
// | | | | | |
// '------------ Add -----------' | Scale (scale = 2.0) Cast (to = int32) |
Copy link
Member

@wschin wschin Oct 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing variable names between operators! Operators are never connected directly. There must be some variables between them. Please also make sure those names are consistent to your comments for your code. Thanks. #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, should be good now!


In reply to: 227994103 [](ancestors = 227994103)

@vaeksare
Copy link
Member Author

@justinormont This was something I considered and talked to other people about initially. The issue that occurs is that our OnnxTransform (the only real way we have to run ONNX models directly) currently only works on Windows. As such, these tests would have to be made to only runs on Windows builds, which kind of defeats the purpose and means we still need the original ones. When the ability to run ONNX models is added to all platforms, this is likely something we should look into. But at the moment, I don't believe investing time into this is worth it.

wschin and others added 3 commits October 24, 2018 16:53
Co-Authored-By: vaeksare <vaeksare@microsoft.com>
Co-Authored-By: vaeksare <vaeksare@microsoft.com>
Co-Authored-By: vaeksare <vaeksare@microsoft.com>
Copy link
Member

@wschin wschin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. It looks great. :shipit:

@vaeksare vaeksare merged commit 0cdde0f into dotnet:master Oct 25, 2018
@vaeksare vaeksare deleted the savewordembeddings branch October 25, 2018 17:23
LittleLittleCloud added a commit to LittleLittleCloud/machinelearning that referenced this pull request Jan 26, 2022
* add sweepable api example

* fix tests

* add grid search tuner

* use grid search
LittleLittleCloud added a commit that referenced this pull request Jan 28, 2022
* add sweepable api example

* fix tests

* add grid search tuner

* use grid search
@ghost ghost locked as resolved and limited conversation to collaborators Mar 27, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants