Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues when trying to update ND4J depenency to 1.0.0-M2.1 #24

Open
wolfig opened this issue May 24, 2023 · 1 comment
Open

Issues when trying to update ND4J depenency to 1.0.0-M2.1 #24

wolfig opened this issue May 24, 2023 · 1 comment

Comments

@wolfig
Copy link

wolfig commented May 24, 2023

Hi Enzosos,

I am trying to use your library in a Project for tuning an particle ion source (similar to what is show here: Ion Source Optimization Using Bi-Objective Genetic and Matrix-Profile Algorithm). In the paper I used a Python implementation of matrix-profile, but not I want to move the logic to JAVA. One goal in this is to update the dependencies to np4j to version 1.0.0-M2.1, which unfortunately has breaking changes (introduced with nd4j 1.0.0-beta4).

One issue I am facing when running the unit tests with the updated dependency is that all calls of the type

INDArray.get(INDArrayIndex... indexes)

Need to be two dimensional now. This is not a general issue, but when changing the code in MatrixProfileCalculator::MPRunnable::run(), from

    @Override
    public void run() {
        INDArray distanceProfile      = distProfile.getDistanceProfile(timeSeriesA, timeSeriesB, index, window);
        INDArray distanceProfileIndex = distProfile.getDistanceProfileIndex(tsBLength, index, window);

        if (trivialMatch) {
            INDArrayIndex[] indices = new INDArrayIndex[] { NDArrayIndex.interval(
                            Math.max(0, index - window / 2),
                            Math.min(index + window / 2 + 1, tsBLength)) };
            distanceProfile.put(indices, Double.POSITIVE_INFINITY);
        }

        updateProfile(distanceProfile, distanceProfileIndex);
    }

to (NDArrayIndex.all() to be compatible to nd4j 1.0.0-M2.1)

    @Override
    public void run() {
        INDArray distanceProfile      = distProfile.getDistanceProfile(timeSeriesA, timeSeriesB, index, window);
        INDArray distanceProfileIndex = distProfile.getDistanceProfileIndex(tsBLength, index, window);

        if (trivialMatch) {
            INDArrayIndex[] indices = new INDArrayIndex[] { **NDArrayIndex.all(),** NDArrayIndex.interval(
                            Math.max(0, index - window / 2),
                            Math.min(index + window / 2 + 1, tsBLength)) };
            distanceProfile.put(indices, Double.POSITIVE_INFINITY);
        }

        updateProfile(distanceProfile, distanceProfileIndex);
    }

I get errors like

  java.lang.IllegalStateException: Indices are out of range: Cannot get interval index Interval(b=0,e=5,s=1) on array with size(1)=4. Array shape: [1, 4], indices: [all(), Interval(b=0,e=5,s=1)]

for all "testMatrixProfileSelfJoin*" test cases of Matrix profile test. The cause of this error is the fact that distanceProfile.put(...) (calling INDArray.get()) fails, because for the get(...) the IntervalIndex in ìndicesis larger than the distanceProfilearray. This again is caused, because the IntervalIndexis created using the size of tsB which is larger than distanceProfile.

One way to cure this is to change the code to in MatrixProfileCalculator to

    @Override
    public void run() {
        INDArray distanceProfile      = distProfile.getDistanceProfile(timeSeriesA, timeSeriesB, index, window);
        INDArray distanceProfileIndex = distProfile.getDistanceProfileIndex(tsBLength, index, window);

        if (trivialMatch) {
            INDArrayIndex[] indices = new INDArrayIndex[] { **NDArrayIndex.all(),** NDArrayIndex.interval(
                            Math.max(0, index - window / 2),
                            Math.min(index + window / 2 + 1, **distanceProfile.length()**)) };
            distanceProfile.put(indices, Double.POSITIVE_INFINITY);
        }

        updateProfile(distanceProfile, distanceProfileIndex);
    }

I am not sure if this is the right approach as it changes the logic of calculating Matrix-Profile. With this change, the tests do not throw errors any more, but I get assertion failures in tests Windows8, 2SawTeeth, 2Humps, ... ; Windows4, Windows5, StraightLine, Plateau become green.

Could you have a look into this?

@wolfig
Copy link
Author

wolfig commented May 27, 2023

I made some experiments. As a reference-implementation, I assume the Python stumpy library. Furthermore, Furthermore, I assume that your stmp-code is equivalent to stumpy's "stump" method. As test data I used the "targetSeriesWithPattern":

[0.6, 0.5, 2.00, 1.0, -1.01, -0.5, 1.0, 2.3, 4.0, 5.9, 4.2, 3.1, 3.2, 3.4, 2.9, 3.5, 1.05, -1.0, -0.50, 1.01, 2.41, 3.99, 6.01, 4.7, 3.2, 2.6, 4.1, 4.3, 1.1, 1.7, 3.1, 1.9, -0.5, 2.1, 1.9, 2.01, -0.02, 0.48, 2.03, 3.31, 5.1, 7.1, 5.1, 3.2, 2.3, 1.8, 2.1, 1.7, 1.1, -0.1, 2.1, 2.01, 3.9, 3.1, 1.05, -1.0, -0.5, 1.01, 2.41, 3.99, 6.01, 4.7, 4.5, 3.9, 2.1, 3.3, 3.1, 2.7, 1.9]

and calculated the stumpy.stump profile with windows size 8 to be equivalent with your "Windows8" test case:

mp = stumpy.stump(data, m=8)

stumpy.stump produces (this is my reference):

1.978220936260314 0.83707925066267 0.4546809779055928 0.08726639430199337 0.17008405364304846 0.4098865841250271 0.6994474556577408 1.4518582387451167 1.3319005154718184 0.9451213501212021 1.8153057348935029 1.9904369580815713 1.1875958951401007 1.081292397096952 0.5911694103586415 0.16532701476470923 0.0 0.17008405364304846 0.4098865841250271 0.6994474556577408 1.4518582387451167 1.2224427563106743 1.459718781282131 1.9023070773167448 2.288602610987441 1.2224427563106743 1.6558969321997734 1.9023070773167448 2.1472026281338645 1.9278741619147393 1.9850484024035508 2.472148082913442 1.978220936260314 0.83707925066267 0.48804702922152227 0.08726639430199337 0.39649268281634986 0.7430980370551044 1.1167212838495193 1.354934919189077 1.3319005154718184 1.1392798574009926 0.9451213501212021 1.697424487450637 1.9327652893783562 1.33003115179361 1.5756094407644912 2.081034560837073 1.8508390280468399 1.1167212838495193 1.1875958951401007 1.081292397096952 0.5911694103586415 0.16532701476470923 0.0 0.5018020789530718 0.6484523316327658 1.3794821611001897 1.7919132200300565 1.4694559779059273 1.1392798574009926 1.3864057061299813

You provide "expected values in your test case "Windows8". The difference array mp_values - test case_expected looks like this (I set all values below 0.001 to zero...):

0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.0022, 0.0, 0.0, 0.0, 0.0, -0.8473572436893255, 0.0, -0.03699292268325527, 0.0, -0.5861572436893256, -0.04150306780022661, -0.13219292268325522, 0.0, -0.016525838085260647, 0.0, -0.08145191708655775, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.0022, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0

the corresponding difference array mp_values - values_of_my_correction looks like this:

0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.8473572436893255, 0.0, -0.03699292268325527, 0.0, -0.5861572436893256, -0.04150306780022661, -0.13219292268325522, 0.0, -0.016525838085260647, 0.0, -0.08145191708655775, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0

Question is what that means...?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant