Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SeqMap.keys is key iteration order only #10766

Open
wants to merge 1 commit into
base: 2.13.x
Choose a base branch
from

Conversation

som-snytt
Copy link
Contributor

@som-snytt som-snytt commented Apr 29, 2024

Fixes scala/bug#12991

Ensure that keys of a SeqMap is a Seq in expected order.

That satisfies the unwritten expectation that keys is ordered and mutually comparable.

@scala-jenkins scala-jenkins added this to the 2.13.15 milestone Apr 29, 2024
@som-snytt
Copy link
Contributor Author

@lrytz deprecated overriding of SeqMap#keys may have been hasty. IF it is useful (a big IF), it is usefully specialized. WDYT?

#10544

@SethTisue SethTisue requested a review from a team April 29, 2024 23:57
@SethTisue SethTisue added the library:collections PRs involving changes to the standard collection library label Apr 29, 2024
@som-snytt som-snytt marked this pull request as ready for review April 30, 2024 05:46
@lrytz
Copy link
Member

lrytz commented Apr 30, 2024

I think either keys is an alais of keySet, or we guarantee additional properties. I assume these would be "strict copy" and "same iteration order for ordered / sorted maps"? Maybe we can add overrides that reutrn Seq after SIP-51, but even without that the properties can be expressed in scaladoc and tests, and it's clear whether something is a bug or not.

Doing something in between can probably avoid some surprises but not really prevent bugs.

@som-snytt
Copy link
Contributor Author

Not sure if lrytz agreed to undeprecate, so I pushed it as a separate commit, unsquashed. Sorry, Jenkins.

To be clear, this PR was motivated by Ichoran's endorsement on the ticket, that even if keys is not (yet) Seq, go ahead and change it to return one anyway, as nobody could possibly be relying on the current behavior.

with nobody being the wiser for it

I can report that I do not feel wiser.

@lrytz
Copy link
Member

lrytz commented May 8, 2024

I see the point that returning a Seq doesn't hurt and can prevent surprises.

But if we don't actually guarantee anything then the value is very limited, surprises when using keys are just delayed.

I think either keys should be a final / deprecatedOverriding alias, or it should have documented properties that clients can rely on, anything in between is useful only by accident. But changing keys to always return a strict Seq copy is a (performance) breaking change, so it's not clear whether we can do that.

@som-snytt
Copy link
Contributor Author

som-snytt commented May 8, 2024

This is only for SeqMaps that don't yet override. It could be restricted further to SeqMap1..4, to be more conservative.

Then we can sort whether it's useful API etc under SIP-51.

Otherwise, I'll have to at-notify Ichoran.

@som-snytt som-snytt changed the title SeqMap.keys is Seq SeqMap.keys is Seq [ci: last-only] May 8, 2024
@som-snytt som-snytt force-pushed the issue/12991-seqmap-keys branch from 67027de to bff3d35 Compare May 8, 2024 20:19
@som-snytt
Copy link
Contributor Author

it's not clear what the API intends:

      assertEquals(/*"Expect the same keys to appear in the join taken either way around.",*/
        ourChanges.mergeByKey(theirChangesRedux).keySet,
        theirChangesRedux.mergeByKey(ourChanges).keys,
      )

@SethTisue
Copy link
Member

SethTisue commented Jul 18, 2024

@som-snytt where does this stand, in your opinion?

also interested in @scala/collections opinions

@som-snytt
Copy link
Contributor Author

Doing something in between can probably avoid some surprises but not really prevent bugs.

This standard is too high for me.

@NthPortal
Copy link
Contributor

I'm pretty sure I've mentioned this before somewhere, but I'd personally prefer that keys be of the as-yet-nonexistent type SeqSet.

I would also like keys in general to be an alias of keySet (maybe eventually the primary and later only method!)

@som-snytt
Copy link
Contributor Author

I don't remember why I reopened, but probably because Seth said

if anyone else cares enough

which is nerd-sniping as moral imperative.

I'll refresh my memory and the branch.

@som-snytt
Copy link
Contributor Author

I see the trouble began with this spurious override in VectorMap:

override def keys: Vector[K] = keysIterator.toVector

The override would make sense on grounds of efficiency, but per the docs, anyone who wants this behavior should write it out.

Lukas's comment is that the doc for keys must specify its guarantee.

If keys are ordered,

c.keys.iterator.sameElements(c.keySet.iterator)

Addtionally, keySet is ordered if the collection is sorted.

It is a non-goal of this PR to guarantee that keys returns a Set. That is what keySet is for. If you want a set that preserves iteration order, then the doc could show how to build that, too.

As an example, ListMap yields a list of keys. If you want an unordered set, use keys.toSet, or keySet for ordered keys.

The only reason to prefer keys.iterator to keySet.iterator is when it's more efficient for unsorted keys. (Speculation.)

I see the ticket is about checking equality. That is an especially weak expectation. Rex suggests the solution in this PR, just make keys return a Seq. But if the purpose of keys is "give me cheap iteration order", then producing a Seq is not desirable. That is Lukas's point about a performance regression, which would be severe for large maps. (The ticket is about tiny maps.)

When I touch the code, perhaps my perspective will change, and I'll note that here.

@som-snytt som-snytt force-pushed the issue/12991-seqmap-keys branch from bff3d35 to f580d8d Compare November 27, 2024 18:16
@som-snytt
Copy link
Contributor Author

The Ichoranism on scala/bug#12808

promise less, but make it work

@som-snytt
Copy link
Contributor Author

som-snytt commented Nov 27, 2024

That is just java.io.IOException: No space left on device

This PR will be squashable: it reverts the deprecated overriding, adds clarifying doc, and also slips in the Ichoranist "make it work" because why not. That calls for a mima exception for the private object, which I did not drill into during this expedition.

@som-snytt som-snytt changed the title SeqMap.keys is Seq [ci: last-only] SeqMap.keys is key iteration order only [ci: last-only] Nov 29, 2024
@som-snytt som-snytt force-pushed the issue/12991-seqmap-keys branch from f580d8d to 87815f5 Compare November 29, 2024 18:04
@som-snytt som-snytt force-pushed the issue/12991-seqmap-keys branch from 87815f5 to 2b71cd0 Compare December 3, 2024 13:29
@som-snytt
Copy link
Contributor Author

Note to self: I ran sbt:scala2> library/mimaReportBinaryIssues locally but was somehow careless and thought it said I was OK without an exclusion (because I had changed the signature?). Now I see the correct report, so I don't know what my mistake was.

@som-snytt som-snytt changed the title SeqMap.keys is key iteration order only [ci: last-only] SeqMap.keys is key iteration order only Dec 3, 2024
@som-snytt som-snytt requested a review from lrytz December 3, 2024 22:22
Copy link
Member

@lrytz lrytz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're keeping all 3 methods around I'd like the docs to say when to use which - I don't know at this point.

Is there a big risk that existing custom maps would violate the new iteration order guarantee?

@@ -130,6 +132,7 @@ object SeqMap extends MapFactory[SeqMap] {
f(key1, value1)
f(key2, value2)
}
override def keys: Iterable[K] = Vector(key1, key2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iiuc these overrides wouldn't be needed to support the new iteration order guarantee?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is only for the unreasonable expectation that a.keys == b.keys for a couple of SeqMap.

Undeprecate overriding Map keys
@som-snytt som-snytt force-pushed the issue/12991-seqmap-keys branch from 2b71cd0 to 6511259 Compare December 22, 2024 21:58
@som-snytt
Copy link
Contributor Author

I gave it some thought again, because what else are the winter holidays for, and I wavered a moment about whether to keep the deprecation of keys. Also, I noticed that in Java, Dictionary has keys (an Enumeration) but Map has keySet.

There is no justification for keys in the standard library, assuming ListMap is always small. The important optimization via retronym is to override keysIterator and avoid Map#iterator which produces (k,v).

The clever pattern is that keySet delegates to keysIterator, which could be protected, if you guarantee keySet.iterator preserves order. Maybe the thinking was that unsorted set does not let you make that guarantee?

It occurs to me that SeqMap could be a SortedMap with an Ordering that reports arrival order, which would solve this API question. That would be for a future version.

assertSameElements(keys.iterator, sm.keys.iterator)
assertSameElements(vm.keys, sm.keys)
assertEquals(vm.keys, sm.keys)
assertEquals(lm.keys, sm.keys)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final edit added the unreasonable expectation back in. Previously, keys changed from Set to Seq after four elements. "But that's a change of only one letter!"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
library:collections PRs involving changes to the standard collection library
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SeqMap.keys != VectorMap.keys (with identical keys) for sizes 0 to 4
7 participants