Skip to content

Commit

Permalink
initial
Browse files Browse the repository at this point in the history
  • Loading branch information
tonsky committed Feb 19, 2019
0 parents commit 0683019
Show file tree
Hide file tree
Showing 23 changed files with 3,422 additions and 0 deletions.
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
.nrepl-port
.lein-*
target
.DS_Store
.idea
pom.xml
21 changes: 21 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2019 Nikita Prokopov

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
131 changes: 131 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
A B-tree based persistent sorted set for Clojure/Script.

PersistentSortedSet supports:

- transients,
- custom comparators,
- fast iteration,
- efficient slices (iterator over a part of the set)
- efficient `rseq` on slices.

Almost a drop-in replacement for `clojure.core/sorted-set`, the only difference being this one can’t store `nil`.

Implementations are provided for Clojure and ClojureScript.

## Support us

<a href="https://www.patreon.com/bePatron?u=4230547"><img src="./extras/become_a_patron_button@2x.png" alt="Become a Patron!" width="217" height="51"></a>

## Usage

Dependency:

```clj
[persistent-sorted-set "0.1.0"]
```

Code:

```clj
(require '[me.tonsky.persistent-sorted-set :as set])

(set/sorted-set 3 2 1)
;=> #{1 2 3}

(-> (set/sorted-set 1 2 3 4)
(conj 2.5))
;=> #{1 2 2.5 3 4}

(-> (set/sorted-set 1 2 3 4)
(disj 3))
;=> #{1 2 4}

(-> (set/sorted-set 1 2 3 4)
(contains? 3))
;=> true

(-> (apply set/sorted-set (range 10000))
(set/slice 5000 5010))
;=> (5000 5001 5002 5003 5004 5005 5006 5007 5008 5009 5010)

(-> (apply set/sorted-set (range 10000))
(set/rslice 5010 5000))
;=> (5010 5009 5008 5007 5006 5005 5004 5003 5002 5001 5000)

(set/sorted-set-by > 1 2 3)
;=> #{3 2 1}
```

## Performance

To reproduce:

1. Install `[com.datomic/datomic-free "0.9.5703"]` locally.
2. Run `lein bench`.

`PersistentTreeSet` is Clojure’s Red-black tree based sorted-set.
`BTSet` is Datomic’s B-tree based sorted set (no transients, no disjoins).
`PersistentSortedSet` is this implementation.

Numbers I get on my 3.2 GHz i7-8700B:

### Conj 100k randomly sorted Integers

```
PersistentTreeSet 143..165ms
BTSet 125..141ms
PersistentSortedSet 105..121ms
PersistentSortedSet (transient) 50..54ms
```

### Call contains? 100k times with random Integer on a 100k Integers set

```
PersistentTreeSet 51..54ms
BTSet 45..47ms
PersistentSortedSet 46..47ms
```

### Iterate with java.util.Iterator over a set of 1M Integers

```
PersistentTreeSet 70..77ms
PersistentSortedSet 10..11ms
```

### Iterate with ISeq.first/ISeq.next over a set of 1M Integers

```
PersistentTreeSet 116..124ms
BTSet 92..105ms
PersistentSortedSet 56..68ms
```

### Iterate over a part of a set from 1 to 999999 in a set of 1M Integers

For `PersistentTreeSet` we use ISeq produced by `(take-while #(<= % 999999) (.seqFrom set 1 true))`.

For `PersistentSortedSet` we use `(.slice set 1 999999)`.

```
PersistentTreeSet 238..256ms
PersistentSortedSet 70..91ms
```

### Disj 100k elements in randomized order from a set of 100k Integers

```
PersistentTreeSet 151..155ms
PersistentSortedSet 91..98ms
PersistentSortedSet (transient) 47..50ms
```

## Projects using PersistentSortedSet

- [Datascript](https://github.com/tonsky/datascript), persistent in-memory database

## License

Copyright © 2019 Nikita Prokopov

Licensed under MIT (see [LICENSE](LICENSE)).
42 changes: 42 additions & 0 deletions bench-java/clojure/lang/ClojureSet.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
package clojure.lang;

import java.util.*;
import clojure.java.api.Clojure;

@SuppressWarnings("unchecked")
public class ClojureSet extends PersistentTreeSet implements me.tonsky.persistent_sorted_set.IPersistentSortedSet {
static public final ClojureSet EMPTY = new ClojureSet(null, PersistentTreeMap.EMPTY);

static IFn takeWhile;
static {
takeWhile = (IFn) Clojure.var("clojure.core", "take-while");
}

ClojureSet(IPersistentMap meta, IPersistentMap impl) {
super(meta, impl);
}

public IPersistentSet cons(Object o) {
if(contains(o))
return this;
return new ClojureSet(meta(), impl.assoc(o,o));
}

public ISeq slice(Object from, Object to, Comparator cmp) {
IFn pred = new AFn() {
public Object invoke(Object arg) {
return cmp.compare(arg, to) <= 0;
}
};
return (ISeq) takeWhile.invoke(pred, seqFrom(from, true));
}

public ISeq rslice(Object from, Object to, Comparator cmp) {
IFn pred = new AFn() {
public Object invoke(Object arg) {
return cmp.compare(arg, to) >= 0;
}
};
return (ISeq) takeWhile.invoke(pred, seqFrom(from, false));
}
}
Loading

0 comments on commit 0683019

Please sign in to comment.