Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to use Nats server as PUB/SUB broker #273

Merged
merged 12 commits into from
May 28, 2020
Merged

Option to use Nats server as PUB/SUB broker #273

merged 12 commits into from
May 28, 2020

Conversation

FZambia
Copy link
Member

@FZambia FZambia commented Mar 20, 2019

This is a proof of concept of using Nats server instead of Redis for PUB/SUB part of Centrifugo. I am still considering is it worth adding this or not, any feedback is welcome.

@Raerten
Copy link

Raerten commented Mar 20, 2019

what are the pros and cons using NATS?
Is NATS a dedicated service? Redis itself is widely spread and likely already used in projects, so NATS can complicate infrastructure.

@FZambia
Copy link
Member Author

FZambia commented Mar 20, 2019

@Raerten I forgot to mention that using Nats as broker will be optional of course. Redis Engine will work the same way as now without changes.

The reason why I think it's useful is because Nats is simple to configure in cluster mode and it's very performant. For applications that only need fast PUB/SUB this can be a nice option.

@FZambia FZambia changed the title Nats server as PUB/SUB broker Option to use Nats server as PUB/SUB broker Mar 20, 2019
@maurodelazeri
Copy link

I'm just using pub and sub ... wondering the performance redis vs nats
I can be wrong, but I believe redis will still be faster, even if you're using redis sentinel

@FZambia
Copy link
Member Author

FZambia commented Mar 20, 2019

@maurodelazeri hi, Nats is insanely fast in terms of publish speed:

BenchmarkRedisPublishParallel-4         300000          4193 ns/op
BenchmarkNatsPublishParallel-4          2000000         782 ns/op

And a bit slower in terms of subscription performance:

BenchmarkRedisSubscribeParallel-4        500000         3004 ns/op
BenchmarkNatsSubscribeParallel-4         100000         11200 ns/op

I was comparing single instance of Redis vs single instance of Nats. This is a bit unfair as Nats utilizes many CPU cores on machine while Redis runs on on single CPU core. With sharded Redis on each of my 4 cores I could get better results with Redis of course.

Though I suppose performance is not what we should look at here actually. There are some other benefits like easy clustering and cluster-aware client. Another concern is that adding more options for scalability can help with better adoption of Centrifugo in general. Looks like Centrifugo users mostly use unreliable PUB/SUB features only. Nats is pretty popular in Go community, maybe I am a bit biased because of this fact. So I want to collect feedback on this before moving further.

@maurodelazeri
Copy link

@FZambia that sounds amazing, I def will spend some time testing this...
I'm using a fork of redis https://github.com/JohnSully/KeyDB that supports multi-threading

@FZambia
Copy link
Member Author

FZambia commented Mar 20, 2019

@maurodelazeri hm interesting - never heard of that Redis fork before! Are you using it as Centrifugo engine or just as part of your other applications?

@maurodelazeri
Copy link

maurodelazeri commented Mar 20, 2019

@FZambia I'm using as engine... there are an extensive discussion here https://news.ycombinator.com/item?id=19368955

I have implemented some low latency systems with nng, that's definitely going to be much faster than nats, not sure if you know this project... it is an evolution of http://zeromq.org/ an integration would be awesome

https://nanomsg.github.io/nng/
https://github.com/nanomsg/mangos

@FZambia
Copy link
Member Author

FZambia commented Mar 20, 2019

Thanks for the link! Yeah, actually in its early days Centrifugo (when it was written in Python) used ZeroMQ for PUB/SUB layer, but then it's evolutionarily came to Redis (don't remember exact reasons - but it definitely allowed to improve system observability and understanding and opened a road to recovery and presence features). Now with recent changes in Centrifuge library it's pretty simple to experiment with any PUB/SUB technology. I've heard about Mangos - but one thing bothers me is the fact that in Mangos PUB/SUB messages are being filtered on receiver side - that means that if we have several nodes then published messages will be sent to each of them even if there are no subscribers. Maybe this is not a big problem for many use cases though. If I have time I can try to prototype this - though not in near future I suppose.

@maurodelazeri
Copy link

@FZambia you are right, the project still does not support server side filtering nanomsg/nng#587 but like he said, that's not a "problem" besides the fact of the bandwidth...
btw, the protobuf support you added also improved a lot

@ghost
Copy link

ghost commented Apr 6, 2019

I use NATS and would be interested in see the proof of concept.

@ghost
Copy link

ghost commented Apr 6, 2019

Btw is it NATS or NATS streaming.

If it's NATS streaming then there is a grpc integration.
Google for" Proximo grpc NATS".
It allows you to do all the NATS things using a grpc API.

So every service maps to a NATS topic. Why is this cool ?
Because you get easy RPC for many languages ( grpc ) with NATS durability.
With centrifugal underneath it's pretty reusable platform

@ghost
Copy link

ghost commented Apr 6, 2019

I just checked the code.

Wondering if this impacts the client code written in other languages ? For example I use dart in order to talk to centrifugal from Flutter in one of my prototypes.

Also where is the durability. It's using only NATS and not NATS streaming. Am confused where messages are stored when you have a cluster.

@FZambia
Copy link
Member Author

FZambia commented Apr 7, 2019

@gedw99 this is Nats, not Nats Streaming. Centrifugo semantics does not fit well to Nats Streaming, of course my first attempt was to add Nats streaming backend but it seems that Nats streaming and Centrifugo were invented for slightly different purposes so I did not find a good way to combine them without many changes in how Centrifugo works.

This pr introduces Nats for scalable but unreliable messaging - this can be OK for apps which do not need message any delivery guarantees - i.e. sth like real-time stock info, real-time charts, games.

Though if we combine this with Redis Engine possibilities we get the same message delivery guarantees as we have with plain Redis Engine (history is kept in Redis while PUB/SUB works over Nats). This is possible due to recent changes in Centrifuge library (btw you also took part in that issue and pointed me to Event Sourcing there - centrifugal/centrifuge#12 - it's now even more like event sourcing). I quickly sketched a small note on Medium for Centrifuge library users about pluggable components.

At moment Nats team works on Nats v2 - I'd like to see what they end up with before moving with Nats in Centrifugo. But you can already just checkout this branch ant test it out if you want - maybe you will have feedback on this.

@FZambia
Copy link
Member Author

FZambia commented Apr 7, 2019

Wondering if this impacts the client code written in other languages ? For example I use dart in order to talk to centrifugal from Flutter in one of my prototypes.

All Centrifugo and Centrifuge library will work with this as they currently work with Memory or Redis engines - i.e. the same way without any changes. It's a bit costly to change client API for us as one change must be reflected in many client libraries - Javascript, Java, Swift, Dart, Go. But in this case no changes needed - we just replace one of server components of Centrifugo.

@ghost
Copy link

ghost commented Apr 7, 2019

Got it. Thanks Alex.

Very interesting.
So redis is the backing store for NATs. Nice and simple compared to NATS streaming.
Reminds me alot of liftbridge. Maybe you know it ? Might be worth taking a look as it has the same goals as this project and uses NATS. Not saying you should change things but always good to see others on the same path.

Yeah I saw the JetStream proposal for NATS V2.
It's very smart how you have managed to decouple but still employ NATS - many others have tried but failed including me :)

https://github.com/liftbridge-io/liftbridge
The guy was on the NATS team and is building a grpc based NATS that is more simple like Kafka.
It's also inspired by Jet stream ( NATs V2 ).

My only risk is redis. Single CPU. It's 2019 and a 48 core server is normal and a 24 core ARM server is normal ( packet & scaleway data centers for example ). It's a huge SPOP ( single point of performance ) bottleneck. There is a redis clone written in golang that is multicore.

  1. All cores
  2. Easy deployment to chip ISA.
  3. Easier to debug that's for sure.

Maybe you saw it ?

https://github.com/tidwall/redcon

That Dev is reliable and running production stuff on top of redcon. It's API compatible.
I have not looked at the linearizability guarantees yet.

@FZambia
Copy link
Member Author

FZambia commented Apr 7, 2019

My only risk is redis. Single CPU. It's 2019 and a 48 core server is normal and a 24 core ARM server is normal ( packet & scaleway data centers for example ). It's a huge SPOP ( single point of performance ) bottleneck. There is a redis clone written in golang that is multicore.

I don't agree with this concern as I know how fast can be Redis - we just serve 450k connections with single Redis core. As Centrifugo has client side sharding - there is no bottleneck, it's very simple to utilize all cores running multiple Redis instances - at work we have 10 shards and lot of room to grow.

Yeah, I've heard about Redcon - though I suppose it's not feature rich and as far as I remember it uses epoll in single thread. I had report recently from Centrifugo user about succesfull use of KeyDB (see recent discussion on HN here) - it's multithreaded and full featured.

@ghost
Copy link

ghost commented Apr 15, 2019

Thanks for the feedback.
I now understand where this change fits.

I support gnats / NATS with redis backing.

Also i like the idea of being able to use all the NATS functionality on top. Security rules for example are extensive for NATS.
It will be interesting where those sort of things fit in.

I have not had time yet to look at this code ! Hoping to have a play next week.

@FZambia
Copy link
Member Author

FZambia commented Jun 12, 2019

Still considering adding this feature, if someone wants it to be merged - please write your thoughts on how this can be useful for you. I am a bit skeptical here because adding Nats as custom broker can be a new dependency we need to support/update and I don't really want to just add it without real use case scenario.

@joeblew99
Copy link

Hey @FZambia

We spoke on Telegram.
I want to try using --broker=nats

My use case it to run on rasp pi servers on premise, and scaleway in the cloud. Both are ARM.

Does NATS broker mean that i dont need Redis though as i doubt i can get it running on ARM ?

@FZambia
Copy link
Member Author

FZambia commented Jul 22, 2019

@joeblew99 https://redis.io/topics/ARM - looks like Redis works just fine on ARM btw

@joeblew99
Copy link

@FZambia ARM !! Wow you made my day. thanks for checking this out !

So we can use Centrifugo today without NATS on Rasp PI then. It will make the project much easier.

I will still try using NATS to see how it feels. Will Redis still be needed if we use NATS ? I am not sure.

@FZambia
Copy link
Member Author

FZambia commented Jul 23, 2019

I will still try using NATS to see how it feels

Yeah, waiting for feedback

Will Redis still be needed if we use NATS ?

For presence and history/recover features - yes. For unreliable PUB/SUB only - no.

@BatuhanK
Copy link

BatuhanK commented Feb 25, 2020

Hi @FZambia is there any updates on this? As you mentioned, nats option will be nice when history-recovery is not necessary.
Currently we are using centrifugo this way and performance boost from nats will be really nice. Also ease of nats clustering it will be really valuable for us too (even if you don't cluster it, due to it's multithreaded nature single container with multi-core cpu will be simplify many deployments).

By the way, thanks for this amazing project 🙏

@FZambia
Copy link
Member Author

FZambia commented Feb 26, 2020

@BatuhanK I slowed down this a bit due to pure understanding on how this will help Centrifugo users. Nats in quite popular in Go community but I am not sure it's widespread enough outside Go to be generally useful. Do you personally already have Nats in your infrastructure? What kind of setup are you using then - I need this information to understand how real Nats production system now configured, which options Centrifugo should support from start to fit user needs?

I am not rushing with this as adding one more dependency to external system will cause more maintenance overhead for me so decision must be weighted.

@ReDev1L
Copy link

ReDev1L commented Mar 27, 2020

@FZambia Please consider interop with NATS Streaming (STAN) if possible. It will allow switch out redis completely for history/recovering. But as i understood - centrifuge api doesnt allow that?
I'm considering using https://github.com/resgateio/resgate or centrifugo for realtime frontend updates. We are using NATS and STAN in our backend and we love it. Its simple in use, lowest TCO for supporting, and very fast. I dont even look to redis/kafka/rabbitMQ/whatever for messaging/event system.

@FZambia
Copy link
Member Author

FZambia commented Mar 28, 2020

@ReDev1L hi, I'd really like to interop, and I danced around this several times, but it's a bit different model from what Centrifugo does. Centrifugo multiplexes multiple subscriptions to same channel to one to broker, with STAN/Liftbridge/Kafka/Pulsar every streaming (i.e. with message recovery) subscription must be separate object for each client subscription. It will work but it does not fit current internal Centrifugo interfaces and logic. I will continue to think whether it's possible to fit that model, maybe with more radical changes inside Centrifugo. BTW current Redis approach seems to fit better for most Centrifugo use cases (scales much better, especially for cases when many subscribers use the same channel with recovery on). But extending current model to fit streaming brokers is really interesting to me anyway.

@FZambia FZambia merged commit 6f123b0 into master May 28, 2020
@FZambia FZambia deleted the nats_broker branch May 28, 2020 23:13
@subhendukundu
Copy link

subhendukundu commented Jul 12, 2020

If I understood correctly from commitI can use nats with Centrifugo now?
I have more questions tho,

  1. Is it using NATS wss as well (I see we are using v1.10.0, not sure if it has the wss)? If not we will start using?
  2. Did we notice a performance improvement due to this?
  3. With NATS are we planning to ditch Redis?
  4. Why not use NATS WebSockets directly (Super curious)?
  5. Any documents regarding this? I can contribute if not there cause I was planning on using this with nats WebSockets.

@FZambia
Copy link
Member Author

FZambia commented Jul 13, 2020

@subhendukundu I think you are expecting a bit different from this integration. Nats integration in Centrifugo case only uses Nats as broker, there is nothing related to Nats recently added WebSocket protocol. It's just an option to scale Centrifugo to many nodes with at most once message delivery guarantees.

Did we notice a performance improvement due to this?

In publishing performance most probably, in other aspects - no. As I said this is just an extra option to scale Centrifugo PUB/SUB without Redis.

With NATS are we planning to ditch Redis?

No, my main focus is on Redis.

Why not use NATS WebSockets directly (Super curious)?

Don't understand question – if someone wants to use NATS WebSockets then it's possible without Centrifugo involved.

Any documents regarding this? I can contribute if not there cause I was planning on using this with nats WebSockets.

Centrifugo with Nats integration not released yet, but docs for upcoming version already exist: https://centrifugal.github.io/centrifugo/server/engines/#nats-broker

@subhendukundu
Copy link

This makes so much clear (Sorry for the confusion).
Just one more question, message recovery on reconnecting works in nats WebSockets? Or we have to write the logic for that, any doc or sample?
Thanks again.

@FZambia
Copy link
Member Author

FZambia commented Jul 13, 2020

Just one more question, message recovery on reconnecting works in nats WebSockets? Or we have to write the logic for that, any doc or sample?

In Centrifugo integration with Nats recovery won't work. Recovery only available with Memory and Redis engines.

@subhendukundu
Copy link

Is there a way we can achieve it?

@FZambia
Copy link
Member Author

FZambia commented Jul 13, 2020

Is there a way we can achieve it?

No, not at moment. Maybe in future with https://github.com/nats-io/jetstream – this needs research though. Could you provide more info about your use case and requirements? Do you want to replace Redis? Why?

@subhendukundu
Copy link

The use case I am trying to do is a realtime chat app without a database (like Snapchat but no data storage). Two options are best so far according to me.

  1. centrifugo and 2. nats.ws (Reason behind nats, as it has huge potential for future and new features, including the crazy number of connection with just 1gb cpu)
    I have both the way app working prototype/POC up and running.
    Now with nats.ws when the app is on the closed state or no network or user X hasn't opened the app, so there's no connection with the server/socket with user X while user Y sent 10 messages to user X. I will add push notifications to let User X know he has new 10 messages, but how do I show the messages as nats will lose the data? I did the research not able to find a solution that is already available with nats (unless I figure out how to use jetstream I guess).

Now with centrifugo, I haven't tried this scenario. What do you feel a better option for this case?
I am using flutter to make the app.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants