Skip to content

Upgrading from v2.3 through v3.0 and v3.1 to v3.2 results in panic #9480

Closed
@jlhawn

Description

Bug reporting

The docs recommend upgrading from v2.3 to v3.2 by first upgrading to each minor version along the way, however there seems to be an issue if you perform this transition too quickly, specifically if there are no writes to the v3 backend or there are no snapshots produced during v3.0 or v3.1 then this causes v3.2 to panic on startup.

To reproduce this, start with an etcd v2.3 server which does have a snapshot (this bug does not occur if no snapshots have taken place yet). Stop the server and replace it with a v3.0 server. Everything seems fine, next stop the server and replace it with a v3.1 server. Again everything is fine. Finally, stop the server and replace it with a v3.2 server and witness this panic when the server starts up:

2018-03-22 18:14:32.879716 I | etcdserver: recovered store from snapshot at index 52
2018-03-22 18:14:32.882938 C | etcdserver: recovering backend from snapshot error: database snapshot file path error: snap: snapshot file doesn't exist
panic: recovering backend from snapshot error: database snapshot file path error: snap: snapshot file doesn't exist
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0xb7ab8c]

goroutine 1 [running]:
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver.NewServer.func1(0xc4201ac5f8, 0xc4201ac3d0)
	/usr/local/google/home/jpbetz/Projects/etcd/src/github.com/coreos/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/server.go:284 +0x3c
panic(0xdaf1c0, 0xc42025f950)
	/usr/local/google/home/jpbetz/.gvm/gos/go1.8.7/src/runtime/panic.go:489 +0x2cf
github.com/coreos/etcd/cmd/vendor/github.com/coreos/pkg/capnslog.(*PackageLogger).Panicf(0xc420170820, 0xf95ff9, 0x2a, 0xc4201ac440, 0x1, 0x1)
	/usr/local/google/home/jpbetz/Projects/etcd/src/github.com/coreos/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/pkg/capnslog/pkg_logger.go:75 +0x15c
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver.NewServer(0xc42026c000, 0x0, 0x14b2580, 0xc42025f8e0)
	/usr/local/google/home/jpbetz/Projects/etcd/src/github.com/coreos/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/server.go:379 +0x2e4d
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/embed.StartEtcd(0xc420182a80, 0xc420264000, 0x0, 0x0)
	/usr/local/google/home/jpbetz/Projects/etcd/src/github.com/coreos/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/embed/etcd.go:157 +0x782
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdmain.startEtcd(0xc420182a80, 0x6, 0xf71713, 0x6, 0x1)
	/usr/local/google/home/jpbetz/Projects/etcd/src/github.com/coreos/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdmain/etcd.go:186 +0x58
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdmain.startEtcdOrProxyV2()
	/usr/local/google/home/jpbetz/Projects/etcd/src/github.com/coreos/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdmain/etcd.go:103 +0x15ba
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdmain.Main()
	/usr/local/google/home/jpbetz/Projects/etcd/src/github.com/coreos/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdmain/main.go:39 +0x61
main.main()
	/usr/local/google/home/jpbetz/Projects/etcd/src/github.com/coreos/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/etcd/main.go:28 +0x20

The bug seems to have been introduced in this patch from last year.

It this case, the new db backend (which has not yet been used and has been in its initial state since the v3.0 server was deployed) reports an index (0) which is less than the latest snapshot index. The server assumes that this means there is a *.snap.db file which can be renamed to db to catch up to the *.snap file but no such *.snap.db file exists.

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions