【Mongodb】sharding 集群Add/Remove 节点

MongoDB的Auto-Sharding能够做到：

1 当各Sharding间负载和数据分布不平衡时，自动rebalancing

2 简单方便的添加和删除节点

3 自动故障转移(auto failover)

4 可扩展至上千台节点

如何增加shard节点，在之前的shard集群配置过程使用过。当向一个sharding集群添加新的节点，mongodb 会将在其他节点的数据chunk迁移到新的节点上面。
以便达到均分数据的目的，这也算是负载均衡吧。

添加之前：

mongos> db.printShardingStatus()

— Sharding Status —

sharding version: { “_id” : 1, “version” : 3 }

shards:

{ “_id” : “shard0000”, “host” : “10.250.7.225:27018” }

{ “_id” : “shard0001”, “host” : “10.250.7.249:27019” }

{ “_id” : “shard0002”, “host” : “10.250.7.241:27020” }

databases:

{ “_id” : “admin”, “partitioned” : false, “primary” : “config” }

{ “_id” : “test”, “partitioned” : true, “primary” : “shard0000” }

test.momo chunks:

shard0000 30

shard0001 26

shard0002 24

too many chunks to print, use verbose if you want to force print

……..省略…….

Noet：对于由于chunks数量过大，而显示“too many chunks to print, use verbose if you want to force print”，可以使用如下方式查看：

printShardingStatus(db.getSisterDB(“config”),1);

在admin 数据库操作

mongos> use admin

switched to db admin

mongos> db.runCommand({addshard:”10.250.7.225:27019″})

{ “shardAdded” : “shard0003”, “ok” : 1 }

这里添加很短时间就返回结果，但是后台要花一定的时间来做数据 chunk的迁移，从其他shard节点迁移到新的节点上面.

mongos> db.runCommand({ listShards : 1});

{

“shards” : [

{

“_id” : “shard0000”,

“host” : “10.250.7.225:27018”

{

“_id” : “shard0001”,

“host” : “10.250.7.249:27019”

{

“_id” : “shard0002”,

“host” : “10.250.7.241:27020”

{

“_id” : “shard0003”,

“host” : “10.250.7.225:27019”

}

“ok” : 1

}

过一段时间再看：已经做了数据的平均分布了。

mongos> printShardingStatus(db.getSisterDB(“config”),1);

— Sharding Status —

sharding version: { “_id” : 1, “version” : 3 }

shards:

{ “_id” : “shard0000”, “host” : “10.250.7.225:27018” }

{ “_id” : “shard0001”, “host” : “10.250.7.249:27019” }

{ “_id” : “shard0002”, “host” : “10.250.7.241:27020” }

{ “_id” : “shard0003”, “host” : “10.250.7.225:27019” }

databases:

{ “_id” : “admin”, “partitioned” : false, “primary” : “config” }

{ “_id” : “test”, “partitioned” : true, “primary” : “shard0000” }

test.momo chunks:

shard0003 16

shard0001 21

shard0000 21

shard0002 23

{ “id” : { $minKey : 1 } } –>> { “id” : 0 } on : shard0003 { “t” : 28000, “i” : 0 }

{ “id” : 0 } –>> { “id” : 5236 } on : shard0003 { “t” : 33000, “i” : 0 }

{ “id” : 5236 } –>> { “id” : 11595 } on : shard0003 { “t” : 35000, “i” : 0 }

{ “id” : 11595 } –>> { “id” : 17346 } on : shard0003 { “t” : 37000, “i” : 0 }

{ “id” : 17346 } –>> { “id” : 23191 } on : shard0003 { “t” : 40000, “i” : 0 }

{ “id” : 23191 } –>> { “id” : 31929 } on : shard0003 { “t” : 43000, “i” : 0 }

…..省略部分….

{ “id” : 930108 } –>> { “id” : 948575 } on : shard0002 { “t” : 21000, “i” : 7 }

{ “id” : 948575 } –>> { “id” : 957995 } on : shard0002 { “t” : 27000, “i” : 42 }

{ “id” : 957995 } –>> { “id” : 969212 } on : shard0002 { “t” : 27000, “i” : 43 }

{ “id” : 969212 } –>> { “id” : 983794 } on : shard0002 { “t” : 25000, “i” : 6 }

{ “id” : 983794 } –>> { “id” : 999997 } on : shard0002 { “t” : 25000, “i” : 7 }

{ “id” : 999997 } –>> { “id” : { $maxKey : 1 } } on : shard0002 { “t” : 11000, “i” : 3 }

test.yql chunks:

shard0003 1

shard0000 1

shard0002 1

shard0001 1

{ “_id” : { $minKey : 1 } } –>> { “_id” : ObjectId(“4eb298b3adbd9673afee95e3”) } on : shard0003 { “t” : 5000, “i” : 0 }

{ “_id” : ObjectId(“4eb298b3adbd9673afee95e3”) } –>> { “_id” : ObjectId(“4eb2a64640643e5bb60072f7”) } on : shard0000 { “t” : 4000, “i” : 1 }

{ “_id” : ObjectId(“4eb2a64640643e5bb60072f7”) } –>> { “_id” : ObjectId(“4eb2a65340643e5bb600e084”) } on : shard0002 { “t” : 3000, “i” : 1 }

{ “_id” : ObjectId(“4eb2a65340643e5bb600e084”) } –>> { “_id” : { $maxKey : 1 } } on : shard0001 { “t” : 5000, “i” : 1 }

{ “_id” : “mongos”, “partitioned” : false, “primary” : “shard0000” }

附上日志记录：

##启动信息

Sat Nov 5 17:41:23 [initandlisten] MongoDB starting : pid=11807 port=27019 dbpath=/opt/mongodata/r2 64-bit host=rac1

Sat Nov 5 17:41:23 [initandlisten] db version v2.0.1, pdfile version 4.5

Sat Nov 5 17:41:23 [initandlisten] git version: 3a5cf0e2134a830d38d2d1aae7e88cac31bdd684

Sat Nov 5 17:41:23 [initandlisten] build info: Linux bs-linux64.10gen.cc 2.6.21.7-2.ec2.v1.2.fc8xen #1 SMP Fri Nov 20 17:48:28 EST 2009 x86_64 BOOST_LIB_VERSION=1_41

Sat Nov 5 17:41:23 [initandlisten] options: { dbpath: “/opt/mongodata/r2”, logappend: true, logpath: “/opt/mongodata/r1/27019.log”, port: 27019, shardsvr: true }

Sat Nov 5 17:41:23 [initandlisten] journal dir=/opt/mongodata/r2/journal

Sat Nov 5 17:41:23 [initandlisten] recover : no journal files present, no recovery needed

Sat Nov 5 17:41:23 [initandlisten] waiting for connections on port 27019

Sat Nov 5 17:41:23 [websvr] admin web console waiting for connections on port 28019

###连接其他节点，并复制数据

Sat Nov 5 17:41:53 [initandlisten] connection accepted from 10.250.7.220:46807 #1

Sat Nov 5 17:42:03 [initandlisten] connection accepted from 10.250.7.225:57578 #2

Sat Nov 5 17:42:03 [FileAllocator] allocating new datafile /opt/mongodata/r2/test.ns, filling with zeroes…

Sat Nov 5 17:42:03 [FileAllocator] creating directory /opt/mongodata/r2/_tmp

Sat Nov 5 17:42:03 [FileAllocator] done allocating datafile /opt/mongodata/r2/test.ns, size: 16MB, took 0.1 secs

Sat Nov 5 17:42:03 [FileAllocator] allocating new datafile /opt/mongodata/r2/test.0, filling with zeroes…

Sat Nov 5 17:42:06 [FileAllocator] done allocating datafile /opt/mongodata/r2/test.0, size: 64MB, took 3.143 secs

Sat Nov 5 17:42:06 [migrateThread] build index test.momo { _id: 1 }

Sat Nov 5 17:42:06 [migrateThread] build index done 0 records 0 secs

Sat Nov 5 17:42:06 [migrateThread] info: creating collection test.momo on add index

Sat Nov 5 17:42:06 [migrateThread] build index test.momo { id: 1.0 }

Sat Nov 5 17:42:06 [migrateThread] build index done 0 records 0 secs

Sat Nov 5 17:42:06 [FileAllocator] allocating new datafile /opt/mongodata/r2/test.1, filling with zeroes…

Sat Nov 5 17:42:07 [migrateThread] migrate commit succeeded flushing to secondaries for ‘test.momo’ { id: MinKey } -> { id: 0.0 }

Sat Nov 5 17:42:07 [migrateThread] migrate commit flushed to journal for ‘test.momo’ { id: MinKey } -> { id: 0.0 }

Sat Nov 5 17:42:07 [migrateThread] migrate commit succeeded flushing to secondaries for ‘test.momo’ { id: MinKey } -> { id: 0.0 }

Sat Nov 5 17:42:07 [migrateThread] migrate commit flushed to journal for ‘test.momo’ { id: MinKey } -> { id: 0.0 }

Sat Nov 5 17:42:07 [migrateThread] about to log metadata event: { _id: “rac1-2011-11-05T09:42:07-0”, server: “rac1”, clientAddr: “”, time: new Date(1320486127651), wh

at: “moveChunk.to”, ns: “test.momo”, details: { min: { id: MinKey }, max: { id: 0.0 }, step1: 3271, step2: 217, step3: 0, step4: 0, step5: 520 } }

Sat Nov 5 17:42:07 [migrateThread] SyncClusterConnection connecting to [rac1:28001]

Sat Nov 5 17:42:07 [migrateThread] SyncClusterConnection connecting to [rac2:28002]

Sat Nov 5 17:42:07 [migrateThread] SyncClusterConnection connecting to [rac3:28003]

Sat Nov 5 17:42:07 [FileAllocator] done allocating datafile /opt/mongodata/r2/test.1, size: 128MB, took 1.011 secs

Sat Nov 5 17:42:13 [initandlisten] connection accepted from 10.250.7.249:40392 #3

Sat Nov 5 17:42:13 [migrateThread] build index test.yql { _id: 1 }

Sat Nov 5 17:42:13 [migrateThread] build index done 0 records 0.001 secs

Sat Nov 5 17:42:13 [migrateThread] info: creating collection test.yql on add index

Sat Nov 5 17:42:13 [migrateThread] migrate commit succeeded flushing to secondaries for ‘test.yql’ { _id: MinKey } -> { _id: ObjectId(‘4eb298b3adbd9673afee95e3’) }

Sat Nov 5 17:42:13 [migrateThread] migrate commit flushed to journal for ‘test.yql’ { _id: MinKey } -> { _id: ObjectId(‘4eb298b3adbd9673afee95e3’) }

Sat Nov 5 17:42:14 [migrateThread] migrate commit succeeded flushing to secondaries for ‘test.yql’ { _id: MinKey } -> { _id: ObjectId(‘4eb298b3adbd9673afee95e3’) }

Sat Nov 5 17:42:14 [migrateThread] migrate commit flushed to journal for ‘test.yql’ { _id: MinKey } -> { _id: ObjectId(‘4eb298b3adbd9673afee95e3’) }

Sat Nov 5 17:42:14 [migrateThread] about to log metadata event: { _id: “rac1-2011-11-05T09:42:14-1”, server: “rac1”, clientAddr: “”, time: new Date(1320486134775), wh

at: “moveChunk.to”, ns: “test.yql”, details: { min: { _id: MinKey }, max: { _id: ObjectId(‘4eb298b3adbd9673afee95e3’) }, step1: 5, step2: 0, step3: 0, step4: 0, step5:

1006 } }

Sat Nov 5 17:42:16 [migrateThread] migrate commit succeeded flushing to secondaries for ‘test.momo’ { id: 102100 } -> { id: 120602 }

Sat Nov 5 17:42:16 [migrateThread] migrate commit flushed to journal for ‘test.momo’ { id: 102100 } -> { id: 120602 }

Sat Nov 5 17:42:17 [migrateThread] migrate commit succeeded flushing to secondaries for ‘test.momo’ { id: 102100 } -> { id: 120602 }

Sat Nov 5 17:42:17 [migrateThread] migrate commit flushed to journal for ‘test.momo’ { id: 102100 } -> { id: 120602 }

Sat Nov 5 17:42:17 [migrateThread] about to log metadata event: { _id: “rac1-2011-11-05T09:42:17-2”, server: “rac1”, clientAddr: “”, time: new Date(1320486137351), wh

at: “moveChunk.to”, ns: “test.momo”, details: { min: { id: 102100 }, max: { id: 120602 }, step1: 0, step2: 0, step3: 1573, step4: 0, step5: 479 } }

Sat Nov 5 17:42:20 [conn2] end connection 10.250.7.225:57578

Sat Nov 5 17:42:21 [initandlisten] connection accepted from 10.250.7.220:46814 #4

Sat Nov 5 17:42:21 [conn4] warning: bad serverID set in setShardVersion and none in info: EOO

Sat Nov 5 18:06:47 [initandlisten] connection accepted from 10.250.7.225:13612 #6

Sat Nov 5 18:06:47 [migrateThread] Socket say send() errno:32 Broken pipe 10.250.7.225:27018

Sat Nov 5 18:06:47 [migrateThread] about to log metadata event: { _id: “rac1-2011-11-05T10:06:47-3”, server: “rac1”, clientAddr: “”, time: new Date(1320487607530), wh

at: “moveChunk.to”, ns: “test.momo”, details: { min: { id: 120602 }, max: { id: 132858 }, note: “aborted” } }

Sat Nov 5 18:06:47 [migrateThread] not logging config change: rac1-2011-11-05T10:06:47-3 SyncClusterConnection::insert prepare failed: 9001 socket exception [2] serve

r [127.0.0.1:28001] rac1:28001:{}

Sat Nov 5 18:07:00 [migrateThread] migrate commit succeeded flushing to secondaries for ‘test.momo’ { id: 120602 } -> { id: 132858 }

Sat Nov 5 18:07:00 [migrateThread] migrate commit flushed to journal for ‘test.momo’ { id: 120602 } -> { id: 132858 }

Sat Nov 5 18:07:01 [migrateThread] migrate commit succeeded flushing to secondaries for ‘test.momo’ { id: 120602 } -> { id: 132858 }

Sat Nov 5 18:07:01 [migrateThread] migrate commit flushed to journal for ‘test.momo’ { id: 120602 } -> { id: 132858 }

Sat Nov 5 18:07:01 [migrateThread] about to log metadata event: { _id: “rac1-2011-11-05T10:07:01-4”, server: “rac1”, clientAddr: “”, time: new Date(1320487621150), wh

at: “moveChunk.to”, ns: “test.momo”, details: { min: { id: 120602 }, max: { id: 132858 }, step1: 0, step2: 0, step3: 1121, step4: 0, step5: 886 } }

Sat Nov 5 18:07:01 [migrateThread] SyncClusterConnection connecting to [rac1:28001]

Sat Nov 5 18:07:01 [migrateThread] SyncClusterConnection connecting to [rac2:28002]

Sat Nov 5 18:07:01 [migrateThread] SyncClusterConnection connecting to [rac3:28003]

Sat Nov 5 18:07:17 [migrateThread] migrate commit flushed to journal for ‘test.momo’ { id: 142178 } -> { id: 154425 }

Sat Nov 5 18:07:18 [migrateThread] migrate commit succeeded flushing to secondaries for ‘test.momo’ { id: 142178 } -> { id: 154425 }

Sat Nov 5 18:07:18 [migrateThread] migrate commit flushed to journal for ‘test.momo’ { id: 142178 } -> { id: 154425 }

Sat Nov 5 18:07:18 [migrateThread] about to log metadata event: { _id: “rac1-2011-11-05T10:07:18-6”, server: “rac1”, clientAddr: “”, time: new Date(1320487638676), wh

at: “moveChunk.to”, ns: “test.momo”, details: { min: { id: 142178 }, max: { id: 154425 }, step1: 0, step2: 0, step3: 1108, step4: 0, step5: 940 } }

…..省略部分…..

Sat Nov 5 18:09:23 [clientcursormon] mem (MB) res:55 virt:413 mapped:80

Sat Nov 5 18:12:21 [conn1] command admin.$cmd command: { writebacklisten: ObjectId(‘4eb4e43618ed672581e26201’) } ntoreturn:1 reslen:44 300012ms

Sat Nov 5 18:14:24 [clientcursormon] mem (MB) res:55 virt:413 mapped:80

Sat Nov 5 18:17:21 [conn1] command admin.$cmd command: { writebacklisten: ObjectId(‘4eb4e43618ed672581e26201’) } ntoreturn:1 reslen:44 300012ms

Sat Nov 5 18:19:24 [clientcursormon] mem (MB) res:55 virt:413 mapped:80

二删除节点

集群对于删除节点，也会将被删除节点上的数据迁移到其他的节点上面。

db.runCommand({ listShards : 1});

mongos> db.runCommand({removeshard:”10.250.7.225:27018″})

{

“msg” : “draining started successfully”,

“state” : “started”,

“shard” : “shard0000”,

“ok” : 1

}

mongos> db.runCommand({ listShards : 1});

{

“shards” : [

{

“_id” : “shard0001”,

“host” : “10.250.7.249:27019”

{

“_id” : “shard0002”,

“host” : “10.250.7.241:27020”

{

“_id” : “shard0003”,

“host” : “10.250.7.225:27019”

{

“_id” : “shard0000”,

“draining” : true, –正在迁移数据

“host” : “10.250.7.225:27018”

}

“ok” : 1

}

mongos>

删除之后：

mongos> db.printShardingStatus()

— Sharding Status —

sharding version: { “_id” : 1, “version” : 3 }

shards:

{ “_id” : “shard0000”, “draining” : true, “host” : “10.250.7.225:27018” }

{ “_id” : “shard0001”, “host” : “10.250.7.249:27019” }

{ “_id” : “shard0002”, “host” : “10.250.7.241:27020” }

{ “_id” : “shard0003”, “host” : “10.250.7.225:27019” }

databases:

{ “_id” : “admin”, “partitioned” : false, “primary” : “config” }

{ “_id” : “test”, “partitioned” : true, “primary” : “shard0000” }

test.momo chunks:

shard0003 27

shard0001 28

shard0002 27

too many chunks to print, use verbose if you want to force print

test.yql chunks:

shard0003 1

shard0001 2

shard0002 1

{ “_id” : { $minKey : 1 } } –>> { “_id” : ObjectId(“4eb298b3adbd9673afee95e3”) } on : shard0003 { “t” : 5000, “i” : 0 }

{ “_id” : ObjectId(“4eb298b3adbd9673afee95e3”) } –>> { “_id” : ObjectId(“4eb2a64640643e5bb60072f7”) } on : shard0001 { “t” : 6000, “i” : 0 }

{ “_id” : ObjectId(“4eb2a64640643e5bb60072f7”) } –>> { “_id” : ObjectId(“4eb2a65340643e5bb600e084”) } on : shard0002 { “t” : 3000, “i” : 1 }

{ “_id” : ObjectId(“4eb2a65340643e5bb600e084”) } –>> { “_id” : { $maxKey : 1 } } on : shard0001 { “t” : 5000, “i” : 1 }

{ “_id” : “mongos”, “partitioned” : false, “primary” : “shard0000” }

mongos>

附上相关日志：

##Balancer 会将被去除节点上的数据拷贝的其他的节点上。

Sat Nov 5 19:09:29 [Balancer] chose [shard0000] to [shard0001] { _id: “test.yql-_id_ObjectId(‘4eb298b3adbd9673afee95e3’)”, lastmod: Timestamp 4000|1, ns: “test.yql”, min: { _id: ObjectId(‘4eb298b3adbd9673afee95e3’) }, max: { _id: ObjectId(‘4eb2a64640643e5bb60072f7’) }, shard: “shard0000” }

Sat Nov 5 19:09:29 [Balancer] chose [shard0000] to [shard0003] { _id: “test.momo-id_212402”, lastmod: Timestamp 42000|1, ns: “test.momo”, min: { id: 212402 }, max: { id: 236820 }, shard: “shard0000” }

Sat Nov 5 19:09:29 [Balancer] moving chunk ns: test.yql moving ( ns:test.yql at: shard0000:10.250.7.225:27018 lastmod: 4|1 min: { _id: ObjectId(‘4eb298b3adbd9673afee95e3’) } max: { _id: ObjectId(‘4eb2a64640643e5bb60072f7’) }) shard0000:10.250.7.225:27018 -> shard0001:10.250.7.249:27019

Sat Nov 5 19:09:33 [Balancer] created new distributed lock for test.yql on rac1:28001,rac2:28002,rac3:28003 ( lock timeout : 900000, ping interval : 30000, process : 0 )

Sat Nov 5 19:09:33 [Balancer] ChunkManager: time to load chunks for test.yql: 0ms sequenceNumber: 114 version: 6|0

Sat Nov 5 19:09:33 [Balancer] moving chunk ns: test.momo moving ( ns:test.momo at: shard0000:10.250.7.225:27018 lastmod: 42|1 min: { id: 212402 } max: { id: 236820 }) shard0000:10.250.7.225:27018 -> shard0003:10.250.7.225:27019

Sat Nov 5 19:09:34 [Balancer] moveChunk result: { chunkTooBig: true, estimatedChunkSize: 1462920, errmsg: “chunk too big to move”, ok: 0.0 }

Sat Nov 5 19:09:34 [Balancer] balancer move failed: { chunkTooBig: true, estimatedChunkSize: 1462920, errmsg: “chunk too big to move”, ok: 0.0 } from: shard0000 to: shard0003 chunk: { _id: “test.momo-id_212402”, lastmod: Timestamp 42000|1, ns: “test.momo”, min: { id: 212402 }, max: { id: 236820 }, shard: “shard0000” }

Sat Nov 5 19:09:34 [Balancer] forcing a split because migrate failed for size reasons

Sat Nov 5 19:09:34 [Balancer] created new distributed lock for test.momo on rac1:28001,rac2:28002,rac3:28003 ( lock timeout : 900000, ping interval : 30000, process : 0 )

Sat Nov 5 19:09:34 [Balancer] ChunkManager: time to load chunks for test.momo: 1ms sequenceNumber: 115 version: 43|5

Sat Nov 5 19:09:34 [Balancer] forced split results: { ok: 1.0 }

Sat Nov 5 19:09:34 [Balancer] distributed lock ‘balancer/rac4:27017:1320477786:1804289383’ unlocked.

Sat Nov 5 19:09:39 [Balancer] distributed lock ‘balancer/rac4:27017:1320477786:1804289383’ acquired, ts : 4eb5197318ed672581e267a7

Sat Nov 5 19:09:39 [Balancer] chose [shard0002] to [shard0003] { _id: “test.momo-id_682899”, lastmod: Timestamp 43000|2, ns: “test.momo”, min: { id: 682899 }, max: { id: 697740 }, shard: “shard0002” }

Sat Nov 5 19:09:39 [Balancer] moving chunk ns: test.momo moving ( ns:test.momo at: shard0002:10.250.7.241:27020 lastmod: 43|2 min: { id: 682899 } max: { id: 697740 }) shard0002:10.250.7.241:27020 -> shard0003:10.250.7.225:27019

Sat Nov 5 19:09:43 [Balancer] created new distributed lock for test.momo on rac1:28001,rac2:28002,rac3:28003 ( lock timeout : 900000, ping interval : 30000, process : 0 )

Sat Nov 5 19:09:43 [Balancer] ChunkManager: time to load chunks for test.momo: 1ms sequenceNumber: 116 version: 44|1

Sat Nov 5 19:09:43 [Balancer] distributed lock ‘balancer/rac4:27017:1320477786:1804289383’ unlocked.

Sat Nov 5 19:09:48 [Balancer] distributed lock ‘balancer/rac4:27017:1320477786:1804289383’ acquired, ts : 4eb5197c18ed672581e267a8

Sat Nov 5 19:09:48 [Balancer] chose [shard0000] to [shard0003] { _id: “test.momo-id_212402”, lastmod: Timestamp 43000|4, ns: “test.momo”, min: { id: 212402 }, max: { id: 224692 }, shard: “shard0000” }

Sat Nov 5 19:09:48 [Balancer] moving chunk ns: test.momo moving ( ns:test.momo at: shard0000:10.250.7.225:27018 lastmod: 43|4 min: { id: 212402 } max: { id: 224692 }) shard0000:10.250.7.225:27018 -> shard0003:10.250.7.225:27019

发布者：全栈程序员-用户IM，转载请注明出处：https://javaforall.cn/109072.html原文链接：https://javaforall.cn

【正版授权，激活自己账号】： Jetbrains全家桶Ide使用，1年售后保障，每天仅需1毛

【官方授权正版激活】： 官方授权正版激活支持Jetbrains家族下所有IDE 使用个人JB账号...