Glusterfs Replace Brick in Dispersed
I have a Dispersed Glusterfs volume comprised of 3x bricks on 3x servers. Recently one of the servers experienced a hard drive failure and dropped out of the cluster. I am trying to replace this brick in the cluster but i cant get it to work.
First up here is the version info:
$ glusterfsd --version
glusterfs 3.13.2
Repository revision: git://git.gluster.org/glusterfs.git
Copyright (c) 2006-2016 Red Hat, Inc. <https://www.gluster.org/>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
It is licensed to you under your choice of the GNU Lesser
General Public License, version 3 or any later version (LGPLv3
or later), or the GNU General Public License, version 2 (GPLv2),
in all cases as published by the Free Software Foundation.
It is running on Ubuntu 18.04.
Here is the existing info:
Volume Name: vol01
Type: Disperse
Volume ID: 061cac4d-1165-4afe-87e0-27b213ea19dc
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: srv02:/srv/glusterfs/vol01/brick <-- This is the brick that died
Brick2: srv03:/srv/glusterfs/vol01/brick
Brick3: srv04:/srv/glusterfs/vol01/brick
Options Reconfigured:
nfs.disable: on
transport.address-family: inet
I wish to replace the srv02 brick with a brick from srv05 using the following:
gluster volume replace-brick vol01 srv02:/srv/glusterfs/vol01/brick srv05:/srv/glusterfs/vol01/brick commit force
However when I run this command (as root) I get this error:
volume replace-brick: failed: Pre Validation failed on srv05. brick: srv02:/srv/glusterfs/vol01/brick does not exist in volume: vol01
As far as I know it should work, srv05 is connected:
# gluster peer status
Number of Peers: 3
Hostname: srv04
Uuid: 5bbd6c69-e0a7-491c-b605-d70cb83ebc72
State: Peer in Cluster (Connected)
Hostname: srv02
Uuid: e4e856ba-61df-45eb-83bb-e2d2e799fc8d
State: Peer Rejected (Disconnected)
Hostname: srv05
Uuid: e7d098c1-7bbd-44e1-931f-034da645c6c6
State: Peer in Cluster (Connected)
As you can see srv05 is connected and in the cluster, srv02 is not and disconnected...
All the bricks are the same size on a XFS partitions. The brick on srv05 is empty.
What am I doing wrong? I would prefer not to have to dump the whole FS and rebuild it if possible...
EDIT 2019-01-01:
After following this tutorial here: https://support.rackspace.com/how-to/recover-from-a-failed-server-in-a-glusterfs-array/ to replace the dead server brick (srv02) with the new one.
The server and brick are recognized by the cluster:
# gluster volume status
Status of volume: vol01
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick srv02:/srv/glusterfs/vol01/brick N/A N/A N N/A
Brick srv03:/srv/glusterfs/vol01/brick 49152 0 Y 21984
Brick srv04:/srv/glusterfs/vol01/brick 49152 0 Y 16681
Self-heal Daemon on localhost N/A N/A Y 2582
Self-heal Daemon on srv04 N/A N/A Y 16703
Self-heal Daemon on srv03 N/A N/A Y 22006
The brick however on the replacement SRV02 is not coming online!
After much searching I found this in the brick log on the new srv02:
[2019-01-01 05:50:05.727791] E [MSGID: 138001] [index.c:2349:init] 0-vol01-index: Failed to find parent dir (/srv/glusterfs/vol01/brick/.glusterfs) of index basepath /srv/glusterfs/vol01/brick/.glusterfs/indices. [No such file or directory]
Not at all sure how to fix this one as its a blank brick that I am looking to bring online and heal!
distributed-computing glusterfs
add a comment |
I have a Dispersed Glusterfs volume comprised of 3x bricks on 3x servers. Recently one of the servers experienced a hard drive failure and dropped out of the cluster. I am trying to replace this brick in the cluster but i cant get it to work.
First up here is the version info:
$ glusterfsd --version
glusterfs 3.13.2
Repository revision: git://git.gluster.org/glusterfs.git
Copyright (c) 2006-2016 Red Hat, Inc. <https://www.gluster.org/>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
It is licensed to you under your choice of the GNU Lesser
General Public License, version 3 or any later version (LGPLv3
or later), or the GNU General Public License, version 2 (GPLv2),
in all cases as published by the Free Software Foundation.
It is running on Ubuntu 18.04.
Here is the existing info:
Volume Name: vol01
Type: Disperse
Volume ID: 061cac4d-1165-4afe-87e0-27b213ea19dc
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: srv02:/srv/glusterfs/vol01/brick <-- This is the brick that died
Brick2: srv03:/srv/glusterfs/vol01/brick
Brick3: srv04:/srv/glusterfs/vol01/brick
Options Reconfigured:
nfs.disable: on
transport.address-family: inet
I wish to replace the srv02 brick with a brick from srv05 using the following:
gluster volume replace-brick vol01 srv02:/srv/glusterfs/vol01/brick srv05:/srv/glusterfs/vol01/brick commit force
However when I run this command (as root) I get this error:
volume replace-brick: failed: Pre Validation failed on srv05. brick: srv02:/srv/glusterfs/vol01/brick does not exist in volume: vol01
As far as I know it should work, srv05 is connected:
# gluster peer status
Number of Peers: 3
Hostname: srv04
Uuid: 5bbd6c69-e0a7-491c-b605-d70cb83ebc72
State: Peer in Cluster (Connected)
Hostname: srv02
Uuid: e4e856ba-61df-45eb-83bb-e2d2e799fc8d
State: Peer Rejected (Disconnected)
Hostname: srv05
Uuid: e7d098c1-7bbd-44e1-931f-034da645c6c6
State: Peer in Cluster (Connected)
As you can see srv05 is connected and in the cluster, srv02 is not and disconnected...
All the bricks are the same size on a XFS partitions. The brick on srv05 is empty.
What am I doing wrong? I would prefer not to have to dump the whole FS and rebuild it if possible...
EDIT 2019-01-01:
After following this tutorial here: https://support.rackspace.com/how-to/recover-from-a-failed-server-in-a-glusterfs-array/ to replace the dead server brick (srv02) with the new one.
The server and brick are recognized by the cluster:
# gluster volume status
Status of volume: vol01
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick srv02:/srv/glusterfs/vol01/brick N/A N/A N N/A
Brick srv03:/srv/glusterfs/vol01/brick 49152 0 Y 21984
Brick srv04:/srv/glusterfs/vol01/brick 49152 0 Y 16681
Self-heal Daemon on localhost N/A N/A Y 2582
Self-heal Daemon on srv04 N/A N/A Y 16703
Self-heal Daemon on srv03 N/A N/A Y 22006
The brick however on the replacement SRV02 is not coming online!
After much searching I found this in the brick log on the new srv02:
[2019-01-01 05:50:05.727791] E [MSGID: 138001] [index.c:2349:init] 0-vol01-index: Failed to find parent dir (/srv/glusterfs/vol01/brick/.glusterfs) of index basepath /srv/glusterfs/vol01/brick/.glusterfs/indices. [No such file or directory]
Not at all sure how to fix this one as its a blank brick that I am looking to bring online and heal!
distributed-computing glusterfs
Okay so I followed this tutorial here: support.rackspace.com/how-to/… Now I have the server re-mapped into the cluster, however the brick in srv02 wont start and glusterfsd is not running on that server... where do I check the logs for this?
– Zexelon
Jan 1 at 1:57
add a comment |
I have a Dispersed Glusterfs volume comprised of 3x bricks on 3x servers. Recently one of the servers experienced a hard drive failure and dropped out of the cluster. I am trying to replace this brick in the cluster but i cant get it to work.
First up here is the version info:
$ glusterfsd --version
glusterfs 3.13.2
Repository revision: git://git.gluster.org/glusterfs.git
Copyright (c) 2006-2016 Red Hat, Inc. <https://www.gluster.org/>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
It is licensed to you under your choice of the GNU Lesser
General Public License, version 3 or any later version (LGPLv3
or later), or the GNU General Public License, version 2 (GPLv2),
in all cases as published by the Free Software Foundation.
It is running on Ubuntu 18.04.
Here is the existing info:
Volume Name: vol01
Type: Disperse
Volume ID: 061cac4d-1165-4afe-87e0-27b213ea19dc
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: srv02:/srv/glusterfs/vol01/brick <-- This is the brick that died
Brick2: srv03:/srv/glusterfs/vol01/brick
Brick3: srv04:/srv/glusterfs/vol01/brick
Options Reconfigured:
nfs.disable: on
transport.address-family: inet
I wish to replace the srv02 brick with a brick from srv05 using the following:
gluster volume replace-brick vol01 srv02:/srv/glusterfs/vol01/brick srv05:/srv/glusterfs/vol01/brick commit force
However when I run this command (as root) I get this error:
volume replace-brick: failed: Pre Validation failed on srv05. brick: srv02:/srv/glusterfs/vol01/brick does not exist in volume: vol01
As far as I know it should work, srv05 is connected:
# gluster peer status
Number of Peers: 3
Hostname: srv04
Uuid: 5bbd6c69-e0a7-491c-b605-d70cb83ebc72
State: Peer in Cluster (Connected)
Hostname: srv02
Uuid: e4e856ba-61df-45eb-83bb-e2d2e799fc8d
State: Peer Rejected (Disconnected)
Hostname: srv05
Uuid: e7d098c1-7bbd-44e1-931f-034da645c6c6
State: Peer in Cluster (Connected)
As you can see srv05 is connected and in the cluster, srv02 is not and disconnected...
All the bricks are the same size on a XFS partitions. The brick on srv05 is empty.
What am I doing wrong? I would prefer not to have to dump the whole FS and rebuild it if possible...
EDIT 2019-01-01:
After following this tutorial here: https://support.rackspace.com/how-to/recover-from-a-failed-server-in-a-glusterfs-array/ to replace the dead server brick (srv02) with the new one.
The server and brick are recognized by the cluster:
# gluster volume status
Status of volume: vol01
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick srv02:/srv/glusterfs/vol01/brick N/A N/A N N/A
Brick srv03:/srv/glusterfs/vol01/brick 49152 0 Y 21984
Brick srv04:/srv/glusterfs/vol01/brick 49152 0 Y 16681
Self-heal Daemon on localhost N/A N/A Y 2582
Self-heal Daemon on srv04 N/A N/A Y 16703
Self-heal Daemon on srv03 N/A N/A Y 22006
The brick however on the replacement SRV02 is not coming online!
After much searching I found this in the brick log on the new srv02:
[2019-01-01 05:50:05.727791] E [MSGID: 138001] [index.c:2349:init] 0-vol01-index: Failed to find parent dir (/srv/glusterfs/vol01/brick/.glusterfs) of index basepath /srv/glusterfs/vol01/brick/.glusterfs/indices. [No such file or directory]
Not at all sure how to fix this one as its a blank brick that I am looking to bring online and heal!
distributed-computing glusterfs
I have a Dispersed Glusterfs volume comprised of 3x bricks on 3x servers. Recently one of the servers experienced a hard drive failure and dropped out of the cluster. I am trying to replace this brick in the cluster but i cant get it to work.
First up here is the version info:
$ glusterfsd --version
glusterfs 3.13.2
Repository revision: git://git.gluster.org/glusterfs.git
Copyright (c) 2006-2016 Red Hat, Inc. <https://www.gluster.org/>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
It is licensed to you under your choice of the GNU Lesser
General Public License, version 3 or any later version (LGPLv3
or later), or the GNU General Public License, version 2 (GPLv2),
in all cases as published by the Free Software Foundation.
It is running on Ubuntu 18.04.
Here is the existing info:
Volume Name: vol01
Type: Disperse
Volume ID: 061cac4d-1165-4afe-87e0-27b213ea19dc
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: srv02:/srv/glusterfs/vol01/brick <-- This is the brick that died
Brick2: srv03:/srv/glusterfs/vol01/brick
Brick3: srv04:/srv/glusterfs/vol01/brick
Options Reconfigured:
nfs.disable: on
transport.address-family: inet
I wish to replace the srv02 brick with a brick from srv05 using the following:
gluster volume replace-brick vol01 srv02:/srv/glusterfs/vol01/brick srv05:/srv/glusterfs/vol01/brick commit force
However when I run this command (as root) I get this error:
volume replace-brick: failed: Pre Validation failed on srv05. brick: srv02:/srv/glusterfs/vol01/brick does not exist in volume: vol01
As far as I know it should work, srv05 is connected:
# gluster peer status
Number of Peers: 3
Hostname: srv04
Uuid: 5bbd6c69-e0a7-491c-b605-d70cb83ebc72
State: Peer in Cluster (Connected)
Hostname: srv02
Uuid: e4e856ba-61df-45eb-83bb-e2d2e799fc8d
State: Peer Rejected (Disconnected)
Hostname: srv05
Uuid: e7d098c1-7bbd-44e1-931f-034da645c6c6
State: Peer in Cluster (Connected)
As you can see srv05 is connected and in the cluster, srv02 is not and disconnected...
All the bricks are the same size on a XFS partitions. The brick on srv05 is empty.
What am I doing wrong? I would prefer not to have to dump the whole FS and rebuild it if possible...
EDIT 2019-01-01:
After following this tutorial here: https://support.rackspace.com/how-to/recover-from-a-failed-server-in-a-glusterfs-array/ to replace the dead server brick (srv02) with the new one.
The server and brick are recognized by the cluster:
# gluster volume status
Status of volume: vol01
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick srv02:/srv/glusterfs/vol01/brick N/A N/A N N/A
Brick srv03:/srv/glusterfs/vol01/brick 49152 0 Y 21984
Brick srv04:/srv/glusterfs/vol01/brick 49152 0 Y 16681
Self-heal Daemon on localhost N/A N/A Y 2582
Self-heal Daemon on srv04 N/A N/A Y 16703
Self-heal Daemon on srv03 N/A N/A Y 22006
The brick however on the replacement SRV02 is not coming online!
After much searching I found this in the brick log on the new srv02:
[2019-01-01 05:50:05.727791] E [MSGID: 138001] [index.c:2349:init] 0-vol01-index: Failed to find parent dir (/srv/glusterfs/vol01/brick/.glusterfs) of index basepath /srv/glusterfs/vol01/brick/.glusterfs/indices. [No such file or directory]
Not at all sure how to fix this one as its a blank brick that I am looking to bring online and heal!
distributed-computing glusterfs
distributed-computing glusterfs
edited Jan 1 at 15:48
Zexelon
asked Jan 1 at 1:20
ZexelonZexelon
5910
5910
Okay so I followed this tutorial here: support.rackspace.com/how-to/… Now I have the server re-mapped into the cluster, however the brick in srv02 wont start and glusterfsd is not running on that server... where do I check the logs for this?
– Zexelon
Jan 1 at 1:57
add a comment |
Okay so I followed this tutorial here: support.rackspace.com/how-to/… Now I have the server re-mapped into the cluster, however the brick in srv02 wont start and glusterfsd is not running on that server... where do I check the logs for this?
– Zexelon
Jan 1 at 1:57
Okay so I followed this tutorial here: support.rackspace.com/how-to/… Now I have the server re-mapped into the cluster, however the brick in srv02 wont start and glusterfsd is not running on that server... where do I check the logs for this?
– Zexelon
Jan 1 at 1:57
Okay so I followed this tutorial here: support.rackspace.com/how-to/… Now I have the server re-mapped into the cluster, however the brick in srv02 wont start and glusterfsd is not running on that server... where do I check the logs for this?
– Zexelon
Jan 1 at 1:57
add a comment |
1 Answer
1
active
oldest
votes
So in the end I got the brick to come online by the following in the brick volume directory:
# mkdir .glusterfs
# chmod 600 .glusterfs
# cd .glusterfs
# mkdir indices
# chmod 600 indices
# systemctl restart glusterd
The brick came online and the heal process was started with:
# gluster volume heal vol01 full
So far it seams to be functioning just fine.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53992485%2fglusterfs-replace-brick-in-dispersed%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
So in the end I got the brick to come online by the following in the brick volume directory:
# mkdir .glusterfs
# chmod 600 .glusterfs
# cd .glusterfs
# mkdir indices
# chmod 600 indices
# systemctl restart glusterd
The brick came online and the heal process was started with:
# gluster volume heal vol01 full
So far it seams to be functioning just fine.
add a comment |
So in the end I got the brick to come online by the following in the brick volume directory:
# mkdir .glusterfs
# chmod 600 .glusterfs
# cd .glusterfs
# mkdir indices
# chmod 600 indices
# systemctl restart glusterd
The brick came online and the heal process was started with:
# gluster volume heal vol01 full
So far it seams to be functioning just fine.
add a comment |
So in the end I got the brick to come online by the following in the brick volume directory:
# mkdir .glusterfs
# chmod 600 .glusterfs
# cd .glusterfs
# mkdir indices
# chmod 600 indices
# systemctl restart glusterd
The brick came online and the heal process was started with:
# gluster volume heal vol01 full
So far it seams to be functioning just fine.
So in the end I got the brick to come online by the following in the brick volume directory:
# mkdir .glusterfs
# chmod 600 .glusterfs
# cd .glusterfs
# mkdir indices
# chmod 600 indices
# systemctl restart glusterd
The brick came online and the heal process was started with:
# gluster volume heal vol01 full
So far it seams to be functioning just fine.
answered Jan 1 at 15:43
ZexelonZexelon
5910
5910
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53992485%2fglusterfs-replace-brick-in-dispersed%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Okay so I followed this tutorial here: support.rackspace.com/how-to/… Now I have the server re-mapped into the cluster, however the brick in srv02 wont start and glusterfsd is not running on that server... where do I check the logs for this?
– Zexelon
Jan 1 at 1:57