How to implement a Continuous Control of a quadruped robot with Deep Reinforcement Learning in Pybullet and...

-1

I have designed this robot in URDF format and its environment in pybullet. Each leg has a minimum and maximum value of movement.

What reinforcement algorithm will be best to create a walking policy in a simple environment in which a positive reward will be given if it walks in the positive X-axis direction?

The expected output from the policy is an array in the range of (-1, 1) for each joint. The input of the policy is the position of each joint, the center of mass of the body, the difference in height between the floor and the body to see if it has fallen and the movement in the x-axis.

Limitations

left_front_joint => lower="-0.4" upper="2.5" id=0

left_front_leg_joint => lower="-0.6" upper="0.7" id=2

right_front_joint => lower="-2.5" upper="0.4" id=3

right_front_leg_joint => lower="-0.6" upper="0.7" id=5

left_back_joint => lower="-2.5" upper="0.4" id=6

left_back_leg_joint => lower="-0.6" upper="0.7" id=8

right_back_joint => lower="-0.4" upper="2.5" id=9

right_back_leg_joint => lower="-0.6" upper="0.7" id=11

The code above is just a test of the environment with a manual set of movements hard-coded in the robot just to test how it could walk later. The environment is set to real time, but I assume it needs to be in a frame by frame lapse during the policy training.

A video of it can be seen in:

https://youtu.be/j9sysG-EIkQ

import pybullet as p

import time

import pybullet_data



def moveLeg( robot=None, id=0, position=0, force=1.5  ):

    if(robot is None):

        return;

    p.setJointMotorControl2(

        robot,

        id,

        p.POSITION_CONTROL,

        targetPosition=position,

        force=force,

        #maxVelocity=5

    )



pixelWidth = 1000

pixelHeight = 1000

camTargetPos = [0,0,0]

camDistance = 0.5

pitch = -10.0

roll=0

upAxisIndex = 2

yaw = 0



physicsClient = p.connect(p.GUI)#or p.DIRECT for non-graphical version

p.setAdditionalSearchPath(pybullet_data.getDataPath()) #optionally

p.setGravity(0,0,-10)

viewMatrix = p.computeViewMatrixFromYawPitchRoll(camTargetPos, camDistance, yaw, pitch, roll, upAxisIndex)

planeId = p.loadURDF("plane.urdf")

cubeStartPos = [0,0,0.05]

cubeStartOrientation = p.getQuaternionFromEuler([0,0,0])

#boxId = p.loadURDF("r2d2.urdf",cubeStartPos, cubeStartOrientation)

boxId = p.loadURDF("src/spider.xml",cubeStartPos, cubeStartOrientation)

# boxId = p.loadURDF("spider_simple.urdf",cubeStartPos, cubeStartOrientation)







toggle = 1







p.setRealTimeSimulation(1)



for i in range (10000):

    #p.stepSimulation()





    moveLeg( robot=boxId, id=0,  position= toggle * -2 ) #LEFT_FRONT

    moveLeg( robot=boxId, id=2,  position= toggle * -2 ) #LEFT_FRONT



    moveLeg( robot=boxId, id=3,  position= toggle * -2 ) #RIGHT_FRONT

    moveLeg( robot=boxId, id=5,  position= toggle *  2 ) #RIGHT_FRONT



    moveLeg( robot=boxId, id=6,  position= toggle *  2 ) #LEFT_BACK

    moveLeg( robot=boxId, id=8,  position= toggle * -2 ) #LEFT_BACK



    moveLeg( robot=boxId, id=9,  position= toggle *  2 ) #RIGHT_BACK

    moveLeg( robot=boxId, id=11, position= toggle *  2 ) #RIGHT_BACK

    #time.sleep(1./140.)g

    #time.sleep(0.01)

    time.sleep(1)



    toggle = toggle * -1



    #viewMatrix        = p.computeViewMatrixFromYawPitchRoll(camTargetPos, camDistance, yaw, pitch, roll, upAxisIndex)

    #projectionMatrix  = [1.0825318098068237, 0.0, 0.0, 0.0, 0.0, 1.732050895690918, 0.0, 0.0, 0.0, 0.0, -1.0002000331878662, -1.0, 0.0, 0.0, -0.020002000033855438, 0.0]

    #img_arr = p.getCameraImage(pixelWidth, pixelHeight, viewMatrix=viewMatrix, projectionMatrix=projectionMatrix, shadow=1,lightDirection=[1,1,1])



cubePos, cubeOrn = p.getBasePositionAndOrientation(boxId)

print(cubePos,cubeOrn)

p.disconnect()

The complete code can be seen here:
https://github.com/rubencg195/WalkingSpider_OpenAI_PyBullet_ROS

edited Dec 27 '18 at 17:46

Prune

42.6k143454

asked Dec 27 '18 at 16:39

Ruben A. Chevez

Please check "Which site?" for general issues.
– Prune
Dec 27 '18 at 17:47

For this site, we generally look for a demonstrable problem with code or coding tools, narrowed down by your prior research and experimentation or debugging. What reinforcement algorithms have you tried, and how did they work out for you?
– Prune
Dec 27 '18 at 17:49

add a comment |

-1

I have designed this robot in URDF format and its environment in pybullet. Each leg has a minimum and maximum value of movement.

What reinforcement algorithm will be best to create a walking policy in a simple environment in which a positive reward will be given if it walks in the positive X-axis direction?

Limitations

left_front_joint => lower="-0.4" upper="2.5" id=0

left_front_leg_joint => lower="-0.6" upper="0.7" id=2

right_front_joint => lower="-2.5" upper="0.4" id=3

right_front_leg_joint => lower="-0.6" upper="0.7" id=5

left_back_joint => lower="-2.5" upper="0.4" id=6

left_back_leg_joint => lower="-0.6" upper="0.7" id=8

right_back_joint => lower="-0.4" upper="2.5" id=9

right_back_leg_joint => lower="-0.6" upper="0.7" id=11

A video of it can be seen in:

https://youtu.be/j9sysG-EIkQ

import pybullet as p

import time

import pybullet_data



def moveLeg( robot=None, id=0, position=0, force=1.5  ):

    if(robot is None):

        return;

    p.setJointMotorControl2(

        robot,

        id,

        p.POSITION_CONTROL,

        targetPosition=position,

        force=force,

        #maxVelocity=5

    )



pixelWidth = 1000

pixelHeight = 1000

camTargetPos = [0,0,0]

camDistance = 0.5

pitch = -10.0

roll=0

upAxisIndex = 2

yaw = 0



physicsClient = p.connect(p.GUI)#or p.DIRECT for non-graphical version

p.setAdditionalSearchPath(pybullet_data.getDataPath()) #optionally

p.setGravity(0,0,-10)

viewMatrix = p.computeViewMatrixFromYawPitchRoll(camTargetPos, camDistance, yaw, pitch, roll, upAxisIndex)

planeId = p.loadURDF("plane.urdf")

cubeStartPos = [0,0,0.05]

cubeStartOrientation = p.getQuaternionFromEuler([0,0,0])

#boxId = p.loadURDF("r2d2.urdf",cubeStartPos, cubeStartOrientation)

boxId = p.loadURDF("src/spider.xml",cubeStartPos, cubeStartOrientation)

# boxId = p.loadURDF("spider_simple.urdf",cubeStartPos, cubeStartOrientation)







toggle = 1







p.setRealTimeSimulation(1)



for i in range (10000):

    #p.stepSimulation()





    moveLeg( robot=boxId, id=0,  position= toggle * -2 ) #LEFT_FRONT

    moveLeg( robot=boxId, id=2,  position= toggle * -2 ) #LEFT_FRONT



    moveLeg( robot=boxId, id=3,  position= toggle * -2 ) #RIGHT_FRONT

    moveLeg( robot=boxId, id=5,  position= toggle *  2 ) #RIGHT_FRONT



    moveLeg( robot=boxId, id=6,  position= toggle *  2 ) #LEFT_BACK

    moveLeg( robot=boxId, id=8,  position= toggle * -2 ) #LEFT_BACK



    moveLeg( robot=boxId, id=9,  position= toggle *  2 ) #RIGHT_BACK

    moveLeg( robot=boxId, id=11, position= toggle *  2 ) #RIGHT_BACK

    #time.sleep(1./140.)g

    #time.sleep(0.01)

    time.sleep(1)



    toggle = toggle * -1



    #viewMatrix        = p.computeViewMatrixFromYawPitchRoll(camTargetPos, camDistance, yaw, pitch, roll, upAxisIndex)

    #projectionMatrix  = [1.0825318098068237, 0.0, 0.0, 0.0, 0.0, 1.732050895690918, 0.0, 0.0, 0.0, 0.0, -1.0002000331878662, -1.0, 0.0, 0.0, -0.020002000033855438, 0.0]

    #img_arr = p.getCameraImage(pixelWidth, pixelHeight, viewMatrix=viewMatrix, projectionMatrix=projectionMatrix, shadow=1,lightDirection=[1,1,1])



cubePos, cubeOrn = p.getBasePositionAndOrientation(boxId)

print(cubePos,cubeOrn)

p.disconnect()

The complete code can be seen here:
https://github.com/rubencg195/WalkingSpider_OpenAI_PyBullet_ROS

edited Dec 27 '18 at 17:46

Prune

42.6k143454

asked Dec 27 '18 at 16:39

Ruben A. Chevez

Please check "Which site?" for general issues.
– Prune
Dec 27 '18 at 17:47

For this site, we generally look for a demonstrable problem with code or coding tools, narrowed down by your prior research and experimentation or debugging. What reinforcement algorithms have you tried, and how did they work out for you?
– Prune
Dec 27 '18 at 17:49

add a comment |

-1

I have designed this robot in URDF format and its environment in pybullet. Each leg has a minimum and maximum value of movement.

What reinforcement algorithm will be best to create a walking policy in a simple environment in which a positive reward will be given if it walks in the positive X-axis direction?

Limitations

left_front_joint => lower="-0.4" upper="2.5" id=0

left_front_leg_joint => lower="-0.6" upper="0.7" id=2

right_front_joint => lower="-2.5" upper="0.4" id=3

right_front_leg_joint => lower="-0.6" upper="0.7" id=5

left_back_joint => lower="-2.5" upper="0.4" id=6

left_back_leg_joint => lower="-0.6" upper="0.7" id=8

right_back_joint => lower="-0.4" upper="2.5" id=9

right_back_leg_joint => lower="-0.6" upper="0.7" id=11

A video of it can be seen in:

https://youtu.be/j9sysG-EIkQ

import pybullet as p

import time

import pybullet_data



def moveLeg( robot=None, id=0, position=0, force=1.5  ):

    if(robot is None):

        return;

    p.setJointMotorControl2(

        robot,

        id,

        p.POSITION_CONTROL,

        targetPosition=position,

        force=force,

        #maxVelocity=5

    )



pixelWidth = 1000

pixelHeight = 1000

camTargetPos = [0,0,0]

camDistance = 0.5

pitch = -10.0

roll=0

upAxisIndex = 2

yaw = 0



physicsClient = p.connect(p.GUI)#or p.DIRECT for non-graphical version

p.setAdditionalSearchPath(pybullet_data.getDataPath()) #optionally

p.setGravity(0,0,-10)

viewMatrix = p.computeViewMatrixFromYawPitchRoll(camTargetPos, camDistance, yaw, pitch, roll, upAxisIndex)

planeId = p.loadURDF("plane.urdf")

cubeStartPos = [0,0,0.05]

cubeStartOrientation = p.getQuaternionFromEuler([0,0,0])

#boxId = p.loadURDF("r2d2.urdf",cubeStartPos, cubeStartOrientation)

boxId = p.loadURDF("src/spider.xml",cubeStartPos, cubeStartOrientation)

# boxId = p.loadURDF("spider_simple.urdf",cubeStartPos, cubeStartOrientation)







toggle = 1







p.setRealTimeSimulation(1)



for i in range (10000):

    #p.stepSimulation()





    moveLeg( robot=boxId, id=0,  position= toggle * -2 ) #LEFT_FRONT

    moveLeg( robot=boxId, id=2,  position= toggle * -2 ) #LEFT_FRONT



    moveLeg( robot=boxId, id=3,  position= toggle * -2 ) #RIGHT_FRONT

    moveLeg( robot=boxId, id=5,  position= toggle *  2 ) #RIGHT_FRONT



    moveLeg( robot=boxId, id=6,  position= toggle *  2 ) #LEFT_BACK

    moveLeg( robot=boxId, id=8,  position= toggle * -2 ) #LEFT_BACK



    moveLeg( robot=boxId, id=9,  position= toggle *  2 ) #RIGHT_BACK

    moveLeg( robot=boxId, id=11, position= toggle *  2 ) #RIGHT_BACK

    #time.sleep(1./140.)g

    #time.sleep(0.01)

    time.sleep(1)



    toggle = toggle * -1



    #viewMatrix        = p.computeViewMatrixFromYawPitchRoll(camTargetPos, camDistance, yaw, pitch, roll, upAxisIndex)

    #projectionMatrix  = [1.0825318098068237, 0.0, 0.0, 0.0, 0.0, 1.732050895690918, 0.0, 0.0, 0.0, 0.0, -1.0002000331878662, -1.0, 0.0, 0.0, -0.020002000033855438, 0.0]

    #img_arr = p.getCameraImage(pixelWidth, pixelHeight, viewMatrix=viewMatrix, projectionMatrix=projectionMatrix, shadow=1,lightDirection=[1,1,1])



cubePos, cubeOrn = p.getBasePositionAndOrientation(boxId)

print(cubePos,cubeOrn)

p.disconnect()

The complete code can be seen here:
https://github.com/rubencg195/WalkingSpider_OpenAI_PyBullet_ROS

edited Dec 27 '18 at 17:46

Prune

42.6k143454

asked Dec 27 '18 at 16:39

Ruben A. Chevez

I have designed this robot in URDF format and its environment in pybullet. Each leg has a minimum and maximum value of movement.

What reinforcement algorithm will be best to create a walking policy in a simple environment in which a positive reward will be given if it walks in the positive X-axis direction?

Limitations

left_front_joint => lower="-0.4" upper="2.5" id=0

left_front_leg_joint => lower="-0.6" upper="0.7" id=2

right_front_joint => lower="-2.5" upper="0.4" id=3

right_front_leg_joint => lower="-0.6" upper="0.7" id=5

left_back_joint => lower="-2.5" upper="0.4" id=6

left_back_leg_joint => lower="-0.6" upper="0.7" id=8

right_back_joint => lower="-0.4" upper="2.5" id=9

right_back_leg_joint => lower="-0.6" upper="0.7" id=11

A video of it can be seen in:

https://youtu.be/j9sysG-EIkQ

import pybullet as p

import time

import pybullet_data



def moveLeg( robot=None, id=0, position=0, force=1.5  ):

    if(robot is None):

        return;

    p.setJointMotorControl2(

        robot,

        id,

        p.POSITION_CONTROL,

        targetPosition=position,

        force=force,

        #maxVelocity=5

    )



pixelWidth = 1000

pixelHeight = 1000

camTargetPos = [0,0,0]

camDistance = 0.5

pitch = -10.0

roll=0

upAxisIndex = 2

yaw = 0



physicsClient = p.connect(p.GUI)#or p.DIRECT for non-graphical version

p.setAdditionalSearchPath(pybullet_data.getDataPath()) #optionally

p.setGravity(0,0,-10)

viewMatrix = p.computeViewMatrixFromYawPitchRoll(camTargetPos, camDistance, yaw, pitch, roll, upAxisIndex)

planeId = p.loadURDF("plane.urdf")

cubeStartPos = [0,0,0.05]

cubeStartOrientation = p.getQuaternionFromEuler([0,0,0])

#boxId = p.loadURDF("r2d2.urdf",cubeStartPos, cubeStartOrientation)

boxId = p.loadURDF("src/spider.xml",cubeStartPos, cubeStartOrientation)

# boxId = p.loadURDF("spider_simple.urdf",cubeStartPos, cubeStartOrientation)







toggle = 1







p.setRealTimeSimulation(1)



for i in range (10000):

    #p.stepSimulation()





    moveLeg( robot=boxId, id=0,  position= toggle * -2 ) #LEFT_FRONT

    moveLeg( robot=boxId, id=2,  position= toggle * -2 ) #LEFT_FRONT



    moveLeg( robot=boxId, id=3,  position= toggle * -2 ) #RIGHT_FRONT

    moveLeg( robot=boxId, id=5,  position= toggle *  2 ) #RIGHT_FRONT



    moveLeg( robot=boxId, id=6,  position= toggle *  2 ) #LEFT_BACK

    moveLeg( robot=boxId, id=8,  position= toggle * -2 ) #LEFT_BACK



    moveLeg( robot=boxId, id=9,  position= toggle *  2 ) #RIGHT_BACK

    moveLeg( robot=boxId, id=11, position= toggle *  2 ) #RIGHT_BACK

    #time.sleep(1./140.)g

    #time.sleep(0.01)

    time.sleep(1)



    toggle = toggle * -1



    #viewMatrix        = p.computeViewMatrixFromYawPitchRoll(camTargetPos, camDistance, yaw, pitch, roll, upAxisIndex)

    #projectionMatrix  = [1.0825318098068237, 0.0, 0.0, 0.0, 0.0, 1.732050895690918, 0.0, 0.0, 0.0, 0.0, -1.0002000331878662, -1.0, 0.0, 0.0, -0.020002000033855438, 0.0]

    #img_arr = p.getCameraImage(pixelWidth, pixelHeight, viewMatrix=viewMatrix, projectionMatrix=projectionMatrix, shadow=1,lightDirection=[1,1,1])



cubePos, cubeOrn = p.getBasePositionAndOrientation(boxId)

print(cubePos,cubeOrn)

p.disconnect()

The complete code can be seen here:
https://github.com/rubencg195/WalkingSpider_OpenAI_PyBullet_ROS

tensorflow deep-learning pytorch reinforcement-learning openai-gym

edited Dec 27 '18 at 17:46

Prune

42.6k143454

asked Dec 27 '18 at 16:39

Ruben A. Chevez

edited Dec 27 '18 at 17:46

Prune

42.6k143454

asked Dec 27 '18 at 16:39

Ruben A. Chevez

edited Dec 27 '18 at 17:46

Prune

42.6k143454

edited Dec 27 '18 at 17:46

Prune

42.6k143454

edited Dec 27 '18 at 17:46

Prune

42.6k143454

asked Dec 27 '18 at 16:39

Ruben A. Chevez

asked Dec 27 '18 at 16:39

Ruben A. Chevez

asked Dec 27 '18 at 16:39

Ruben A. Chevez

Please check "Which site?" for general issues.
– Prune
Dec 27 '18 at 17:47

For this site, we generally look for a demonstrable problem with code or coding tools, narrowed down by your prior research and experimentation or debugging. What reinforcement algorithms have you tried, and how did they work out for you?
– Prune
Dec 27 '18 at 17:49

add a comment |

Please check "Which site?" for general issues.
– Prune
Dec 27 '18 at 17:47

For this site, we generally look for a demonstrable problem with code or coding tools, narrowed down by your prior research and experimentation or debugging. What reinforcement algorithms have you tried, and how did they work out for you?
– Prune
Dec 27 '18 at 17:49

Please check "Which site?" for general issues.
– Prune
Dec 27 '18 at 17:47

For this site, we generally look for a demonstrable problem with code or coding tools, narrowed down by your prior research and experimentation or debugging. What reinforcement algorithms have you tried, and how did they work out for you?
– Prune
Dec 27 '18 at 17:49

add a comment |

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53948176%2fhow-to-implement-a-continuous-control-of-a-quadruped-robot-with-deep-reinforceme%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

CcsXAlNXw,SsHFwsqr PxuOdf5wR8FR UOnFC9tPhG2HBXU9Ox,jaz1teO7 H3mtgNg kvca4YezIfcPwg,zqgERZhLX SoGa,pSP

搜尋此網誌

Bdtjtk