How to implement a Continuous Control of a quadruped robot with Deep Reinforcement Learning in Pybullet and...

Multi tool use
Multi tool use












-1














I have designed this robot in URDF format and its environment in pybullet. Each leg has a minimum and maximum value of movement.



What reinforcement algorithm will be best to create a walking policy in a simple environment in which a positive reward will be given if it walks in the positive X-axis direction?



The expected output from the policy is an array in the range of (-1, 1) for each joint. The input of the policy is the position of each joint, the center of mass of the body, the difference in height between the floor and the body to see if it has fallen and the movement in the x-axis.



Limitations




  • left_front_joint => lower="-0.4" upper="2.5" id=0

  • left_front_leg_joint => lower="-0.6" upper="0.7" id=2

  • right_front_joint => lower="-2.5" upper="0.4" id=3

  • right_front_leg_joint => lower="-0.6" upper="0.7" id=5

  • left_back_joint => lower="-2.5" upper="0.4" id=6

  • left_back_leg_joint => lower="-0.6" upper="0.7" id=8

  • right_back_joint => lower="-0.4" upper="2.5" id=9

  • right_back_leg_joint => lower="-0.6" upper="0.7" id=11


The code above is just a test of the environment with a manual set of movements hard-coded in the robot just to test how it could walk later. The environment is set to real time, but I assume it needs to be in a frame by frame lapse during the policy training.



A video of it can be seen in:



https://youtu.be/j9sysG-EIkQ



import pybullet as p
import time
import pybullet_data

def moveLeg( robot=None, id=0, position=0, force=1.5 ):
if(robot is None):
return;
p.setJointMotorControl2(
robot,
id,
p.POSITION_CONTROL,
targetPosition=position,
force=force,
#maxVelocity=5
)

pixelWidth = 1000
pixelHeight = 1000
camTargetPos = [0,0,0]
camDistance = 0.5
pitch = -10.0
roll=0
upAxisIndex = 2
yaw = 0

physicsClient = p.connect(p.GUI)#or p.DIRECT for non-graphical version
p.setAdditionalSearchPath(pybullet_data.getDataPath()) #optionally
p.setGravity(0,0,-10)
viewMatrix = p.computeViewMatrixFromYawPitchRoll(camTargetPos, camDistance, yaw, pitch, roll, upAxisIndex)
planeId = p.loadURDF("plane.urdf")
cubeStartPos = [0,0,0.05]
cubeStartOrientation = p.getQuaternionFromEuler([0,0,0])
#boxId = p.loadURDF("r2d2.urdf",cubeStartPos, cubeStartOrientation)
boxId = p.loadURDF("src/spider.xml",cubeStartPos, cubeStartOrientation)
# boxId = p.loadURDF("spider_simple.urdf",cubeStartPos, cubeStartOrientation)



toggle = 1



p.setRealTimeSimulation(1)

for i in range (10000):
#p.stepSimulation()


moveLeg( robot=boxId, id=0, position= toggle * -2 ) #LEFT_FRONT
moveLeg( robot=boxId, id=2, position= toggle * -2 ) #LEFT_FRONT

moveLeg( robot=boxId, id=3, position= toggle * -2 ) #RIGHT_FRONT
moveLeg( robot=boxId, id=5, position= toggle * 2 ) #RIGHT_FRONT

moveLeg( robot=boxId, id=6, position= toggle * 2 ) #LEFT_BACK
moveLeg( robot=boxId, id=8, position= toggle * -2 ) #LEFT_BACK

moveLeg( robot=boxId, id=9, position= toggle * 2 ) #RIGHT_BACK
moveLeg( robot=boxId, id=11, position= toggle * 2 ) #RIGHT_BACK
#time.sleep(1./140.)g
#time.sleep(0.01)
time.sleep(1)

toggle = toggle * -1

#viewMatrix = p.computeViewMatrixFromYawPitchRoll(camTargetPos, camDistance, yaw, pitch, roll, upAxisIndex)
#projectionMatrix = [1.0825318098068237, 0.0, 0.0, 0.0, 0.0, 1.732050895690918, 0.0, 0.0, 0.0, 0.0, -1.0002000331878662, -1.0, 0.0, 0.0, -0.020002000033855438, 0.0]
#img_arr = p.getCameraImage(pixelWidth, pixelHeight, viewMatrix=viewMatrix, projectionMatrix=projectionMatrix, shadow=1,lightDirection=[1,1,1])

cubePos, cubeOrn = p.getBasePositionAndOrientation(boxId)
print(cubePos,cubeOrn)
p.disconnect()


The complete code can be seen here:
https://github.com/rubencg195/WalkingSpider_OpenAI_PyBullet_ROS










share|improve this question
























  • Please check "Which site?" for general issues.
    – Prune
    Dec 27 '18 at 17:47










  • For this site, we generally look for a demonstrable problem with code or coding tools, narrowed down by your prior research and experimentation or debugging. What reinforcement algorithms have you tried, and how did they work out for you?
    – Prune
    Dec 27 '18 at 17:49
















-1














I have designed this robot in URDF format and its environment in pybullet. Each leg has a minimum and maximum value of movement.



What reinforcement algorithm will be best to create a walking policy in a simple environment in which a positive reward will be given if it walks in the positive X-axis direction?



The expected output from the policy is an array in the range of (-1, 1) for each joint. The input of the policy is the position of each joint, the center of mass of the body, the difference in height between the floor and the body to see if it has fallen and the movement in the x-axis.



Limitations




  • left_front_joint => lower="-0.4" upper="2.5" id=0

  • left_front_leg_joint => lower="-0.6" upper="0.7" id=2

  • right_front_joint => lower="-2.5" upper="0.4" id=3

  • right_front_leg_joint => lower="-0.6" upper="0.7" id=5

  • left_back_joint => lower="-2.5" upper="0.4" id=6

  • left_back_leg_joint => lower="-0.6" upper="0.7" id=8

  • right_back_joint => lower="-0.4" upper="2.5" id=9

  • right_back_leg_joint => lower="-0.6" upper="0.7" id=11


The code above is just a test of the environment with a manual set of movements hard-coded in the robot just to test how it could walk later. The environment is set to real time, but I assume it needs to be in a frame by frame lapse during the policy training.



A video of it can be seen in:



https://youtu.be/j9sysG-EIkQ



import pybullet as p
import time
import pybullet_data

def moveLeg( robot=None, id=0, position=0, force=1.5 ):
if(robot is None):
return;
p.setJointMotorControl2(
robot,
id,
p.POSITION_CONTROL,
targetPosition=position,
force=force,
#maxVelocity=5
)

pixelWidth = 1000
pixelHeight = 1000
camTargetPos = [0,0,0]
camDistance = 0.5
pitch = -10.0
roll=0
upAxisIndex = 2
yaw = 0

physicsClient = p.connect(p.GUI)#or p.DIRECT for non-graphical version
p.setAdditionalSearchPath(pybullet_data.getDataPath()) #optionally
p.setGravity(0,0,-10)
viewMatrix = p.computeViewMatrixFromYawPitchRoll(camTargetPos, camDistance, yaw, pitch, roll, upAxisIndex)
planeId = p.loadURDF("plane.urdf")
cubeStartPos = [0,0,0.05]
cubeStartOrientation = p.getQuaternionFromEuler([0,0,0])
#boxId = p.loadURDF("r2d2.urdf",cubeStartPos, cubeStartOrientation)
boxId = p.loadURDF("src/spider.xml",cubeStartPos, cubeStartOrientation)
# boxId = p.loadURDF("spider_simple.urdf",cubeStartPos, cubeStartOrientation)



toggle = 1



p.setRealTimeSimulation(1)

for i in range (10000):
#p.stepSimulation()


moveLeg( robot=boxId, id=0, position= toggle * -2 ) #LEFT_FRONT
moveLeg( robot=boxId, id=2, position= toggle * -2 ) #LEFT_FRONT

moveLeg( robot=boxId, id=3, position= toggle * -2 ) #RIGHT_FRONT
moveLeg( robot=boxId, id=5, position= toggle * 2 ) #RIGHT_FRONT

moveLeg( robot=boxId, id=6, position= toggle * 2 ) #LEFT_BACK
moveLeg( robot=boxId, id=8, position= toggle * -2 ) #LEFT_BACK

moveLeg( robot=boxId, id=9, position= toggle * 2 ) #RIGHT_BACK
moveLeg( robot=boxId, id=11, position= toggle * 2 ) #RIGHT_BACK
#time.sleep(1./140.)g
#time.sleep(0.01)
time.sleep(1)

toggle = toggle * -1

#viewMatrix = p.computeViewMatrixFromYawPitchRoll(camTargetPos, camDistance, yaw, pitch, roll, upAxisIndex)
#projectionMatrix = [1.0825318098068237, 0.0, 0.0, 0.0, 0.0, 1.732050895690918, 0.0, 0.0, 0.0, 0.0, -1.0002000331878662, -1.0, 0.0, 0.0, -0.020002000033855438, 0.0]
#img_arr = p.getCameraImage(pixelWidth, pixelHeight, viewMatrix=viewMatrix, projectionMatrix=projectionMatrix, shadow=1,lightDirection=[1,1,1])

cubePos, cubeOrn = p.getBasePositionAndOrientation(boxId)
print(cubePos,cubeOrn)
p.disconnect()


The complete code can be seen here:
https://github.com/rubencg195/WalkingSpider_OpenAI_PyBullet_ROS










share|improve this question
























  • Please check "Which site?" for general issues.
    – Prune
    Dec 27 '18 at 17:47










  • For this site, we generally look for a demonstrable problem with code or coding tools, narrowed down by your prior research and experimentation or debugging. What reinforcement algorithms have you tried, and how did they work out for you?
    – Prune
    Dec 27 '18 at 17:49














-1












-1








-1







I have designed this robot in URDF format and its environment in pybullet. Each leg has a minimum and maximum value of movement.



What reinforcement algorithm will be best to create a walking policy in a simple environment in which a positive reward will be given if it walks in the positive X-axis direction?



The expected output from the policy is an array in the range of (-1, 1) for each joint. The input of the policy is the position of each joint, the center of mass of the body, the difference in height between the floor and the body to see if it has fallen and the movement in the x-axis.



Limitations




  • left_front_joint => lower="-0.4" upper="2.5" id=0

  • left_front_leg_joint => lower="-0.6" upper="0.7" id=2

  • right_front_joint => lower="-2.5" upper="0.4" id=3

  • right_front_leg_joint => lower="-0.6" upper="0.7" id=5

  • left_back_joint => lower="-2.5" upper="0.4" id=6

  • left_back_leg_joint => lower="-0.6" upper="0.7" id=8

  • right_back_joint => lower="-0.4" upper="2.5" id=9

  • right_back_leg_joint => lower="-0.6" upper="0.7" id=11


The code above is just a test of the environment with a manual set of movements hard-coded in the robot just to test how it could walk later. The environment is set to real time, but I assume it needs to be in a frame by frame lapse during the policy training.



A video of it can be seen in:



https://youtu.be/j9sysG-EIkQ



import pybullet as p
import time
import pybullet_data

def moveLeg( robot=None, id=0, position=0, force=1.5 ):
if(robot is None):
return;
p.setJointMotorControl2(
robot,
id,
p.POSITION_CONTROL,
targetPosition=position,
force=force,
#maxVelocity=5
)

pixelWidth = 1000
pixelHeight = 1000
camTargetPos = [0,0,0]
camDistance = 0.5
pitch = -10.0
roll=0
upAxisIndex = 2
yaw = 0

physicsClient = p.connect(p.GUI)#or p.DIRECT for non-graphical version
p.setAdditionalSearchPath(pybullet_data.getDataPath()) #optionally
p.setGravity(0,0,-10)
viewMatrix = p.computeViewMatrixFromYawPitchRoll(camTargetPos, camDistance, yaw, pitch, roll, upAxisIndex)
planeId = p.loadURDF("plane.urdf")
cubeStartPos = [0,0,0.05]
cubeStartOrientation = p.getQuaternionFromEuler([0,0,0])
#boxId = p.loadURDF("r2d2.urdf",cubeStartPos, cubeStartOrientation)
boxId = p.loadURDF("src/spider.xml",cubeStartPos, cubeStartOrientation)
# boxId = p.loadURDF("spider_simple.urdf",cubeStartPos, cubeStartOrientation)



toggle = 1



p.setRealTimeSimulation(1)

for i in range (10000):
#p.stepSimulation()


moveLeg( robot=boxId, id=0, position= toggle * -2 ) #LEFT_FRONT
moveLeg( robot=boxId, id=2, position= toggle * -2 ) #LEFT_FRONT

moveLeg( robot=boxId, id=3, position= toggle * -2 ) #RIGHT_FRONT
moveLeg( robot=boxId, id=5, position= toggle * 2 ) #RIGHT_FRONT

moveLeg( robot=boxId, id=6, position= toggle * 2 ) #LEFT_BACK
moveLeg( robot=boxId, id=8, position= toggle * -2 ) #LEFT_BACK

moveLeg( robot=boxId, id=9, position= toggle * 2 ) #RIGHT_BACK
moveLeg( robot=boxId, id=11, position= toggle * 2 ) #RIGHT_BACK
#time.sleep(1./140.)g
#time.sleep(0.01)
time.sleep(1)

toggle = toggle * -1

#viewMatrix = p.computeViewMatrixFromYawPitchRoll(camTargetPos, camDistance, yaw, pitch, roll, upAxisIndex)
#projectionMatrix = [1.0825318098068237, 0.0, 0.0, 0.0, 0.0, 1.732050895690918, 0.0, 0.0, 0.0, 0.0, -1.0002000331878662, -1.0, 0.0, 0.0, -0.020002000033855438, 0.0]
#img_arr = p.getCameraImage(pixelWidth, pixelHeight, viewMatrix=viewMatrix, projectionMatrix=projectionMatrix, shadow=1,lightDirection=[1,1,1])

cubePos, cubeOrn = p.getBasePositionAndOrientation(boxId)
print(cubePos,cubeOrn)
p.disconnect()


The complete code can be seen here:
https://github.com/rubencg195/WalkingSpider_OpenAI_PyBullet_ROS










share|improve this question















I have designed this robot in URDF format and its environment in pybullet. Each leg has a minimum and maximum value of movement.



What reinforcement algorithm will be best to create a walking policy in a simple environment in which a positive reward will be given if it walks in the positive X-axis direction?



The expected output from the policy is an array in the range of (-1, 1) for each joint. The input of the policy is the position of each joint, the center of mass of the body, the difference in height between the floor and the body to see if it has fallen and the movement in the x-axis.



Limitations




  • left_front_joint => lower="-0.4" upper="2.5" id=0

  • left_front_leg_joint => lower="-0.6" upper="0.7" id=2

  • right_front_joint => lower="-2.5" upper="0.4" id=3

  • right_front_leg_joint => lower="-0.6" upper="0.7" id=5

  • left_back_joint => lower="-2.5" upper="0.4" id=6

  • left_back_leg_joint => lower="-0.6" upper="0.7" id=8

  • right_back_joint => lower="-0.4" upper="2.5" id=9

  • right_back_leg_joint => lower="-0.6" upper="0.7" id=11


The code above is just a test of the environment with a manual set of movements hard-coded in the robot just to test how it could walk later. The environment is set to real time, but I assume it needs to be in a frame by frame lapse during the policy training.



A video of it can be seen in:



https://youtu.be/j9sysG-EIkQ



import pybullet as p
import time
import pybullet_data

def moveLeg( robot=None, id=0, position=0, force=1.5 ):
if(robot is None):
return;
p.setJointMotorControl2(
robot,
id,
p.POSITION_CONTROL,
targetPosition=position,
force=force,
#maxVelocity=5
)

pixelWidth = 1000
pixelHeight = 1000
camTargetPos = [0,0,0]
camDistance = 0.5
pitch = -10.0
roll=0
upAxisIndex = 2
yaw = 0

physicsClient = p.connect(p.GUI)#or p.DIRECT for non-graphical version
p.setAdditionalSearchPath(pybullet_data.getDataPath()) #optionally
p.setGravity(0,0,-10)
viewMatrix = p.computeViewMatrixFromYawPitchRoll(camTargetPos, camDistance, yaw, pitch, roll, upAxisIndex)
planeId = p.loadURDF("plane.urdf")
cubeStartPos = [0,0,0.05]
cubeStartOrientation = p.getQuaternionFromEuler([0,0,0])
#boxId = p.loadURDF("r2d2.urdf",cubeStartPos, cubeStartOrientation)
boxId = p.loadURDF("src/spider.xml",cubeStartPos, cubeStartOrientation)
# boxId = p.loadURDF("spider_simple.urdf",cubeStartPos, cubeStartOrientation)



toggle = 1



p.setRealTimeSimulation(1)

for i in range (10000):
#p.stepSimulation()


moveLeg( robot=boxId, id=0, position= toggle * -2 ) #LEFT_FRONT
moveLeg( robot=boxId, id=2, position= toggle * -2 ) #LEFT_FRONT

moveLeg( robot=boxId, id=3, position= toggle * -2 ) #RIGHT_FRONT
moveLeg( robot=boxId, id=5, position= toggle * 2 ) #RIGHT_FRONT

moveLeg( robot=boxId, id=6, position= toggle * 2 ) #LEFT_BACK
moveLeg( robot=boxId, id=8, position= toggle * -2 ) #LEFT_BACK

moveLeg( robot=boxId, id=9, position= toggle * 2 ) #RIGHT_BACK
moveLeg( robot=boxId, id=11, position= toggle * 2 ) #RIGHT_BACK
#time.sleep(1./140.)g
#time.sleep(0.01)
time.sleep(1)

toggle = toggle * -1

#viewMatrix = p.computeViewMatrixFromYawPitchRoll(camTargetPos, camDistance, yaw, pitch, roll, upAxisIndex)
#projectionMatrix = [1.0825318098068237, 0.0, 0.0, 0.0, 0.0, 1.732050895690918, 0.0, 0.0, 0.0, 0.0, -1.0002000331878662, -1.0, 0.0, 0.0, -0.020002000033855438, 0.0]
#img_arr = p.getCameraImage(pixelWidth, pixelHeight, viewMatrix=viewMatrix, projectionMatrix=projectionMatrix, shadow=1,lightDirection=[1,1,1])

cubePos, cubeOrn = p.getBasePositionAndOrientation(boxId)
print(cubePos,cubeOrn)
p.disconnect()


The complete code can be seen here:
https://github.com/rubencg195/WalkingSpider_OpenAI_PyBullet_ROS







tensorflow deep-learning pytorch reinforcement-learning openai-gym






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Dec 27 '18 at 17:46









Prune

42.6k143454




42.6k143454










asked Dec 27 '18 at 16:39









Ruben A. Chevez

72




72












  • Please check "Which site?" for general issues.
    – Prune
    Dec 27 '18 at 17:47










  • For this site, we generally look for a demonstrable problem with code or coding tools, narrowed down by your prior research and experimentation or debugging. What reinforcement algorithms have you tried, and how did they work out for you?
    – Prune
    Dec 27 '18 at 17:49


















  • Please check "Which site?" for general issues.
    – Prune
    Dec 27 '18 at 17:47










  • For this site, we generally look for a demonstrable problem with code or coding tools, narrowed down by your prior research and experimentation or debugging. What reinforcement algorithms have you tried, and how did they work out for you?
    – Prune
    Dec 27 '18 at 17:49
















Please check "Which site?" for general issues.
– Prune
Dec 27 '18 at 17:47




Please check "Which site?" for general issues.
– Prune
Dec 27 '18 at 17:47












For this site, we generally look for a demonstrable problem with code or coding tools, narrowed down by your prior research and experimentation or debugging. What reinforcement algorithms have you tried, and how did they work out for you?
– Prune
Dec 27 '18 at 17:49




For this site, we generally look for a demonstrable problem with code or coding tools, narrowed down by your prior research and experimentation or debugging. What reinforcement algorithms have you tried, and how did they work out for you?
– Prune
Dec 27 '18 at 17:49

















active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53948176%2fhow-to-implement-a-continuous-control-of-a-quadruped-robot-with-deep-reinforceme%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown






























active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53948176%2fhow-to-implement-a-continuous-control-of-a-quadruped-robot-with-deep-reinforceme%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







CcsXAlNXw,SsHFwsqr PxuOdf5wR8FR UOnFC9tPhG2HBXU9Ox,jaz1teO7 H3mtgNg kvca4YezIfcPwg,zqgERZhLX SoGa,pSP
qo,blr1HpsmPzcWle7 utro

Popular posts from this blog

Monofisismo

Angular Downloading a file using contenturl with Basic Authentication

Olmecas