Trajectory features estimation for legged robots€¦ · 11/5/2019  · (32) (32) dropout (64)...

2
Trajectory features estimation for legged robots R. Omar Chavez-Garcia, Jerome Guzzi, Alessandro Giusti, Luca Maria Gambardella Fig. 1: Left: Simulated ANYmal platform on a challenging terrain. Circle regions represent the sampling area for the target pose input. Square area indicates the elevation patch input. Right: A sample (K,q 2 ,f (γ )) is the outcome of a local motion from q 1 to q 2 . I. I NTRODUCTION Robot navigation relies on correctly estimating (through perception or proprioception) negotiable terrains where the robot can safely traverse. Data-driven approaches for local traversability estimation (seen as a local planner) has been previously studied for selecting appropriate locomotion com- mands to negotiate challenging trajectories [1], [2]. We explore an approach to solve the problem of local estimation of trajectory features by applying self-supervised machine learning methods to learn the mapping between the inputs of a local planner and its outputs. In our approach, inputs consist of a relative target pose and elevation infor- mation surrounding the ANYmal legged robot while outputs indicate if a target pose can be successfully reached and the cost of doing so in terms of time. The mapping is learned from instances of local motions (trajectories) and their outcomes, obtained via simulation. II. PROPOSED APPROACH The robot moves in an environment in which it has partial knowledge K. A local motion e =(q 1 ,q 2 ) ∈C 2 is defined by two nearby states in the robot’s configuration space C . When the robot try to move from q 1 to q 2 using a local planner and a controller, it will follow a trajectory γ : [0,T ] →C , with γ (0) = q 1 and γ (T )= q 2 (when it successfully reached q 2 at time T ). We introduce f (γ )=(f 1 (γ ),...,f n (γ )) ∈F ,a n- dimensional description of trajectory features; f should con- tain any information relevant to evaluate the trajectory. For Authors are with the Dalle Molle Institute for Artificial Intelligence (IDSIA), USI-SUPSI, Lugano, Switzerland. [email protected] This work has been conducted as part of ANYmal Research, a community to advance legged robotics, and was supported by the National Centre of Competence in Research Robotics. example it may denote if the robot has reached a target pose, the duration, the energy consumed or a failure type. We apply a self-supervised machine approach to model the mapping (e, K) 7y E[f (e, K)]. Thus, we collect many instances of local trajectories γ by sampling local motions in different environments and recording their features f (γ ). We assume that relevant features depend only on the relative position of the target q2 with respect to the source q1 and the local knowledge around it. Hence, we use a local coordinate frame, centred at q 1 , to represent q 2 and K. The resulting dataset is composed of samples of the form (K,q 2 ,f 1 ,f 2 ,...,f n ) (see Figure 1-right). III. EXPERIMENTAL RESULTS ANYmal is a state-of-the-art quadruped robot developed for autonomous operation in challenging environments by ETHZ [3] with the ability to walk on rough terrain. We use a simulation of the ANYmal in Gazebo (see Figure 1-left) to learn how well the robot copes with different obstacles. ANYmal moves on a terrain whose information K is encoded as a heightmap (with 2 cm per pixel resolution), i.e. a regular image with an elevation value associated to each pixel; in this work we ignore any property that is not geometrical (like the surface material). ANYmal’s footprint is 80 cm × 60 cm × 70 cm. Local motion knowledge (K) is given by 100 px × 100 px patches centred at the robot’s location and oriented along its heading. The simulated ANYmal uses a simple local planner and a complex closed feedback controller to follow the local tra- jectory while compensating for the terrain irregularities [4]. The local planner takes as only input the relative target pose q 2 =(x, y, θ). For simplicity, we limit our analysis to local motions that maintain robot’s orientation. For any sampled trajectory, f ∈{0, 1R + is composed by the success of reaching q 2 and the duration T of the trajectory. We collected a training dataset of around 70k samples (74% with success =1) of the form (K, x, y, success,T ), by randomly spawning the robot on stable poses on different generated maps (similar as in our previous work [5]) and randomly sampling a relative target point at a distance between 15 cm to 50 cm in any direction; in average the robot needs 8 steps and 10 seconds per sample (see Figure 1- right). We employed 12 simulated maps (10 m × 10 m each) for data gathering containing challenging obstacles such as slopes, bumps, holes, rails and steps. Figure 2 shows a schematic of the proposed CNN. This model is expected to learn relevant features related to robot’s locomotion. CNN’s architecture is divided in two stages: the first one processes the visual input provided by the patch

Transcript of Trajectory features estimation for legged robots€¦ · 11/5/2019  · (32) (32) dropout (64)...

Page 1: Trajectory features estimation for legged robots€¦ · 11/5/2019  · (32) (32) dropout (64) input target (x;y) concatenation (64) (32) softmax output success (64) dense (1) output

Trajectory features estimation for legged robots

R. Omar Chavez-Garcia, Jerome Guzzi, Alessandro Giusti, Luca Maria Gambardella

Fig. 1: Left: Simulated ANYmal platform on a challengingterrain. Circle regions represent the sampling area for thetarget pose input. Square area indicates the elevation patchinput. Right: A sample (K, q2, f(γ)) is the outcome of alocal motion from q1 to q2.

I. INTRODUCTION

Robot navigation relies on correctly estimating (throughperception or proprioception) negotiable terrains where therobot can safely traverse. Data-driven approaches for localtraversability estimation (seen as a local planner) has beenpreviously studied for selecting appropriate locomotion com-mands to negotiate challenging trajectories [1], [2].

We explore an approach to solve the problem of localestimation of trajectory features by applying self-supervisedmachine learning methods to learn the mapping between theinputs of a local planner and its outputs. In our approach,inputs consist of a relative target pose and elevation infor-mation surrounding the ANYmal legged robot while outputsindicate if a target pose can be successfully reached andthe cost of doing so in terms of time. The mapping islearned from instances of local motions (trajectories) andtheir outcomes, obtained via simulation.

II. PROPOSED APPROACH

The robot moves in an environment in which it has partialknowledge K. A local motion e = (q1, q2) ∈ C2 is defined bytwo nearby states in the robot’s configuration space C. Whenthe robot try to move from q1 to q2 using a local planner anda controller, it will follow a trajectory γ : [0, T ] → C, withγ(0) = q1 and γ(T ) = q2 (when it successfully reached q2at time T ).

We introduce f(γ) = (f1(γ), . . . , fn(γ)) ∈ F , a n−dimensional description of trajectory features; f should con-tain any information relevant to evaluate the trajectory. For

Authors are with the Dalle Molle Institute for Artificial Intelligence(IDSIA), USI-SUPSI, Lugano, Switzerland. [email protected]

This work has been conducted as part of ANYmal Research, a communityto advance legged robotics, and was supported by the National Centre ofCompetence in Research Robotics.

example it may denote if the robot has reached a target pose,the duration, the energy consumed or a failure type.

We apply a self-supervised machine approach to model themapping (e,K) 7→ y ≈ E[f(e,K)]. Thus, we collect manyinstances of local trajectories γ by sampling local motionsin different environments and recording their features f(γ).

We assume that relevant features depend only on therelative position of the target q2 with respect to the sourceq1 and the local knowledge around it. Hence, we use alocal coordinate frame, centred at q1, to represent q2 andK. The resulting dataset is composed of samples of the form(K, q2, f1, f2, . . . , fn) (see Figure 1-right).

III. EXPERIMENTAL RESULTS

ANYmal is a state-of-the-art quadruped robot developedfor autonomous operation in challenging environments byETHZ [3] with the ability to walk on rough terrain. We usea simulation of the ANYmal in Gazebo (see Figure 1-left)to learn how well the robot copes with different obstacles.

ANYmal moves on a terrain whose information K isencoded as a heightmap (with 2 cm per pixel resolution),i.e. a regular image with an elevation value associated toeach pixel; in this work we ignore any property that is notgeometrical (like the surface material). ANYmal’s footprintis 80 cm× 60 cm× 70 cm. Local motion knowledge (K) isgiven by 100 px× 100 px patches centred at the robot’slocation and oriented along its heading.

The simulated ANYmal uses a simple local planner and acomplex closed feedback controller to follow the local tra-jectory while compensating for the terrain irregularities [4].The local planner takes as only input the relative target poseq2 = (x, y, θ). For simplicity, we limit our analysis to localmotions that maintain robot’s orientation. For any sampledtrajectory, f ∈ {0, 1} × R+ is composed by the success ofreaching q2 and the duration T of the trajectory.

We collected a training dataset of around 70k samples(74% with success = 1) of the form (K, x, y, success, T ),by randomly spawning the robot on stable poses on differentgenerated maps (similar as in our previous work [5]) andrandomly sampling a relative target point at a distancebetween 15 cm to 50 cm in any direction; in average therobot needs 8 steps and 10 seconds per sample (see Figure 1-right). We employed 12 simulated maps (10m× 10m each)for data gathering containing challenging obstacles such asslopes, bumps, holes, rails and steps.

Figure 2 shows a schematic of the proposed CNN. Thismodel is expected to learn relevant features related to robot’slocomotion. CNN’s architecture is divided in two stages: thefirst one processes the visual input provided by the patch

Page 2: Trajectory features estimation for legged robots€¦ · 11/5/2019  · (32) (32) dropout (64) input target (x;y) concatenation (64) (32) softmax output success (64) dense (1) output

inputmap

patch K conv

(16)

conv

(16)

conv

(32)

conv

(32)

drop

out

flatte

n/

dens

e(6

4)

inputtarget(x, y)

conc

aten

atio

n

dens

e(6

4)

dens

e(3

2) softmaxoutputsuccess

dens

e(6

4)

dense (1)output T

Fig. 2: ANYmal Convolutional Neural Network architecture:inputs comprise the heightmap patch K and the relative targetgoal (x, y); outputs consist of a soft classification of successfor reaching the target pose, and an estimation of the time toreach such target pose. The model is supervised via two lossfunctions: cross-entropy for the classifier and mean-squarederror for the regressor. The minimized loss value of the modelis the weighted sum of both individual losses.

TABLE I: ANYmal’s model evaluation for success predictionand duration estimation in two datasets. Metrics include AreaUnder the ROC (AUC) for the classification of success andMean Squared Error (MSE) for duration estimation.

dataset (success/failures samples) AUC MSE

evaluation (7300 / 1398) 0.892 1.89rough map (18453 / 185) 0.867 2.35

sample; an additional input represented by the relative targetpoint (x, y) is then added to a second stage. The output ofthe CNN consists: of a success score ysuccess from a softmaxlayer and a duration estimation (in seconds) yT from a denseconnected layer.

Table I show quantitative results of the model’s evalu-ation. We evaluated the trained model with two datasets:an evaluation dataset containing samples from 8 differentheightmaps (similar to those in the training dataset); anda dataset from a particularly rough terrain (10m× 10m)where we executed several paths while recording success andfailure samples. Examples of estimated success and durationvalues are depicted in Figure 3. Success estimations arecoherent with the elevation obstacles around the robot; inboth cases it can be seen that lower success values appeararound the prominent obstacles, i.e. the walls and the holeedges. Duration estimation is also in agreement with thedistance to a target and the obstacles the robot may facetowards such target.

IV. CONCLUSIONS AND PERSPECTIVES

We introduced a data-driven approach to estimate featuresof short trajectories using local knowledge. Results show themodel’s ability to predict the success of the trajectory andits duration. Additionally, we are exploring estimating energyconsumption computed as the time integral of the leg motorspower.

We are currently working in a general approach for robotpath planning: first we employ the proposed local estimatorto predict the outcome of short trajectories from local in-formation, then we use this model to discriminate between

Fig. 3: Examples of estimated values on two patches(100 px× 100 px = 2m× 2m) from the rough map dataset.Left: estimated success when target pose is sampled froma circular region with radius between 15 cm to 50 cm andfixed heading; green indicates a high success score andred otherwise. Right: duration estimation for trajectorieswith success score > 0.6; green colouring indicates shortdurations, pink otherwise. ANYmal’s footprint appears as ablue arrow.

potential edges while building a dense random tree to connectsource and target poses accounting for: traversability risks,and costs derived from estimated duration and energy.

REFERENCES

[1] W. Gao, D. Hsu, W. S. Lee, and S. S. K. Subramanian, “Intention-net:Integrating planning and deep learning for goal-directed autonomousnavigation,” in Conference on Robot Learning, volume 78 of Proc.Machine Learning Research, 2017, pp. 185–194.

[2] L. Wellhausen, A. Dosovitskiy, R. Ranftl, K. Walas, C. Cadena Lerma,and M. Hutter, “Where should i walk? predicting terrain properties fromimages via self-supervised learning,” IEEE Robotics and AutomationLetters, vol. 4, no. 2, pp. 1509–1516, 2019.

[3] M. Hutter, C. Gehring, D. Jud, A. Lauber, C. D. Bellicoso, V. Tsounis,J. Hwangbo, K. Bodie, P. Fankhauser, M. Bloesch, R. Diethelm,S. Bachmann, A. Melzer, and M. Hoepflinger, “Anymal - a highlymobile and dynamic quadrupedal robot,” in IEEE/RSJ InternationalConference on Intelligent Robots and Systems (IROS), 2016, pp. 38–44.

[4] P. Fankhauser, C. D. Bellicoso, C. Gehring, R. Dube, A. Gawel, andM. Hutter, “Free Gait – An Architecture for the Versatile Control ofLegged Robots,” in IEEE-RAS International Conference on HumanoidRobots, 2016.

[5] R. O. Chavez-Garcia, J. Guzzi, L. M. Gambardella, and A. Giusti,“Learning ground traversability from simulations,” IEEE Robotics andAutomation Letters, vol. 3, no. 3, pp. 1695–1702, July 2018.