openai_ros.task_envs.hopper package


openai_ros.task_envs.hopper.hopper_stay_up module

class openai_ros.task_envs.hopper.hopper_stay_up.HopperStayUpEnv[source]

Bases: openai_ros.robot_envs.hopper_env.HopperEnv


Make Hopper Learn how to Stay Up indefenitly

__module__ = 'openai_ros.task_envs.hopper.hopper_stay_up'
_compute_reward(observations, done)[source]

We Base the rewards in if its done or not and we base it on the joint poisition, effort, contact force, orientation and distance from desired point. :return:


Here we define what sensor data defines our robots observations To know which Variables we have access to, we need to read the HopperEnv API DOCS Returns the state of the robot needed for OpenAI QLearn Algorithm The state will be defined by an array of the: 1) distance from desired point in meters 2) The pitch orientation in radians 3) the Roll orientation in radians 4) the Yaw orientation in radians 5) Force in contact sensor in Newtons 6-7-8) State of the 3 joints in radians 9) Height of the Base

observation = [distance_from_desired_point,
base_roll, base_pitch, base_yaw, force_magnitude, joint_states_haa, joint_states_hfe, joint_states_kfe, height_base]

Inits variables needed to be initialised each time we reset at the start of an episode. :return:


We consider the episode done if: 1) The Monopeds height is lower than a threshhold 2) The Orientation is outside a threshold


It sets the joints of monoped based on the action integer given based on the action number given. :param action: The action integer that sets what movement to do next.


Sets the Robot in its init linear and angular speeds and lands the robot. Its preparing it to be reseted in the world.

calculate_reward_contact_force(force_magnitude, weight=1.0)[source]

We calculate reward base on the contact force. The nearest to the desired contact force the better. We use exponential to magnify big departures from the desired force. Default ( 7.08 N ) desired force was taken from reading of the robot touching the ground from a negligible height of 5cm. :return:

calculate_reward_distance_from_des_point(current_position, weight=1.0)[source]

We calculate the distance from the desired point. The closser the better :param weight: :return:reward

calculate_reward_joint_position(joints_state_array, weight=1.0)[source]

We calculate reward base on the joints configuration. The more near 0 the better. :return:

calculate_reward_orientation(rpy_array, weight=1.0)[source]

We calculate the reward based on the orientation. The more its closser to 0 the better because it means its upright desired_yaw is the yaw that we want it to be. to praise it to have a certain orientation, here is where to set it. :param: rpy_array: Its an array with Roll Pitch and Yaw in place 0, 1 and 2 respectively. :return:


/lowerleg_contactsensor_state/states[0]/contact_positions ==> PointContact in World /lowerleg_contactsensor_state/states[0]/contact_normals ==> NormalContact in World

==> One is an array of all the forces, the other total,
and are relative to the contact link referred to in the sensor.

/lowerleg_contactsensor_state/states[0]/wrenches[] /lowerleg_contactsensor_state/states[0]/total_wrench :return:


You will see that because the X axis is the one pointing downwards, it will be the one with higher value when touching the floor For a Robot of total mas of 0.55Kg, a gravity of 9.81 m/sec**2, Weight = 0.55*9.81=5.39 N Falling from around 5centimetres ( negligible height ), we register peaks around Fx = 7.08 N :return:


Calculates the distance from the current position to the desired point :param start_point: :return:

get_distance_from_point(pstart, p_end)[source]

Given a Vector3 Object, get distance from current position :param p_end: :return:

is_in_desired_position(current_position, epsilon=0.05)[source]

It return True if the current position is similar to the desired poistion


Check if the monoped is inside the Workspace defined


Based on the orientation RPY given states if the monoped has flipped


Detects if there is something too close to the monoped front

Module contents