Robo Quest

Exploring modern robotics theory and practice

Issac Sim provides a set of APIs that are relevant for robotics applications. They abstract the complexities of dealing with Pixar's USD Python API.

Core API

BaseSample – A boilerplate extension application that sets up the basics for a robotic extension application.

An extension application is a UI elements provider in Issac Sim parlance.

BaseSample performs

  • Loading the world and it's assets.
  • Clearing the world when a new stage is added.
  • Reset the world to default state.
  • Handle hot reloading.

BaseSample has a World instance. World allows to interact with the simulator in an easy and modular way. It handles time related events(including physics step), resetting scene and adding tasks etc. World is a singleton and the reference to it can be obtained anywhere as World.instance() (after the BaseSample has been initialized in the constructor).

A World contains an instance of a Scene. Scene manages simulation assets in the USD Stage. Scene provides an API to manipulate different USD assets in the stage.

When creating a BaseSample one can use setup_scene to initialize scene for the first time use.

NOTE: setup_scene is not called during hot reloading but only during loading the world from an EMPTY stage.

Objects can be added to the scene using scene.add()

setup_post_load is called when load button is pressed, after setup_scene and after one physics time step.

When pressing the play button for simulation we can get a registered callback triggered using World's add_physics_callback which takes the callback function as the parameter. This function is called just before the physics time step is executed.

Standalone application

Instead of an extension application one can also create a standalone application where the application is stated from python and one can control when to step physics and rendering.

After creating a SimulationApp one can add the world, scene and objects in the scene as usual. Then using world.step(render=True) one can execute a single physics time step.

Adding a robot to the stage

Issas Sim provides a number of helper classes which make it easier to deal with USD assets. After adding the robot as USD to the stage using add_reference_to_stage one can wrap the USD asset under Robot and add Robot to the scene.

Robot is an Articulation which according to the comments in the source code provides high level functions to deal with an articulation prim and its attributes/ properties.”

Every Articulation has an ArticulationController which according to it's comments is a PD Controller of all degrees of freedom of an articulation, can apply position targets, velocity targets and efforts.

During a physics step(which we have access to using callbacks as state above) we can apply an ArticulationAction to the ArticulationController to specify joint positions, efforts or velocities.

Instead of using the Robot class one can also use a more specialized version called WheeledRobot which derives from the Robot class. This class then provides a more specialized apply_wheel_action instead of more general apply_action of the Robot class.

Adding a custom controller

Controllers inherit from BaseController and must implement forward method which receive a whatever command and returns ArticulationAction.

Once a custom controller has been created, this can be assigned as the controller of the robot. The robot can apply_action on the ArticulationAction as returned by the forward method.

Before implementing custom controllers it is always wise to check ready made controllers such as DifferentialController or WheelBasePoseController.

Manipulator robots have their own domain specific controllers such as PickAndPlaceController. The forward method of this controller for example takes a picking position, a placing position and the current joint position and calculates a step forward and returns an ArticulationAction which the robot can apply.


For creating complex scenes one can use Tasks which provides a way to modularize scene creation. Custom tasks derive from BaseTask and implement get_observationswhich Returns current observations from the objects needed for the behavioral layer. The task can implement setup_scene pre_step and post_reset etc as usual.

Task can be added to the World which will then call task's setup_scene when initializing the scene.

Tasks help encapsulate the logic and hence make it easier to write larger more complex scenes. Just as in adding custom controllers, Issac Sim provides common tasks which should be used if relevant. For example PickPlace task for Franka.

Tasks can have other tasks in them. In this case the subtask's setup_scene can be called by the parent task.

Personally, I feel Task is not aptly named. It should be called SubScene perhaps.

Multi robot scene

Just a matter of maintaining a state machine and orchestrate the tasks as needed in the physics time step. Each robot has it's own controller and requires an ArticulationAction as needed to make the whole scene and handoff work as needed.

Till now we have assumed we can observe all the output state that is we have a full state feedback $y = C \cdot x$ where $y$ is the output, $x$ is the input and $C = I$. Assuming full state feedback, we have an optimal feedback matrix $K$ such that $u = -K \cdot x$ is the optimal control as derived using Linear Quadratic Regulator(LQR). In Matlab K = lqr(A,B,Q,R)`where Q is the penalty for errors in tracking and R is the control effort.

In real system we don't have access to the full state. Sometimes we don't have the sensors and sometimes measuring the actual state is too challenging. If we assume that $x \in \mathbb{R}^n, u \in \mathbb{R}^q, y \in \mathbb{R}^p$ then $p \ll n$


The observability obsv(A,C) answers the question whether it is possible to estimate any state $x$ given a series of measurements $y(t)$. If the system is observable then we can design a full state estimator which uses the control signal $u$ and the measured state $y$ to estimate $\tilde x$. The best estimated state $\tilde x$ is then fed into LQR controller to generate appropriate optimal control.

The observability matrix is defined as

\begin{equation} \mathcal{O} = [C \hspace{0.1cm} CA \hspace{0.1cm} CA^2 \ldots CA^{n-1}]^T \end{equation}

which looks(and behaves) like the controllable matrix $\mathcal{C}$. The system is observable if $rank(obsv(A,C)) == n$. Also if we take $[U, \Sigma, V] = svd(\mathcal{O})$ then $V^T$ represents the directions where the system is most observable.

Full State Estimation and Kalman Filter

The estimator itself is a dynamical system which can be represented as

\begin{align} \dot{\tilde x} &= A \tilde x + B u + \overbrace{K_f(y – \tilde y)}^\text{update based on new data 'y'} \nonumber \\ \tilde y &= C \tilde x \end{align}

The equations can be combined into one equation( by plugging $\tilde y = C \tilde x$) to get

\begin{equation} \dot{\tilde x} = (A – K_f C) \cdot \tilde x + [B \hspace{0.1cm} K_f] \cdot \left[ \begin{array}{c}u \\ y \end{array} \right] \end{equation}

If $A$ and $C$ are observable then by placing the eigenvalues of $(A-K_f C)$ using suitable gain matrix $K$ will be an optimal choice.

Kalman Filter

Kalman filter is an analog of LQR, that is an optimal estimator. A real system has noise and uncertainty. If we assume the system has $w_d$ disturbance and $w_n$ as the sensor measurement noise then we can express the dynamical system as

\begin{align} \dot x &= A \cdot x + B \cdot u + w_d \nonumber \\ y &= C \cdot x + w_n \end{align}

Where $w_d$ and $w_n$ are both white Gaussian noise with covariances $V_d$ and $V_n$ respectively. If we believe that the disturbance is way more than expected then we should trust the measurements and also the other way around. In some sense the ratio of the variances is a measure of how we should adjust the system.

Linear Quadratic Gaussian (LQG)

LQG combines LQR and LQE to generate an optimal controller for a linear system that is controllable and observable. It is interesting and not immediately obvious that an optimal estimator and optimal controller together are also optimal.

Sometimes LQG does not work and has robustness issues. That leads to robust control which is studied later. We have (as derived previously)

\begin{align} \epsilon &= x – \tilde x \nonumber \\ \dot x &= Ax -B K_r \tilde x + w_d \nonumber \\ \dot \epsilon &= (A- K_f C)\epsilon +w_d – K_f w_n \end{align}

Then combining the equations we get

\begin{equation} \frac{d}{dt} \left[ \begin{array}{c} x \\ \epsilon \end{array} \right] = \left[ \begin{array} {cc} (A- BK_f) & BK_r \\ 0 & (A-K_fC) \end{array} \right] \left[ \begin{array} {c} x \\ \epsilon \end{array} \right] + \left[ \begin{array} {cc} I & 0 \\ I & -K_f \end{array} \right] \left[ \begin{array} {c} w_d \\ w_n \end{array} \right] \end{equation}

It clearly combines the dynamics of LQR and LQE. This is an example of Separation Principle


For a linear dynamical system $\dot x = Ax$ the system is stable or unstable depending on the eigenvectors of the matrix $A$. If we now add a control variable to the system the dynamics can be written as \ref{eq:dynsys}

\begin{equation} \label{eq:dynsys} \dot x = Ax + Bu \end{equation} where $x \in \mathbb{R}^n$, $u \in \mathbb{R}^q$ and $B \in \mathbb{R}^{n \times q}$

We can now shape the eigenvectors using the control input, even for systems which otherwise are inherently unstable(for example, an inverted pendulum).

If we now assume a fully observable system where the output $y = Cx, C = I, y \in \mathbb{R}^p$ then the optmial control law(if controllable) is given as \begin{equation} \label{eq:optcontrol} u = -Kx \end{equation}

If we substitute \ref{eq:optcontrol} in \ref{eq:dynsys} then the resulting dynamics are given as

\begin{align} \dot x &= Ax \hspace{0.1cm} – BKx \nonumber \\ \dot x &= (A \hspace{0.1cm} – BK)x \end{align}

The eigenvectors of $(A-BK)$ govern the stability of the system and are arbitrarily controllable. In a physical system $A$, the dynamics of the system and $B$ the control surface are fixed. Hence depending on the choice of $A$ and $B$ the system is controllable(or not).

A system is controllable if the matrix

\begin{equation} \mathcal{C} = [B, AB, A^2B, A^3B \ldots A^{n+1}B] \end{equation}

has rank = n. In Matlab it is as simple as doing ctrb(A,B) to figure out the controllability. rank(ctrb(A,B)) == n test tells us if the system is controllable, not how controllable the system is.


Vectors in the reachability set $R_t$ are such that they can be reached with a corresponding $u(t)$. Full reachability implies that all $R_t = \mathbb{R^n}$. A controllable system implies full reachability(and vice versa)

Degree of controllability

If we take [U, S, V] = svd(C, 'econ') then the singular vectors obtained(ordered by eigenvalues) represent the directions where the controls are most amenable(This is related to the Gramian).

There is a PBH test whereby $rank[A – \lambda I \hspace{0.2cm} B] = n$. That is B needs to have some component in each eigenvector direction for the system to be controllable. If B is a random vector then with high probability $(A,B)$ is ctrb


If all unstable(and lightly damped) eigen vectors of $A$ are in the controllable subspace(the singular vectors above) then the system is stabalizable.

Inverted Pendulum on a Cart

The state of the system is given as $[x, \dot{x}, \theta, \dot{\theta}]^T$ which is the position of the cart, angle of the pendulum and their velocities. The dynamical equation can be derived using for example Lagrange equations. The equations are non linear so we need to linearize them around the fixed points. The natural fixed points are $[0, 0, 0, 0]^T$ and $[0, 0, \pi, 0]^T$

The equations are nicely derived in this video – Equations of Motion for the Inverted Pendulum (2DOF) Using Lagrange's Equations

The final equations are

\begin{align} (M+m)\ddot{x} – m l \ddot{\theta} + m l \dot{\theta}^2 sin(\theta) &= u(t) \nonumber \\ l \ddot{\theta} – \ddot{x}cos(\theta) – g sin(\theta) &= 0 \end{align}

If we linearize around zero here we can assume $sin(\theta) \approx \theta$ and $cos(\theta) \approx 1$ and that simplifies the equations into

\begin{align} (M+m)\ddot{x} – m l \ddot{\theta} &= u(t) \nonumber \\ l \ddot{\theta} – \ddot{x} – g \theta &= 0 \label{eq:lininvcart} \end{align}

Note that since $\theta \approx 0$ $\dot{\theta}^2$ is vanishingly small so we also ignore it.

We can now write the state space form of the equations \ref{eq:lininvcart} which gives us the $A$ and the $B$ matrices. If we check eig(A) then we can see that a eigen value is unstable so given the control we can check if this system is controllable. That is rank(ctrb(A,B)) == 4 is true than the system is controllable.

If the system is controllable then we find $K$ such that the eigenvalues of $(A – BK)$ are stable. We can use K = place(A,B,eigs) where eigs is a suitable vector with eigen values(in LHP). We can chose any eigs but the choice is eventually governed by the non linear dynamics(which we ignored) and hence the linearization if not accurate or we cannot generate the control effort to have those eigen values. There is an optimal controller however which give the optimal control K which is the best solution.

Linear Quadratic Regulator (LQR) Control

Identify a cost function which integrates what performance I want from the controller. The cost function can be defined as

\begin{equation} J = \int_{0}^{\infty} (x^T Q x + u^T R u) dt \end{equation}

where $Q$ is a diagonal matrix which represents penalty for not tracking the set point and $R$ represents effort spent on control which we want to minimize. In Matlab one can do K = lqr(A,B,Q,R) which gives the optimal gain matrix for the given system.

The minimization objective is of the order $O(n^3)$ which makes it intractable for large dimensional states.

In the following post we study modern optimal control theory. What is easy and challenging.

We already have dynamical systems represented as a series of Ordinary Differential Equations(ODE) which model the physical natural systems. This is already successful in making prediction of the naturals system.

In control theory, we want to manipulate and control the systems studied above in a desired manner.

Passive control

No energy expenditure and if it works then it's the best. Often not enough and we need to do more for actual control.

Active Control

Active energy pumped to control the system and keep stable.

Open loop control.

We have a known system, which is also called a plant. A system has an input(u) and outputs(y). Open loop control understands the plant so that we can give exact control input to get the required output. For example a sinusoidal for controlling the balance of vertical pole on a finger. Always needs energy.



Autonomous robots need to move. For moving they need to know where they are and where they are going. This is the job of localization. That is given a map(probably constructed by humans) can I(robot) determine where I am? If I know where I am and I know where to go(destination on map) then I can design a path to get there. Path planning is a topic in itself and in this post I(Author) will write my notes on localization as collected from different sources(duly noted), experiments in software and hardware to cement my understanding of the topic.

The challenge of localization


If perception was perfect and map was accurate the problem of localization would be trivial. For example if GPS was fully accurate to the resolution(accuracy of position, for example 1mm for table robots and perhaps 10 cm for an AGV) demanded from the robot and we had a detailed map of the environment we would know where we are and where to go. However, sensors are not accurate and there is an inherent noise in the sensor readings. That is $x = h + \mathcal{N(\mu, \sigma)}$ where $h$ is the actual observation and $\mathcal{N(\mu, \sigma)} $ is noise, which is characterized here as Gaussian white noise. However sometimes the noise may not be white and then it is called colored noise.

In particular, if each sample has a normal distribution with zero mean, the signal is said to be additive white Gaussian noise. Wikipedia With a white noise audio signal, the range of frequencies between 40 Hz and 60 Hz contains the same amount of sound power as the range between 400 Hz and 420 Hz, since both intervals are 20 Hz wide. The frequency spectrum of pink noise is linear in logarithmic scale; it has equal power in bands that are proportionally wide. This means that pink noise would have equal power in the frequency range from 40 to 60 Hz as in the band from 4000 to 6000 Hz. Wikipedia

In addition to sensor noise, actuators have improper actuation which is another form of noise. Wheels slip and friction coefficients are unknown. Odometry calculations using dead reckoning is known to diverge from the true state over time no matter how accurate the sensors are.


Finally at the end of July my Aalto account was activated(restored actually since I already had an account because I registered for some public online courses). Excitedly I jumped to the course plan planner service which is called SISU.

Fun fact: SISU is an actual word in Finnish which loosely translates to tenacity.

Unfortunately around the same time a migration from the previous course planner system to sisu was going on so sisu was not available till 9th August.

That was a hard 2 weeks of waiting!

As soon as sisu became available I made my study plan. I want to raise my understanding of current robotics research to an advanced level so I focus all my time on campus with that goal in mind.

My Study Plan

The masters program in Mechanical Engineering in Aalto requires completing 120 credits. 30 of those credits are reserved for the graduate thesis.

Following is the initial plan that I have, broken according to the different sections in sisu.


I am finally going to graduate school.

When I finished my bachelors of engineering I was certain I would never ever go on to study masters in engineering. 4 years were enough. Even though I had a good time at college, it was not studies which were the defining part. It was meeting people and the experience of living on my own. In fact, my friendships from that period are some of my closest friendships even now.

Instead of engineering, I did want to get into management then. I tried halfheartedly to get into IIM. But half efforts lead to no results so that was the end of it.

That's 16 years now – 16 years spent in software, even though I studied electronics engineering.

And now that I am finally going back to study engineering why would I go and do what I liked the least – Mechanical Engineering?


Enter your email to subscribe to updates.