Series:
- Autonomous Mobile Robot #1: Data collection to a trained model
- Autonomous Mobile Robot #2: Inference as a ROS service
- Autonomous Mobile Robot #3: Pairing with a PS3 Controller for teleop
- Autonomous Mobile Robot #4: Integrating with GCP
Note
These notes will be to document the progress related to the AMR project found in this repo, where AMR stands for autonomous mobile robot. The base robot is assumed to be a Donkey at the time of writing, the RC car based autonomous race car popularized by DIY Robocars originating out of the US. Any advice on system design, coding, anything is much appreciated.
The objective is to create an autonomous mobile robot that can function in a diverse range of environments.
The assumed set up is that we have a raspi connected to a disassembled RC car as specified in the Donkey documents. [1] In contrast to the Donkey software we will be using primarily ROS (Kinetic) running on Ubuntu Mate 16.04.1.
The following code will bring up the environment to allow for the teleoperation of the mobile robot, and initiate the data collection and storage process.
$ git clone https://github.com/surfertas/amr_core.git
$ cd amr_worker/amr_bringup/launch/
$ roslaunch amr_teleop_bringup.launch
Make sure that the external packages joy package and
videostream_opencv are installed and
placed in the amr_worker
directory.
Thanks to the below as some of the code in this repo has been influenced by them. (Licenses have been respected to the best of my knowledge.)
Collecting the data and storing
The project uses ROS middleware. In order to train a deep learning model, we need to first collect data. ROS uses a pub/sub protocol which facilitates an environment that is actually quite conducive for the data collection process.
Sensor data is published over ROS topics. Any person interested in the data can simply subscribe to the respective ROS topic, and obtain the data as it is published. (Not really ideal from a safety perspective, though is addressed in ROS2.)
For this project, the data collection process is handled by the amr_data_processor
package. The script amr_data_storage_node.py
has the specific implementation. The required
configurations need to be set in data_storage.yaml
found in the
amr_data_processor/config
directory before using.
We throw away non-synced data and only collect synced data as the publishing
frequency for each sensor is likely to be different. We simplify the problem by
only storing data that is synchronized by using the
ApproximateTimeSynchronizer()
[1] method which allows for the synchronization of
different topics. See below for snippet showing an example usage of this method. I have
included the initialization of the synchronizer as well as the associated call
back function. The full source can be found
here.
Details of the sync algorithm can be found at the ROS wiki site. [2]
def _init_subscribers(self):
""" Set up subscribers and sync. """
# Initialize subscribers.
img_sub = message_filters.Subscriber(self._img_topic, CompressedImage)
cmd_sub = message_filters.Subscriber(self._cmd_topic, Command2D)
subs = [img_sub, cmd_sub]
# Sync subscribers
self._sync = message_filters.ApproximateTimeSynchronizer(
subs,
queue_size=10,
slop=0.2
)
self._sync.registerCallback(self._sync_sub_callback)
rospy.loginfo("Synced subscribers initialized...")
def _sync_sub_callback(self, img, cmd):
""" Call back for synchronize image and command subscribers.
Args:
img - image message of type CompressedImage
cmd - velocity message of type TwistStamped
"""
if len(self._img_path_array) < self._capacity:
cv_img = cv2.imdecode(np.fromstring(img.data, np.uint8), 1)
path = os.path.join(self._data_dir, '{}.png'.format(rospy.get_rostime()))
cv2.imwrite(path, cv_img)
self._img_path_array.append(path)
self._cmd_array.append([cmd.x, cmd.y])
if len(self._cmd_array) % self._save_frequency == 0:
self._save_data_info()
In this case, each time the CompressedImage
message published by the topic
specified in self._img_topic
syncs with Command2D
message published by the
topic specified in self._cmd_topic
, the call back function
self._sync_sub_callback()
is called with the synced messages passed as
parameters. The call back function converts the ROS image to a cv matrix format,
and stores the data to the specified path.
Note that we are storing the data to an external SSD as the image files can consume material amount of memory quickly.
We store the path to the image as well, which will be used as features fed into a
learning algorithm and the commands, used as the target or label, as an array.
The data is saved to disk periodically by calling self._save_data_info()
presented
below.
def _save_data_info(self):
""" Call periodically to save as input (path) and label to be used for
training models.
"""
data = {
"images": np.array(self._img_path_array),
"control_commands": np.array(self._cmd_array)
}
with open(os.path.join(self._data_dir, "predictions.pickle"), 'w') as f:
pickle.dump(data, f)
rospy.loginfo("Predictions saved to {}...".format(self._data_dir))
Transforming data into consumable form
Once we are satisfied with the amount of data collected, we can turn to loading and formatting the data to transform the raw data into something consumable by a learning model for training.
The PyTorch Dataset API is tremendously handy as PyTorch allows for almost seamless integration of pandas allowing for the reformatting of the data to be a breeze. The Dataset class is shown below. The repo page is found here.
We can simply load the pickle file and convert the dict to a pandas DataFrame
object. Note that the Dataset api requires the definition of __getitem__
and
__len__
. This can be customized for the task at hand. We can easily change the
number of input features or target variables used, as well as generate sequences
if necessary.
class AMRControllerDataset(Dataset):
"""
Custom dataset to handle amr controller.
Input is an image taken from a monocular camera, with controller mapping
image to steering and throttle commands.
"""
def __init__(self, pickle_file, root_dir, transform=None):
self._pickle_file = pickle_file
self._root_dir = root_dir
self._transform = transform
self._frames = self._get_frames()
def __len__(self):
return len(self._frames)
def __getitem__(self, idx):
path = self._frames['images'].iloc[idx]
# Get image name
img_name = path.rsplit('/',1)[-1]
# Create path to image
img_path = os.path.join(self._root_dir, img_name)
# Get actual image
img = io.imread(img_path)
if self._transform is not None:
img = self._transform(img)
return {
'image': img,
'commands': self._frames[['throttle', 'steer']].iloc[idx].as_matrix()
}
def _get_frames(self):
pickle_path = os.path.join(self._root_dir, self._pickle_file)
with open(pickle_path, 'rb') as f:
pdict = pickle.load(f)
img_df = pd.DataFrame(pdict['images'], columns=['images'])
controls_df = pd.DataFrame(pdict['control_commands'], columns=['throttle', 'steer'])
df = pd.concat([img_df, controls_df], axis=1)
return df
Thats it.
We can just initiate AMRControllerDataset()
from our training implementation
and the data will be ready to go.
Training a model based on the collected data
Now that we have our raw data converted to a consumable form we can move on to the training step.
The work flow can be summarized as below and found in amr_models
directory.
- Specify data set, done in
data_loader.py
. (Walk through above) - Specify data transformations, done in
transforms.py
(Note: If using a pre-trained model that used imagenet, need to take care to use appropriate transformations when running inference.) - Specify learning architecture, done in
model.py
. - Layout training, validation process in
train.py
To split the training data into a train and validation data set, I found the
SubsetRandomSampler()
[4] extremely helpful. The training will save a model
every time the metric (MSE) is improved on the validation step.
Once we are satisfied, we are finished with the training and we have a model saved that can be loaded and used for inference.
Summary
By simply calling roslaunch amr_teleop_bringup.launch
we kick off a system
that records and stores synced sensor data that can be processed by a training
system to train a learning model. In the next update we will walk through how
to use the trained model.
References:
- http://docs.donkeycar.com/guide/build_hardware/#parts-needed
- https://docs.ros.org/api/message_filters/html/python/
- http://wiki.ros.org/message_filters/ApproximateTime
- http://pytorch.org/docs/master/data.html