Series:
- Autonomous Mobile Robot #1: Data collection to a trained model
- Autonomous Mobile Robot #2: Inference as a ROS service
- Autonomous Mobile Robot #3: Pairing with a PS3 Controller for teleop
- Autonomous Mobile Robot #4: Using GCP Storage
I have fallen back to PyTorch (once again). The training environment is still integrated with Google Cloud Platform(GCP).
The current process implemented to get the collected samples to the training environment is notably manual as one can see in the walk through in the coming paragraphs. I plan to automate the process as much as possible on iterations to follow.
A rough outline of the system in place
-
The raw images from a monocular camera and associated controls are stored on an external SSD connected to the Raspi via a USB cable. At a user specified frequency a pickle file that contains samples (path to image, controls) is saved to the SSD.
-
Once data collection process is complete the data needs to be transferred to GCP storage. Note that a GCP bucket is required. The following command will transfer the file from the SSD to the specified cloud storage.
$ gsutil -m cp -r [image_dir] gs://[bucket_name]
It is worth highlighting that using the option -m speeds up the transfer considerably where gsutil help describes the flag as follows.
Causes supported operations (acl ch, acl set, cp, mv, rm, rsync, and setmeta) to run in parallel. This can significantly improve performance if you are performing operations on a large number of files over a reasonably fast network connection.
-
Next we need to move the training data from GCP storage to the VM instance used for GPU training. We can use
gsutil -m cp -r gs://[bucket_name] [data_dir_on_vm_instance]
. -
Once the data has been transferred, we can move to the training environment and generate the
gcs.csv
file. Go to/data/gcs
and run the following command.
$ ls [data_dir_on_vm_instance] >> gcs.csv
Once complete, we can run the script generate_csv_with_url.py
and a path_to_data.csv
will be generated.
Start the training
- Set the training configuration in the config/default.py file. Run
python train.py
and the training process should kick off.