# we will run `docker run -it --rm -v ./dataset_dir:/dataset YOUR_IMAGE` to run the downloading container. # The provider will mount their `dataset_dir` to `/dataset` directory in the container. # So, you should save your dataset into `/dataset` so that our `training container` can access your dataset. FROM ubuntu:22.04 # This is our example script: # First, we create the directory named `/dataset` so that it can link to the provider's host. # Then, we use apt to download `wget` so that we can use wget to download the dataset from our target URL. RUN mkdir /dataset && apt-get update && apt-get install -y wget # And we get the dataset file from the target URL and save it into /dataset # Then, we unzip it so that the `training container` can access them directly. WORKDIR /dataset CMD pwd && wget https://www.cs.toronto.edu/\~kriz/cifar-10-python.tar.gz && tar zxvf cifar-10-python.tar.gz