Docker Deploy (Deprecated)
These instructions are deprecated. Use the Binder portfolio deployment instructions instead. These are archived here for students interested in exploring Docker and learning more about container technology.
Individual Assignment 3.1: Polish and Publish Big Ideas #1 & #2
!](https://images-na.ssl-images-amazon.com/images/I/71Sy7ibwXvL._AC_UL320_SR202,320_.jpg)
Figure 46: Don’t forget the importance of plating!
This week you are going to compile your source material into a publicly facing website, which you will publish using the free github.io GitHub pages service. Follow the command line tutorial below and read the documentation that follows for more information.
Walkthroughs
I’ll upload a screencast of using Rstudio Server on Saturday, but the command line tutorial will help you get most of the way there. Note that it isn’t a movie; you can actually cut and paste text from the tutorial into your local shell.
Note: To change the playback speed, adjust the "speed=3" argument in this url.Figure 47: Shellcast!
Or if you prefer to use the Rstudio server GUI interface, follow the screencast below. Either method, command line or Rstudio GUI, will allow you complete the assignment succesfully.
Figure 48: Screencast!
Step 1: Expect errors
!
We will need to use a bit of command line programming to get this done, and the first (second, third, …) time you run the build commands something will go wrong. This is, of course, why you learned to make an issue on GitHub.com! The best way to troubleshoot is to try for 15 minutes, and if you can’t resolve the bug quickly, make an issue describing the problem with a copy of the error message on your own github.com/w201rdada/portfolio-username
repository page. For bookdown
build errors using the _build.sh
script below, first read the appropriate section of the documentation relevant to the syntax you are trying to debug before posting your issue.
Step 2: Install Docker
Navigate to docker.com and you should find an installation of docker for your system. Docker by itself is very lightweight so go ahead and install it. After you’ve installed and started docker, open a command line shell terminal and enter the following command:
docker pull w201rdada/portfolio
If you see a bunch of layers downloading and extracting, it worked! This may take a while so leave it running while you skip to the next step. If you’re curious about what docker is and why we’re using it, read on.
Docker on Windows
Windows users may need to take a few additional steps to configure Docker before the containers will work.
Why Docker?
You will be building your website using a docker container. The practical reason to use docker is to save you from needing to install software manually and to help ensure that we are in a common computing environment. What git
and GitHub lets us do for sharing code, docker
and Docker Hub let us do for sharing code execution. Whether you are a Linux, Windows, or macOS user, if you can install and run docker then you’ll virtually be in the same computing environment as the rest of the class. That means that what works for one of us should work for all of us.
If you’re familiar with virtual machines, docker has a similar goal, which is to allow your personal computer to serve as a host for another computing environment altogether. It is like having a different computer inside your computer. So for instance if you’re on Windows or on a Mac, docker can let you easily play around with Linux inside a container. Neat! Now you have no excuse to avoid diving into command line programming.
One caveat: while docker is a lightweight solution compared to most VMs, there is no avoiding some drain on computing, memory, and storage resources between the host and the container. I advise you to have at least 10-15GB of empty hard disk space before proceeding. The docker image you will download for W201 will take under 5GB of space, and it is important to have a lot of empty space left over so your computer doesn’t get bogged down with everything else you ask it to do. The silver lining here is that we will treat the Docker containers as ephemeral, meaning you can throw the image away to get your resources back without losing any of your work!
Figure 49: Read the dockumentation to learn more!
Docker and shell programming are very interesting topics, but our goal here is not to study them. Our first priority is going to be to use these tools to make a website for our big ideas. The gist, however, is as follows. In the nifty figure above, your laptop’s or desktop’s local operating system serves as the blue client
and host
. After downloading an image
from a registry
like Docker Hub, you run a container
defined by the image
. Indeed you could launch a ton of containers and they’d all start out from the same spot while being completely isolated from each other. Work in any of them is invisible to the others. This is a very nifty trick for keeping software stacks simple, organized, and reproducible.
“Ephemeral” means that when the container is terminated all work done within it after the container is launched will be lost. This could be scary if your work were to become trapped inside the container! However, by using volume mapping
, we can ensure that important work is safely saved outside of the container, such that losing the container is no big deal. Similarly, by using port mapping
, we can interact with processes running inside the container, for instance by using a web browser. Each of these mappings creates a little puncture in our “containment”, but we can intentionally set how much our host resources may be exposed to what happens within the container.
Step 3: Update Your Portfolio
Working directly with GitHub.com is fine for editing source documents, but it’s finally time to execute the code so we’ll need your local machine to do that. First, to help avoid the old cross platform line ending conundrum, configure git on your local machine to intelligently convert CRLF to LF or vice versa:
Configure Git on OS X or Linux to properly handle line endings:
git config --global core.autocrlf input
Configure Git on Windows to properly handle line endings:
git config --global core.autocrlf true
If you’ve never cloned your portfolio repository locally, open a new shell and change your working directory to a location where you would like to download it. Then enter the command:
git clone https://github.com/w201rdada/portfolio-username
Where you should replace username
with your GitHub username. What, you ask, if I don’t have git
installed on my local machine?
DOCKERTIME!
Well you could install git but that sounds impossible. Why not use our nifty docker image which already has git installed! To use git from the container to download your repository to your current working directory, try:
docker run --rm -v "$(pwd)":/home/root/ -w /home/root/ -e UID=$UID -it w201rdada/portfolio bash
To break this command down, docker run
starts the container using the image w201rdada/portfolio
that we pulled earlier (it will pull it from Docker Hub automatically if you hadn’t before). Everything in between are options to docker run
: --rm
is a flag saying to toss the container when you’re done with it (otherwise they can stick around and pile up); -v "$(pwd)":/home/root
maps your current working directory to root’s home folder inside the container; -w /home/root/
sets the container’s working directory to the same; -e UID=$UID
is optional and aligns the user ID inside and outside the container to help avoid file permission problems when using volume mapping; and -it
gives you an interactive/teletype session so it acts like a regular command line prompt. Finally bash
is the command to run immediately after starting the container, which will start a new shell session for you. If it worked, you should see a command line prompt that looks something like:
root@f1f630304be8:
If you do, congrats, you’re in the container! You’re logged in root, so you can do anything! Normally that’s scary, but if you mess up you can just toss this container and start fresh. For now all you have to do is enter git clone https://github.com/w201rdada/portfolio-username
and give it your username and password.
Here you can leave the container open and continue to work. Any remaining git commands can be executed from the container as above, or from your local machine.11
When you’re finished with the container, type exit
or control-d
to trash it. The repo will have cloned to your working directory. The next time you want to use a container for a command line session you can repeat the docker run
command above. Note that while using this approach any files written outside of the /home/root/
directory will be lost when you type exit
, since no other directory is mapped to the host.
Pull Portfolio Updates
If you’ve already been working locally with your git repository, be sure to stash and commit changes before proceeding.
From the host or from the container, you should change your working directory to your cloned git repository, i.e. with cd portfolio-username
. Since you forked this repository I have made some updates to the shell scripts. To incorporate those changes enter:
git pull https://github.com/w201rdada/portfolio
This will incorporate changes to files you’ve ignored with any new changes from the original repository. It will not, however, replace your work in 01.Rmd
or 02.Rmd
, as it will detect a merge conflict. That’s exactly what we want–your changes plus my changes! If you did happen to fuss around with the scripts or other resources leading to a merge conflict, just go to https://github.com/w201rdada/portfolio and manually copy the scripts you’ll need.
Another option here is to use the
--rebase
flag. As always, breakfast can help us understand the difference. The portfolio repository you cloned a few months ago is like toast. Since then you’ve been adding butter to the toast. But after you forked the repo I added my own jelly to the original repo and you don’t have those changes.git pull
puts the jelly on top of your butter, it adds my commits on top of your commits.git pull --rebase
picks the butter up, slips the jelly under it, and puts your butter back on top, so your changes are the most recent commits. Mmmm toast.P.S.: If you want to be a real Bay Area hipster make the expensive toast pilgrimage to Trouble Coffee. An adventure for your next Immersion :)
The merge will automatically generate a commit and ask you to describe it. If you see some text with a bunch of ~
’s along the left edge, you’re in the vim
text editor and you should enter :wq
to w
rite the message and q
uit. This saves you the step of manually committing the changes. To push your changes to GitHub enter:
git push origin master
Step 4: Launch RStudio Server Container
The w201rdada/portfolio
image is a reasonably complete data science image based on Debian jessie
and containing python
and R
, as well as caddy
which is a helpful web server. We won’t need python but it’s there if you’re interested. Instead, we’ll be using bookdown
which is an R package that extends the popular rmarkdown
package. We’ll also use the very helpful RStudio Server
IDE. This is the exciting part folks, so hold on to your hats!
The docker run command we’ll use now is fairly complicated, so I’ve simplified it by creating a shell script called _rundock.sh
. To learn the details open the script and read the comments. This script is the only one inside the portfolio repository that is intended to be run outside of the container, so depending on the health of your docker install YMMV. If the container starts everything else should work as expected. Apologies in advance for asking you to be beta testers, but it’s exciting tech that we can debug together where it runs off the rails!
From your host command line, cd
into your portfolio-username
folder. Then simply type:
bash _rundock.sh
If bash isn’t your shell then use the appropriate command, such as
sh _rundock.sh
.Windows users may want to run
bash <(unix2dos < _rundock.sh)
to avoid line ending conflicts between Unix and Windows.
This script runs a docker container with a more complicated configuration, albeit from the exact same image we were using before. After it runs you should see:
rstudio server port not set, setting to 1873, the year South Hall was built :)
web preview port not set, setting to 2014, the year MIDS launched :P
pulling container image...
Using default tag: latest
latest: Pulling from w201rdada/portfolio
Digest: sha256:5767a0c8db2f94956279610f1e1011b07210a0043fbcbbc614e6099eeffdcca7
Status: Image is up to date for w201rdada/portfolio:latest
portfolio
running portfolio container...
dddab2e21ead3a38b54ea02b88b44c4a4b5407eb0d64ab752739b7147b538717
ID dddab2e21ead
NAME portfolio
IMAGE w201rdada/portfolio
PORTS 80/tcp, 443/tcp, 0.0.0.0:2014->2015/tcp, 0.0.0.0:1873->8787/tcp
COMMAND "/start.sh"
CREATED 2017-07-19 22:04:26 -0700 PDT
STATUS Up Less than a second
SIZE 665B (virtual 2.44GB)
...success!
rstudio server listening on port 1873
user: oski
pass: goldenbears
web preview listening on port 2014
Windows users: you may need to activate localhost before proceeding.
Long story short, the container launches RStudio Server listening on port 1873 and a local web server listening on port 2014. To access RStudio, open a web browser at localhost:1873
. You should be prompted to log in. Type oski
for the user and goldenbears
for the password, and it should load RStudio. This RStudio is being generated form the container using the version of R installed there, and it is totally unaware of any other R installation on your host.
Open a second browser window at localhost:2014
and you should see a simple 404 Not Found
error. That means its working! The web server is running but the directory is empty. So now let’s give it a website to serve!
Step 5: Build HTML from Rmarkdown
RStudio gives you full access to the R statistical programming language, and it has some fun features that make working with Rmarkdown more enjoyable. We’ll draw some of those details out later. For now let’s keep our eyes on the prize.
Open the _build.sh
file from the Files pane in RStudio. It will recognize it as a shell script. Press the Run Script button in the top right corner of the _build.sh
tab. This simply runs the bookdown
command that will compile your source documents into HTML. If this step fails, it is likely because there is a syntax error somewhere in your document. Try to read the console output to get a sense of where the error is, or enter traceback()
from the console to get more clues about the origins of errors.
This step will likely be a stumbling block for many of us, so we need to lean back on our skills in making issues on GitHub.com. Add an issue with your error messages to your own portfolio repository, then assign the issue to your instructor and any other collaborators that you want to ask for help. After you’ve made your issue, post a link to it on Slack, which will be a great place to help each other out as well.
If it did work you will in the console see a line that says:
Output created: _book/index.html
[1] "/home/oski/_book/index.html"
This is the file your web server is looking for! Reload localhost:2014
and you should see your source rendered as a nice HTML “gitbook”. Savor the feeling! At this point you may be inspired to go back and edit your source to work on the composition itself. That’s a great motivation; seeing your work in publication format tends to stir the creative juices!
Step 6: Deploy to w201rdada.github.io
The web preview is being hosted locally for your eyes only. That’s the place to get your draft finalized, do your debugging, and all round polish off your work. When you’re ready to show it to the world (well your classmates, at least), then it’s time to use the _deploy.sh
script.
Go ahead and open the script. This is the list of commands that need to run to deploy the HTML so it will render using the GitHub Pages service. To push to your repository replace these lines:
# git config --global user.email "calnet@berkeley.edu"
# git config --global user.name "Oski Bear"
With your own information, being sure to delete the comments at the start of the line. So for me I would replace the lines with:
git config --global user.email "notmyreal@email.edu"
git config --global user.name "Brooks Ambrose"
The email is what GitHub will use to identify you with your account, and prompt you for authorization. After you put in your own information, save the file.
The _deploy.sh
script is meant to be used immediately after the _build.sh
script, so run that first if you haven’t already. Running _build.sh
multiple times won’t cause any problems, other than making you wait, so always press it to be sure you’ve actually built your latest changes before deploying them. The preview build will be located in the _book
directory, while the deployment build will be in the docs
directory. This means you can keep working without interfering with a previously published draft. Running the _deploy.sh
script however will overwrite the docs
build with the _book
build, so it’s a one-way process! Ultimately it doesn’t really matter if something bad happens to the html, because if your source is intact you can easily rebuild it.
OK enough beating around the bush! RStudio will show the same Run Script button at the top of the _deploy.sh
script, but don’t push it! Unfortunately the interactive prompts required by git don’t play well with this RStudio implementation. If you do push it, it’s not big loss, but it can confuse the commit process.
Fortunately, there is an interactive shell available under the Tools > Shell...
menu. Open that shell, and command line prompt should take over your window. Type bash _deploy.sh
there and hit enter, and it will eventually ask you for your GitHub authentication. Enter it to push your website to the remote. Mazeltov!
_bash.sh
and_deploy.sh
can be run from the interactive shell within the container as described above, if you don’t want to use RStudio Server or the preview page. This could be a nice way to handle small updates. Because all your changes are saved to your local disk, you can preview the html outside of the container using any method you prefer.
Windows users may want to use
bash <(dos2unix < _build.sh)
orbash <(dos2unix < _deploy.sh)
from the RStudio Tools > Shell command line prompt if you are experiencing EOL conflicts.
For convenience, running bash _build+deploy.sh
from the Rstudio Tools > Shell will run both scripts while also converting Windows CRLF line endings to LF. If you’re experiencing strange errors try this approach.
Step 7: Activate GitHub Pages
Even after you’ve pushed your website to GitHub, you need to activate the GitHub Pages service from your repository’s settings menu. Navigate to https://github.com/w201rdada/portfolio-username/settings
and scroll down until you see Source under the GitHub Pages heading. Use the drop down menu to select master branch /docs folder
, then save. It will then show you the URL of your website, which should be at https://w201rdada.github.io/portfolio-username/
. The first time you do this it will take several minutes to propagate across GitHub’s servers, so be patient. If you get the 404 check back after a few minutes.
Step 8: Kick back and bask in your glory!
You did it! Or maybe you didn’t. Either way let us know how it turned out by filling us in on Slack!
When you’re done for the day, trash the container to get your host’s system resources back. Be sure to save all your open files in RStudio first!
docker stop portfolio
docker rm portfolio
Or more succinctly:
docker rm -f portfolio
When you’re ready for another session, fire up _rundock.sh
again and try making a few other helpful changes to your website. Don’t forget to rebuild and redeploy to publish them:
- Open
_output.yml
from within RStudio and changeMYGITHUBUSER
in theedit:
field. This will let authorized collaborators (like your instructor!) easily make copy edits to your source from GitHub.com using the pencil icon at the top of the web page. - Also from
_output.yml
make other edits to your sidebar. You can remove for instance, the LinkedIn line or any of the others. Look around your site and replace “Oski Bear” and other boilerplate with your own info. - Edit
index.Rmd
with your info. Write your own welcome message, fill out abstracts, record and link your own welcome video, or delete anything you don’t want. - Delete
03.Rmd
to prevent it from being built. bookdown
has lots more configuration options. Give the documentation a whirl. It’s not the sexiest website in the world, but it plays great withR
and is a serviceable, free solution for design work.- Of course any website you want to build can be hosted from github.io.

Figure 50: Docker FTW!
Submission Instructions
Your submission will be due Monday, 03/19/2018 at 11:59 PM PDT as a deployment to w201rdada.github.io. If your page isn’t live I will expect an issue posted to your portfolio repository explaining the roadblock.
Your instructor will be happy to help debug git or any other problems occuring within the container, but we’re afraid we can’t offer support for your locally installed software!↩