20 January 2022

How to parse console arguments in your Rust application with Clap

In a previous article I explained how to use ArgParse module to read console arguments in  your Python applications. Rust has it's own crates to parse consoles arguments too. In this article I'm going to explain how to use clap crate for that.

One wonderful thing about Rust is that inside its hard rustacean shell it uses to have a pythonista heart. Using clap you'll find many of ArgParse features. Actually concept are quite similar: while you called verbs as subparsers in ArgParser, here in clap you're going to call them subcommands, and ArgParse arguments are simply named at clap like arg. So if you're used to ArgParser you'll likely feel at home using clap. I'm going to assume you read my ArgParse article to not to repeat myself explaining the same concepts. 

Like any other Rust crate you need to include clap in your Cargo.toml file:

After that you can use clap in your source code. To illustrate explanations, I'm going to use as an example the command parsing I use in my project cifra-rust

As you can see there, you can use clap directly in you main() function but I like to abstract it in a generic function that returns my own defined Configuration struct type. That way if I switch from clap to any other parser crate change will be smoother for my app (reduce coupling). So, as I do at python I define a parse_arguments() function that returns a Configuration type.

There you can see that root parser is defined at clap using App::new(). As other rust crates, clap makes heavy use of builder pattern to configure its parsing. That way you can configure command version, author or long description ("long_about"), between other options, as you can see from lines 277 to 280:


Clap behaviour can be customized with setting() call. A typical parameter for that call is AppSettings::ArgRequiredElseHelp, to show help if command is called with no arguments:


A subparser is created calling subcommand() and passing it a new App instance:


With call about() you can define a short description, about the command or arguments, that will appear when --help is called.

Usually parser (and subparsers) will need arguments. Arguments are defined calling arg() in the parser they belong to. Here you have an example:


In last example you can see that subparser create (line 285) has two arguments: dictionary_name (line 287) and initial_words_file (line 292). Note that every arg() call uses as a parameter and Arg instance. Argument configuration is done using builder pattern over Arg instances.

Argument dictionary_name is required because it is configured as required(true) at line 288. Be aware that although you can use a flag here you are discouraged to do so. By definition, all required arguments should be positional (i.e. require no flag), the only time a flag could be used in required arguments is when called operation is destructive and user is required to prove he knows what he is doing providing that extra flag. When a positional argument is used you may call index() to specify the position of this argument relative to other positional arguments. Actually, I've found out later that you can leave off index() and in that case index will be assigned in order of evaluation. What index() allows you is setting indexes out of order.

When you call takes_value(true) on an argument, the value provided by user is stored in a key called like the argument. For instance, the takes_value(true) at line 290 makes the provided value to be stored in a key called dictionary_name. If your are using an optional flag with no value (i.e. a boolean flag) you could call takes_value(false) or just omit it.

The call to value_name() is equivalent to metavar parameter at python's argparse. It lets you define the string you want to be used to represent this parameter when help is called with --help argument.

Oddly, arguments don't use about() to define their help strings but help() instead.

You can find an optional argument definition from line 292 to 298. There, long version of flag is defined with long() and short one with short() call. In this example, this argument could be called both "--initial_words_file <PATH_TO_FILE_WITH_WORDS>" and "-i <PATH_TO_FILE_WITH_WORDS>".

It can be useful call validator(), as it let you define a function to be called over provided argument to assert it is what you expect to receive. Provided function should receive an string parameter with provided argument and it should return a Result<(), String>. If you make your checks and find correct the argument the function should return an Ok(()), or an Err("Here you write your error message") instead.

Chaining methods on nested arguments and subcommands you can define an entire command tree. Once you finish you have to make one final call to get_matches_from() to the root parser. You pass a vector of strings to that method, with every individual command argument:



Usually you'll pass a vector of strings from args() which returns a vector with every command argument given by user when application was called from console.



Note that my main() function is almost empty. That is because that way I can call _main() (note the underscore) from my integration tests, entering my own argument vector, to simulate a user calling the application from console.

What get_matches_from() returns is a ArgMatches type. That type retuns every argument value if be search by its key name. From line 149 to 245 we implement a method to create a Configuration type using an ArgMatches contents.

Be aware that when you only have a parser you can get values directly using value_of() method. But if you have subcommands you have first to get the specific ArgMatches for that subcommand branch doing a call to subcommand_matches():


In last example you can see the usual workflow:
  • You go deep getting the ArgMatches of the branch you are interested in. (lines 160-161)
  • Once you have the ArgMatches you want, you use value_of() to get an specific argument value. (line 164)
  • For optionals parameters you can make a is_present() call to check if that one was provided or not. (line 165). 
Using those methods, you can retrieve every provided value and build up your configuration to run your application.

As you can see you can make really powerful command parser with clap getting every functionality you are used at ArgParse but in a rustacean environment.

19 December 2021

How to create your own custom Docker images


In a previous article we covered the basics of docker images usage. But there we used images built by others. That's great if you find the image you look for, but what happens if none fits your needs?

In this article we are going to explain how to create your own images and upload them to Docker Hub so they can be easily downloaded in your projects environments and shared with other developers.


To cook you need a recipe

Provided that you followed the tutorial previously linked, you'll have docker already installed in your linux box. Once docker is installed you need a recipe to tell it how to "cook" your image. That recipe is a file called Dockerfile that you'll create in a folder of your choice with any files you want to include in your image.

What you cook with that recipe? An image, but what is a docker image? A docker image is a ready to use virtual linux operating system with a specific configuration and set of dependencies installed. How you use that image is up to you. Some images are provided "as-is", as a base point where you are supposed to install your application and run it inside. Other images are more specific and contains an app that is executed as soon you run that image. Just keep in your mind that a docker image is like what an OOP language calls a "class", while a container is an instance of that class (an OOP language would call it an "object"). You may have many containers running after being started from the same image.

To understand how a Dockerfile works, we are going to asses some examples. 

Let's start with this Dockerfile from the vdist project. The image you build with that Dockerfile is intended to compile a python distribution and run an application called fpm over a folder is created at runtime when an image container is used. So, this Dockerfile is supposed to install every dependency to make possible compiling python and running fpm.

As a first step, every Dockerfile starts defining which image you are going to use as starting point. Following with the OOP metaphor, being our image a class, we need to define from which class inherits ours.

In this case, our image is derived from ubuntu:latest image, but you can use any image available at Docker Hub. Just take care to check that image Docker Hub page to find out which tag to use. Once you upload your image to Docker Hub others may use it as a base point for their respective images.

Every art piece must be signed. Your image is no different so you should define some metadata for it:

The real match comes with RUN commands. Those are what you use to configure your image. 

Some people misunderstand what RUN commands are intended for the first time they try to build a docker image. RUN commands are not executed when a container is created from an image with "docker run" command. Instead they are run only once by "docker build" to create an image from a Dockerfile.

The exact set of RUN command will vary depending of your respective project needs. In this example RUN commands check apt sources list are OK, update apt database and install a bunch of dependencies using both "apt-get" and "gem install". 

However, you'd better start your bunch of RUN commands with a "RUN set -e" this will make fail your entire image building process if any RUN command returns and error. May seem an extreme measure, but that way you are sure you are not uploading an image with unadvertised errors. 

Besides, when you review some Dockerfiles from other project you will find many of them include several shell commands inside the same RUN command as our example does in lines 14-16. Docker people recommends including inside the same RUN command a bunch of shell commands if they are related between them (i.e. if two commands have no sense separated, being executed without the other, they should be run inside the same RUN command). That's because of how images are built using a layer structure where every RUN command is a separate layer. Following Docker people advise, your images should be quicker to rebuilt when you perform any change over its Dockerfile. So to include several shell commands inside the same RUN command, remember to separate those commands in a line for each and append every line with a "&& \" to chain them (except the last line as the example shows). 

Apart of RUN commands, there are others you should know. Let's review the Dockerfile from my project markdown2man. That image is intended to run a python script to use Pandoc to make a file conversion using arguments passed by the user when a container is started from that image. So, to create that image you can find some now already familiar commands:


But, from there things turn interesting with some new commands.

With ENV commands you can create environments variables to be used in following building commands so you don't have to repeat the same string over and over again and simplify modifications:


Nevertheless, be aware that environment variables created with ENV commands outlast building phase and persist when a container is created from that image. That can provoke collisions with other environments variables created further. If you only need the environment variable for the building phase and want it removed when a image container is created, then use ARG commands instead of ENV ones.

To include your application files inside the image use COPY commands:


Those commands copy a source file, relative to the Dockerfile location, from the host where you are building the image to a path inside that image. In this example, we are copying a requirements.txt, which is located alongside Dockerfile, to a folder called /script (as SCIPT_PATH environment variable defines) inside the image.

Last, but not least, we find another new command: ENTRYPOINT.


An ENTRYPOINT defines which command to run when a container is started from this image so arguments passed to "docker run" are actually passed to this command. Container will stay alive until command defined at ENTRYPOINT returns. 

ENTRYPOINTS are great to use docker containers to run commands without needing to pollute your user system with packages needed to run those commands.

Time to cook

Once your recipe is ready, you must cook something with it.

When your Dockerfile is ready, use docker build command to create an image from it. Provided you are at the same folder of your Dockerfile:

dante@Camelot:~/Projects/markdown2man$ docker build -t dantesignal31/markdown2man:latest .
Sending build context to Docker daemon  20.42MB
Step 1/14 : FROM python:3.8
 ---> 67ec76d9f73b
Step 2/14 : LABEL maintainer="dante-signal31 (dante.signal31@gmail.com)"
 ---> Using cache
 ---> ca94c01e56af
Step 3/14 : LABEL description="Image to run markdown2man GitHub Action."
 ---> Using cache
 ---> b749bd5d4bab
Step 4/14 : LABEL homepage="https://github.com/dante-signal31/markdown2man"
 ---> Using cache
 ---> 0869d30775e0
Step 5/14 : RUN set -e
 ---> Using cache
 ---> 381750ae4a4f
Step 6/14 : RUN apt-get update     && apt-get install pandoc -y
 ---> Using cache
 ---> 8538fe6f0c06
Step 7/14 : ENV SCRIPT_PATH /script
 ---> Using cache
 ---> 25b4b27451c6
Step 8/14 : COPY requirements.txt $SCRIPT_PATH/
 ---> Using cache
 ---> 03c97cc6fce4
Step 9/14 : RUN pip install --no-cache-dir -r $SCRIPT_PATH/requirements.txt
 ---> Using cache
 ---> ccb0ee22664d
Step 10/14 : COPY src/lib/* $SCRIPT_PATH/lib/
 ---> d447ceaa00db
Step 11/14 : COPY src/markdown2man.py $SCRIPT_PATH/
 ---> 923dd9c2c1d0
Step 12/14 : RUN chmod 755 $SCRIPT_PATH/markdown2man.py
 ---> Running in 30d8cf7e0586
Removing intermediate container 30d8cf7e0586
 ---> f8386844eab5
Step 13/14 : RUN ln -s $SCRIPT_PATH/markdown2man.py /usr/bin/markdown2man
 ---> Running in aa612bf91a2a
Removing intermediate container aa612bf91a2a
 ---> 40da567a99b9
Step 14/14 : ENTRYPOINT ["markdown2man"]
 ---> Running in aaa4774f9a1a
Removing intermediate container aaa4774f9a1a
 ---> 16baba45e7aa
Successfully built 16baba45e7aa
Successfully tagged dantesignal31/markdown2man:latest

dante@Camelot:~$

I you weren't at the same folder than Dockerfile you should replace that ".", at the end of command, with a path to Dockerfile folder.

The "-t" parameter is used to give a proper name (a.k.a. tag) to your image. If you want to upload your image to Docker Hub try to follow its naming conventions. For an image to be uploaded to Docker Hub its name should be composed like: <docker-hub-user>/<repository>:<version>. You can see in the last console example that docker-hub-user parameter was dantesignal31 while repository was markdown2man and version was latest.

Upload your image to Docker Hub

If building process ended right you should be able to find your image registered in your system.

dante@Camelot:~/Projects/markdown2man$ docker images
REPOSITORY                   TAG                 IMAGE ID       CREATED          SIZE
dantesignal31/markdown2man   latest              16baba45e7aa   15 minutes ago   1.11GB


dante@Camelot:~$

But a only locally available image has a limited use. To make that image globally available you should upload it to Docker Hub. But to do that you first need an account at Docker Hub. The process to sign up for a new account ID is similar to any other online service.

Done the sign up process, login with your new account to Docker Hub. Once logged in, create a new repository. Remember that whatever name you give to your repository it will be prefixed with your username to get the repo full name.


In this example, full name for the repository would be mobythewhale/my-private-repo. Unless you are using a paid account you'll probably set a "Public" repository.

Remember to tag your image with a repository name according what you created at Docker Hub.

Before you can push your image you have to login to Docker Hub from your console with "docker login":

dante@Camelot:~/Projects/markdown2man$ docker login
Authenticating with existing credentials...
WARNING! Your password will be stored unencrypted in /home/dante/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded
dante@Camelot:~$

First time you log in you will be asked your username and password.

Once logged you can now upload your image with "docker push":

dante@Camelot:~/Projects/markdown2man$ docker push dantesignal31/markdown2man:latest
The push refers to repository [docker.io/dantesignal31/markdown2man]
05271d7777b6: Pushed 
67b7e520b6c7: Pushed 
90ccec97ca8c: Pushed 
f8ffd19ea9bf: Pushed 
d637246b9604: Pushed 
16c591e22029: Pushed 
1a4ca4e12765: Pushed 
e9df9d3bdd45: Mounted from library/python 
1271cc224a6b: Mounted from library/python 
740ef99eafe1: Mounted from library/python 
b7b662b31e70: Mounted from library/python 
6f5234c0aacd: Mounted from library/python 
8a5844586fdb: Mounted from library/python 
a4aba4e59b40: Mounted from library/python 
5499f2905579: Mounted from library/python 
a36ba9e322f7: Mounted from library/debian 
latest: digest: sha256:3e2a65f043306cc33e4504a5acb2728f675713c2c24de0b1a4bc970ae79b9ec8 size: 3680

dante@Camelot:~$

Your image is now available at Docker Hub, ready to be used like anyone else.

Conclusion

We reviewed the very basics about how build a docker image. From here, the only way is practicing creating increasingly complex images starting from the simpler ones. Fortunately Docker has a great documentation so it should be easy to solve any blocker you find in your way.

Hopefully all this process should be greatly simplified when Docker Desktop is finally available for linux as now is for windows or mac.