Showing posts with label programming. Show all posts
Showing posts with label programming. Show all posts

17 December 2021

How to create your own custom Actions for GitHub Actions

In my article about GitHub Actions we reviewed how you can build entire workflows just using premade bricks, called Actions, that you can find at GitHub Marketplace

That marketplace has many ready to use Actions but chances are that sooner than later you'll need to do something that has no action at markeplace. Sure, you can still use your own scripts. In that article I used a custom script (./ci_scripts/ at section "Sharing data between steps and jobs". Problem with that approach is that if you want to do the same task in another project you need to copy your scripts between project repositories and adapt them. You'd better transform your scripts to Actions to be easily reusable not only in your projects but publicly with other people projects.

Every Action you can find at marketplace is made in one of the ways I'm going to explain here. Actually if you enter to any Action marketplace page you will find a link, at right hand side, to that Action repository so you can asses it and learn how it works.

There are 3 main methods to build your own GitHub Actions:

  • Composite Actions: They are the simpler, and quicker, but a condition to use this way is that your Action should be based in a self-sufficient script that needs no additional dependency to be installed. It should run only with an standard linux distribution offers.
  • Docker Actions: If you need any dependency to make your script work then you'll need t follow this way.
  • Javascript Actions: Well... you can write your own Actions with javascript, but I don't like that language so I'm not going to include it in this article.
The problem with your Actions dependencies is that they can pollute the workflow environment where your Action is going to be used. Your action dependencies can even collide with those of the app being built. That's why we need to encapsulate our Action and its dependencies to be independent of the environment of the app being built. Of course, this problem does not apply if your Action is intended to setup the workflow environment installing something. There are Actions to, for example, installing and setup Pandoc to be used by the workflow. Problem arises when your Action is intended to do one specific task not related to installing something (for example copying files) and it does install something under the table, as that con compromise the workflow environment. So, best option if you need to install anything to make your Action work is installing it in a docker container and make your Action script run from inside that container, entirely independent of workflow environment. 

Composite Actions

If your Action just needs a bunch of bash commands or a python script exclusively using its built-in standard library then composite Actions is your way to go.

As an example of a composite action, we are going to review how my Action rust-app-version works. That Action looks for a rust Cargo.toml configuration file and read which version is set there for the rust app. That version string is offered as the Action output and you can use that output in your workflow, for instance, to tag a new release at GitHub. This action only uses modules available at standard python distribution. It does not need to install anything at user runner. It's true that there is a requirements.txt at rust-app-version repository but those are only dependencies for unit testing.

To have your own composite Action you first need a GitHub repository to host it. There you can place the few files really needed for your action.

At the very root of your repository you need a file called "action.yml". This file is really important as it models your Action. Your users should be able to think about your action as a black box. Something where you enter some inputs and you receive any output. Those inputs and outputs are defined in action.yml.

If we read the action.yml file at rust-app-version we can see that this action only needs an input called "cargo_toml_folder" and actually that input is optional as it can receive a value of "." if it is omitted when this action is called:

Outputs are somewhat different as the must refer to the output of an specific step in your action:

In last section we specify that this action is going to have just one output called "app_version" and that output is going to be the output called "version" of an step with an id value of "get-version".

Those inputs and outputs define what your action consumes and offers, i.e. what your action does. How your action does it is defined under "runs:" tag. There you set that your Action is a composite one and you call a sequence of steps. This particular example only has one step but you can have as many steps as you need:

Take note of line 22 where that steps receives a name: "get-version". That name is important to refer to this step from outputs configuration.

Line 24 is where your command is run. I only executed one command. If you needed multiple commands to be executed inside the same step, then you should use a bar after run: "run: |". With that bar you mark that next few lines (indented under "run:" tag) are lines separated commands to be executed sequentially.

Command at line 24 is interesting because of 3 points:
  • It calls an script located at our Action repository. To refer to our Action repository root use the github.action_path environment variable. The great thing is that although our script is hosted at its repository, GitHub runs it so that it can view the repository of the workflow from where it is called. Our script will see the workflow repository files as it was run from its root.
  • At the end of the line you may see how inputs are used through inputs context.
  • The weirdest thing of that line is how you setup that step output. You set a bash step output doing an echo "::set-output name=<ouput_name>::<output_value>". In this case name is version and its value is what prints to console. Be aware that output_name is used to retrieve that output after step ends through ${{ steps.<id>.outputs.<output_name> }}, in this case ${{ steps.get_version.outputs.version }}
Apart from that, you only need to setup your Action metadata. That is done in the first few lines:

Be aware that "name:" is the name your action will have at GitHub Marketplace. The another parameter, "description:", its the short explanation that will be shown along name in the search results at Markeplace. And "branding:" is only the icon (from Feather icon suite) and color that will represent your action at Markeplace.

With those 24 lines at action.yml and your script at its respective path (here at rust_app_version/ subfolder), you can use your action. You just need to push the button that will appear in your repository to publish your action at Marketplace. Nevertheless, you'd better read this article to the end because I have some recommendations that may be helpful for you.

Once published, it becomes visible for other GitHub users and a Marketplace page is created for your action. To use an Action like this you only need to include in your workflow a configuration like this:

Docker actions

If your Action needs to install any dependency then you should package that Action inside a docker container. That way your Action dependencies won't mess with your user workflow dependencies.

As an example of a docker action, we are going to review how my Action markdown2man works. That action takes a file and converts it to a man page. Using it you don't have to keep two sources to document your console application usage. Instead of that you may document your app usage only with and convert that file to a man page.

To do that conversion markdown2man needs Pandoc package installed. But Pandoc has its respective dependencies, so installing them at user runner may break his workflow. Instead of it, we are going to install those dependencies in a docker image and run our script from that image. Remember that docker lets you execute scripts from container interacting with host files.

As with composite Actions, we need to create an action.yml at Action repository root. There we set our metadata, input and outputs like we do with composite actions. The difference here is that this specific markdown2man Action does not emit any output, so that section is omitted. Section for "runs:" is different too:

In that section we specify this Action is a docker one (at "using:"). There are two ways use a docker image in your action: 
  • Generate an specific image for that action and store it at GitHub docker registry. In that case you use the "image: Dockerfile" tag.
  • Use a prebuilt image from DockerHub registry. To do that you use the "image: <dockerhub_user>:<docker-image-tag>" tag.
If the image you are going to build is exclusively intended to be used at GitHub Action I would follow Dockerfile option. Here, with markdown2man we follow the Dockerfile approach so a docker image is build any time Action is run after a Dockerfile update. Generated image is cached at GitHub registry to be offered quicker to further Actions. Remember a Dockerfile is a kind of a recipe to "cook" an image, so commands that file contains are only executed when the image is built ("cooked"). Once build, the only command that is run is the one you set at entrypoint tag, passing in arguments set at "docker run".                                                                                                                                                                          The "args:" tag has every parameter to be passed to our script at the container. You will probably use your input here to be passed to our script. Be aware that as it happened in composite action, here user repository files are visible to our container.

As you may suspect by now, docker actions are more involved than composite Actions because of the added complexity of creating the Dockerfile. The Dockerfile for markdown2man is pretty simple. As markdown2man script is a python one, we make our image derive from the official docker image for version 3.8:

Afterwards, we set image metadata:

To configure your image, for example installing things, you use RUN commands.

ENV command generates environment variables to be used in your Dockerfile commands:

You use COPY command to copy your requirements.txt from your repository and include it in your generated image. Your scripts are copied fro your Action repository to container following the same approach:

After script files are copied, I like to make then executable and link them from /usr/bin/ folder to include it at the system path:

After that, you set your script as the image entrypoint so this script is run once image is started and that script is provided with arguments you set at the "args:" tag at action.yml file.

You can try that image at your computer building that image from the Dockerfile and running that image as a container:

dante@Camelot:~/$ docker run -ti -v ~/your_project/:/work/ dantesignal31/markdown2man:latest /work/ mancifra


For local testing you need to mount your project folder as volume (-v flag) if your scripts to process any file form that repository. Last two argument in the example (work/ and mancifra) are the arguments that must be passed to entrypoint.

And that's all. Once you have tested everything you can publish your Action and use it in your workflows:

With a call like that a man file called cifra.2.gz should be created at man folder. If manpage_folder does not exist then markdown2man creates it for you.

Your Actions are first class code

Although your Action will likely be small sized, you should take of them as you would with your full-blown apps. Be aware that many people will find and your Actions through Marketplace in their workflows. An error in your Action can brreak many workflows so be diligent and test your Action as you would with any other app.

So, with my Actions I follow the same approach as in other projects and I set up a GitHub workflow to run tests against any pushes in a staging branch. Only once those tests succeed I merge staging with main branch and generate a new release for the Action.

Lets use the markdown2man workflow as example. There you can read that we have two test types:

  • Unit tests: They check the python script markdown2man is based on.

  • Integration tests: They check markdown2man behaviour as a GitHub Action. Although your Action was not published yet you can install it from a workflow in the same repository (lines 42-48). So, what I do is calling the Action from the very same staging branch we are testing and I use that Action with a markdown I have ready at test folder. If a proper man page file is generated then integration test is passed (line 53). Having the chance to test an Action against its own repository is great as it lets you test your Action as people would use it without needing to publish it.

In addition to testing it, you should write a for your action in order to explain in detail how to use your Action. In that document you should include at least this information:

  • A description of what the action does.
  • Required input and output arguments.
  • Optional input and output arguments.
  • Any secret your action needs.
  • Any environment variable your action uses.
  • An example of how to use your action in a workflow.

And you should add too a LICENSE file explaining the usage terms for your Action.


The strong point of GitHub Action is the high degree of reusability and sharing it promotes. Every time you find yourself repeating the same bunch of commands you are encouraged to make and Action with those commands and share it through Marketplace. Doing that way you get a piece of functionality easier to use throughout your workflows than copy-pasting commands and you contribute to improve the Marketplace so that others can benefit too from that Action. 

Thanks to this philosophy GitHub Marketplace has grown to a huge amount of Actions, ready to use and to help you to save you from implementing that functionality by your own.

07 November 2021

How to use GitHub Actions for continuous integration and deployment

In a previous article I explained how to use Travis CI to get continuous integration in your project. Problem is that Travis has changed its usage terms in the last year and now its not so comfortable for open source projects. The keep saying they are still free for open source project but actually you have to beg for free credits every time you expend them and they make you to prove you still comply with what they define as an open source project (I've heard about cases where developers where discarded for free credits just because they had GitHub sponsors).

So, although I've kept my vdist project at Travis (at the moment) I've been searching alternatives for my projects. As I use GitHub for my open source repositories its natural to try its continuous integration and deployment framework: GitHub Actions.

Conceptually, using GitHub Actions is pretty much the same as Travis CI, so I'm not going to repeat myself. If you want to learn why to use a continuous integration and deployment framework you can read the article I linked at this one start.

So, lets focus at the point of how to use GitHub Actions.

As Travis CI what you do with GitHub Actions is centered in yaml files you place at .github/workflows folder in your repository. GitHub will look at that folder to execute CI/CD workflows. In that folder you can have many workflows to be executed depending on different GitHub events (pushes over branches, pull requests, issues filling and a long etcetera). 

But I find GitHub Actions way better than Travis CI. GitHub promotes massive reusing. Every single step can be encapsulated and shared with your other workflows and with other people at GitHub. With so many people developing and sharing at GitHub you'll find yourself reusing others tasks (a.k.a actions) than implementing by yourself. Unless your are automatizing something really weird, chances are that some other has implemented it and shared. In this article we'll use others actions and implement our own custom steps. Besides you can reuse your own workflows (or share with others) so if you have a working workflow for your project you can reuse it in a similar project so you don't need to reimplement its workflow from scratch.

For this article we'll focus in a typical "test -> package -> deploy" workflow which we're going to call test_and_deploy.yaml (if you feel creative you can call it as you like, but try to be expressive). If you want this article full code you can find it in this commit of my Cifra-rust project at GitHub.

To create that file you have two options: create it in your IDE and push like any other file or write it using built-in GitHub web editor. For the first time my advice is to use web editor as it guides you better to get your first yaml up and working. So go to your GitHub repo page and click over Actions tab:

There, your are prompted to create your first workflow. When you choose to create a workflow you're offered to use a predefined template (GitHub has many for different tasks and languages) or set up a workflow yourself. For the sake of this article choose "set up a workflow yourself". Then you will enter to web editor. An initial example will be already loaded to let you have and initial scaffolding.

Now lets check Cifra-rust yaml file to learn what we can do with GitHub actions.


At very first few lines (from line 1 to 12) you can see we name this workflow ("name" tag). Use an expressive name, to identify quickly what this workflow does. You will use this name to reuse this workflow from other repositories.

The "on" tag defines which events trigger this workflow. In my case, this workflow is triggered by pushes and pull requests over staging branch. There're many more events you can use.

The "workflow_dispatch" tag allows you to trigger this workflow manually from GitHub web interface. I use to set it, it doesn't harm to have that option.


Next is the "jobs" tag (line 15) and there is where the "nuts and guts" of workflow begins. A workflow is composed of jobs. Every job is runned in a separate virtual machine (the runner) so each job has its dependencies encapsulated. That encapsulation is good to avoid jobs messing dependencies and filesystems of others jobs. Try to focus every job in just one task. 

Jobs are run in parallel by default unless you explicitly set dependencies between them. If you need a job B be run after a job A is completed successfully you need to use tag "needs" to set B needs A to be completed to start. In Cifra-rust example jobs "merge_staging_and_master", "deploy_crates_io" and "generate_deb_package" need "tests" job to be successfully finished to start. You can see an example of "needs" tag usage at line 53:

As "deploy_debian_package" respectively needs "generate_deb_package" to be finished before, you end with an execution tree like this:


Every job is composed of one or multiple steps. One step is a sequence of shell commands You can run native shell commands or scripts you had included in your repository. From line 112 to 115 we have one of those steps:

There we are calling a script stored in a folder called ci_scripts from my repository. Note the pipe (" | ") next to "run" tag. Without that pipe you can include just one command in run tag but that pipe allows you to include multiple command to executed separated in different lines (like step in lines 44 to 46).

If you find yourself repeating the same commands across multiple workflows them you have a good candidate to encapsulate those commands in an action. As you can guess Actions are the key stones of GitHub Actions. An action is a set of commands packaged to be shared and reused in many workflows (yours or of others users). An action has inputs and outputs and what happens inside is not your problem while it works as intended. You can develop your own actions and share with others but it deserves an article on its own. In next article I will convert that man page generation step in a reusable action.

At right hand side, GitHub Actions web editor has a searcher to find actions suitable for the task you want to perform. Guess you want to install a Rust toolchain, you can do this search:

Although GitHub web editor is really complete and useful to find mistakes in your yaml files, its searcher lacks a way to filter or reorder its results. Nevertheless that searcher is your best option to find shared actions for your workflows.

Once you click in any searcher result you are shown a summary about which text to include in your yaml file to use that action. You can get more detailed instructions clicking in "View full Marketplace listing" link.

As any other step, and action uses a "name" tag to document which task is intended to do and an "id" if that step must be referenced from other workflow places (see an example at line 42). What makes different an action from a command step is the "uses" tag. That tag links to the action we want to use. the text to use in that tag differs for every action but you can find what to write there at searcher results instructions. In those instructions the use to describe which inputs the action accepts. Those inputs are included in the "with" tag. For instance, in lines 23 to 27 I used an action to install the Rust building framework:

As you can see, in "uses" tag you can include which version of given action to use. There are wildcards to use latest version but you'd better set an specific version to avoid your workflows get broken by actions updates.

You make a job chaining actions as steps. Every step in a job is executed sequentially. If any of then fails the entire workflow fails. 

Sharing data between steps and jobs

Although the steps of a workflow are executed in the same virtual machine they cannot share bash environment variables because every step spawns a different bash process. If you set an environment variable in a step that needs to be used in another step of the same job you have two options:

  • Quick and dirty: Append that environment variable to $GITHUB_ENV so that variable can be accessed later using env context. For instance, at line 141 we create DEB_PACKAGE environment variable:

That environment variable is accessed at line 146, in the next step:

  • Set step outputs: Problem with last method is that although you can share data between steps of the same job you cannot do it across different jobs. Setting steps outputs is a bit slower but leaves your step ready to share data not only with other steps of the same job but with any step of the workflow. To set a variable environment as an step output you need to echo that variable to ::set-output and give a name to that variable followed by its value after a double colon ("::"). You have an example at line 46:

Note that step must be identified with an "id" tags to further retrieve the shared variable. As that step is identified as "version_tag" the created "package_tag" variable can be later retrieved from another step of the same job using: 

${{ steps.version_tag.outputs.package_tag }}

Actually that method is used at line 48 to prepare that variable to be recovered from another job. Remember that method so far helps you to pass data to steps in the same job. To export data from a job to be used from another job you have to declare it first as a job output (lines 47-48):

Note that in last screenshot indentation level should be at the same level as "steps" tag to properly set package_tag as a job output.

To retrieve that output from another job, the catching job must declare giving job as needed in its "needs" tag. After that, the shared value can be retrieved using next format:

${{ needs.<needed_job>.outputs.<output_varieble_name> }} 

In our example, "deploy_debian_package" needs the value exported at line 48 so it declares its job (tests) as needed in line 131:

After that, it can get and use that value at line 157:

Passing files between jobs

Sometimes passing a variable is not enough because you need to produce files in a job to be consumed in another job.

You can share files between steps in the same job because those steps share the same virtual machine. But between jobs you need to transfer files from a virtual machine to another.

When you generate a file (an artifact) in a job, you can upload it to a temporal shared storage to allow other jobs in the same workflow get that artifact. To upload and download artifact to that temporal storage you have two predefined actions: upload-artifact and download-artifact.

In our example, "generate_deb_package" job generates a debian package that is needed by "deploy_debian_package". So, in lines 122 to 126 "generated_deb_package" uploads that package:

On the other side "deploy_debian_package" downloads saved artifact in lines 132 to 136:

Using your repository source code

By default you start every job with a clean virtual machine. To have your source code downloaded to that virtual machine you use an action called checkout. In our example it is used as first step of "tests" job (line 21) to allow code to be built and tested:

You can set any branch to be downloaded, but if you don't set anyone the one related to triggering event is used. In our example, staging branch is the used one.

Running your workflow

You have two options to trigger your workflow to try it: you can do the triggering event (in our example pushing code to staging branch) or you can launch workflow manually from GitHub web interface (assuming you set "workflow_dispatch" tag in your yml file as I advised).

To do a manual triggering, go to repository Actions tab and select the workflow you want to launch. Then you'll see "Run workflow button" at right hand side:

Once pushed that button, workflow will start and that tab list will show that workflow as active. Selecting that workflow in the list shows an execution tree like the one I showed earlier. Clicking in any of the boxes shows running logs of every step of that job. It is extremely easy read through generated logs.


I've found GitHub Actions really enjoyable. Its focus in reusability and sharing makes really easy and fast creating complex workflow. Unless you're working in a really weird workflow, chances are that most of your workflow components (if not all) are already implemented and shared as actions, so designing a complex workflow becomes an easy task of joining already available pieces. 

GitHub Actions documentation is great, and it popularity makes easy find answers online for any problem you meet.

Besides that, I've felt yml files structure coherent and logic so its easy to grasp concepts an gain a good level really quickly. 

Being free and unlimited for open source repositories I guess I'm going to migrate all my CI/CD workflows to GitHub Actions from Travis CI.

23 October 2021

How to use PackageCloud to distribute your packages

Recently I wrote an article explaining how to setup a JFrog Artifactory account to host Debian and RPM repositories. 

Since then, my interest in Artifactory has weakened because they have a policy of continuous activity to keep your account alive. If you have periods with no packages uploads/downloads they suspend your account and you have to reactivate it manually. That is extremely unpleasant and uncomfortable and happens frequently (I've been receiving suspensions every few weeks) when you're like me, a hobbyist developer in his spare time who can't keep a continuous advance in his projects. That and the extremely complex setup to run a package repository made me look for another option.

Searching through the web I've found PackageCloud and so far it happens to be an appealing alternative to Artifactory. PackageCloud has a free tier with 2GB storage and 10GB of monthly transfer. For a hobbyist like I think it is enough. 

Creating a repository

Once your register at PackageCloud you access to an extremely clean dashboard. Compared to Artifactory everything is simple and easy. At up-right corner of "Home" page you have a big button to "Create a repository". You can use it to create a repository for every application you have. 


A wonderful feature of PackageCloud is that a single repository can host many packages types simultaneously. So you don't have to create a repository for every package type you want to host for your application, instead you have a single repository for your application and you upload there your deb, rpm, gem, etc packages you build. 

Uploading packages

While you use PackageCloud your realize it's developers have made a great effort to guide at every step. Once you create a new repository you are given immediate guidance about how to upload packages through console (although you have a nice blue button too to upload them through web interface if you like):

As a first approach we will first try web interface to upload a package and after we will test console approach.

If you select your recently created repository at Home page you will enter to the same screen I posted before. To use web interface to upload packages push "Upload a package" button. That button will trigger next pop-up window:

As you can see in last screenshot, I've pushed "Select a package" button and selected vdist_2.1.0:amd64.deb. After you have to select your target distribution in given combo box. I'd suggest selecting a general compatible distribution to widen you audience. For example, I develop in a Linux Mint box and although Linux Mint is present at combo box I prefer to select Ubuntu Focal as a wider equivalent (being aware that my Linux Mint Uma is based in Ubuntu Focal). After selecting package and target distribution, "upload" button will be enabled. Upload will start when you push that button. You will informed that upload has ended with a green boxed "Upload Successful!".

If you prefer console you can use PackageCloud console application. That package is developed in Ruby so you have to be sure you have it installed:

dante@Camelot:~/$ sudo apt install ruby ruby-dev g++
[sudo] password for dante:


Ruby package gives you "gem" command to install PackageCloud application:

dante@Camelot:~/$ sudo gem install package_cloud
Building native extensions. This could take a while...
Successfully installed unf_ext-0.0.8
Successfully installed unf-0.1.4
Successfully installed domain_name-0.5.20190701
Successfully installed http-cookie-1.0.4
Successfully installed mime-types-data-3.2021.0901
Successfully installed mime-types-3.3.1
Successfully installed netrc-0.11.0
Successfully installed rest-client-2.1.0
Successfully installed json_pure-1.8.1
Building native extensions. This could take a while...
Successfully installed rainbow-2.2.2
Successfully installed package_cloud-0.3.08
Parsing documentation for unf_ext-0.0.8
Installing ri documentation for unf_ext-0.0.8
Parsing documentation for unf-0.1.4
Installing ri documentation for unf-0.1.4
Parsing documentation for domain_name-0.5.20190701
Installing ri documentation for domain_name-0.5.20190701
Parsing documentation for http-cookie-1.0.4
Installing ri documentation for http-cookie-1.0.4
Parsing documentation for mime-types-data-3.2021.0901
Installing ri documentation for mime-types-data-3.2021.0901
Parsing documentation for mime-types-3.3.1
Installing ri documentation for mime-types-3.3.1
Parsing documentation for netrc-0.11.0
Installing ri documentation for netrc-0.11.0
Parsing documentation for rest-client-2.1.0
Installing ri documentation for rest-client-2.1.0
Parsing documentation for json_pure-1.8.1
Installing ri documentation for json_pure-1.8.1
Parsing documentation for rainbow-2.2.2
Installing ri documentation for rainbow-2.2.2
Parsing documentation for package_cloud-0.3.08
Installing ri documentation for package_cloud-0.3.08
Done installing documentation for unf_ext, unf, domain_name, http-cookie, mime-types-data, mime-types, netrc, rest-client, json_pure, rainbow, package_cloud after 8 seconds
11 gems installed


That package_cloud command lets you do many things, even create repositories from console, but let focus on package uploading. To upload a package to an existing repository just use "push" verb:

dante@Camelot:~/$ package_cloud push dante-signal31/vdist/ubuntu/focal vdist_2.2.0post1_amd64.deb 

/var/lib/gems/2.7.0/gems/json_pure-1.8.1/lib/json/common.rb:155: warning: Using the last argument as keyword parameters is deprecated
Got your token. Writing a config file to /home/dante/.packagecloud... success!
/var/lib/gems/2.7.0/gems/json_pure-1.8.1/lib/json/common.rb:155: warning: Using the last argument as keyword parameters is deprecated
Looking for repository at dante-signal31/vdist... /var/lib/gems/2.7.0/gems/json_pure-1.8.1/lib/json/common.rb:155: warning: Using the last argument as keyword parameters is deprecated
Pushing vdist_2.2.0post1_amd64.deb... success!


URL you pass to "package_cloud push" is always of the form <username>/<repository>/<distribution>/<distribution version>.

After that you can see the new package registered at web interface:

Installing packages

Ok, so far you know everything you need from the publisher side, but know you need to learn what your users have to do to install your packages.

The thing cannot be simplest. Every repository has a button in its packages section called "Quick install instructions for:". Push its button and you will get a pop-up window like this:

Just copy given command text (or use copy button) and make your user paste that command in their consoles (i.e. include that command at your installation instructions documentation):

dante@Camelot:~/$ curl -s | sudo bash
[sudo] password for dante:
Detected operating system as LinuxMint/uma.
Checking for curl...
Detected curl...
Checking for gpg...
Detected gpg...
Running apt-get update... done.
Installing apt-transport-https... done.
Installing /etc/apt/sources.list.d/dante-signal31_vdist.list...done.
Importing packagecloud gpg key... done.
Running apt-get update... done.

The repository is setup! You can now install packages.


That command registered your PackageCloud repository as one of the system authorised packaged sources. Theoretically now your user could do a "sudo apt update" and get your package listed but here comes the only gotcha of this process. Recall when I told that I develop in Linux Mint but I set repository to Ubuntu/focal? the point is that last command detected my system and set my source as it was for Linux Mint:

dante@Camelot:~/$ cat /etc/apt/sources.list.d/dante-signal31_vdist.list 
# this file was generated by for
# the repository at

deb uma main
deb-src uma main


I've marked as red the incorrect path. If you insist to update apt in this moment you get next error:

dante@Camelot:~/$ sudo apt update
Hit:1 focal InRelease
Hit:2 focal InRelease
Hit:3 focal InRelease
Get:4 focal-updates InRelease [114 kB]
Ign:5 uma InRelease
Get:6 focal-security InRelease [114 kB]
Get:7 focal-backports InRelease [101 kB]
Hit:8 uma Release
Ign:11 uma InRelease
Hit:10 focal InRelease
Err:12 uma Release
404 Not Found [IP: 443]
Reading package lists... Done
E: The repository ' uma Release' does not have a Release file.
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.


So, in my case I have to correct it manually to let it this way:

dante@Camelot:~/$ cat /etc/apt/sources.list.d/dante-signal31_vdist.list 
# this file was generated by for
# the repository at

deb focal main
deb-src focal main


Now "apt update" will work and you'll find our package:

dante@Camelot:~/$ sudo apt update
Hit:1 focal InRelease
Hit:2 focal InRelease
Hit:3 focal InRelease
Get:4 focal-updates InRelease [114 kB]
Ign:5 uma InRelease
Get:6 focal-security InRelease [114 kB]
Get:7 focal-backports InRelease [101 kB]
Hit:8 uma Release
Hit:10 focal InRelease
Get:11 focal InRelease [24,4 kB]
Get:12 focal/main amd64 Packages [963 B]
Fetched 353 kB in 3s (109 kB/s)
Reading package lists... Done
Building dependency tree
Reading state information... Done
All packages are up to date.
dante@Camelot:~/Downloads$ sudo apt show vdist
Package: vdist
Version: 2.2.0post1
Priority: extra
Section: net
Installed-Size: 214 MB
Depends: libssl1.0.0, docker-ce
Download-Size: 59,9 MB
APT-Sources: focal/main amd64 Packages
Description: vdist (Virtualenv Distribute) is a tool that lets you build OS packages from your Python applications, while aiming to build an isolated environment for your Python project by utilizing virtualenv. This means that your application will not depend on OS provided packages of Python modules, including their versions.

N: There is 1 additional record. Please use the '-a' switch to see it


Obviously, if your user system matches your repository target there would be nothing to fix, but chances are that your user have derivatives boxes so they'll need to apply this fix, so make sure to include it in your documentation.

At this point, your users will be able to install your package as any other:

dante@Camelot:~/$ sudo apt install vdist



PackageCloud makes extremely easy deploying your packages. Comparing with Bintray or Artifactory, its setup is a charm. I have to check how things go on the long term but at first glance it seems a promising service.

How to package Rust applications - DEB packages

Rust language itself is harsh. Your first contact with compiler and borrow checker uses to be traumatic until you realize they are actually to protect you against yourself. Once you understand that you begin to love that language.

But everything else apart of the language is kind, really comfortable I'd say. With cargo, compiling, testing, documenting, profiling and even publishing to (the Pypi of Rust) is a charm. Packaging is no exception of that as it is integrated with cargo assuming some configuration we're going to explain here.

To package my Rust applications into debian packages, I use cargo-deb. To install it just type:

dante@Camelot:~~/Projects/cifra-rust/$ cargo install cargo-deb
Updating index
Downloaded cargo-deb v1.32.0
Downloaded 1 crate (63.2 KB) in 0.36s
Installing cargo-deb v1.32.0
Downloaded crc v1.8.1
Downloaded build_const v0.2.2
Compiling crossbeam-deque v0.8.1
Compiling xz2 v0.1.6
Compiling toml v0.5.8
Compiling cargo_toml v0.10.1
Finished release [optimized] target(s) in 53.21s
Installing /home/dante/.cargo/bin/cargo-deb
Installed package `cargo-deb v1.32.0` (executable `cargo-deb`)


Done that, you could start packaging simple applications. By default cargo deb obtains basic information from your Cargo.toml file. That way it loads next fields:

  • name
  • version
  • license
  • license-file
  • description
  • readme
  • homepage
  • repository

But seldom happens your application has no dependencies at all. To configure more advanced use cases, create a [package.metadata.deb] section in your Cargo.toml. In that section you can configure next fields:

  • maintainer
  • copyright
  • changelog
  • depends
  • recommends
  • enhances
  • conflicts
  • breaks
  • replaces
  • provides
  • extended-description
  • extended-description-file
  • section
  • priority
  • assets
  • maintainer-scripts

As a working example of this you can read this Cargo.toml version of my application Cifra.

There you can read general section from where cargo deb loads its basic information:



Cargo has a great documentation where you can find every section and tag explained.

Be aware that every file path you include in Cargo.toml is relative to Cargo.toml file.

Specific section for cargo-deb must not be long to have a working package:


Tags for this section are documented at cargo-deb homepage.

Section and priority tags are used to classify your application in Debian hierarchy. Although I've set them I think they are rather useless because official Debian repositories have higher requirement than cargo-deb can meet at the moment, so any debian package produced with cargo-deb will end in a personal repository were debian hierarchy for applications is not present.

Actually, most important tag is assets. That tag lets you set which files should be included in package and where they should be placed at installation. Format of that tag contents is straightforward. It is a list of tuples of three elements:

  • Relative path to file to be included in package: That path is relative to Cargo.toml location.
  • Absolute path to place that file in user computer.
  • Permissions for that file at user computer.

I should have included a "depends" tag to add my package dependencies. Cifra depends on SQLite3 and that is not a Rust crate but a system package, so it is a dependency of Cifra debian package. If you want to use "depends" tag you must use debian dependency format, but actually is not necessary because cargo-deb can calculate your dependencies automatically if you don't use "depends" tag. It does it using ldd against your compiled artifact and searching with dpkg  which system packages provides libraries detected by ldd.

Once you have your cargo-deb configuration in your Cargo.toml, building your debian package is as simple as:

dante@Camelot:~/Projects/cifra-rust/$ cargo deb
Finished release [optimized] target(s) in 0.17s


As you can see in the output, you will find your generated package in a new folder inside your project's called target/debian/.

Cargo deb is a wonderful tool which only downside is not being capable to meet Debian packaging policy to build packages suitable to be included in official repositories.

17 October 2021

How to write manpages with Markdown and Pandoc

No decent console application is released without a man page documenting how to use it. However, man page inners are rather arcane, but being a 1971 file format it has held up quite well.

Nowadays you have two standard ways to learn how to use a console command (google apart ;-) ): typing application command followed by "--help" to get a quick glance of application usage, or typing "man" followed by application name to get a detailed information about application usage. 

To implement "--help" approach in your application you can manually include "--help" parsing and output or you'd better use an argument parsing library like Python's ArgParse.

The "man" approach needs you to write a man page for your application.

Standard way to produce man pages is using troff formating commands. Linux has it's own troff implementation called groff. You have a feel about how is standard man pages format you can type next source inside a file called, for instance, corrupt.1:

.TH CORRUPT 1 .SH NAME corrupt \- modify files by randomly changing bits .SH SYNOPSIS .B corrupt [\fB\-n\fR \fIBITS\fR] [\fB\-\-bits\fR \fIBITS\fR] .IR file ... .SH DESCRIPTION .B corrupt modifies files by toggling a randomly chosen bit. .SH OPTIONS .TP .BR \-n ", " \-\-bits =\fIBITS\fR Set the number of bits to modify. Default is one bit.

Once saved, that file can be displayed by man. Assuming you are in the same folder than corrupt.1 you can type:

dante@Camelot:~/$ man -l corrupt.1

Output is: 

CORRUPT(1)                                                 General Commands Manual 

corrupt - modify files by randomly changing bits

corrupt [-n BITS] [--bits BITS] file...

corrupt modifies files by toggling a randomly chosen bit.

-n, --bits=BITS
Set the number of bits to modify. Default is one bit.



You can opt to write specific man pages for your application following for instance this small cheat sheet

Nevertheless, troff/groff format is rather cumbersome and I feel keeping two sources of usage documentation (your and your man page) is prone to errors and an effort waste. So, I follow a different approach: I write and keep updated my and afterwards I convert it to a man page. Sure you need to keep a quite standard format for your README, and it won't be the fanciest one, but at least you will avoid to type the same things twice, in two different files, with different formatting tags.

The key stone to make that conversion is a tool called Pandoc. That tool is the Swiss army knife of document conversion. You can use it to convert between many documents format like word (docx), openoffice (odt), epub... or markdown (md) and groff man. Pandoc uses to be available at your distribution standard package manager repositories, so at an ubuntu distribution you just need:

dante@Camelot:~/$ sudo apt install pandoc

Once Pandoc is installed, you have to observe some conventions with your to make further conversion easier. To illustrate this article explanations, we will follow as an example the file for my project Cifra.

As you can see, linked contains GitHub badges at the very beginning but those are going to be removed in conversion process we are going to explain later in this article. Everything else is structured to comply with contents a man page is supposed to have.

Maybe, most suspecting line is the first one. It's not usual to find a line like this in GitHub README:

Actually, that line is metadata for man page reader. After "%" contains document title (usually application name), manual section, a version, a "|" separator and finally a header. Manual section is 1 for user commands, 2 for system calls and 3 for C library functions. Your applications will fit in section 1 99% of times. I've not included a version for Cifra in that line, but you could have done. Besides, header indicates what set of documentation this manual page belongs to.

After that line, every section use to be included in man pages, but the only ones that should be included at least are:

  • Name: The name of the command.
  • Synopsis: A one-liner summarizing command line arguments and options.
  • Description: Describes in detail how to use your application command.

Other sections you can add are:

  • Options: Command line options.
  • Examples: About command usage.
  • Files: Useful if your application includes configuration files.
  • Environment: Here you explain if your application uses any environment variable.
  • Bugs: Detected bugs should be reported? There you can link your GitHub issues page.
  • Authors: Who is this masterpiece of code author?
  • See also: References to other man pages.
  • Copyright | License: A good place to include your application license text.

Once you have decided which sections include in manpage, you should write them following a format easily convertible by pandoc in a manpage with an standard structure. That's why main sections are always marked at level one of indentation ("#"). A problem with indentation is that while you can get further subheaders in markdown using "#" character  ("#" for main titles, "##" for subheaders, "###" for sections, and so on), those subheaders are only recognized by Pandoc up to sublevel 2. To get more sublevels I've opted for "pandoc lists", a format you can use in your markdown that is recognized afterwards by pandoc:


In lines 42 and 48, you have what pandoc call are "line blocks". Those are lines beginning with a vertical bar ( | ) followed by a space. Those spaces between vertical bar and command will be keep in man page converted text by pandoc.

Everything else in that is good classic markdown format.

Let guess we have written our entire and we want to perform conversion. To do that you can use an script like this from a temporal folder:


Lines 3 and 4 clean any file used in a previous conversion, while line 5 actually copies from source folder.

Line 6 removes every GitHub badge you could have in your 

Line 7 is where we call pandoc to perform conversion.

In that point you could have a file that can be opened with man:

dante@Camelot:~/$ man -l man/cifra.1

But man pages use to be compressed in gzip format, as you can see with your system man pages:

dante@Camelot:~/$ ls /usr/share/man/man1

That's why script compress at line 8 your generated man page. And while generated man page is now a compressed gzip file, it is still readable by man:

dante@Camelot:~/$ man -l man/cifra.1.gz

Although those are some steps, you can see they are easily scriptable both as a local script, as an step of your continuous integration workflow.

As you can see, with these easy steps you only need to keep updated your file because man page can be generated from it.

18 September 2021

How to parse console arguments in your Python application with ArgParse

There is one common pattern for every console application: it has to manage user arguments. Few console applications runs with no user arguments, instead most applications needs user provided arguments to run properly. Think of ping, it needs at least one argument: IP address or URL to be pinged:

dante@Camelot:~/$ ping
PING ( 56(84) bytes of data.
64 bytes from icmp_seq=1 ttl=113 time=3.73 ms
64 bytes from icmp_seq=2 ttl=113 time=3.83 ms
64 bytes from icmp_seq=3 ttl=113 time=3.92 ms
--- ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2005ms
rtt min/avg/max/mdev = 3.733/3.828/3.920/0.076 ms


When you run a python application you get in sys.argv list every argument user entered when calling your application from console. Console command is split using whitespaces as separator a each resulting token is a sys.argv list element. First element of that list is your application name. So, if we implemented our own python version of ping sys.argv 0 index element would be "ping" and 1 index element would be "".

You could do your own argument parsing just assessing sys.argv content by yourself. Most developers do it that way... at the begining. If you go that way you'll soon realize that it is not so easy to perform a clean management of user argument and that you have to repeat a lot of boilerplate in your applications. so, there are some libraries to easy argument parsing for you. One I like a lot is argparse.

Argparse is a built-in module of standard python distribution so you don't have to download it from Pypi. Once you understand its concepts it's easy, very flexible and it manages for you many edge use cases. It one of those modules you really miss when developing in other languages.

First concept you have to understand to use argparse is Argument Parser concept. For this module, an Argument Parser is a fixed word that is followed by positional or optional user provided arguments. Default Argument Parser is the precisely the application name. In our ping example, "ping" command is an Argument Parser. Every Argument Parser can be followed by Arguments, those can be of two types:

  • Positional arguments: They cannot be avoided. User need to enter them or command is assumed as wrong. They should be entered in an specific order. In our ping example "" is a positional argument. We can meet many other examples, "cp" command for instance needs to positional arguments source file to copy and destination for copied file.
  • Optional arguments: They can be entered or not. They are marked by tags. Abbreviated tags use an hyphen and a char, long tags use double hyphens and a word. Some optional arguments admit a value and others not (they are boolean, true if they are used or false if not).

Here you can see some of the arguments cat command accepts:

dante@Camelot:~/$ cat --help
Usage: cat [OPTION]... [FILE]...
Concatenate FILE(s) to standard output.

With no FILE, or when FILE is -, read standard input.

-A, --show-all equivalent to -vET
-b, --number-nonblank number nonempty output lines, overrides -n
-e equivalent to -vE
-E, --show-ends display $ at end of each line
-n, --number number all output lines
-s, --squeeze-blank suppress repeated empty output lines
-t equivalent to -vT
-T, --show-tabs display TAB characters as ^I
-u (ignored)
-v, --show-nonprinting use ^ and M- notation, except for LFD and TAB
--help display this help and exit
--version output version information and exit

cat f - g Output f's contents, then standard input, then g's contents.
cat Copy standard input to standard output.

GNU coreutils online help: <>
Full documentation at: <>
or available locally via: info '(coreutils) cat invocation'


There are more complex commands that include what is called Subparsers. Subparsers appear when your command have other "fixed" words, like verbs. For instance, "docker" command have many subparsers: "docker build", "docker run", "docker ps", etc. Every subparser can accept its own set of arguments. A subparser can accept other subparsers too, so you can get pretty complex command trees.

To show how to use argparse module works, I'm going to use my own configuration for my application Cifra. Take a look to it's file at GitHub.

In its __main__ section you can see how arguments are processed at high level of abstraction:

You can see we are using sys.argv to get arguments provided by user but discarding first one because is the root command itself: "cifra".

I like using sys.argv as a default parameter of main() call because that way I can call main() from my functional tests using an specific list of arguments.

Arguments are passed to parse_arguments, that is a my function to do all argparse magic. There you can find the root configuration for argparse:

There you configure root parser, the one linked to your base command, defining a description of you command and a final note (epilog) for your command. Those texts will appear when your user call your command with --help argument.

My application Cifra has some subparsers, like docker command has. To allow a parser to have subparsers you must do:

Once you have allowed your parser to have subparser you can start to create those subparsers. Cifra has 4 subparsers at this level:



We'll see that argparse returns a dict-like object after its parsing. If user called "cifra dictionary" then "mode" key from that dict-like object will have a "dictionary" value.

Every subparser can have its own subparser, actually "dictionary" subparser has more subparser adding more branches to the command tree. So you can call "cifra dictionary create", "cifra dictionary update", "cifra dictionary delete", etc.

Once you feel a parser does not need more subparsers you can add its respective arguments. For "cipher" subparser we have these ones:

In these arguments, "algorithm" and "key" are positional so they are compulsory. Arguments placed at that position will populate values for "algorithm" and "key" keys in dict-like object returned by argparse.

Metavar parameter is the string we want to be used to represent this parameter when help is called with --help argument. Help parameter as you can guess is the tooltip used to explain this parameter when --helps argument is used.

Type parameter is used to preprocess argument given by user. By default arguments are interpreted like strings, but if you use type=int argument will be converted to an int (or throw an error if it can't be done). Actually str or int are functions, you can use your own ones. For file_to_cipher parameter I used _check_is_file function:

As you can see, this function check if provided arguments points to a valid path name and if that happens returns provided string or raises an argparse error otherwise.

Remember parser optional parameters are always those preceded by hyphens. If those parameters are followed by an argument provided by users, that argument will be the value for dict-like object returned by arparser key called like optional parameter long name. In this example line 175 means that when user types "--ciphered_file myfile.txt", argparser will return a key called "ciphered_file" with value "myfile.txt".

Sometimes, your optional parameters won't need values. For example, they could be a boolean parameter to do verbose output:

parser.add_argument("--verbose", help="increase output verbosity", action="store_true", default=False)

With action="store_true" the dict-like object will have a "verbose" key with a boolean value: true if --verbose was called or false otherwise.

Before returning from parse_arguments I like to filter returned dict-like object content:

In this section parsed_arguments is the dict-like object that argparse returns after parsing provided user arguments with arg_parser.parse_args(args). This object includes those uncalled parameters with None value. I could leave it that way and check afterwards if values are None before using them, but a I fell somewhat cleaner remove those None'd values keys and just check if those keys are present or not. That's why I filter dict-like object at line 235.

Back to main() function, after parsing provided arguments you can base your further logic depending on parsed arguments dict contents. For example, for cifra:

This way, argparser lets you deal with provided user arguments in a clean way giving you for free help messages and error messages, depending on whether user called --help or entered a wrong or incomplete command. Actually generated help messages are so clean that I use them for my repository file and as a base point for my man pages (what that is an story for another article).