23 October 2021

How to package Rust applications - DEB packages


Rust language itself is harsh. Your first contact with compiler and borrow checker uses to be traumatic until you realize they are actually to protect you against yourself. Once you understand that you begin to love that language.

But everything else apart of the language is kind, really comfortable I'd say. With cargo, compiling, testing, documenting, profiling and even publishing to crates.io (the Pypi of Rust) is a charm. Packaging is no exception of that as it is integrated with cargo assuming some configuration we're going to explain here.

To package my Rust applications into debian packages, I use cargo-deb. To install it just type:

dante@Camelot:~~/Projects/cifra-rust/$ cargo install cargo-deb
Updating crates.io index
Downloaded cargo-deb v1.32.0
Downloaded 1 crate (63.2 KB) in 0.36s
Installing cargo-deb v1.32.0
Downloaded crc v1.8.1
Downloaded build_const v0.2.2
[...]
Compiling crossbeam-deque v0.8.1
Compiling xz2 v0.1.6
Compiling toml v0.5.8
Compiling cargo_toml v0.10.1
Finished release [optimized] target(s) in 53.21s
Installing /home/dante/.cargo/bin/cargo-deb
Installed package `cargo-deb v1.32.0` (executable `cargo-deb`)


dante@Camelot:~$

Done that, you could start packaging simple applications. By default cargo deb obtains basic information from your Cargo.toml file. That way it loads next fields:

  • name
  • version
  • license
  • license-file
  • description
  • readme
  • homepage
  • repository

But seldom happens your application has no dependencies at all. To configure more advanced use cases, create a [package.metadata.deb] section in your Cargo.toml. In that section you can configure next fields:

  • maintainer
  • copyright
  • changelog
  • depends
  • recommends
  • enhances
  • conflicts
  • breaks
  • replaces
  • provides
  • extended-description
  • extended-description-file
  • section
  • priority
  • assets
  • maintainer-scripts

As a working example of this you can read this Cargo.toml version of my application Cifra.

There you can read general section from where cargo deb loads its basic information:

 

 

Cargo has a great documentation where you can find every section and tag explained.

Be aware that every file path you include in Cargo.toml is relative to Cargo.toml file.

Specific section for cargo-deb must not be long to have a working package:


 

Tags for this section are documented at cargo-deb homepage.

Section and priority tags are used to classify your application in Debian hierarchy. Although I've set them I think they are rather useless because official Debian repositories have higher requirement than cargo-deb can meet at the moment, so any debian package produced with cargo-deb will end in a personal repository were debian hierarchy for applications is not present.

Actually, most important tag is assets. That tag lets you set which files should be included in package and where they should be placed at installation. Format of that tag contents is straightforward. It is a list of tuples of three elements:

  • Relative path to file to be included in package: That path is relative to Cargo.toml location.
  • Absolute path to place that file in user computer.
  • Permissions for that file at user computer.

I should have included a "depends" tag to add my package dependencies. Cifra depends on SQLite3 and that is not a Rust crate but a system package, so it is a dependency of Cifra debian package. If you want to use "depends" tag you must use debian dependency format, but actually is not necessary because cargo-deb can calculate your dependencies automatically if you don't use "depends" tag. It does it using ldd against your compiled artifact and searching with dpkg  which system packages provides libraries detected by ldd.

Once you have your cargo-deb configuration in your Cargo.toml, building your debian package is as simple as:

dante@Camelot:~/Projects/cifra-rust/$ cargo deb
[...]
Finished release [optimized] target(s) in 0.17s
/home/dante/Projects/cifra-rust/target/debian/cifra_0.9.1_amd64.deb

dante@Camelot:~$

As you can see in the output, you will find your generated package in a new folder inside your project's called target/debian/.

Cargo deb is a wonderful tool which only downside is not being capable to meet Debian packaging policy to build packages suitable to be included in official repositories.

17 October 2021

How to write manpages with Markdown and Pandoc


No decent console application is released without a man page documenting how to use it. However, man page inners are rather arcane, but being a 1971 file format it has held up quite well.

Nowadays you have two standard ways to learn how to use a console command (google apart ;-) ): typing application command followed by "--help" to get a quick glance of application usage, or typing "man" followed by application name to get a detailed information about application usage. 

To implement "--help" approach in your application you can manually include "--help" parsing and output or you'd better use an argument parsing library like Python's ArgParse.

The "man" approach needs you to write a man page for your application.

Standard way to produce man pages is using troff formating commands. Linux has it's own troff implementation called groff. You have a feel about how is standard man pages format you can type next source inside a file called, for instance, corrupt.1:

.TH CORRUPT 1 .SH NAME corrupt \- modify files by randomly changing bits .SH SYNOPSIS .B corrupt [\fB\-n\fR \fIBITS\fR] [\fB\-\-bits\fR \fIBITS\fR] .IR file ... .SH DESCRIPTION .B corrupt modifies files by toggling a randomly chosen bit. .SH OPTIONS .TP .BR \-n ", " \-\-bits =\fIBITS\fR Set the number of bits to modify. Default is one bit.

Once saved, that file can be displayed by man. Assuming you are in the same folder than corrupt.1 you can type:

dante@Camelot:~/$ man -l corrupt.1

Output is: 

CORRUPT(1)                                                 General Commands Manual 

NAME
corrupt - modify files by randomly changing bits

SYNOPSIS
corrupt [-n BITS] [--bits BITS] file...

DESCRIPTION
corrupt modifies files by toggling a randomly chosen bit.

OPTIONS
-n, --bits=BITS
Set the number of bits to modify. Default is one bit.

CORRUPT(1)


 

You can opt to write specific man pages for your application following for instance this small cheat sheet

Nevertheless, troff/groff format is rather cumbersome and I feel keeping two sources of usage documentation (your README.md and your man page) is prone to errors and an effort waste. So, I follow a different approach: I write and keep updated my README.md and afterwards I convert it to a man page. Sure you need to keep a quite standard format for your README, and it won't be the fanciest one, but at least you will avoid to type the same things twice, in two different files, with different formatting tags.

The key stone to make that conversion is a tool called Pandoc. That tool is the Swiss army knife of document conversion. You can use it to convert between many documents format like word (docx), openoffice (odt), epub... or markdown (md) and groff man. Pandoc uses to be available at your distribution standard package manager repositories, so at an ubuntu distribution you just need:

dante@Camelot:~/$ sudo apt install pandoc

Once Pandoc is installed, you have to observe some conventions with your README.md to make further conversion easier. To illustrate this article explanations, we will follow as an example the file README.md for my project Cifra.

As you can see, linked README.md contains GitHub badges at the very beginning but those are going to be removed in conversion process we are going to explain later in this article. Everything else is structured to comply with contents a man page is supposed to have.

Maybe, most suspecting line is the first one. It's not usual to find a line like this in GitHub README:

Actually, that line is metadata for man page reader. After "%" contains document title (usually application name), manual section, a version, a "|" separator and finally a header. Manual section is 1 for user commands, 2 for system calls and 3 for C library functions. Your applications will fit in section 1 99% of times. I've not included a version for Cifra in that line, but you could have done. Besides, header indicates what set of documentation this manual page belongs to.

After that line, every section use to be included in man pages, but the only ones that should be included at least are:

  • Name: The name of the command.
  • Synopsis: A one-liner summarizing command line arguments and options.
  • Description: Describes in detail how to use your application command.

Other sections you can add are:

  • Options: Command line options.
  • Examples: About command usage.
  • Files: Useful if your application includes configuration files.
  • Environment: Here you explain if your application uses any environment variable.
  • Bugs: Detected bugs should be reported? There you can link your GitHub issues page.
  • Authors: Who is this masterpiece of code author?
  • See also: References to other man pages.
  • Copyright | License: A good place to include your application license text.

Once you have decided which sections include in manpage, you should write them following a format easily convertible by pandoc in a manpage with an standard structure. That's why main sections are always marked at level one of indentation ("#"). A problem with indentation is that while you can get further subheaders in markdown using "#" character  ("#" for main titles, "##" for subheaders, "###" for sections, and so on), those subheaders are only recognized by Pandoc up to sublevel 2. To get more sublevels I've opted for "pandoc lists", a format you can use in your markdown that is recognized afterwards by pandoc:

 

In lines 42 and 48, you have what pandoc call are "line blocks". Those are lines beginning with a vertical bar ( | ) followed by a space. Those spaces between vertical bar and command will be keep in man page converted text by pandoc.

Everything else in that README.md is good classic markdown format.

Let guess we have written our entire README.md and we want to perform conversion. To do that you can use an script like this from a temporal folder:

 

Lines 3 and 4 clean any file used in a previous conversion, while line 5 actually copies README.md from source folder.

Line 6 removes every GitHub badge you could have in your README.md. 

Line 7 is where we call pandoc to perform conversion.

In that point you could have a file that can be opened with man:

dante@Camelot:~/$ man -l man/cifra.1

But man pages use to be compressed in gzip format, as you can see with your system man pages:

dante@Camelot:~/$ ls /usr/share/man/man1

That's why script compress at line 8 your generated man page. And while generated man page is now a compressed gzip file, it is still readable by man:

dante@Camelot:~/$ man -l man/cifra.1.gz

Although those are some steps, you can see they are easily scriptable both as a local script, as an step of your continuous integration workflow.

As you can see, with these easy steps you only need to keep updated your README.md file because man page can be generated from it.