The Role of the Penguin in IT. Professional Uses of Linux

25 April is Penguin Day, an unusual holiday designed to draw attention to the importance of these birds. It turns out that penguins, although indirectly, are perfect for IT administration, among other things. Surely many people recognise the Tux penguin, the ‘mascot’ of the Linux family of systems. The term Linux refers to the kernel itself, the most important and basic component of any operating system. The author of Linux is Linus Torvalds, who published the full source code of his solution. This approach has given rise to many different distributions. Some of them (e.g. Debian, Arch Linux, Red Hat Linux) became very popular and became a certain standard, and then became base distributions for other systems. Specialised distributions for very specific applications (e.g. IoT), known to a rather narrow audience, have also emerged.

Penguin’s role in IT – professional uses of Linux

At SOFTIQ, we mainly implement and maintain Debian, Ubuntu and Red Hat Enterprise Linux (and derivatives such as AlmaLinux or Rocky Linux). It is these systems that are currently the most common choice for server applications, and there is no indication that this trend will change in the coming years. The main advantages of Linux include the availability of typical server software, lower hardware requirements compared to Windows Server systems, and lower implementation costs for the customer – most Linux distributions are free. If vendor support is required, licences for RHEL systems are generally used.

In this article, we will present some specific applications for the Linux systems we manage.

HTTP servers

The process of a user accessing any web service is a complex issue. However, it is important to be aware that in network communication, one of the basic protocols is HTTP, which in a general sense is responsible for communication between a client (e.g. a web browser) and a web server. The best-known HTTP servers are NGINX and Apache.

The capabilities of HTTP servers are not entirely clear. For simple static pages, the action of the server is to read a file from a path defined in the configuration and send its contents to the client. For more complex sites, such as online shops, news portals or even blogs, it is necessary to implement dedicated applications to manage the content of the site. The applications themselves can be written using a variety of programming languages. HTTP servers usually have built-in PHP scripting capability, which is commonly used in hosting services. However, more and more applications written in other technologies are being developed. For their hosting, a Phusion Passenger solution can be used, which allows support for Ruby, Node.js and Python scripts. However, this is not always sufficient or the right approach.

For this reason, HTTP servers are very often used in the role of a reverse proxy. This involves forwarding traffic to services running locally on the server or other servers on the internal network that cannot be accessed directly from the Internet.

Application servers

Some web applications require a suitable application server in order to run. This principle applies primarily to the Java and .NET languages.

Examples of database systems that can be deployed on Linux servers include MySQL, PostgreSQL, MS SQL, MongoDB, and Oracle Database.

Storage

A less noticeable but still crucial aspect of any IT system is data storage. For small projects involving a single machine, a dedicated storage server is usually not used – application data is stored in the appropriate directory. However, mention must be made of the good practice of using a separate disk (or even several disks with LVM configured) dedicated solely as a mount point for the data directory.

The issue of storage in high-availability environments looks completely different. First of all, not all applications require file-level synchronisation, because data (including graphics – the BLOB type is used for this) can be stored in the database, and application parameters (database credentials, configuration settings) do not necessarily need to be specified in the configuration files. In this situation, there is no need for file synchronisation.

However, when such a need does arise, the easiest way to achieve data synchronisation between multiple application servers would be to use NFS (Network File System). It is very easy to set up and does not take much time. Unfortunately, NFS does not provide any redundancy or failover solution (switching to a working server when unavailability is detected) – server failure means that the application cannot use this resource, which should not be the case in a high-availability approach.

For this reason, GlusterFS, a distributed file system that provides data replication between nodes and automatic failover, is used. This solution is only available for Linux systems.

In the context of storage, it is also worth mentioning backup servers (backups). For example, Bareos, a free tool for making copies of designated directories from remote clients, can be successfully used. The Bareos server only runs on Linux systems.

Load balancing

Yet another application for machines with Linux installed is load balancing, a technique for balancing the load between a group of servers. Various traffic balancing algorithms are used, from the basic round-robin type to slightly more complex ones such as IP hash or least connection. The load balancer continuously verifies the availability of servers from the pool and directs network traffic to the appropriate hosts.

A single load balancer may not be sufficient, as in such a configuration it represents a so-called SPOF (single point of failure) – its unavailability will at the same time make it impossible to use the service ‘hidden’ behind the load balancer. However, a cluster of load balancer servers using Corosync and Pacemaker tools can be used. As a load balancer at SOFTIQ, we mostly use the proven HAProxy solution.

Docker

At SOFTIQ, we use Docker in the vast majority of projects to carry out software deployments. This not only enables faster delivery of subsequent releases, but also largely maintains portability – the same image is used regardless of the environment. By using containerisation, we also save time when migrating environments between machines, as the problem of having to install different dependencies (dependency hell) disappears.

Linux can be confidently described as the standard solution for containerisation. Docker on Linux systems runs very stably and does not require any license fees even in commercial applications. Not to forget the base systems of the images themselves – with a few exceptions, they are based on Linux distributions, mainly Debian and Alpine.

Elastic Stack

Elastic Stack refers to the jointly used tools Elasticsearch, Logstash and Kibana, which provide the ability to aggregate and view collected logs from different services. Logs can be collected using, for example, Filebeat (ready-made modules are available for popular formats, otherwise, you usually have to prepare your own rules in Logstash). The data is presented in Kibana, where you can add prepared views (e.g. log view per environment and project) and set access rights for individual users.

ELK is ideal for complex environments where, for example, several application servers are running. This means that there are logs on each machine, and often the log file for a given day is several tens of gigabytes in size. Searching for specific logs by connecting to each server and manually executing the grep command is not efficient. The implemented Elastic Stack greatly speeds up this process, as it allows logs to be filtered from a single location.

Proxmox

The Debian distribution is the base system for the popular Proxmox hypervisor. It is a powerful virtualisation software – virtual machines running virtually any operating system are run on physical servers (usually connected in a cluster) with robust hardware resources. Virtualisation provides, among other things, the possibility to increase resources ‘on the fly’ (e.g. disk capacity, number of vCPUs or RAM), control their use and easily clone the environment.

25 Apr 2025

25 Apr 2025

Blog