Excavation: ERPNext Bench Start
Let's find out what happens when we run the command 'bench start'

Published: Aug 09, 2019

Introduction: How does ‘bench start’ work?

As an ERPNext enthusiast, one of the first commands you learn is ‘bench start’. You enter this command in your terminal/console. A bunch of cryptic text will scroll up the screen. Then you can login to your ERPNext website (the default URL is http://localhost:8000)

Great. But what does ‘bench start’ actually do?

This article will reveal the secrets, by tracing the Frappe code. When you’re done reading, you should understand more about what happens when you ‘bench start’. You will also learn more about the building blocks of ERPNext.

What is ‘bench’ anyway?

Bench is a Python application that is a Command Line Interface (CLI). Whenever you type ‘bench start’ in a terminal:

$ bench start
  1. You are launching a Python app named ‘bench’
  2. You are passing it a command-line argument: ‘start’

Of course, there are many other Bench commands besides ‘start’. You can find many of them here: Bench Commands Cheatsheet.

Okay, but how is Bench a command line interface? What makes it behave like that?

Well, Bench is using an external Python package to help. It’s called Click, which is short for Command Line Interface Creation Kit. The Click library helps Python developers easily create CLI programs. It solves problems like parsing arguments and handling optional parameters.

Cool! Knowing this new information, we can locate the Python code for ‘bench start’, and learn more about what it does.

The power of ‘grep’

Whenever I want to search unfamiliar code, I turn to grep. It’s an awesome command-line tool for searching. First, I need to navigate to my Bench directory/folder:

$ cd ~/erpnext/development/bench-repo/bench

This ‘bench’ subdirectory contains all Bench source code. Somewhere in there should be the code for handling ‘bench start’. I’ll use grep to search for any Python files that contain the word ‘start’:

$ grep -rnw . --include='*.py' -e 'start'

There are several results. But not too many:

./tests/test_setup_production.py:25:		# test after start of both benches
./tests/test_setup_production.py:139:			print ("Waiting for all processes to start...")
./commands/setup.py:143:	"Setup Procfile for bench start"
./commands/__init__.py:48:from bench.commands.utils import (start, restart, set_nginx_port, set_ssl_certificate, set_ssl_certificate_key, set_url_root,
./commands/__init__.py:51:bench_command.add_command(start)
./commands/utils.py:5:@click.command('start')
./commands/utils.py:8:def start(no_dev, concurrency):
./commands/utils.py:10:	from bench.utils import start
./commands/utils.py:11:	start(no_dev=no_dev, concurrency=concurrency)
./commands/make.py:46:	"start a new app"
./utils.py:320:def start(no_dev=False, concurrency=None):
./utils.py:328:	command = [program, 'start']
./utils.py:417:	exec_cmd('sudo systemctl start -- $(systemctl show -p Requires {bench_name}.target | cut -d= -f2)'.format(bench_name=bench_name))
./config/lets_encrypt.py:62:		service('nginx', 'start')
./config/lets_encrypt.py:80:	service('nginx', 'start')
./config/lets_encrypt.py:84:	job_command = 'sudo service nginx stop && /opt/certbot-auto renew && sudo service nginx start'
./config/lets_encrypt.py:118:	service('nginx', 'start')

I think we can safely ignore files under ‘tests’ directory. The ‘lets_encrypt’ directory contains code for TLS (SSL) certificates. We can ignore those too.

Ah, but line #6 looks promising. This line introduces a Click command with a parameter ‘start’. That is exactly what we need! The code is in a file named ‘../bench-repo/bench/commands/utils.py

bench/commands/utils.py

Excellent. Our journey to understand ‘bench start’ is going well. Let’s examine this Click command code:

import click
import sys, os, copy


@click.command('start')
@click.option('--no-dev', is_flag=True, default=False)
@click.option('--concurrency', '-c', type=str)
def start(no_dev, concurrency):
	"Start Frappe development processes"
	from bench.utils import start
	start(no_dev=no_dev, concurrency=concurrency)

So, the Click command for ‘start” accepts 2 options. First, it prints some text to the terminal (“Start Frappe development process”). Then it calls a function named “start” from bench.utils. Okay. To learn more, we must examine that file…

bench/utils.py

By searching for the word “start”, we can find this function on line 319.

(Note: This article was written for ERPNext version 12.0. Your line numbers may be slightly different)

def get_program(programs):
	program = None
	for p in programs:
		program = find_executable(p)
		if program:
			break
	return program

def get_process_manager():
	return get_program(['foreman', 'forego', 'honcho'])

def start(no_dev=False, concurrency=None):
	program = get_process_manager()
	if not program:
		raise Exception("No process manager found")
	os.environ['PYTHONUNBUFFERED'] = "true"
	if not no_dev:
		os.environ['DEV_SERVER'] = "true"

	command = [program, 'start']
	if concurrency:
		command.extend(['-c', concurrency])

	os.execv(program, command)

What is this code doing?

  • First, it tries to find an executable program named “*foreman*”
    • If foreman cannot be found, it looks for “*forego*”
    • If forego cannot be found, it looks for “*honcho*”
    • If nothing is found, an error is thrown.
  • Next, if a program was found, execute the program .
    • Pass a parameter value “start”

On my server (Ubuntu 18.10 LTS), my Bench found and called the “honcho” program.

What is Honcho? It is a Python clone of Foreman.

What is Foreman? It is a program to “*manage Procfile-based applications.*” 1

Confused? I was too. But there is a simple explanation. Let’s talk about Procfile.

Procfile

A Procfile is just an ordinary text file. The file name is always “Procfile”, spelled without any extension. The format was designed for the Heroku cloud platform. The Procfile contains a list of processes we want to run, and the commands to run them. 2

<process type>: <command>

Here is a short example. Perhaps we are writing a program. Before our program begins, we need to start 2 different processes.

  1. web: bundle exec thin start
  2. job: bundle exec rake jobs:work



So honcho opens 2 files (Procfile, .env) and starts those processes. Where is the Procfile located in our ERPNext environments?

../frappe-bench

*(This is one reason your working directory must be “frappe-bench” when using Bench commands. Bench expects a certain directory and file structure. Such as finding a file named ‘Procfile’ in the current, working directory.)*

The Bench Procfile

Now that we understand more about Procfile, what’s inside the Frappe Procfile? Ten processes:

redis_cache: redis-server config/redis_cache.conf
redis_socketio: redis-server config/redis_socketio.conf
redis_queue: redis-server config/redis_queue.conf
web: bench serve --port 8000

socketio: /usr/local/bin/node apps/frappe/socketio.js

watch: bench watch

schedule: bench schedule
worker_short: bench worker --queue short --quiet
worker_long: bench worker --queue long --quiet
worker_default: bench worker --queue default --quiet

We’ll now address these 1 at a time, and figure out what’s happening in each:

1. redis_cache

redis_cache: redis-server config/redis_cache.conf

This is the redis cache database “redis_cache.rdb”, running on port 13000.

2. redis_socketio

redis_socketio: redis-server config/redis_socketio.conf

Accorindg to this documentation, it’s “used as a pub/sub between web and socketio processes for realtime communication.”

I suspect this drives the “chat” application inside ERPNext. It’s running on port 12000.

3. redis_queue

redis_queue: redis-server config/redis_queue.conf

This is the redis queue database “redis_queue.rdb”, on port 11000.

4. web

web: bench serve --port 8000

This calls an entirely new Bench function, which starts a web server. What happens there is out-of-scope for this article. But the result is a Werkzeug web server, accessible on TCP port 8000.

5. socketio

node apps/frappe/socketio.js

This is a Node Express socket.io server. It’s possible the only purpose is for Real Time Chat.

This is running on port 9000, beucase of apps/frappe/node_utils.js However, the default port is 3000. The value specified in common_site_config.json overrides the default, and changes the socket port to 9000.

6. watch

“Node server for bundling JS/CSS assets using Rollup. It will also rebuild files as they change.

7. schedule

“Job Scheduler using Python RQ.”[^3]

8. workers

There are 3 worker processes. “worker_long” has a timeout of 1500 seconds (25 minutes). The other two have 300 second timeouts (5 minutes).

Conclusion and Summary


Tags: 

Brian Pond
https://pondconsulting.net
ERPNext Consultant & Full Stack Developer