|
myshell 2.0.0
|
This project is a custom command-line interpreter that provides a user interface for executing commands. The shell is the primary way of interacting with the operating system, allowing users to execute commands and manage files and directories.
The documentation for the whole project is available here:
This also includes documentation for external utilities shipped with the shell. Fell free to read it, as it may be useful for understanding the project and its components.
The project is build using C++20 standard.
Note
Tested on GCC
13.2.1and Clang17.0.3.
Besides, it heavily relies on the following libraries:
Note
For installation instructions, please refer to the official documentation.
CMakeLists.txt to compile the project: Warning
The project contains multiple targets. The main target is
myshell. Don't forget to also compile the external utilities shipped with the shell if you are using IDE for compilation.Currently, the following external utilities are supported:
mycatmyrls
The usage of the shell is similar to that of other shells, such as bash or zsh. If you are familiar with these shells, you should have no trouble using myshell.
The external commands are compiled separately and can be invoked directly from the shell, much like built-in commands. They are recognized automatically by the build system. All external commands are placed in the {CMAKE_BINARY_DIR}/msh/bin/external directory that is added to the PATH environment variable on startup. This ensures that myshell can always find them.
All processing is done by ExternalPrograms.cmake script.
For exact instructions on how to add external commands, please refer to the README.md file in the external directory.
myshell supports command history. Its path is predefined by the build system and is set to {CMAKE_BINARY_DIR}/msh/.msh_history. This ensures that the history file is placed in the build directory, and it will not be in vain to clog the examiner's computer. Also, this allows us to use the persistent history between different runs of the shell.
If you want to change the path to the history file, consider changing the MSH_HISTORY_PATH variable in the CMakeLists.txt file to your desired path.
All mistakes and bugs from myshell 1 were fixed. The shell is now fully functional and supports all features from the main task.
Besides, here is a detailed description of the new features that are implemented in myshell 2:
myshell supports the following redirections:
n>word - Open file word for writing on file descriptor n. If n is omitted, it defaults to 1.n>>word - Open file word for appending on file descriptor n. If n is omitted, it defaults to 1.n<word - Open file word for reading on file descriptor n. If n is omitted, it defaults to 0.&>word - Redirect both standard output and standard error to file word. Equivalent to >word 2>&1 and >&word.&>>word - Append both standard output and standard error to file word. Semantically equivalent to >>word 2>&1.n<&word - The file descriptor n is made to be a copy of the descriptor specified by word. If word doesn't specify a descriptor, the redirection is ill-formed due to ambiguity. If n is not specified, the standard input (file descriptor 0) is used. If descriptor specified by word is not correct, the redirection error occurs. n>&word - Used for duplicating output file descriptors. If n is not specified, the standard output (file descriptor 1) is used. If word doesn't specify a descriptor, it is interpreted as a filename to open. If the file descriptor specified by word is not correct, the redirection error occurs. If n is omitted, and word does not specify a file descriptor, the redirect is equivalent to &>word.myshell supports the following connections:
Sequential execution:
The commands are executed sequentially, one after another. The shell waits for each command to finish before executing the next one.
The processing is done in a left-to-right manner, i.e., the following command:
is equivalent to:
and will print Hello, world to the standard output.
Pipelines:
The commands are executed in a pipeline. The standard output of each command is connected via a pipe to the standard input of the next command. Connection is performed before any redirections specified by command.
The shell waits for all commands to finish. However, the commands are executed asynchronously, and order of execution is undefined.
The exit code (msh_errno) after the pipeline is exactly the exit code of the last command in the pipeline.
Note
Piping both stdout and stderr:
To achieve this, you can use the
|&control operator. It is equivalent to2>&1 |.command1 |& command2Will pipe both stdout and stderr of
command1tocommand2. This implicit redirection is performed after any explicit redirections ofcommand1.
Background execution:
The commands that are terminated with & are executed asynchronously in the background. The shell doesn't wait for them to finish and immediately returns control to the user.
The exit code (msh_errno) after the background execution is always 0.
You can see the list of background processes using the mjobs built-in command. Find more about job control in the Job Control section.
&& - The shell executes the command command2 if and only if the command command1 returns an exit status of zero.|| - The shell executes the command command2 if and only if the command command1 returns a non-zero exit status.myshell supports command substitution. It allows the output of a command to be used in place of command itself:
The command substitution is performed by enclosing the command in backticks or $(). Execution of the command is performed in a subshell.
Any trailing newlines are removed from the output of the command substitution. The output of the command substitution is subject to word splitting and filename expansion. Embedded newlines are not deleted, but they may be removed during word splitting if IFS contains the newline character.
To prevent word splitting or filename expansion, enclose the command substitution in double quotation marks:
Command substitutions may be nested:
will print Hello, world! to the standard output.
Command substitution is not performed inside single quotation marks:
will print Hello, $(whoami)!.
Quotes may appear inside the command substitution, so the following commands are also supported:
and will print Mixed "Quotes" support in $SHELL and ) respectively.
myshell supports the simplest job control for illustrative purposes. It allows users to manage background processes.
When background processes are executed, the shell prints the following message:
where # is the job number and PROCESS_ID is the process ID of the background process.
When processes are terminated, the shell prints the following message on the next user input:
Also, to list all background processes that are currently running or just finished their execution, you can use the mjobs built-in command:
Output example:
Note
The other important role of the job control is to handle
SIGCHLDsignals. This helps to avoid zombie processes in the middle of the pipeline during execution. Without this, the shell will wait for the process to finish only after the whole pipeline is executed. This is not crucial for the shell, nevertheless, it is not kind of expected behavior.
Job control is implemented in the msh_jobs.cpp file and planned to be improved in the future to support more advanced features such as process groups and job control signals.
Tokens are the basic building blocks of myshell. All internal operations are performed on tokens. Tokens can have different flags, which determine how they are processed by the shell.
You can find the token and token types definitions in the msh_token.h file. Flags are defined in the msh_internal.cpp file.
The lexer/tokenizer is a crucial component of myshell. Basically, it is a simple state machine that tokenizes the user input.
It takes the raw input string from the user and breaks it down into individual tokens. These tokens can represent commands, words, variable definitions, shell metacharacters etc. By tokenizing the input, myshell can understand and act upon user commands effectively.
The lexer/tokenizer is implemented in the msh_parser.cpp file.
myshell pipeline is implemented using the following components:
Setup
When myshell starts, it initializes essential configurations:
Main Loop
myshell enters its main loop, continuously awaiting user input. Within this loop, the following operations are performed:
Cleanup
When the user exits the shell, it saves the command history to the history file and performs other necessary cleanup operations.
This section describes the process of building the command tree in more detail.
Consider the following command:
The command tree for this command is shown below:

The root node of the tree is always a command. command can either hold a pointer to a simple_command or a connection_command. connection_command represents a connection between two commands. It can be either a pipeline, a sequential execution, a background execution, or a logical operator. It holds a type of connection and objects of type command that are connected by this connection - rhs and lhs.
Note that leaf nodes of the tree are always simple_commands. simple_command is the execution unit of the shell. It holds the command arguments, redirections, and other information necessary for execution.
The command tree is built iteratively by the split_commands() function located in the msh_utils.cpp file. You can read more about it and other shell functions in the documentation provided above.
The execution starts from the root node of the command tree.
Each type of command has its own execution function that implements the corresponding logic. Feel free to read the documentation for msh_command.h to learn more about them.
As stated above, the elementary execution unit of the shell is a simple_command. They are executed by the msh_exec_simple() function located in the msh_exec.cpp file. The execution of each node is performed recursively in a post-order manner. For better understanding, here is an illustration of the execution order for the command tree shown above:

Before executing a simple command, if it's located within a connection command, the execution function of the latter is responsible for performing the necessary operations, such as setting up pipes, proper execution flags, etc.
Execution of each simple command is performed in the following steps:
These steps are performed by shell utilities located in the msh_utils.cpp file.
Note
The result of variable expansion and command substitution is subject to word splitting and filename expansion. To prevent this, enclose the variable in double quotation marks.
myshell spawns a child process using the fork system call and then executes the command in the child process via execve/execvpe.myshell can also execute script files, treating them as sequences of commands.Note
When
myshellis executed with an argument, it treats the argument as a script file. The shell terminates after executing the script file.
Wildcard Expansion:
Suggestion
is ignored. Wildcard expansion is performed on the entire path, not just the last element.
Variable Expansion:
If a variable is not defined, it is treated as an empty string. This behavior is consistent with other shells, however, unspecified in the task.
msource Built-in Command:
This command is a synonym for the . command and operates identically to it.
Tilde Expansion:
myshell supports tilde expansion. It expands ~ to the user's home directory.
Default Prompt:
The default prompt is changed from \w \$ specified in the main task to powerlevel10k-like prompt to demonstrate the flexibility of the prompt customization mechanism.
Variable Declarations:
Word splitting and filename expansion are not performed on the right-hand side of the declaration.
Assignment statements of the form key=value may also appear as arguments to the malias and mexport built-in commands. Other than that, variable declarations are treated as regular arguments.
Double Quotation Marks Handling: The shell supports the use of double quotation marks for processing file names and arguments with spaces.
The behavior of double quotation marks is similar to that of other shells, such as bash or zsh.
You can use double quotation marks inside double quotation marks by escaping them with a backslash.
Note
This behavior is disabled by default. To enable it, set the
ENABLE_DOUBLE_QUOTE_WILDCARD_SUBSTITUTIONflag toONin theCMakeLists.txtfile.This decision was due to the fact that this is an unexpected behavior for many shells. Neither bash nor zsh perform wildcard substitutions in double quotes.
Single Quotation Marks: Single quotation marks function similarly to double quotes, but no variable or wildcard substitution occurs inside them.
The one exception is that single quotation marks can't appear inside single quotation marks even if they are escaped.
Escape Sequences: We support the following escaping : $, #, '\', '"', and "'". Escaping the corresponding characters allows them to be inserted without their special meaning.
Also, other escape sequences are supported. For now, we support parsing of shell metacharacters, such as |, &, ;, (, ), <, >, but we don't support their functionality. Due to this fact, myshell won't be able to execute commands with these tokens. If you want to use them without their special meaning, you should escape them with a backslash.
\= is also supported.# in Strings**: The shell supports the use of # in strings. This means that # is treated as a regular character and doesn't start a comment.VAR=ABC syntax. These variables are visible only within the shell and aren't passed to child processes. To promote a local variable to an environment variable for child processes, use the mexport VAR command.Note
Variable declarations can only appear before a simple command. Otherwise, they are treated as regular arguments.
Customizable Prompt: The shell prompt can be customized based on the PS1 environment variable. This provides users with the flexibility to include information such as the username or the current time in the prompt.
Currently, the following variables are supported:
\d - The current date in YYYY-MM-DD format.\t - The current time in HH:MM:SS format.\u - The current user.\h - The current host.\w - The current working directory.\W - The current working directory's basename.\n - A newline character.\r - A carriage return character.\s - The current shell.\v - The current shell version.\$ - The prompt character.Alias Support: Our shell supports the creation and utilization of aliases, allowing users to define custom shortcuts for frequently used commands.
Aliases are defined using the alias command. For example, to create an alias named ll for the ls -l command, you can use the following command:
To remove an alias, use the unalias command:
For more information, please use the --help flag.
Note
Aliases can appear in alias itself.
myshellincorporates a robust alias expansion mechanism that prevents infinite alias expansion loops.
mjobs built-in command.