Hush is a Unix shell scripting language inspired by the Lua programming language. Currently in development.
This is the technical specification for Hush, defining the complete syntax and semantics of the language.
Hush is motivated by some shortcomings in traditional shell languages, mainly the lack of appropriate primitive data structures, such as generic arrays and maps. Such shortcomings result in hacky workarounds for simple tasks, and render complicated tasks being rather tricky to implement.
Hush borrows a great part of it’s design from Lua, a simple but elegant scripting language. Lua has proved to be a very expressive programming language, regardless of it’s succinct syntax and semantics, particularly it’s very simple type system.
As in Lua, Hush proposes a handful of built-in types, and no user-defined types. This makes the type system extremely simple, without compromising expressiveness. The types proposed by Hush are:
nil
: the unit type, usually for representing missing values.bool
: the boolean type.int
: a 64 bit integer type.float
: a 64 bit floating point type.char
: a C-like unsigned char type, 0-255.string
: a char-array like string.array
: a heterogeneous array, 0-indexed unlike in Lua.dict
: a heterogeneous hash map.function
: a callable function.error
: a special error type, to ease distinction of errors from other values. This type can only be instantiated by the built-inerror
function. For more details, see the errors section.
All types except the array
and dict
are immutable. This means that, for instance, when
one increments a integer, a new value is created with the result. This is important
because, as dict
is a central type in the language, usage of every other type as a
key will be very common. Dictionary keys should always be immutable, because mutation
has a side effect on hashing.
As Hush is heavily inspired by Lua, it borrows most of the syntax from it, with only minor tweaks to support the additional command invocation syntax.
Identifiers in Hush are case sensitive, composed by alphanumeric or underscore characters, and must not start with a number. The only exception is the keywords, which are not valid identifiers.
The following are keywords in Hush:
let, if, then, else, end, for, in, do, while, function, return, not, and, or, true, false, nil, break, self
Each type in the language, except for the error
type, has corresponding syntax for
literals:
nil
: It’s only value isnil
.bool
:true
andfalse
.int
: 64 bit decimal integer literal, optionally prefixed by a minus sign.float
: 64 bit floating point literal: a decimal followed by a period character, followed by another decimal literal, with an optional exponent.char
: a single char enclosed in single quotes.string
: string literal enclosed in double quotes.array
: values enclosed in[]
, separated by commas.dict
: identifier-expression pairs enclosed in@[]
, separated by a colon and delimited by commas.function
:function (<args>) <body> end
Example:
let my_dict = @[ # dict
field0: nil,
field1: true,
field2: 42, # integer
field3: -12, # integer
field4: 3.14, # float
field5: 12E+99, # float
field6: [ # array
"a simple string",
"a string with \n escaped \" characters",
'c', # a char
'\n' # another char
],
field7: function (arg1, arg2)
return arg1 + arg2
end
]
Comments can be placed in any point of a line, starting with #
and spanning until the
end of the line.
Hush provides the following operators, in order of precedence:
- Unary:
- Logical:
not
(prefix). - Arithmetic:
-
(prefix). - Field access:
.
,[]
(postfix).
- Logical:
- Binary:
- Arithmetic:
*
,/
,%
,+
,-
. - String:
++
(right associative). - Relational:
>
,<
,>=
,<=
. - Equality: ====,
!=
. - Logical:
and
,or
. - Assignment: ===.
- Arithmetic:
Regarding semantics, check the Operators section for more details.
Commands blocks¹ can be delimited by one of {}
, ${}
or &{}
, and inside them, only the
following operators apply:
>
,>>
,<
,<<
.|
.?
.
[1]: Check the Commands section for more details.
Hush adopts static scope, and variables must be declared with a `let` statement.
let x # Introduces the variable in the local scope
let y = 5 # Shortcut for assignment
Assignment is straightforward, but requires previous declaration.
x = 1
All variables are references, and therefore can refer to the same dict
or array
for
instance.
In Hush conditional statements don’t coerce types to bool
. This means that one cannot
have nil
or an empty array as conditions, like in Lua. All conditionals operate with a
bool
. If one supplies a condition that is not a boolean, a panic occurs.
The if
statement can have two forms:
if expression then
# body
end
if expression then
# body
else
# body
end
The if-else
form is a valid expression, and results in the value of the respective
body. If the body ends with a statement that produces no value, then nil
is produced.
The while
loop allows looping over a boolean expression:
while expression do
# body
end
The for
loop allows looping over an iterator function:
for identifier in expression do
# body
end
Here, a new variable is introduced (identifier
), and expression
must result in a
function that can be called once for every iteration, receiving no arguments and
returning an array with two elements. The first element must be a boolean. When true
,
the second element is assigned to the iteration variable, and the loop body is
executed. When false
, the iteration is finished.
Under the hood, the for
loop translates to something like:
let iter = expression
let arr = iter()
while arr[0] do
let identifier = arr[1]
# body
arr = iter()
end
Both loop constructs support the break
keyword, which implements short exiting.
- Field access:
-
The index operator (
[]
) may only be applied to values of typesarray
anddict
, resulting in the respective associated value. Panics when out of bounds.The dot access operator may only be applied to values of type
dict
, and is a shortcut for the index operator:a.b == a["b"]
- Logical:
-
Logical operators may only be applied to values of type
bool
, and always result in a value of the same type. Theand
andor
operators implement short circuit semantics. - Arithmetic:
-
Arithmetic operators may be applied to numeric values (
int
andfloat
). Values of typeint
will be automatically converted tofloat
when paired with afloat
on a binary operator. The integer modulo operator (%
) is only available forint
values. Integer division by zero will cause a panic. - String:
-
The string concatenation operator (
++
) may only be applied to strings, and will result in a new string. Note that strings in Hush are immutable. - Relational:
-
Relational operators may only be applied to values of type
int
,float
,char
orstring
, and always result in a value of typebool
. - Equality:
-
Equality operators can be applied to values of arbitrary types, and always result in
a value of type
bool
.
Providing invalid types for any operator will cause a panic.
In traditional shells, function arguments are always strings, and the return value is always an integer (status code). Hush proposes more generic semantics, which are typically adopted by general purpose programming languages. Functions should be able to accept parameters of arbitrary types, and also be able to return a value of an arbitrary type. On the other hand, commands are limited by the operating system to accept strings and return a status code. Therefore, when invoking external commands, Hush converts the given arguments to strings, and provides the status code as the return value.
In Hush, functions:
- Can have an arbitrary number of parameters, defined by up to two comma-separated
lists of parameters, delimited by a semicolon. The first list, if any, denotes
required parameters. The second list, if any, denotes optional parameters. If a
function is called with missing required arguments, then a panic occurs. Optional
arguments default to
nil
. - Return only one value, in contrast to Lua.
- Are values, being first class citizens like every other type in the language.
- As they are values, they have no name. A function declared with a name is actually a variable declaration, referring to such function value. Therefore, such variable can be reassigned to a different value.
- Can also capture variables, i.e. they can be closures.
- Can be recursive. As functions are values, recursive functions are actually closures on themselves.
- Have access to a special variable,
self
, which is a reference to the function’s parent, if any. If a function is called directly asmy_function()
, thenself
isnil
. Otherwise, if it’s called as a member of adict
, as inmy_obj.my_function()
, thenself
refers to the same value asmy_obj
.
Summarizing, here are some examples of functions in Hush:
# Simple function definition.
function sum(a, b, c)
return a + b + c
end
# Reassigns the sum variable, which was referring to the previous function.
sum = function (a, b, c; d) # Here, `d` is an optional argument.
if d != nil then
return a + b + c + d
else
return a + b + c
end
end
function sum(a)
return function(b) # Closure!
return a + b # Here, `a` is captured from the outer scope.
end
end
# Simple recursive function.
function factorial(n)
if n < 2 then
return 1
else
return n * factorial(n - 1)
end
end
# A member function.
let my_obj = @[
value: 5,
method: function()
if self != nil then
return self.value
else
return 0
end
end,
]
my_obj.method() # Returns 5
let fun = my_obj.method
fun() # Returns 0
In traditional shells, expressions produce two results that can be manipulated by the
language: the standard output (stdin/stderr), and a status code. The output can be
captured by the $()
operator, and the status code is immediately available through the
$?
variable.
In Hush, command blocks are enclosed in {}
. Individual commands must end with a
semicolon, except for the last command in the block. This can be annoying for simple
commands, but it allows one to split a command across multiple lines interspersed
with comments, which is currently impossible in Bash, for instance.
{
docker create
--name $container
-i -a STDIN -a STDOUT -a STDERR # attach all stdio
-v $pwd:/my/project:ro # mount the source code as a read-only volume
my-image:latest;
rsync -av --delete --delete-excluded
# version control directories:
--exclude='.git/'
--exclude='.svn/'
# build directories:
--exclude='.stack-work/'
--exclude='.ccls-cache/'
--exclude='target/'
--exclude='bin/'
--exclude='obj/'
# don't backup series or torrents:
--exclude='series/'
--exclude='torrents/'
~/ /mnt/backup 2>1
| tee rsync.log;
list-musics
| xargs --null -- mediainfo --Output='Audio;%Duration%\n' # get duration in milliseconds
| awk NF # remove empty lines
| paste -s -d + # join lines with +
| bc # eval the resulting expression
}
The result of a command invocation and execution is the status code if 0
, or an
error
otherwise. The resulting error
will contain the status
field in it’s
context. In pipelines, the result is an array of the results of each individual
command.
The result of a command block is an array of results, or a single result if there is a single command/pipeline.
By default, if a command or a pipeline produces an error
, Hush will interrupt the
execution of the current command block. This behavior is similar to Bash’s set -e
.
To prevent this, one can use the ?
operator after a command/pipeline, and Hush will
proceed even if the result is an error
.
Example:
let results = {
# (A)
cat /etc/shadow ?; # Should error with permission denied, but won't abort the command block.
# (B) The following pipeline will contain an error, but the command block won't be aborted.
echo Hello world!
| cat
| cat /etc/shadow # Should error with permission denied.
| cat ?;
# (C)
echo Hello world!; # Should succeed, resulting in 0.
# (D) Should error, aborting the command block.
echo Hello world!
| cat /etc/shadow # Should error with permission denied.
| cat;
# (E)
echo Foo Bar; # Won't be executed, because an error has caused the abortion of the command block.
}
let result
# (A): Permission denied.
result = results[0]
std.type(result) == "error"
result.status == 1 # Cat returns 1 when permission denied.
# (B): Array containing results of each command in the pipeline.
result = results[1]
std.type(result) == "array"
result[0] == 0 # Success.
result[1] == 0 # Success.
std.type(result[2]) == "error" # Permission denied.
result[3] == 0 # Success.
# (C): Success.
result = result[2]
result == 0
# (D): array containing results of each command in the pipeline.
result = result[3]
std.type(result) == "array"
result[0] == 0 # Success.
std.type(result[1]) == "error" # Permission denied.
result[2] == 0 # Success.
# (E): Due to the previous failure not guarded by the ? operator, the last command in the
# block didn't get to execute.
std.length(results) == 4
If the command name contains a path separator (/
), Hush will attempt to execute the
respective file, if any. Otherwise, Hush will look up the command in the following
order:
- Aliases: command aliases defined by the user.
- Built-in commands: commands which are not external programs, but are implemented by
Hush, like
cd
andecho
. - Executables in
$PATH
, respecting the list order
If there is no such command, or the command cannot be executed, it results in an
error
, and Hush outputs the error description to stderr.
Command arguments are separated by spaces. Backslash-escaped spaces are not
considered separators, but argument text. Variables can be accessed by prefixing
their identifier with $
, or surrounding with ${}
, and are expanded with the following
rules:
nil
,bool
,char
,int
,float
,string
: converted to string usingtostring()
, passed as a single argument, regardless of containing spaces, asterisks, and whatnot.array
: each element will be converted to a single argument, using the first and third rules. If the array is empty, no argument is produced. This way, arrays can be used to programmatically build lists of command arguments.dict
,function
,error
: won’t be converted, causing a panic instead.
Attempting to access an undeclared variable results in a panic.
Single quotes delimit literals without interpolation, while double quotes allow
interpolation. Inside double quotes, variables can be accessed with $
or ${}
, to
allow consecutive word characters. As an example, all of the following produce a
single argument to echo
:
let file = "/etc/myconfig"
{
echo $file; # /etc/myconfig
echo '$file'; # $file
echo '/usr'$file'uration'; # /usr/etc/myconfiguration
echo "$file"; # /etc/myconfig
echo "${file}"; # /etc/myconfig
echo "/usr${file}uration"; # /usr/etc/myconfiguration
}
In Hush, there is no such thing as implicitly expanding or globbing the contents of a variable.
Hush performs tree types of expansion for unquoted literal arguments.
- Tilde expansion:
Any argument starting with
~/
will have such prefix expanded to$HOME/
. - Brace expansion:
Arguments containing unescaped brace-enclosed lists will be expanded to an array of strings, regardless of existing file paths. The brace syntax allows two forms:
{a,b,,'c'}
: two or more comma-separated strings, which can be empty or quoted. One argument will be generated for each string.{1..10}
: two integers separated by..
, denoting a sequence. One argument will be generated for each element of the sequence.
Examples:
dir/file{,.jpg,'.png'}
->[ "dir/file", "dir/file.jpg", "dir/file.png" ]
dir/file-{3..1}.txt
->[ "dir/file-3.txt", "dir/file-2.txt", "dir/file-1.txt" ]
- Filename expansion:
Arguments containing any of the following patterns, when unescaped, will be expanded to an alphabetically sorted array of existing file paths, matched by the respective regular expression construct:
*
->[^/]*
?
->[^/]
[
…]
->[
…]
Example:
some/*/path*/with/patterns/[1-9].???
will match paths with the following regex:some/[^/]*/path[^/]*/with/patterns/[1-9].[^/][^/][^/]
Hidden files (whose name starts with a dot) are matched by default, as opposed to Bash. Directory references (
.
,..
) are not matched. Relative paths are expanded with a./
prefix, in order to prevent flag injection vulnerabilities. ¹
When the expansion results in an array, such array is converted to arguments according to the rules described in Commands.
While brace and filename expansion may not be used simultaneously in the same argument, tilde expansion can be used with both.
[1]: As in chown my-user *
, when there is a file named --reference=/home/other-user/
.
Traditional shells implement multiple operators for redirecting file descriptors. In Bash, for instance, there are at least 10 such operators, which implement quite specific behavior. To keep things simple, Hush proposes only four redirection operators:
command < filename
: opens stdin as a reference to the given filename.command << string
: opens stdin as a pipe containing the given string.command fd> fd2
orcommand fd> filename
: opensfd
as a reference to the same file offd2
, or as a reference to the given filename.fd
defaults to1
(stdout) when omitted. The target file is created if it doesn’t exists, or truncated otherwise.command fd>> file
: opensfd
as a reference to the given filename.fd1
defaults to1
(stdout) when omitted. The target file is created if it doesn’t exists, or appended-to otherwise.
Literal file descriptors are denoted by a single number, according to the following table:
File | Number |
---|---|
stdin | 0 |
stdout | 1 |
stderr | 2 |
If one desires to redirect to a file named “2”, quotes must be used:
{ command > "2" }
Filenames may be supplied through variables, but not file descriptors:
let var = 2
{ command > $var } # Redirects to a file named "2"
Contrary to traditional shells, redirection operators must be placed after all of the supplied arguments for a command. This aims to assure that no redirection can go unnoticed when there are many arguments. The redirection operator has higher precedence than the pipe operator.
If any of the I/O operations regarding redirections fails, the target command is not
executed, and an error
is produced.
Commands can be chained into pipelines using the |
operator, which connects the left
hand side’s stdout
to the right hand side’s stdin
using a unix pipe. While the |
operator is left associative, all commands in a pipeline are executed concurrently.
Hush awaits all processes to finish, producing an array with the result of all
commands in the pipeline.
Here are some insightful examples of such behavior:
- The following pipeline:
{ ps aux | cat | cat | cat | grep 'cat' }
May output something like:
91632 0.0 0.0 5492 676 pts/3 S+ 19:03 0:00 cat 91633 0.0 0.0 5492 680 pts/3 S+ 19:03 0:00 cat 91634 0.0 0.0 5492 684 pts/3 S+ 19:03 0:00 cat 91635 0.0 0.0 6396 2316 pts/3 S+ 19:03 0:00 grep cat
Which indicates that all
cat
programs were already running whenps
fetched the process list. - The following command outputs an infinite stream of zeroes:
{ cat /dev/zero | tr '\0' '0' }
But when piped to the
head
command, all involved programs terminate:{ cat /dev/zero | tr '\0' '0' | head -c 20 }
Because when
head
closes it’s side of the pipe, attempts to write from the other programs result inSIGPIPE
.
If any of the I/O operations regarding the pipes fails, none of the target command
are executed, and an error
is produced instead.
The capture operator (${}
in Hush) adopts more flexible semantics than those of
traditional shells. Instead of resulting in the command’s stdout, the result is a
dict
containing three fields: a string
for stdout, a string
for stderr, and the
result status. This enables accessing both stdout and stderr separately, as well as
the result status, all with value semantics. If one cares only about the stdout for
instance, direct access can be used, without requiring any intermediate variables:
${date --iso-8061}.stdout
To pass the output as arguments to other commands, one needs intermediate variables, as opposed to traditional shells.
Bash:
tee $(date --iso-8601)
Hush:
let date = ${date --iso-8601}.stdout
{ tee $date }
If any of the I/O operations regarding capturing fails, the target command is not
executed, and an error
is produced.
Shells like Ksh, Zsh and Bash support asynchronous commands through the coproc
keyword and the &
operator, also providing the wait
built-in for joining such
co-processes. In such shells, the pid of a asynchronous command is immediately
available through the $!
variable.
Bash:
# Array variable to capture the pids of all spawned tasks
declare -a pids
one long running command &
pids+=($!)
another long running command &
pids+=($!)
yet another long running command &
pids+=($!)
# Give jobs some time to complete
sleep 2000
status=0
for pid in ${pids[@]}; do
if ps -p $pid > /dev/null; then
# Job is stil running, abort...
kill $pid
status=1
else
# Job finished, check if succeeded:
if ! wait $pid; then
status=$?
fi
fi
done
exit $status
Hush proposes a different approach, allowing one to launch a command block
asynchronously, and have immediate access to the operations regarding such job. When
a command block is delimited with the &{}
operator, the block is executed
asynchronously, and the resulting value of the expression is a dict
with a set of
values and functions to operate on the job:
pid
: the job’spid
. You are unlikely to need this field in practice.running()
: returns abool
indicating whether the job is still running.abort()
: aborts the job, killing any child processes.join()
: like Bash’swait
, blocks until the job is finished, and returns the command block result.
Hush:
# Array variable to capture the pids of all spawned tasks
let jobs = []
let job = &{ one long running command }
jobs.push(job)
job = &{ another long running command }
jobs.push(job)
job = &{ yet another long running command }
jobs.push(job)
# Give jobs some time to complete
std.sleep(2000)
let status = 0
for job in std.iter(jobs) do
if job.running() then
# Job is still running, abort...
job.abort()
status = 1
else
# Job finished, check if succeeded:
let job_result = job.join()
if std.type(job_result) == "error" then
std.print("Failed to execute job:")
std.print(job_result)
end
end
end
Functions in Hush can be called using the ()
operator. Like in the function
declaration, the function call operator receives required and optional arguments,
using the exact same syntax.
In Hush, there is currently no way of capturing, piping or redirecting the output of shell functions. This is due to the fact that pipes in particular have concurrent semantics, i.e., each component (command or function) in the pipeline runs concurrently. This would be problematic for Hush functions because they can reference outer variables through parameters and closures, and consequently mutate their values. Therefore, two functions in a pipeline could access the same variable concurrently, potentially causing a data race.
There are plans to include such features in the future, by the means of cloning all parameters and closures to piped and asynchronous functions, therefore inhibiting data races. But this has to be more carefully designed before we can settle for anything.
Hush provides two mechanisms for errors. The error
type allows one to construct and
manipulate recoverable errors, which can be detected and handled. Panics, on the other
hand, are irrecoverable errors, which result in abortion of the current script
execution.
Hush provides the error
built-in function to construct values of the error
type. This
mechanism should be used for reporting and handling errors. Command blocks and
built-in functions will report errors by returning values of such type, instead of the
expected return value.
let result = std.cd("/non-existing/directory")
if std.type(result) == "error" then
std.print("Failed to change directory:")
std.print(result)
end
The error
built-in will produce an error
providing:
- A message, supplied by the caller.
- An optional context, supplied by the caller. Useful for attaching related data.
- An automatically generated backtrace.
Examples of recoverable errors:
file not found
permission error
invalid format
command not found
command returned non-zero exit status
Panics are irrecoverable errors, due to invalid program logic. When a panic occurs, Hush halts the current script execution, and prints an error description message along with a stack trace to stderr.
Examples of errors that cause a panic:
syntax error
integer division by zero
index out of bounds
attempt to call a value that is not a function
missing mandatory arguments
Hush provides built-in functions for common tasks, and built-in commands for tasks that cannot be performed by external commands.
Hush provides a top-level dict
named std
, which contains all built-in functions:
cd(dir)
-
If
dir
is astring
, attempts to change the shell’s current working directory, returning an error on failure. Panics otherwise. This functionality is also available through thecd
command. error(description; context)
-
Returns a value of type
error
, containing the given description, a backtrace, and the optional context. Ifdescription
is not astring
, orcontext
is notnil
or adict
, then a panic occurs. exit(status)
-
If
status
is anint
, exits the shell, returning the given status to the operating system. Panics otherwise. glob(value)
-
If
value
is a string, performs path expansion, producing a possibly empty array of strings. Panics otherwise. iter(value)
-
If
value
is astring
,array
ordict
, returns a function that iterates through it’s elements. Panics otherwise. See the Conditionals and loops section for more details on iterator functions. length(value)
-
If
value
is astring
,array
ordict
, returns the number of elements. Panics otherwise. print(; value)
-
If
value
is notnil
, converts it to string usingtostring
, then writes tostdout
, followed by a line break. Prints an empty string otherwise. sleep(milliseconds)
-
If
milliseconds
is anint
, sleeps for the given duration. Panics otherwise. tostring(value)
-
Converts
value
to string, using the following rules:nil
,bool
,char
,int
,float
,string
: traditional representation, without quotes.function
: returns “<function>”.array
,dict
: recursively dump the inner values, delimited with the respective literal syntax.error
: formats the error description message, along with the context if any.
type(value)
-
Returns a string describing the type of
value
.let val = "this is a string" std.type(val) == "string" # true
Attempts to change the values of the std
dict
result in undefined behavior.
Hush provides only a handful built-in commands, which provide functionality that is impossible to be implemented by external programs:
alias
-
Creates an alias, to take part in command lookup. The first argument is the alias
name, and the following arguments are the aliased command and arguments. The alias
name cannot contain a path separator (
/
). Example:{ alias ll ls --color=auto -lh --time-style long-iso --group-directories-first }
cd
-
The first and only argument is the directory to be accessed. If the directory does
not exists, or cannot be accessed,
cd
prints an error description to stderr, and returns1
. Example:{ cd /home/my-username/ }
Note that both the alias
and cd
built-ins perform side-effects in the shell’s
execution context, and therefore cannot be used in concurrent constructs, such as
piping and asynchronous commands. They also can’t take part in redirection and
capturing. Attempts to use built-in commands with any of these constructs will result
in a panic.
Hush mainly focuses on functional programming, but also supports some sort of object oriented programming. While Lua proposes the metatable mechanism to add sophisticated dynamics to tables, Hush adopts simpler semantics, having dicts as plain key-value stores.
Functions can act as methods by using the self
operator, as described
previously. Objects can be defined as dicts with member functions, which can be defined
by a constructor function.
Hush:
function MyCounter(initial_value) # MyCounter is a function that represents a Class.
let increment = function()
self._value += 1
end
let get = function()
return self._value
end
return @[
_value: initial_value, # Public field.
# These methods could be implemented here as well.
# Remember, functions are nothing but values.
increment: increment, # Method
get: get, # Method
]
end
let counter = MyCounter(0)
counter.increment()
counter.increment()
counter.get() # Returns 2
function StepCounter(initial_value, step)
# This function captures the `step` variable, which acts as a private field.
let increment = function()
self._value += step
end
let print = function()
std.print(self.get())
end
let counter = MyCounter(initial_value) # Inheritance
counter.print = print # Additional method
counter.increment = increment # Method overriding
return counter
end
let counter = StepCounter(0, 2)
counter.increment()
counter.increment()
counter.print() # Prints 4