
On Habré, you can find many publications that reveal both the theory of monads and the practice of their application. Most of these articles are expected about Haskell. I will not retell the theory for the nth time. Today we’ll talk about some Erlang problems, ways to solve them with monads, partial use of functions and syntactic sugar from erlando - a cool library from the RabbitMQ team.
Introduction
Erlang has immutability, but no monads * . But thanks to the parse_transform functionality and erlando implementation in the language, there is still the possibility of using monads in Erlang.
About immunity at the very beginning of the story, I spoke not by chance. Immunity is almost everywhere and always - one of the main ideas of Erlang. Immunity and purity of functions allows you to focus on the development of a specific function and not be afraid of side effects. But newcomers to Erlang, coming from Java or Python, for example, find it quite difficult to understand and accept Erlang's ideas. Especially if you recall the Erlang syntax. Those who tried to start using Erlang probably noted its unusualness and independence. In any case, I have accumulated a lot of feedback from newcomers and the “strange” syntax leads the rating.
Erlando
Erlando is an Erlang extension set giving us:
- Partial use / currying of functions with Scheme-like cuts
- Haskell-like do notations
- import-as - syntax sugar for importing functions from other modules.
Note: I took the following code examples to illustrate the features of erlando from Matthew Sackman's presentation, partially diluting them with my code and explanations.
Abstract Cut
Go straight to the point. Consider several functions from a real project:
info_all(VHostPath, Items) -> map(VHostPath, fun (Q) -> info(Q, Items) end). backing_queue_timeout(State = #q{ backing_queue = BQ }) -> run_backing_queue( BQ, fun (M, BQS) -> M:timeout(BQS) end, State). reset_msg_expiry_fun(TTL) -> fun (MsgProps) -> MsgProps #message_properties{ expiry = calculate_msg_expiry(TTL)} end.
All these functions are designed to substitute parameters into simple expressions. In fact, this is a partial application, since some parameters will not be known before the call. Together with flexibility, these features add noise to our code. By changing the syntax a bit - by entering cut - you can improve the situation.
Value _
- _ can be used in templates
- Cut allows _ to be used outside templates
- If it is outside the template, it becomes a parameter for the expression in which it is located
- Multiple use of _ within one expression leads to the substitution of several parameters in this expression
- Cut is not a replacement for closures (funs)
- Arguments are evaluated before the cut function.
Cut uses _ in expressions to indicate where abstraction should be applied. Cut only wraps the closest level in the expression, but nested cut is not prohibited.
For example list_to_binary([1, 2, math:pow(2, _)]).
list_to_binary([1, 2, fun (X) -> math:pow(2, X) end]).
to list_to_binary([1, 2, fun (X) -> math:pow(2, X) end]).
but not in fun (X) -> list_to_binary([1, 2, math:pow(2, X)]) end.
.
It sounds a bit confusing, let's rewrite the examples above using cut:
info_all(VHostPath, Items) -> map(VHostPath, fun (Q) -> info(Q, Items) end). info_all(VHostPath, Items) -> map(VHostPath, info(_, Items)).
backing_queue_timeout(State = #q{ backing_queue = BQ }) -> run_backing_queue( BQ, fun (M, BQS) -> M:timeout(BQS) end, State). backing_queue_timeout(State = #q{backing_queue = BQ}) -> run_backing_queue(BQ, _:timeout(_), State).
reset_msg_expiry_fun(TTL) -> fun (MsgProps) -> MsgProps #message_properties { expiry = calculate_msg_expiry(TTL) } end. reset_msg_expiry_fun(TTL) -> _ #message_properties { expiry = calculate_msg_expiry(TTL) }.
Argument Calculation Order
To illustrate the order in which the arguments are calculated, consider the following example:
f1(_, _) -> io:format("in f1~n"). test() -> F = f1(io:format("test line 1~n"), _), F(io:format("test line 2~n")).
Since the arguments are evaluated before the cut function, the following will be displayed:
test line 2 test line 1 in f1
Cut abstraction in various types and patterns of code
- Tuples
F = {_, 3}, {a, 3} = F(a).
- Lists
dbl_cons(List) -> [_, _ | List]. test() -> F = dbl_cons([33]), [7, 8, 33] = F(7, 8).
- Records
-record(vector, { x, y, z }). test() -> GetZ = _#vector.z, 7 = GetZ(#vector { z = 7 }), SetX = _#vector{x = _}, V = #vector{ x = 5, y = 4 } = SetX(#vector{ y = 4 }, 5).
- Cases
F = case _ of N when is_integer(N) -> N + N; N -> N end, 10 = F(5), ok = F(ok).
- Maps
test() -> GetZ = maps:get(z, _), 7 = GetZ(#{ z => 7 }), SetX = _#{x => _}, V = #{ x := 5, y := 4 } = SetX(#{ y => 4 }, 5).
- Matching Lists and Constructing Binary Data
test_cut_comprehensions() -> F = << <<(1 + (X*2))>> || _ <- _, X <- _ >>,
Pros
- The code has become smaller, therefore it is easier to maintain.
- The code has become simpler and tidier.
- Gone noise from funs.
- For beginners in Erlang, it’s more convenient to write Get / Set functions.
Cons
- Increased entry threshold for experienced Erlang developers, while reducing entry threshold for beginners. Now the team is required to understand cut and know one more syntax.
Do notation
Soft comma is a computation binding construct. Erlang does not have a lazy calculation model. Let's imagine what would happen if Erlang were lazy like Haskell
my_function() -> A = foo(), B = bar(A, dog), ok.
To guarantee the execution order, we would need to explicitly link the calculations by defining a comma.
my_function() -> A = foo(), comma(), B = bar(A, dog), comma(), ok.
Continue the conversion:
my_function() -> comma(foo(), fun (A) -> comma(bar(A, dog), fun (B) -> ok end)).
Based on the conclusion, comma / 2 is an idiomatic function >>=/2
. The monad requires only three functions: >>=/2
, return/1
and fail/1
.
Everything would be fine, but the syntax is just awful. We apply syntax transformers from erlando
.
do([Monad || A <- foo(), B <- bar(A, dog), ok]).
Types of Monads
Since the do-block is parameterized, we can use monads of various types. Inside the do-block, calls return/1
and fail/1
deployed to Monad:return/1
and Monad:fail/1
respectively.
Identity-monad.
The identical monad is the simplest monad that does not change the type of values and does not participate in the control of the calculation process. It is applied with transformers. Performs linking expressions - the program comma discussed above.
Maybe-monad.
Monad of calculations with processing missing values. Associating a parameter with a parametrized calculation is the transfer of a parameter to a calculation, linking a missing parameter with a parameterized calculation is an absent result.
Consider an example of maybe_m:
if_safe_div_zero(X, Y, Fun) -> do([maybe_m || Result <- case Y == 0 of true -> fail("Cannot divide by zero"); false -> return(X / Y) end, return(Fun(Result))]).
Evaluation of the expression is terminated if nothing is returned.
{just, 6} = if_safe_div_zero(10, 5, _+4) ## 10/5 = 2 -> 2+4 -> 6 nothing = if_safe_div_zero(10, 0, _+4)
Error-monad.
Similar to maybe_m, only with error handling. Sometimes the let it crash principle does not apply and errors must be handled at the time they occur. In this case, staircases from case often appear in the code, for example, these:
write_file(Path, Data, Modes) -> Modes1 = [binary, write | (Modes -- [binary, write])], case make_binary(Data) of Bin when is_binary(Bin) -> case file:open(Path, Modes1) of {ok, Hdl} -> case file:write(Hdl, Bin) of ok -> case file:sync(Hdl) of ok -> file:close(Hdl); {error, _} = E -> file:close(Hdl), E end; {error, _} = E -> file:close(Hdl), E end; {error, _} = E -> E end; {error, _} = E -> E end.
make_binary(Bin) when is_binary(Bin) -> Bin; make_binary(List) -> try iolist_to_binary(List) catch error:Reason -> {error, Reason} end.
Reading this is unpleasant, looks like callback noodles in JS. Error_m comes to the rescue:
write_file(Path, Data, Modes) -> Modes1 = [binary, write | (Modes -- [binary, write])], do([error_m || Bin <- make_binary(Data), Hdl <- file:open(Path, Modes1), Result <- return(do([error_m || file:write(Hdl, Bin), file:sync(Hdl)])), file:close(Hdl), Result]). make_binary(Bin) when is_binary(Bin) -> error_m:return(Bin); make_binary(List) -> try error_m:return(iolist_to_binary(List)) catch error:Reason -> error_m:fail(Reason) end.
- List-monad.
Values are lists that can be interpreted as several possible results of a single calculation. If one calculation depends on another, then the second calculation is performed for each result of the first, and the results (second calculation) are collected in a list.
Consider the example of the classic Pythagorean triples. We calculate them without monads:
P = [{X, Y, Z} || Z <- lists:seq(1,20), X <- lists:seq(1,Z), Y <- lists:seq(X,Z), math:pow(X,2) + math:pow(Y,2) == math:pow(Z,2)].
Same thing with list_m only:
P = do([list_m || Z <- lists:seq(1,20), X <- lists:seq(1,Z), Y <- lists:seq(X,Z), monad_plus:guard(list_m, math:pow(X,2) + math:pow(Y,2) == math:pow(Z,2)), return({X,Y,Z})]).
- State-monad.
Monad of stateful computing.
At the very beginning of the article, we talked about the difficulties of beginners when working with a variable state. Often the code looks something like this:
State1 = init(Dimensions), State2 = plant_seeds(SeedCount, State1), {DidFlood, State3} = pour_on_water(WaterVolume, State2), State4 = apply_sunlight(Time, State3), {DidFlood2, State5} = pour_on_water(WaterVolume, State4), {Crop, State6} = harvest(State5), ...
Using a transformer and cut-notation, this code can be rewritten in a more compact and readable form:
StateT = state_t:new(identity_m), SM = StateT:modify(_), SMR = StateT:modify_and_return(_), StateT:exec( do([StateT || StateT:put(init(Dimensions)), SM(plant_seeds(SeedCount, _)), DidFlood <- SMR(pour_on_water(WaterVolume, _)), SM(apply_sunlight(Time, _)), DidFlood2 <- SMR(pour_on_water(WaterVolume, _)), Crop <- SMR(harvest(_)), ... ]), undefined).
- Omega-monad.
Similar to list_m monad. However, the passage is made diagonally.
Hidden error handling
Probably one of my favorite features of the error_m
monad. No matter where the error occurs, the monad will always return either {ok, Result}
or {error, Reason}
. An example illustrating the behavior:
do([error_m || Hdl <- file:open(Path, Modes), Data <- file:read(Hdl, BytesToRead), file:write(Hdl, DataToWrite), file:sync(Hdl), file:close(Hdl), file:rename(Path, Path2), file:delete(Path), return(Data)]).
Import_as
For a snack we have syntax import_as sugar. The standard syntax for the -import / 2 attribute allows you to import functions from others into the local module. However, this syntax does not allow you to assign an alternative name to the imported function. Import_as solves this problem:
-import_as({my_mod, [{size/1, m_size}]}) -import_as({my_other_mod, [{size/1, o_size}]})
These expressions are expanded into real local functions, respectively:
m_size(A) -> my_mod:size(A). o_size(A) -> my_other_mod:size(A).
Conclusion
Of course, monads allow you to control the calculation process by more expressive methods, save code and time to support it. On the other hand, they add extra complexity to untrained team members.
* - in fact, in Erlang monads exist without erlando. A comma separating expressions is a construction of linearization and coupling of calculations.
PS Recently, the erlando library was marked by the authors as archival. I wrote this article more than a year ago. Then, however, as now, on Habré there was no information on monads in Erlang. To remedy this situation, I am publishing, albeit belatedly, this article.
To use erlando in erlang> = 22, you need to fix the problem with deprecated erlang: get_stacktrace / 0. An example of a fix can be found in my fork: https://github.com/Vonmo/erlando/commit/52e23ecedd2b8c13707a11c7f0f14496b5a191c2
Thank you for your time!