Earlier today I found myself needing to join something in Erlang, and after spending a few minutes looking around for a join function in the lists module, realized that there wasn't one. Raising the question, how does one implement join?
The simplest way to implement join is to use a reduce/fold. Erlang provides these in lists:foldl and lists:foldr, which fold to the left and right respectively. I chose to fold from the left, and thus my function used for the fold looked like this:
% Char is assumed to exist.
JoinFunction = fun(Element, Acc) -> lists:append([Acc, Char, Element]) end.
Then we'll apply it to a list.
% List is assumed to exist.
InitialAcc = "",
Results = lists:foldl(JoinFunction, InitialAcc, List).
Now its pretty much working, but we still have a slight problem: it will be preceded by an extra copy of the joining list (for example joining ["one", "two", "three"] using ", " would look like ", one, two, three"). Fortunately, we are dealing with linked lists, so chopping off that extra copy is a O(c) operation (where c is the number of elements in the joining list).
Throw in some protection against an empty list as the list of elements to join, and we end up with a pretty simple solution.
simple_join(_, []) ->
"";
simple_join(Char, List) ->
P = fun(Elem, AccIn) -> lists:append([AccIn, Char, Elem]) end,
lists:nthtail(length(Char), lists:foldl(P, "", List)).
I'm pretty content with that solution, but it felt like it would be worth throwing together a solution without using fold.
There are three situations that need to be handled by the solution.
- Is this the first element? If so, then recurse with the accumulated list equal to list[0], and the list of elements to process equal to list[1:].
- Are there no remaining elements? If so, return the accumulated list.
- Else, recursively call the function on list[1:], and append the joining character and list[0] to the accumulated list.
Although not particularly concise by the number of lines metric, I find the Erlang solution to be quite pleasing.
simple_join(Char, List) ->
simple_join(Char, List, []).
simple_join(_, [], Acc) ->
Acc;
simple_join(Char, [First|Rest], []) ->
simple_join(Char, Rest, First);
simple_join(Char, [First|Rest], Acc) ->
simple_join(Char, Rest, lists:append([Acc, Char, First])).
My favorite part is that all of the logic is handled by pattern matching against function definitions, instead of handled by explicit case or if/else constructions. As far as readability goes, I think its worth comparing it to a roughly equivalent1 function in Scheme.
(define (simple-join char list)
(define (sj char list acc)
(cond ((null? acc) (sj char (cdr list) (car list)))
((null? (cdr list)) acc)
(else (sj char (cdr list) (append acc char (car list))))))
(sj char list (list)))
Looking at the two, I think the Erlang implementation suffers because most programmers (me being no exception) are more familiar with the if-else paradigm than with pattern matching. However, I find the Erlang solution less cluttered, and easier to parse. All in all, I prefer the Erlang solution now that I have become a bit familiar with Erlang and its concepts.
It does almost exactly the same thing, but what the value of doing so is questionable. Strings in Erlang are lists, while strings in Scheme are not (in Common Lisp they are implemented as vectors, and I imagine Scheme is the same, but I am uncertain). As it is, the Scheme version simply joins together lists, not strings.↩
On a side note, the Erlang syntax coloring makes me cry each time I see it, I'll have to see if Pygments 1.0 improves upon on, or I will goaded into action myself.
Hi,
Here is perhaps an alternative:
implode(Data, Seperator) when is_list(Data) andalso is_list(Seperator) -> lists:foldr(fun(X,[]) -> X; (X,Acc) -> X++Seperator++Acc end, "", Data).
Cheers
First, restoring some syntax coloring/formatting that my comments probably previously stripped (sorry!).
implode(Data, Seperator) when is_list(Data) andalso is_list(Seperator) -> lists:foldr(fun(X,[]) -> X; (X,Acc) -> X++Seperator++Acc end, "", Data).Second, that is a pretty impressive solution. Certainly more so than either of mine. Thanks.
One thing I will say about functional programming, is that I find it more difficult to glance at a piece of code and make sense of it quickly, I suppose because functional code tends to be more compact. I suppose it'll come with practice. ;)
I think this is slightly easier to read, no? implode([], _) -> []; implode([H|T], Seperator) when is_list(Seperator) -> lists:foldl(fun(X, Acc) -> Acc++Seperator++X end, H, T).But that was slower than the stdlib string implementation.
I wrote this, thinking it would be faster than the stdlib version:
join2([H|T], Sep) -> lists:concat([H|[P || X <- T, P <- [Sep, X]]]).It wasn't. But this is: It's about 20% faster on my machine, and 50% faster with hipe. Of course it's not tail recursive, so performance may suffer for extremely long lists.Is that not what the
joinfunction in 'string' does?Well, I might be laughing at myself ever so slightly. I might say that I never realized there was a string module, seeing as all the documents were so fevered in explaining how strings are merely lists and that one simply uses list functions upon them.
Or, in summary: yeah, I fucked that one up pretty hard.
Not really... your post still has value in itself, illustrating two points: how to do a join yourself, and how hard it is to navigate the library documentation to find things.
The next step is to compare your implementation to the stdlib one:
join([H|T], Sep) -> H ++ lists:concat([Sep ++ X || X <- T]).That's more concise, but probably not as readable. And an elegant solution to the problem with the initial element.
In general this strikes me as very elegant, and makes clear to me that I myself am still too far stuck in the wrong paradigm. I would have implemented it more like you did.
stdlib 1.15.2 has string:join/2 which does what you want I think.
I think the string:join/2 function was introduced in OTP in recent releases. There is a recipe in trapexit for exactly this problem: http://www.trapexit.org/String_join_with
Reply to this entry