檢視原始碼 強健性
在較大的範例中的 messenger 範例有幾個問題。例如,如果使用者登入的節點在沒有登出的情況下當機,則使用者仍然留在伺服器的 User_List
中,但客戶端會消失。這使得使用者無法再次登入,因為伺服器認為使用者已經登入。
或者,如果伺服器在傳送訊息的過程中當機,會發生什麼事?導致傳送訊息的客戶端永遠掛在 await_result
函數中?
逾時
在改進 messenger 程式之前,讓我們先看看一些基本原則,使用 ping pong 程式作為範例。回想一下,當 "ping" 完成時,它會傳送原子 finished
作為訊息給 "pong",告知 "pong" 它已完成,以便 "pong" 也可以完成。讓 "pong" 完成的另一種方式是讓 "pong" 在一段時間內沒有收到 ping 的訊息時退出。這可以透過在 pong
中加入逾時來完成,如下列範例所示
-module(tut19).
-export([start_ping/1, start_pong/0, ping/2, pong/0]).
ping(0, Pong_Node) ->
io:format("ping finished~n", []);
ping(N, Pong_Node) ->
{pong, Pong_Node} ! {ping, self()},
receive
pong ->
io:format("Ping received pong~n", [])
end,
ping(N - 1, Pong_Node).
pong() ->
receive
{ping, Ping_PID} ->
io:format("Pong received ping~n", []),
Ping_PID ! pong,
pong()
after 5000 ->
io:format("Pong timed out~n", [])
end.
start_pong() ->
register(pong, spawn(tut19, pong, [])).
start_ping(Pong_Node) ->
spawn(tut19, ping, [3, Pong_Node]).
在編譯此程式碼,並將檔案 tut19.beam
複製到必要的目錄後,在 (pong@kosken) 上會看到以下內容
(pong@kosken)1> tut19:start_pong().
true
Pong received ping
Pong received ping
Pong received ping
Pong timed out
在 (ping@gollum) 上會看到以下內容
(ping@gollum)1> tut19:start_ping(pong@kosken).
<0.36.0>
Ping received pong
Ping received pong
Ping received pong
ping finished
逾時設定在
pong() ->
receive
{ping, Ping_PID} ->
io:format("Pong received ping~n", []),
Ping_PID ! pong,
pong()
after 5000 ->
io:format("Pong timed out~n", [])
end.
逾時 (after 5000
) 在進入 receive
時開始。如果收到 {ping,Ping_PID}
,則逾時會被取消。如果沒有收到 {ping,Ping_PID}
,則在 5000 毫秒後執行逾時後的操作。after
必須是 receive
中的最後一項,也就是說,必須在 receive
中的所有其他訊息接收規範之後。也可以呼叫一個函數來返回逾時的整數值
after pong_timeout() ->
一般來說,除了使用逾時之外,還有更好的方法來監督分散式 Erlang 系統的各個部分。逾時通常適用於監督外部事件,例如,如果您期望在特定時間內收到來自某些外部系統的訊息。例如,如果使用者在十分鐘內沒有存取 messenger 系統,則可以使用逾時來將使用者登出。
錯誤處理
在深入了解 Erlang 系統中的監督和錯誤處理細節之前,讓我們先看看 Erlang 程序的終止方式,或者在 Erlang 術語中,退出。
執行 exit(normal)
或只是執行完所有事情的程序具有正常退出。
遇到執行時期錯誤(例如,除以零、錯誤的匹配、嘗試呼叫不存在的函數等等)的程序會以錯誤退出,也就是說,具有異常退出。執行 exit(Reason) 的程序,其中 Reason
是任何 Erlang 項,除了原子 normal
之外,也具有異常退出。
Erlang 程序可以設定與其他 Erlang 程序的連結。如果程序呼叫 link(Other_Pid),則它會在自身和名為 Other_Pid
的程序之間設定一個雙向連結。當程序終止時,它會向所有與其有連結的程序傳送一個稱為訊號的東西。
該訊號攜帶有關其傳送來源的 pid 以及退出原因的資訊。
接收到正常退出的程序的預設行為是忽略訊號。
在其他兩種情況(即,異常退出)下的預設行為是
- 繞過傳送到接收程序的所有訊息。
- 終止接收程序。
- 將相同的錯誤訊號傳播到被終止程序的連結。
透過這種方式,您可以使用連結將交易中的所有程序連接在一起。如果其中一個程序異常退出,則交易中的所有程序都會被終止。由於通常需要在建立程序時同時建立連結,因此有一個特殊的 BIF,spawn_link,其功能與 spawn
相同,但也建立與產生程序的連結。
現在提供一個使用連結來終止 "pong" 的 ping pong 範例
-module(tut20).
-export([start/1, ping/2, pong/0]).
ping(N, Pong_Pid) ->
link(Pong_Pid),
ping1(N, Pong_Pid).
ping1(0, _) ->
exit(ping);
ping1(N, Pong_Pid) ->
Pong_Pid ! {ping, self()},
receive
pong ->
io:format("Ping received pong~n", [])
end,
ping1(N - 1, Pong_Pid).
pong() ->
receive
{ping, Ping_PID} ->
io:format("Pong received ping~n", []),
Ping_PID ! pong,
pong()
end.
start(Ping_Node) ->
PongPID = spawn(tut20, pong, []),
spawn(Ping_Node, tut20, ping, [3, PongPID]).
(s1@bill)3> tut20:start(s2@kosken).
Pong received ping
<3820.41.0>
Ping received pong
Pong received ping
Ping received pong
Pong received ping
Ping received pong
這是 ping pong 程式的一個小修改,其中兩個程序都是從同一個 start/1
函數中產生的,並且 "ping" 程序可以在一個單獨的節點上產生。請注意 link
BIF 的使用。"Ping" 在完成時呼叫 exit(ping)
,這會導致向 "pong" 發送退出訊號,這也會終止 "pong"。
可以修改程序的預設行為,使其在收到異常退出訊號時不會被終止。相反,所有訊號都會轉換為格式為 {'EXIT',FromPID,Reason}
的正常訊息,並加入到接收程序的訊息佇列的末尾。此行為由以下設定
process_flag(trap_exit, true)
還有其他幾個程序標誌,請參閱 erlang(3)。以這種方式變更程序的預設行為通常不會在標準使用者程式中執行,而是留給 OTP 中的監管程式。但是,ping pong 程式已修改以說明退出捕獲。
-module(tut21).
-export([start/1, ping/2, pong/0]).
ping(N, Pong_Pid) ->
link(Pong_Pid),
ping1(N, Pong_Pid).
ping1(0, _) ->
exit(ping);
ping1(N, Pong_Pid) ->
Pong_Pid ! {ping, self()},
receive
pong ->
io:format("Ping received pong~n", [])
end,
ping1(N - 1, Pong_Pid).
pong() ->
process_flag(trap_exit, true),
pong1().
pong1() ->
receive
{ping, Ping_PID} ->
io:format("Pong received ping~n", []),
Ping_PID ! pong,
pong1();
{'EXIT', From, Reason} ->
io:format("pong exiting, got ~p~n", [{'EXIT', From, Reason}])
end.
start(Ping_Node) ->
PongPID = spawn(tut21, pong, []),
spawn(Ping_Node, tut21, ping, [3, PongPID]).
(s1@bill)1> tut21:start(s2@gollum).
<3820.39.0>
Pong received ping
Ping received pong
Pong received ping
Ping received pong
Pong received ping
Ping received pong
pong exiting, got {'EXIT',<3820.39.0>,ping}
新增強健性的較大範例
讓我們回到 messenger 程式並新增變更以使其更強健
%%% Message passing utility.
%%% User interface:
%%% login(Name)
%%% One user at a time can log in from each Erlang node in the
%%% system messenger: and choose a suitable Name. If the Name
%%% is already logged in at another node or if someone else is
%%% already logged in at the same node, login will be rejected
%%% with a suitable error message.
%%% logoff()
%%% Logs off anybody at that node
%%% message(ToName, Message)
%%% sends Message to ToName. Error messages if the user of this
%%% function is not logged on or if ToName is not logged on at
%%% any node.
%%%
%%% One node in the network of Erlang nodes runs a server which maintains
%%% data about the logged on users. The server is registered as "messenger"
%%% Each node where there is a user logged on runs a client process registered
%%% as "mess_client"
%%%
%%% Protocol between the client processes and the server
%%% ----------------------------------------------------
%%%
%%% To server: {ClientPid, logon, UserName}
%%% Reply {messenger, stop, user_exists_at_other_node} stops the client
%%% Reply {messenger, logged_on} logon was successful
%%%
%%% When the client terminates for some reason
%%% To server: {'EXIT', ClientPid, Reason}
%%%
%%% To server: {ClientPid, message_to, ToName, Message} send a message
%%% Reply: {messenger, stop, you_are_not_logged_on} stops the client
%%% Reply: {messenger, receiver_not_found} no user with this name logged on
%%% Reply: {messenger, sent} Message has been sent (but no guarantee)
%%%
%%% To client: {message_from, Name, Message},
%%%
%%% Protocol between the "commands" and the client
%%% ----------------------------------------------
%%%
%%% Started: messenger:client(Server_Node, Name)
%%% To client: logoff
%%% To client: {message_to, ToName, Message}
%%%
%%% Configuration: change the server_node() function to return the
%%% name of the node where the messenger server runs
-module(messenger).
-export([start_server/0, server/0,
logon/1, logoff/0, message/2, client/2]).
%%% Change the function below to return the name of the node where the
%%% messenger server runs
server_node() ->
messenger@super.
%%% This is the server process for the "messenger"
%%% the user list has the format [{ClientPid1, Name1},{ClientPid22, Name2},...]
server() ->
process_flag(trap_exit, true),
server([]).
server(User_List) ->
receive
{From, logon, Name} ->
New_User_List = server_logon(From, Name, User_List),
server(New_User_List);
{'EXIT', From, _} ->
New_User_List = server_logoff(From, User_List),
server(New_User_List);
{From, message_to, To, Message} ->
server_transfer(From, To, Message, User_List),
io:format("list is now: ~p~n", [User_List]),
server(User_List)
end.
%%% Start the server
start_server() ->
register(messenger, spawn(messenger, server, [])).
%%% Server adds a new user to the user list
server_logon(From, Name, User_List) ->
%% check if logged on anywhere else
case lists:keymember(Name, 2, User_List) of
true ->
From ! {messenger, stop, user_exists_at_other_node}, %reject logon
User_List;
false ->
From ! {messenger, logged_on},
link(From),
[{From, Name} | User_List] %add user to the list
end.
%%% Server deletes a user from the user list
server_logoff(From, User_List) ->
lists:keydelete(From, 1, User_List).
%%% Server transfers a message between user
server_transfer(From, To, Message, User_List) ->
%% check that the user is logged on and who he is
case lists:keysearch(From, 1, User_List) of
false ->
From ! {messenger, stop, you_are_not_logged_on};
{value, {_, Name}} ->
server_transfer(From, Name, To, Message, User_List)
end.
%%% If the user exists, send the message
server_transfer(From, Name, To, Message, User_List) ->
%% Find the receiver and send the message
case lists:keysearch(To, 2, User_List) of
false ->
From ! {messenger, receiver_not_found};
{value, {ToPid, To}} ->
ToPid ! {message_from, Name, Message},
From ! {messenger, sent}
end.
%%% User Commands
logon(Name) ->
case whereis(mess_client) of
undefined ->
register(mess_client,
spawn(messenger, client, [server_node(), Name]));
_ -> already_logged_on
end.
logoff() ->
mess_client ! logoff.
message(ToName, Message) ->
case whereis(mess_client) of % Test if the client is running
undefined ->
not_logged_on;
_ -> mess_client ! {message_to, ToName, Message},
ok
end.
%%% The client process which runs on each user node
client(Server_Node, Name) ->
{messenger, Server_Node} ! {self(), logon, Name},
await_result(),
client(Server_Node).
client(Server_Node) ->
receive
logoff ->
exit(normal);
{message_to, ToName, Message} ->
{messenger, Server_Node} ! {self(), message_to, ToName, Message},
await_result();
{message_from, FromName, Message} ->
io:format("Message from ~p: ~p~n", [FromName, Message])
end,
client(Server_Node).
%%% wait for a response from the server
await_result() ->
receive
{messenger, stop, Why} -> % Stop the client
io:format("~p~n", [Why]),
exit(normal);
{messenger, What} -> % Normal response
io:format("~p~n", [What])
after 5000 ->
io:format("No response from server~n", []),
exit(timeout)
end.
新增以下變更
messenger 伺服器會捕獲退出。如果它收到一個退出訊號,{'EXIT',From,Reason}
,這表示客戶端程序已終止或由於以下原因之一而無法連線
- 使用者已登出(已移除 "logoff" 訊息)。
- 與客戶端的網路連線已中斷。
- 客戶端程序所在的節點已當機。
- 客戶端程序已執行某些非法操作。
如果收到如上的退出訊號,則使用 server_logoff
函數從伺服器的 User_List
中刪除 tuple {From,Name}
。如果執行伺服器的節點當機,則會向所有客戶端程序傳送一個退出訊號(系統自動產生):{'EXIT',MessengerPID,noconnection}
,導致所有客戶端程序終止。
此外,在 await_result
函數中引入了五秒的逾時。也就是說,如果伺服器在五秒 (5000 毫秒) 內沒有回覆,則客戶端會終止。這僅在客戶端和伺服器連結之前的登入序列中需要。
一個有趣的情況是,如果客戶端在伺服器連結到它之前終止。這會被處理,因為連結到不存在的程序會自動產生一個退出訊號,{'EXIT',From,noproc}
。這就像程序在連結操作後立即終止一樣。