Kuniaki Mukai
2014-09-09 13:44:03 UTC
Hi,
I have added a sed-like command to DCG phrase based on
the regular expression compiler, which I have posted
in my previous message.
The following are examples to show quickly what the sed command in DCG is.
The first example removes all occurrences alphabet letters.
% ?- phrase(sed(wl("[a-zA-Z]"), =([])), `1a2b3c4d56e`, V), string_codes(S, V).
%@ S = "123456"
The second one swaps adjacent pairs of letters.
% ?- phrase(sed((w(".", A), w(".", B)), append(B, A)), `abcdefg`, V), string_codes(S, V).
%@ S = "badcfeg" .
More generally, a sed expression sed(<DCG Phrase>, <pred/1>) replaces the returned value
from calling <pred/1> for the matched part of input codes.
The following clause expand_sed/6 is almost only what I have added to
my "PAC" library for the "sed".
expand_sed(Words, Func_in_pac, Mod, G_callable, List, List4):-
pac:expand_phrase(Words, Mod, Phrase, List, List0),
pac:expand_arg(Func_in_pac, Mod, Func_callable, List0, List1),
pac:phrase_to_pred(Phrase, Mod, [P, P0]:- Expanded_phrase, List1, List2),
pac:expand_core(
rec(Sed_name, [], ( [L0, L1, P, Q] :- Expanded_phrase, !,
call(Func_callable, Val),
append(Val, L2, L0),
call(Sed_name, L2, L1, P0, Q))
& ( [[C|Z0], Z, [C|P], Q]:- call(Sed_name, Z0, Z, P, Q))
& [R, R, [], []] ),
Mod, Sed_main, List2, List3),
pac:expand_core(pred([U,V]:- call(Sed_main, V, [], U, [])), Mod, G_callable, List3, List4).
In this clause, expand_phrase, expand_arg, phrase_to_pred, and expand_core are
basic tools in my private library PAC, and rec(F, V, P) is a PAC
expression introduced for recursive anonymous predicates like
the named recursive predicate append/3.
The folowing DCG rule removes all TeX control sequence occureences from a TeX text codes.
sed_for_detex --> sed(wl( "\\\\[a-zA-Z]+"
| "\\\\[\\!\\\"\\#\\$\\'\\(\\)\\=\\-\\~\\^\\\\\\|\\`\\@\\{\\[\\}\\]\\*\\:\\+\\;\\<\\>\\,\\.]"),
pred([[]])).
where w/wl means "shortest/longest match first" searching modes, respectively.
% ?- sed_for_detex(`\\Large Hello\\; World! `, X), string_codes(Y, X).
%@ Y = " Hello World! " .
Thank you in advance for pointing to related works.
Documentation is not available, sorry. My documentation speed is exceptionally slow
like turtles.
Regards,
Kuniaki
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 496 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.iai.uni-bonn.de/pipermail/swi-prolog/attachments/20140909/b413f224/signature.asc>
I have added a sed-like command to DCG phrase based on
the regular expression compiler, which I have posted
in my previous message.
The following are examples to show quickly what the sed command in DCG is.
The first example removes all occurrences alphabet letters.
% ?- phrase(sed(wl("[a-zA-Z]"), =([])), `1a2b3c4d56e`, V), string_codes(S, V).
%@ S = "123456"
The second one swaps adjacent pairs of letters.
% ?- phrase(sed((w(".", A), w(".", B)), append(B, A)), `abcdefg`, V), string_codes(S, V).
%@ S = "badcfeg" .
More generally, a sed expression sed(<DCG Phrase>, <pred/1>) replaces the returned value
from calling <pred/1> for the matched part of input codes.
The following clause expand_sed/6 is almost only what I have added to
my "PAC" library for the "sed".
expand_sed(Words, Func_in_pac, Mod, G_callable, List, List4):-
pac:expand_phrase(Words, Mod, Phrase, List, List0),
pac:expand_arg(Func_in_pac, Mod, Func_callable, List0, List1),
pac:phrase_to_pred(Phrase, Mod, [P, P0]:- Expanded_phrase, List1, List2),
pac:expand_core(
rec(Sed_name, [], ( [L0, L1, P, Q] :- Expanded_phrase, !,
call(Func_callable, Val),
append(Val, L2, L0),
call(Sed_name, L2, L1, P0, Q))
& ( [[C|Z0], Z, [C|P], Q]:- call(Sed_name, Z0, Z, P, Q))
& [R, R, [], []] ),
Mod, Sed_main, List2, List3),
pac:expand_core(pred([U,V]:- call(Sed_main, V, [], U, [])), Mod, G_callable, List3, List4).
In this clause, expand_phrase, expand_arg, phrase_to_pred, and expand_core are
basic tools in my private library PAC, and rec(F, V, P) is a PAC
expression introduced for recursive anonymous predicates like
the named recursive predicate append/3.
The folowing DCG rule removes all TeX control sequence occureences from a TeX text codes.
sed_for_detex --> sed(wl( "\\\\[a-zA-Z]+"
| "\\\\[\\!\\\"\\#\\$\\'\\(\\)\\=\\-\\~\\^\\\\\\|\\`\\@\\{\\[\\}\\]\\*\\:\\+\\;\\<\\>\\,\\.]"),
pred([[]])).
where w/wl means "shortest/longest match first" searching modes, respectively.
% ?- sed_for_detex(`\\Large Hello\\; World! `, X), string_codes(Y, X).
%@ Y = " Hello World! " .
Thank you in advance for pointing to related works.
Documentation is not available, sorry. My documentation speed is exceptionally slow
like turtles.
Regards,
Kuniaki
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 496 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.iai.uni-bonn.de/pipermail/swi-prolog/attachments/20140909/b413f224/signature.asc>