How to correctly write regular expression to match ASCII control charsHow can I test and use a Perl regular expression interactively?How do I create a dynamic regexp with rx?How to save part of a regular expression during search and replace?Custom Major Mode - Regex to find word before equal sign and set font-lock-variable-name-faceHow to escape regexp special characters in a string?How to match more than one instance of a single subexpression?why is this trim-space function so complicated/ugly in emacs lisp?How to match symbol in regexp?JavaScript regular expressions in re-builderChange regex-builder-mode hook to use <C-s>
What does it mean by "d-ism of Leibniz" and "dotage of Newton" in simple English?
How can I offer a test ride while selling a bike?
Comma Code - Ch. 4 Automate the Boring Stuff
Looking after a wayward brother in mother's will
Why use water tanks from a retired Space Shuttle?
What does symbols in google maps (when looking for some location in uk) mean?
Why don't I have ground wiring on any of my outlets?
Strange math syntax in old basic listing
Can you please explain this joke: "I'm going bananas is what I tell my bananas before I leave the house"?
California: "For quality assurance, this phone call is being recorded"
Why is there a need to modify system call tables in Linux?
Creating Fictional Slavic Place Names
Can I ask a publisher for a paper that I need for reviewing
Do adult Russians normally hand-write Cyrillic as cursive or as block letters?
Applicants clearly not having the skills they advertise
What caused the tendency for conservatives to not support climate change regulations?
Could a guilty Boris Johnson be used to cancel Brexit?
Is there a way to save this session?
Is having a hidden directory under /etc safe?
Short story written from alien perspective with this line: "It's too bright to look at, so they don't"
Why does my electric oven present the option of 40A and 50A breakers?
Can a magnetic field of a large body be stronger than its gravity?
Can a helicopter mask itself from radar?
Are grass strips more dangerous than tarmac?
How to correctly write regular expression to match ASCII control chars
How can I test and use a Perl regular expression interactively?How do I create a dynamic regexp with rx?How to save part of a regular expression during search and replace?Custom Major Mode - Regex to find word before equal sign and set font-lock-variable-name-faceHow to escape regexp special characters in a string?How to match more than one instance of a single subexpression?why is this trim-space function so complicated/ugly in emacs lisp?How to match symbol in regexp?JavaScript regular expressions in re-builderChange regex-builder-mode hook to use <C-s>
I would like to to create a regular expression in elisp (in the standard 'read' form) to match extended ASCII chars the same as PCRE does:
^[a-zA-Z_x7f-xff][a-zA-Z0-9_x7f-xff]*$
So, I'm currently сonfused about x7f-xff
. Is there a way to set a range using something like xhh?
regular-expressions
add a comment |
I would like to to create a regular expression in elisp (in the standard 'read' form) to match extended ASCII chars the same as PCRE does:
^[a-zA-Z_x7f-xff][a-zA-Z0-9_x7f-xff]*$
So, I'm currently сonfused about x7f-xff
. Is there a way to set a range using something like xhh?
regular-expressions
I think the answer depends on whether you're matching against unibyte or multibyte strings. Do you think À (which is undefined in ASCII, 0xC0 in latin-1 and Unicode, but encoded as 0xC380 in UTF-8) falls into the range 0x7F-0xFF?
– npostavs
Apr 14 at 15:34
I think so. At least PCRE matched À as a char in 0x7F-0xFF range. I need the same behavior for standard Elisp regular expression.
– serghei
Apr 14 at 16:13
And as I can seeÀ
is defined in ASCII: ascii-code.com. 0xC0 is between 0x7F and 0xFF
– serghei
Apr 14 at 16:19
add a comment |
I would like to to create a regular expression in elisp (in the standard 'read' form) to match extended ASCII chars the same as PCRE does:
^[a-zA-Z_x7f-xff][a-zA-Z0-9_x7f-xff]*$
So, I'm currently сonfused about x7f-xff
. Is there a way to set a range using something like xhh?
regular-expressions
I would like to to create a regular expression in elisp (in the standard 'read' form) to match extended ASCII chars the same as PCRE does:
^[a-zA-Z_x7f-xff][a-zA-Z0-9_x7f-xff]*$
So, I'm currently сonfused about x7f-xff
. Is there a way to set a range using something like xhh?
regular-expressions
regular-expressions
edited Apr 14 at 15:59
serghei
asked Apr 14 at 14:54
sergheiserghei
190111
190111
I think the answer depends on whether you're matching against unibyte or multibyte strings. Do you think À (which is undefined in ASCII, 0xC0 in latin-1 and Unicode, but encoded as 0xC380 in UTF-8) falls into the range 0x7F-0xFF?
– npostavs
Apr 14 at 15:34
I think so. At least PCRE matched À as a char in 0x7F-0xFF range. I need the same behavior for standard Elisp regular expression.
– serghei
Apr 14 at 16:13
And as I can seeÀ
is defined in ASCII: ascii-code.com. 0xC0 is between 0x7F and 0xFF
– serghei
Apr 14 at 16:19
add a comment |
I think the answer depends on whether you're matching against unibyte or multibyte strings. Do you think À (which is undefined in ASCII, 0xC0 in latin-1 and Unicode, but encoded as 0xC380 in UTF-8) falls into the range 0x7F-0xFF?
– npostavs
Apr 14 at 15:34
I think so. At least PCRE matched À as a char in 0x7F-0xFF range. I need the same behavior for standard Elisp regular expression.
– serghei
Apr 14 at 16:13
And as I can seeÀ
is defined in ASCII: ascii-code.com. 0xC0 is between 0x7F and 0xFF
– serghei
Apr 14 at 16:19
I think the answer depends on whether you're matching against unibyte or multibyte strings. Do you think À (which is undefined in ASCII, 0xC0 in latin-1 and Unicode, but encoded as 0xC380 in UTF-8) falls into the range 0x7F-0xFF?
– npostavs
Apr 14 at 15:34
I think the answer depends on whether you're matching against unibyte or multibyte strings. Do you think À (which is undefined in ASCII, 0xC0 in latin-1 and Unicode, but encoded as 0xC380 in UTF-8) falls into the range 0x7F-0xFF?
– npostavs
Apr 14 at 15:34
I think so. At least PCRE matched À as a char in 0x7F-0xFF range. I need the same behavior for standard Elisp regular expression.
– serghei
Apr 14 at 16:13
I think so. At least PCRE matched À as a char in 0x7F-0xFF range. I need the same behavior for standard Elisp regular expression.
– serghei
Apr 14 at 16:13
And as I can see
À
is defined in ASCII: ascii-code.com. 0xC0 is between 0x7F and 0xFF– serghei
Apr 14 at 16:19
And as I can see
À
is defined in ASCII: ascii-code.com. 0xC0 is between 0x7F and 0xFF– serghei
Apr 14 at 16:19
add a comment |
1 Answer
1
active
oldest
votes
You can use -ÿ
instead of x7f-xff
. That first character, which StackExchange prints as a space, is DEL
, which has codepoint 127 (decimal), #o177 (octal), and #x7f (hexadecimal).
That is, you can just insert the characters themselves in the regexp pattern.
One way to input such characters is to use C-x 8 RET
. To search for any char in the range x7f
through xff
you would type this at the C-M-s
prompt (without the spaces):
[ C-x 8 RET # x 7 f - C-x 8 RET # x f f ]
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "583"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2femacs.stackexchange.com%2fquestions%2f48925%2fhow-to-correctly-write-regular-expression-to-match-ascii-control-chars%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can use -ÿ
instead of x7f-xff
. That first character, which StackExchange prints as a space, is DEL
, which has codepoint 127 (decimal), #o177 (octal), and #x7f (hexadecimal).
That is, you can just insert the characters themselves in the regexp pattern.
One way to input such characters is to use C-x 8 RET
. To search for any char in the range x7f
through xff
you would type this at the C-M-s
prompt (without the spaces):
[ C-x 8 RET # x 7 f - C-x 8 RET # x f f ]
add a comment |
You can use -ÿ
instead of x7f-xff
. That first character, which StackExchange prints as a space, is DEL
, which has codepoint 127 (decimal), #o177 (octal), and #x7f (hexadecimal).
That is, you can just insert the characters themselves in the regexp pattern.
One way to input such characters is to use C-x 8 RET
. To search for any char in the range x7f
through xff
you would type this at the C-M-s
prompt (without the spaces):
[ C-x 8 RET # x 7 f - C-x 8 RET # x f f ]
add a comment |
You can use -ÿ
instead of x7f-xff
. That first character, which StackExchange prints as a space, is DEL
, which has codepoint 127 (decimal), #o177 (octal), and #x7f (hexadecimal).
That is, you can just insert the characters themselves in the regexp pattern.
One way to input such characters is to use C-x 8 RET
. To search for any char in the range x7f
through xff
you would type this at the C-M-s
prompt (without the spaces):
[ C-x 8 RET # x 7 f - C-x 8 RET # x f f ]
You can use -ÿ
instead of x7f-xff
. That first character, which StackExchange prints as a space, is DEL
, which has codepoint 127 (decimal), #o177 (octal), and #x7f (hexadecimal).
That is, you can just insert the characters themselves in the regexp pattern.
One way to input such characters is to use C-x 8 RET
. To search for any char in the range x7f
through xff
you would type this at the C-M-s
prompt (without the spaces):
[ C-x 8 RET # x 7 f - C-x 8 RET # x f f ]
edited Apr 14 at 17:52
answered Apr 14 at 17:46
DrewDrew
49.3k464110
49.3k464110
add a comment |
add a comment |
Thanks for contributing an answer to Emacs Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2femacs.stackexchange.com%2fquestions%2f48925%2fhow-to-correctly-write-regular-expression-to-match-ascii-control-chars%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
I think the answer depends on whether you're matching against unibyte or multibyte strings. Do you think À (which is undefined in ASCII, 0xC0 in latin-1 and Unicode, but encoded as 0xC380 in UTF-8) falls into the range 0x7F-0xFF?
– npostavs
Apr 14 at 15:34
I think so. At least PCRE matched À as a char in 0x7F-0xFF range. I need the same behavior for standard Elisp regular expression.
– serghei
Apr 14 at 16:13
And as I can see
À
is defined in ASCII: ascii-code.com. 0xC0 is between 0x7F and 0xFF– serghei
Apr 14 at 16:19