Uwe C. Schroeder wrote:
Hi,
I think I just found an error in the new masked control.
My guess is that it's also in the just released new version.
If using the EMAIL autoformat the characters excluded are wrong.
It allows an underscore (which is fine in the user part of
the email address),
but disallows a dash (-) in the domain part (which is legal
there) and also
allows the underscore in the domain part which is wrong.
AFAIK there are no
domains with an underscore.
I am currently in the process of making some fixes and
enhancements to maskededit.py, and am trying to fix this issue
while I'm at it.
The regular expression I was using to do validation is incorrect
because I'm using \w, which includes _ but does not include -.
I was also being too restrictive in top-level domain choices,
so I looked up RFC822, and found a Backus-Naur form for email
address syntax, but no regexp...
Finding it difficult to believe that this wasn't a solved
problem, I did some google research to find "standard" regular
expressions for validating email addresses. While this
turned up some examples, they all had their flaws (usually
being too permissive.)
Cobbling together many of them, I've come up with the following
rex that I think does the job, and also adds support for
user@[<ip addr>].
^\w+([\-\.]\w+)*@((([a-zA-Z0-9]+(\-[a-zA-Z0-9]+)*\.)+)[a-zA-Z]{2,4}|\[(\d|\d
\d|(1\d\d|2[0-4]\d|25[0-5]))(\.(\d|\d\d|(1\d\d|2[0-4]\d|25[0-5]))){3}\]) *$
Explanation of regexp (in chunks):
user part:
\w+([\-\.]\w+)* # 1 or more alphanumeric or \_
# followed by 0 or more sequences of either - or .
# followed by alphanumeric or _
@
domain part:
(([a-zA-Z0-9]+(\-[a-zA-Z0-9]+)*\.)+)[a-zA-Z]{2,4}
# 1 or more sequences of:
# 1 or more alphanum
# followed by 0 or more sequences of:
# '-' followed by 1 or more alphanum
# followed by '.'
# followed by 2-4 alphabetic characters
or
(\[(\d|\d\d|(1\d\d|2[0-4]\d|25[0-5]))(\.(\d|\d\d|(1\d\d|2[0-4]\d|25[0-5]))){
3}\])
# [ followed by either:
# 1 digit, 2 digits, 1 and 2 digits or 2, a
digit between 0-4 and another digit,
# or 25 and a digit between 0 and 5
# followed by 3 sequences of:
# . followed by the same pattern as above
# followed by ]
(followed by any amount of space, for right side of control value.)
Some of my test expressions:
expr pass fail
ยทยทยท
=================================
a@x.o x
a@x.or x
Ab.Cd@x.org x
.b@x.org x
b.@x.org x
b.@x.org x
_b@x.org x
abc@x-y.com x
abc@-y.com x
A_b@x-y.co.uk x
A_b@x-y.co.u x
abc@.org x
abc@[1.1.1.1] x
abc@[1.1.1] x
abc@[10.1.1.300] x
abc@[10.1.1.256] x
abc@[10.1.1.255] x
abc@[10.1.1.196] x
abc@[260.1.1.196] x
etc.
It seems to pass all of my tests for what I'm allowing,
but I welcome more feedback before I check it in.
Is this acceptable and/or sufficient?
If not, please let me know (preferably with a modified expression!)
Regards,
/Will Sadkin
Parlance Corporation
www.nameconnector.com