validation for float input

-Scott David Daniels wrote:

How about:
     import math
     def IsFloat(s):
         try:
             x = float(s)
             return x == 0.0 or 0.0 != math.frexp(x)[0]
         except ValueError:
             return False # ValueError either from the float or frexp

Excellent

Jean-Michel Fauth, Switzerland

-Scott David Daniels wrote:

How about:
     import math
     def IsFloat(s):
         try:)

             x = float(s)
             return x == 0.0 or 0.0 != math.frexp(x)[0]
         except ValueError:
             return False # ValueError either from the float or frexp

Again, I think the behaviour of math.frexp is going to be system/compiler/mathlib dependent. With Python 2.3 on Linux:

tests = ["34.54", "23e12", "456", "23e999999"]

for s in tests:
     if IsFloat(s):
         print s, "Is a valid Float"
     else:
         print s, "Is NOT a valid Float"
34.54 Is a valid Float
23e12 Is a valid Float
456 Is a valid Float
Traceback (most recent call last):
   File "./junk.py", line 15, in ?
     if IsFloat(s):
   File "./junk.py", line 7, in IsFloat
     return x == 0.0 or 0.0 != math.frexp(x)[0]
OverflowError: math range error

Now, you could do:
     except (ValueError, OverflowError):
         return False # ValueError either from the float or frexp

Which seems to work, but still fails on a NaN:

34.54 Is a valid Float
23e12 Is a valid Float
456 Is a valid Float
34.56th Is NOT a valid Float
23e999999 Is NOT a valid Float
-23e93893873 Is NOT a valid Float
NaN Is a valid Float

I'm also not sure, without doing more math that I can handle at the moment, if math.frexp() can take any valid float.

I still think explicitly testing for +Inf, -Inf and NaN may be the only way, and that makes it explicit how you are handling those values, which may, in fact, be valid for a given application.

Of course, having Python understand the IEEE special values in a platform independent way would be even better.

-Chris

···

--
Christopher Barker, Ph.D.
Oceanographer
                                         
NOAA/OR&R/HAZMAT (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@noaa.gov

Chris Barker wrote:

-Scott David Daniels wrote:

How about:
     import math
     def IsFloat(s):
         try:
             x = float(s)
             return x == 0.0 or 0.0 != math.frexp(x)[0]
         except ValueError:
             return False # ValueError either from the float or frexp

Again, I think the behaviour of math.frexp is going to be system/compiler/mathlib dependent.

Actually, frexp / ldexp / modf were designed to allow access to
floating point in a way that allowed implementation of higher-
precision software.

My mistakes were in the comparison and float handling.work:

With Python 2.3 on Linux:

IsFloat("23e999999")
OverflowError: math range error

This OverFlow is from float(string).

Now, you could do:
    except (ValueError, OverflowError):

Which _is_ the right thing to do

Which seems to work, but still fails on a NaN:
NaN Is a valid Float

(note: msc doesn't produce NaNs from the same string)

  return x == 0.0 or 0.0 != math.frexp(x)[0]

This was the problem: The NaN compares equal to 0.0, so no frexp used.

I'm also not sure, without doing more math that I can handle at the moment, if math.frexp() can take any valid float.

It can indeed. The exponent field of all IEEE floats are
significantly smaller than storage for an int. frexp will
easily be able to store the exponent, and the fraction should
simply be "normalized" to .5 <= fraction < 1.0 (a not-so-
difficult task). Numbers in that range are easy to represent
in pretty much all known floating point systems.

This should do the trick:

     def IsFloat(s):
         try:
             vals = math.frexp(float(s))
         except (ValueError, OverFlowError):
             return False # ValueError either from the float or
         return vals == (0.0, 0) or 0.5 <= abs(vals[0]) < 1.0

frexp should return 0.0 for the fraction part of all non-numbers
(NaNs, Infs) as well as for 0.0 (and set an error), but in the NaN
cases, it should _not_ return a zero exponent (spec as remembered).

I still think explicitly testing for +Inf, -Inf and NaN may be the only way, and that makes it explicit how you are handling those values, which may, in fact, be valid for a given application.

But the only way I know of testing for those values involves
either IEEE - conformant predicates (isNan, isInf, ...) which
are infrequently implemented correctly, or looking at frexp
results.

-Scott David Daniels

Scott D Daniels wrote:

This should do the trick:

    def IsFloat(s):
        try:
            vals = math.frexp(float(s))
        except (ValueError, OverFlowError):
            return False # ValueError either from the float or
        return vals == (0.0, 0) or 0.5 <= abs(vals[0]) < 1.0

This looks like a definitive version, if you don't want Inf, -Inf, or NaN to be considered valid.

I still think explicitly testing for +Inf, -Inf and NaN may be the only way, and that makes it explicit how you are handling those values, which may, in fact, be valid for a given application.

But the only way I know of testing for those values involves
either IEEE - conformant predicates (isNan, isInf, ...) which
are infrequently implemented correctly, or looking at frexp
results.

See:

In particular:
"""
The reference implementation is provided in the module "fpconst" [1], which is written in pure Python by taking advantage of the "struct" standard module to directly set or test for the bit patterns that define IEEE 754 special values. Care has been taken to generate proper results on both big-endian and little-endian machines.
"""

[1] http://www.analytics.washington.edu/Zope/projects/fpconst/

-Chris

···

--
Christopher Barker, Ph.D.
Oceanographer
                                         
NOAA/OR&R/HAZMAT (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@noaa.gov

Chris Barker wrote:

... PEP 754 – IEEE 754 Floating Point Special Values | peps.python.org
"... to directly set or test for the bit patterns that define IEEE 754 special values..."
[1] http://www.analytics.washington.edu/Zope/projects/fpconst/

I'm whacking away at a machine-independent version of access to
floating point data. My goal is to provide an abstraction for
the finite values available on many different floating points
(not simply IEEE754), but one that does work precisely on that
representation. If you are curious, you could look at:

     http://members.dsl-only.net/~daniels/bits.html

This properly describes the interface, but the code is still
flaky. I just recently got some pure-python code to do all
of the interface (albeit _very_ inefficiently), and I hope to
pass all of my tests in a week or two.

I'd love comments on the interface or its description, if you'd
care to make them.

When I release it, it will be with a variant of the MIT license.
The eventual goal is an MIT license (or python, we could make
these methods on int, long, and float).

-Scott David Daniels
Scott.Daniels@Acm.Org