This topic needs a title

Hellow :slight_smile:

I have a program – written in Python 2.7.3 and wxPython 2.9.4.0 – that uses Python’s urllib2.urlopen() function/method. The purpose of my program is to count the occurence of a user-given word that exists on a user-given website. So if I want to know how many times the word “library” occurs on “wxpython.org”, I type in the URL box “http://www.wxpython.org” (without quotes), and in the word box I type “library” (without quotes).

My problem is that the user needs to be exact at inputting the URL, because “wxpython.org” is not enough; it has to be inputted exactly as “http://www.wxpython.org”, which is rather annoying for the user.

Is there a way the URL be inputted only as “wxpython.org” and urllib2.urlopen’s function/method filling in the missing “http://www.” part? Or is there any other way like checking for the “http://www.” part and prepend it if not present? What is the best way to solve my issue and how exactly should I go and implement the solution?

There is, and that's probably why it is your homework
assignment. You need to read up on string handling.

Karsten

···

On Mon, Oct 29, 2012 at 09:14:56PM +0100, Boštjan Mejak wrote:

I have a program – written in Python 2.7.3 and wxPython 2.9.4.0 – that uses
Python's urllib2.urlopen() function/method. The purpose of my program is to
count the occurence of a user-given word that exists on a user-given
website. So if I want to know how many times the word "library" occurs on "
wxpython.org", I type in the URL box "http://www.wxpython.org" (without
quotes), and in the word box I type "library" (without quotes).

My problem is that the user needs to be exact at inputting the URL, because
"wxpython.org" is not enough; it has to be inputted exactly as "
http://www.wxpython.org <http://www.python.org>", which is rather annoying
for the user.

Is there a way the URL be inputted only as "wxpython.org" and
urllib2.urlopen's function/method filling in the missing "http://www."
part? Or is there any other way like checking for the "http://www." part
and prepend it if not present? What is the best way to solve my issue and
how exactly should I go and implement the solution?

--
GPG key ID E4071346 @ gpg-keyserver.de
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

So I can’t do anything else than use the ol’ string handling technique of prepending the missing “http://www.” if it’s missing in the URL?

The part is really required.� The latest round of browsers
is making people lazy, because they fill that in if it’s not
provided.� I still find myself typing it by hand.� It’s not hard to
detect this, however.� Remember that there are a lot of protocols
available for URLs.� There’s no reason for urllib to assume that you
meant “http”.� If your program knows that is the default,
then it’s up to you to provide it.
The “www” part is different.� That’s not universal.� You should be
able to fetch and have it work just fine.� You
might get a “redirect” response telling you to fetch
instead, but that’s something you need to be
handling.

···

Bo�tjan Mejak wrote:

    My problem is that the user needs to be exact at inputting

the URL, because "wxpython.org "
is not enough; it has to be inputted exactly as "http://www.wxpython.org ", which is rather
annoying for the user.

"http://"http://

http://wxpython.orgwww.wxpython.org

-- Tim Roberts, Providenza & Boekelheide, Inc.

timr@probo.com

Is there any HTTP URL-validating function/method in Python/wxPython?

Is this helpful?

http://stackoverflow.com/questions/827557/how-do-you-validate-a-url-with-a-regular-expression-in-python

-Che

···

On Sun, Nov 4, 2012 at 10:28 AM, Boštjan Mejak mejak.bost@gmail.com wrote:

Is there any HTTP URL-validating function/method in Python/wxPython?