fjnu 1672 URLs

原创于 2008-04-11 23:49:00 发布 · 740 阅读

0 ·

本内容遵循CC 4.0 BY-SA版权协议

标签

#path #integer #url #string #input

☆ACM 解题报告☆ 专栏收录该内容

229 篇文章

订阅专栏

本文介绍了一种简化版URL语法，并提供了一个程序示例来解析URL，提取协议、主机、端口和路径等关键部分。该程序适用于处理不超过60字符的URL。

Description

In the early nineties, the World Wide Web (WWW) was invented. Nowadays, most people think that the WWW simply consists of all the pretty (or not so pretty) HTML-pages that you can read with your WWW browser. But back then, one of the main intentions behind the design of the WWW was to unify several existing communication protocols.

Then (and even now), information on the Internet was available via a multitude of channels: FTP, HTTP, E-Mail, News, Gopher, and many more. Thanks to the WWW, all these services can now be uniformly addressed via URLs (Uniform Resource Locators). The syntax of URLs is defined in the Internet standard RFC 1738. For our problem, we consider a simplified version of the syntax, which is as follows:

<protocol> "://" <host> [ ":" <port> ] [ "/" <path> ]

The square brackets [] mean that the enclosed string is optional and may or may not appear. Examples of URLs are the following:

http://www.informatik.uni-ulm.de/acm
ftp://acm.baylor.edu:1234/pub/staff/mr-p
gopher://veryold.edu

More specifically,

<protocol> is always one of http, ftp or gopher.

<host> is a string consisting of alphabetic (a-z, A-Z) or numeric (0-9) characters and points (.).

<port> is a positive integer, smaller than 65536.

<path> is a string that contains no spaces.

You are to write a program that parses an URL into its components.

Input

The input starts with a line containing a single integer n, the number of URLs in the input. The following n lines contain one URL each, in the format described above. The URLs will consist of at most 60 characters each.

Output

For each URL in the input first print the number of the URL, as shown in the sample output. Then print four lines, stating the protocol, host, port and path specified by the URL. If the port and/or path are not given in the URL, print the string <default> instead. Adhere to the format shown in the sample output.

Print a blank line after each test case.

Sample Input

3
ftp://acm.baylor.edu:1234/pub/staff/mr-p
http://www.informatik.uni-ulm.de/acm
gopher://veryold.edu

Sample Output

URL #1
Protocol = ftp
Host     = acm.baylor.edu
Port     = 1234
Path     = pub/staff/mr-p

URL #2
Protocol = http
Host     = www.informatik.uni-ulm.de
Port     = <default>
Path     = acm

URL #3
Protocol = gopher
Host     = veryold.edu
Port     = <default>
Path     = <default>

KEY:字符串的处理，我的方法比较繁琐啊……应该有更好的方法；