Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

Supposing that I have the following URL as a String;

String urlSource = 'https://www.wikipedia.org/';

I want to extract the main page name from this url String; 'wikipedia', removing https:// , www , .com , .org part from the url.

What is the best way to extract this? In case of RegExp, what regular expression do I have to use?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
806 views
Welcome To Ask or Share your Answers For Others

1 Answer

You do not need to make use of RegExp in this case.

Dart has a premade class for parsing URLs:

Uri

What you want to achieve is quite simple using that API:

final urlSource = 'https://www.wikipedia.org/';

final uri = Uri.parse(urlSource);
uri.host; // www.wikipedia.org

The Uri.host property will give you www.wikipedia.org. From there, you should easily be able to extract wikipedia.

Uri.host will also remove the whole path, i.e. anything after the / after the host.

Extracting the second-level domain

If you want to get the second-level domain, i.e. wikipedia from the host, you could just do uri.host.split('.')[uri.host.split('.').length - 2].

However, note that this is not fail-safe because you might have subdomains or not (e.g. www) and the top-level domain might also be made up of multiple parts. For example, co.uk uses co as the second-level domain.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...