java - Download list of pages from some domain with URL constraint -
i need download list of pages on domain have specific url endings.
for example, have webpage, http://brnensky.denik.cz/
, czech webpage news. every article has url ending post date, http://brnensky.denik.cz/zpravy_region/ruzova-kola-usnadni-presun-po-brne-20140418.html
.
so find list of urls begin http://brnensky.denik.cz/
, whatever, , example -20140418.html
. possible achieve?
i'm trying solve in java, other way help.
regex be
^http://brnensky\.denik\.cz.*[0-9]{8}\.html
logic
beginning url , ending date.html , date 8 digit string.
you may have escape '/' according tool or lang used implement expression
Comments
Post a Comment