android - Capture image from web page using regex -


i writing simple program capture image resources web page. image items in html looks like:

case1:<img src="http://www.aaa.com/bbb.jpg" alt="title bbb" width="350" height="385"/> 

or

case2:<img alt="title ccc" src="http://www.ddd.com/bbb.jpg"  width="123" height="456"/> 

i know how handle either case separately, take first 1 example:

    string capture = "<img(?:.*)src=\"http://(.*)\\.jpg\"(?:.*)alt=\"(.*?)\"(?:.*)/>";     defaulthttpclient client = new defaulthttpclient();     basichttpcontext context = new basichttpcontext();     scanner scanner = new scanner(client             .execute(new httpget(uri), context)             .getentity().getcontent());     pattern pattern = pattern.compile(capture);     while (scanner.findwithinhorizon(pattern, 0) != null) {         matchresult r = scanner.match();         string imageurl = "http://" +r.group(1)+".jpg";         string imagetitle = r.group(2);         //do image    } 

the question how write correct pattern image items web page source code contains both case1 , case2? want scan page once.

use jsoup

import org.jsoup.jsoup; import org.jsoup.nodes.document; import org.jsoup.nodes.element; import org.jsoup.select.elements; ...   document doc;  string useragent = "mozilla/5.0 (windows nt 6.1; wow64; rv:28.0) gecko/20100101 firefox/28.0"; try {      // need http protocol     doc = jsoup.connect("http://domain.tld/images.html").useragent(useragent).get();      // images     elements images = doc.select("img");     (element image: images) {          // values img attribute (src & alt)         system.out.println("\nimage: " + image.attr("src"));         system.out.println("alt : " + image.attr("alt"));      }  } catch (ioexception e) {     e.printstacktrace(); } 

jsoup, html parser, “jquery-like” , “regex” selector syntax easy use , flexible enough whatever want.


Comments

Popular posts from this blog

jQuery Mobile app not scrolling in Firefox -

c++ - How to add Crypto++ library to Qt project -

how to receive file in java(servlet/jsp) -