Thursday, 15 May 2014

java - Parse Company Info -



java - Parse Company Info -

i wondering if knows how parse company name "alcoa inc." shown in url below. much easier show image not have plenty reputation. help appreciated.

http://www.google.com/finance?q=nyse%3aaa&ei=ldwvuyc7fp_ylgpbiae

this have tried far using jsoup parse div class:

<div class="appbar-snippet-primary"> <span>alcoa inc.</span> </div> public elements htmlparser(string url, string element, string elementtype, string returnelement){ seek { document doc = jsoup.connect(url).get(); document parse = jsoup.parse(doc.html()); if (returnelement == null){ homecoming parse.select(elementtype + "." + element); } else { homecoming parse.select(elementtype + "." + element + " " + returnelement); } } public string htmlparsegooglestocks(string url){ string pr = "pr"; string appbar_center = "appbar-snippet-primary"; string val = "val"; string span = "span"; string div = "div"; string td = "td"; elements price_data; elements title_data; elements more_data; price_data = htmlparser(url, pr, span, null); title_data = htmlparser(url, appbar_center, div, span); //more_data = htmlparser(url, val, td, null); //string stockprice = price_data.text().tostring(); string title = title_data.text().tostring(); //system.out.println(more_data.text()); homecoming title;

myself, i'd analyze page of interest's source html, , utilize jsoup extract information. instance, using little jsoup programme so:

import java.io.ioexception; import org.jsoup.jsoup; import org.jsoup.nodes.document; import org.jsoup.select.elements; public class googlefinance { public static final string page = "https://www.google.com/finance?q=nasdaq:xone"; public static void main(string[] args) throws ioexception { document doc = jsoup.connect(page).get(); elements title = doc.select("title"); system.out.println(title.text()); } }

you in return:

exone co: nasdaq:xone quotes & news - google finance

it doesn't much easier that.

java parsing web-scraping html-parsing finance

No comments:

Post a Comment