Tuesday, 15 May 2012

c# - How to fix the Script Task code that downloads ticker price files from Yahoo and inserts into database? -



c# - How to fix the Script Task code that downloads ticker price files from Yahoo and inserts into database? -

below given ssis script task takes in ticker symbol , date range , returns csv formatted download can used extract cost history not work , have no thought why. total info ssis well-thought-out concept can found in below link.

ssis / etl illustration – yahoo equity & mutual fund cost history

you can download sample ssis bundle below link.

sample bundle on skydrive

following ssis script task has no errors not download file:

i started understand code, , seemed link subdivided components here, , right values must work don't understand why not retrieving file.

ichart.finance.yahoo.com/table.csv?s={symbol}&a={startmm}&b={startdd}&c= {‌​startyyyy}&d={endmm}&e={enddd}&f={endyyyy}&g={res}&ignore=.csv script task code using: using system; using system.data; using microsoft.sqlserver.dts.runtime; using system.windows.forms; using system.configuration; using system.collections.generic; using system.data.sql; using system.data.sqlclient; using system.net; using system.collections.specialized; using system.linq; using hash = system.collections.generic.dictionary<string, string>; namespace st_361aad0e48354b30b8152952caab8b2b.csproj { [system.addin.addin("scriptmain", version = "1.0", publisher = "", description = "")] public partial class scriptmain : microsoft.sqlserver.dts.tasks.scripttask.vstartscriptobjectmodelbase { #region vsta generated code enum scriptresults { success = microsoft.sqlserver.dts.runtime.dtsexecresult.success, failure = microsoft.sqlserver.dts.runtime.dtsexecresult.failure }; #endregion static string dir; static datetime end; const string csv_format = "id,cusip,date,open,high,low,close,volume,adj close"; public void main() { // end date today minus 1 day (end of day) end = datetime.now; // output directory stored in ssis variable // can set @ runtime dir = system.io.path.combine(dts.variables["outputcsv"].value.tostring(), end.tostring("yyyymmdd")); if (!system.io.directory.exists(dir)) system.io.directory.createdirectory(dir); // connection string our database var connectionstring = dts.variables["connectionstring"].value.tostring(); // sql command execute var sql = dts.variables["pricehistorysqlcommand"].value.tostring(); var list = new list<hash>(); using (var cnn = new sqlconnection(connectionstring)) { cnn.open(); using (var cmd = new sqlcommand(sql, cnn)) { cmd.commandtimeout = 0; var dr = cmd.executereader(); while (dr.read()) { // store result in temporary hash var h = new hash(); h["cusip"] = dr["cusip"].tostring(); h["symbol"] = dr["symbol"].tostring(); h["product_id"] = dr["product_id"].tostring(); h["last_price_dt_id"] = dr["last_price_dt_id"].tostring(); list.add(h); // process batches of 100 @ time // (this requires system.threading.dll (ctp of parallel extensions) installed in gac) if (list.count >= 100) { system.threading.tasks.parallel.foreach(list, item => { var dt = item["last_price_dt_id"].trygetdatefromdatedimensionid(end.addyears(-100)); downloadpricehistory(item["product_id"], item["cusip"], item["symbol"], dt); }); list.clear(); } } } } // todo: add together code here dts.taskresult = (int)scriptresults.success; } static void downloadpricehistory(string id, string cusip, string symbol, datetime begin) { // write path var path = system.io.path.combine(dir, cusip + ".csv"); var url = string.format("http://ichart.finance.yahoo.com/table.csv?s={0}&d={1}&e={2}&f={3}&g=d&a={4}&b={5}&c={6}&ignore=.csv", symbol.toupper(), (end.month - 1).tostring("00"), end.day.tostring("00"), end.year, (begin.month - 1).tostring("00"), begin.day.tostring("00"), begin.year); string csv; using (webclient web = new webclient()) { seek { var text = web.downloadstring(url); var lines = text.split('\n'); system.text.stringbuilder sb = new system.text.stringbuilder(); int = 0; foreach (var line in lines) { // skip first line header if (i == 0) sb.appendline(csv_format); // ensure line beingness added not null else if (false == string.isnullorempty(line) && false == string.isnullorempty(line.trim())) sb.appendline(id + "," + cusip + "," + line); i++; } // add together header , body csv = sb.tostring(); } grab (system.net.webexception) { // 404 error csv = csv_format; } } system.io.file.writealltext(path, csv); } } /// <summary> /// simple extension methods. /// </summary> public static class extensionmethods { /// <summary> /// gets datetime object dimension id string illustration '20090130' translated /// proper datetime of '01-30-2009 00:00:00'. if string empty default passed /// in <paramref name="defaultifnull"/>. /// </summary> /// <param name="str">the string</param> /// <param name="defaultifnull">the default null.</param> /// <returns>returns datetime.</returns> public static datetime trygetdatefromdatedimensionid(this string str, datetime defaultifnull) { if (string.isnullorempty(str)) homecoming defaultifnull; homecoming datetime.parse(str.substring(4, 2) + "/" + str.substring(6, 2) + "/" + str.substring(0, 4)); } } }

import ticker cost history yahoo finance chart website using ssis:

there way import ticker symbol cost history yahoo chart website database using ssis. here sample bundle written using ssis 2008 r2 database in sql server 2008 r2

create ssis bundle named (say so_14797886.dtsx) using business intelligence development studio (bids) , create ole db connection manager/data source connects database. sample uses info source oledb_sora.ds connects database sora on local machine running instance kiwi\sqlserver2008r2. kiwi machine name , sqlserver2008r2 instance name.

execute below given script in database create 2 tables.

table dbo.tickersymbols hold info list of ticker symbols , start , end dates import cost files along resolution of import. resolution can contain values d day; w weekly; m monthly; , y yearly.

table dbo.tickerpricehistory hold cost history info of symbols downloaded yahoo finance chart website.

insert script has added 4 records ticker symbols aapl (apple); msft (microsoft); goog (google); , yhoo (yahoo). each record set different date ranges , resolution.

script create tables , insert few ticker symbols data: create table dbo.tickersymbols ( id int identity(1,1) not null , symbol varchar(10) not null , startdate datetime not null , enddate datetime not null , resolution char(1) not null , constraint [pk_tickersymbols] primary key clustered ([id] asc) ); go create table dbo.tickerpricehistory ( id int identity(1,1) not null , symbol varchar(10) not null , pricedate datetime not null , priceopen numeric(18,2) null , pricehigh numeric(18,2) null , pricelow numeric(18,2) null , priceclose numeric(18,2) null , volume bigint null , adjustmentclose numeric(18,2) null , constraint [pk_tickerpricehistory] primary key clustered ([id] asc) ); go insert dbo.tickersymbols (symbol, startdate, enddate, resolution) values ('aapl', '2012-02-01', '2012-02-04', 'd') , ('goog', '2013-01-01', '2013-01-31', 'w') , ('msft', '2012-09-01', '2012-11-30', 'm') , ('yhoo', '2012-01-01', '2012-12-31', 'y') ; go

on ssis package, create next variables.

enddate: bundle utilize variable of info type datetime hold end date of symbol beingness looped through in record set list.

fileextension: variable of info type string hold file extension utilize downloaded files. optional.

filename: variable of info type string hold name of file given symbol. name generated based on timestamp avoid overwriting downloaded files. click variable , press f4 view properties. alter property evaluateasexpression true. click on ellipsis button against expression open expression builder. set expression next value. look evaluate value msft_20130210_092519.csv, msft symbol , rest of info bundle start time in format yyymmdd_hhmmss , .csv file extension.

@[user::symbol] + "_" + (dt_wstr, 4) year(@[system::starttime]) + right("00" + (dt_wstr, 2) month(@[system::starttime]), 2) + right("00" + (dt_wstr, 2) day(@[system::starttime]), 2) + "_" + right("00" + (dt_wstr, 2) datepart("hh", @[system::starttime]), 2) + right("00" + (dt_wstr, 2) datepart("mi", @[system::starttime]), 2) + right("00" + (dt_wstr, 2) datepart("ss", @[system::starttime]), 2) + @[user::fileextension]

filepath: variable of info type string hold finish path of downloaded file given symbol. click variable , press f4 view properties. alter property evaluateasexpression true. click on ellipsis button against look open expression builder. set expression value @[user::rootfolder] + "\\" + @[user::filename]. utilize express

resolution: bundle utilize variable of info type string hold reolution info of symbol beingness looped through in record set list.

rootfolder: variable of info type string hold root folder files should downloaded to.

sql_getsymbols: variable of info type string contain t-sql query fetch ticker symbols info database. set value select symbol, startdate, enddate, resolution dbo.tickersymbols

startdate: bundle utilize variable of info type datetime hold start date of symbol beingness looped through in record set list.

symbol: bundle utilize variable of info type string hold ticker symbol loops through each record in record set list.

symbolslist: bundle utilize variable of info type object hold result set of ticker symbols stored in database.

urlyahoochart: variable of info type string hold url yahoo finance chart website place holders fill in appropriate values query string. set value http://ichart.finance.yahoo.com/table.csv?s={0}&a={1}&b={2}&c={3}&d={4}&e={5}&f={6}&g={7}&ignore=.csv

on package, right-click on connection managers tab , click flat file connection...

on general page of flat file connection manager editor, perform next actions:

set name file_tickerpricehistory

set description read ticker symbol cost history.

if have sample file, point file location. ssis infer settings info in file. in case, downloaded file navigating url http://ichart.finance.yahoo.com/table.csv?s=msft&a=9&b=1&c=2012&d=11&e=30&f=2012&g=m&ignore=.csv , saved under name c:\siva\stackoverflow\files\14797886\data\\msft_20130210_092519.csv

make sure format set delimited.

make sure header row delimiter set {cr}{lf}

check box column names in first info row

click columns page

on columns page of flat file connection manager editor, create sure row delimiter set {lf} , column delimiter set comma {,}. click advanced page.

on advanced page of flat file connection manager editor, columns created based on file information. alter values shown below column names match names in database. way column mapping easier. columns except lastly column should have columndelimiter set comma {,}. lsst column should have the columndelimiter set {lf}.

column info type dataprecision datascale ------------------- ------------------------------------ ------------- --------- pricedate date [dt_date] priceopen numeric [dt_numeric] 18 2 pricehigh numeric [dt_numeric] 18 2 pricelow numeric [dt_numeric] 18 2 priceclose numeric [dt_numeric] 18 2 volume eight-byte unsigned integer [dt_ui8] adjustmentclose numeric [dt_numeric] 18 2

you should see both connection managers @ bottom of package.

drag , drop execute sql task on *control flo*w tab , perform next actions on general tab.

set name get symbols database set description fetch list of symbols , download settings database. set resultset full result set because query homecoming record set. set connectiontype ole db set connection oledb_sora select variable sqlsourcetype select user::sql_getsymbols sourcevariable click result set page.

on result set page of execute sql task, click add together , set result name 0 indicating index of result set. select user::symbolslist variable name store result set object variable.

drag , drop foreach loop container , place after execute sql task. connect execute sql task greenish arrow foreach loop container. double-click foreach loop container view foreach loop editor. configure foreach loop editor shown below.

on variable mappings page of foreach loop editor, configure shown below:

drag , drop script task within foreach loop container. double-click script task open script task editor. on script page of script task editor, click ellipsis button against readonlyvariables , select below listed variables. need utilize these within script task code.

user::enddate user::fileextension user::filename user::filepath user::resolution user::rootfolder user::startdate user::symbol user::urlyahoochart

click edit script... button on script task editor , type below code. after typing code, close script task editor.

script task code in c#: using system; using system.data; using microsoft.sqlserver.dts.runtime; using system.windows.forms; using system.net; namespace st_5fa66fe26d20480e8e3258a8fbd16683.csproj { [system.addin.addin("scriptmain", version = "1.0", publisher = "", description = "")] public partial class scriptmain : microsoft.sqlserver.dts.tasks.scripttask.vstartscriptobjectmodelbase { #region vsta generated code enum scriptresults { success = microsoft.sqlserver.dts.runtime.dtsexecresult.success, failure = microsoft.sqlserver.dts.runtime.dtsexecresult.failure }; #endregion public void main() { seek { string symbol = dts.variables["user::symbol"].value.tostring(); datetime startdate = convert.todatetime(dts.variables["user::startdate"].value); datetime enddate = convert.todatetime(dts.variables["user::enddate"].value); string resolution = dts.variables["user::resolution"].value.tostring(); string urlyahoochart = dts.variables["user::urlyahoochart"].value.tostring(); string rootfolder = dts.variables["user::rootfolder"].value.tostring();; string fileextension = dts.variables["user::fileextension"].value.tostring(); string filename = dts.variables["user::filename"].value.tostring(); string downloadpath = dts.variables["user::filepath"].value.tostring(); if (!system.io.directory.exists(rootfolder)) system.io.directory.createdirectory(rootfolder); urlyahoochart = string.format(urlyahoochart , symbol , startdate.month , startdate.day , startdate.year , enddate.month , enddate.day , enddate.year , resolution); bool refire = false; dts.events.fireinformation(0, string.format("download url of {0}", symbol), urlyahoochart, string.empty, 0, ref refire); webclient webclient = new webclient(); webclient.downloadfile(urlyahoochart, downloadpath); dts.taskresult = (int)scriptresults.success; } grab (exception ex) { dts.events.fireerror(0, "download error", ex.tostring(), string.empty, 0); } } } }

drag , drop data flow task within foreach loop container after script task. connect greenish arrow script task data flow task. command flow tab should shown below.

on info flow task, drag , drop flat file source , configure shown below read cost history csv files.

drag , drop derived column transformation , create new column named symbol look (dt_str,10,1252)@[user::symbol] add together symbol info pipeline.

drag , drop ole db destination , configure shown below insert info database.

your info flow tab should shown below:

before running package, need create couple of changes prevent warnings or errors on design time view due absence of files in folder.

click flat file connection manager file_tickerpricehistory , press f4 view properties. alter property delayvalidation true. create sure validation of file existence happen during runtime. click ellipsis button against expression , set connectionstring property value @[user::filepath]. alter file path each file beingness downloaded website.

click data flow task , press f4 view properties. alter property delayvalidation true. create sure validation of file existence happen during runtime.

navigate data flow tab , click flat file source , press f4 view properties. alter property validateexternalmetadata false. create sure validation of flat file existence happen during runtime.

let navigate folder c:\siva\stackoverflow\files\14797886, downloaded files saved empty. folder not have empty. execution check.

run next sql statements against database verify info in table. sec table should empty.

select * dbo.tickersymbols; select * dbo.tickerpricehistory;

execute package. if set correctly, bundle should run , download files each symbol listed in table dbo.tickersymbols

the files should saved folder c:\siva\stackoverflow\files\14797886. notice each file named appropriately based on expressions provided in package.

run next sql statement against database verify info in table. table dbo.tickerpricehistory should have info cost files downloaded website.

select * dbo.tickerpricehistory;

the above sample bundle illustrated how download cost files yahoo finance chart website given list of ticker symbols , load them database.

c# csv ssis yahoo

No comments:

Post a Comment