Intraday Data Loading Using Python

I find myself using Python for most of my scripting needs, although sometimes I revert to Bash.

A recent trading idea required intraday data. I found it difficult to find historical intraday data for free; all the news sites offer only historical daily close data. Google and MSN offer real-time quotes, but not historical, creating a page scrapper and running it for a few days seemed like overkill.
I found a small shareware application called "Historical Quotes Downloader" (it uses data from prophet.net) and used it to get 5 days of minute-by-minute intraday data for the S&P 500. That left me with 500 files that needed to be loaded in MySQL...so I wrote a little Python script to do just that. After loading the data, I plan to use quantmod for R to do the research.

Here is the quick Python script for loading files from Historical Quotes Downloader/prophet.net into MySQL, each tick is in a separate table.


# import all of the intra-day data in the table
# create a table for each of the files, using the name before the first period
# and load all of the intraday data
#
# this script could be implemented using fileinput, should try that next time

import sys, datetime, time
import os, fnmatch
import MySQLdb

try:
conn = MySQLdb.connect(host = "localhost",
user = "username",
passwd = "password",
db = "intraday")
except MySQLdb.Error, e:
print "Error %d: %s" % (e.args[0], e.args[1])
sys.exit(1)

if len(sys.argv) < 2:
print "Proper Command: "+sys.argv[0]+" ..."
sys.exit(0)

# expand the regular expression to all the matching files
fileList = []
for fileName in os.listdir(os.path.dirname(sys.argv[1])):
if fnmatch.fnmatch(fileName, os.path.basename(sys.argv[1])):
fileList.append(fileName)

# loop though each of the files passed in
for tickFileName in fileList:

tickName = os.path.basename(tickFileName)
tableName = tickName[0:tickName.index(".")]
print "creating table:%s:\n" %(tableName)

cursor = None
# create the table
try:
cursor = conn.cursor()
cursor.execute("""
CREATE TABLE `"""+tableName+"""` (
`datetime` DATETIME NOT NULL DEFAULT 0,
`open` DOUBLE NOT NULL DEFAULT 0,
`high` DOUBLE NOT NULL DEFAULT 0,
`low` DOUBLE NOT NULL DEFAULT 0,
`close` DOUBLE NOT NULL DEFAULT 0,
`volume` INTEGER UNSIGNED NOT NULL DEFAULT 0,
PRIMARY KEY(`datetime`)
)ENGINE = MYISAM;""")
cursor.close()
conn.commit()
except MySQLdb.Error, e:
print "Error %d: %s" %(e.args[0], e.args[1])
try:
conn.rollback ()
except:
continue

#open the file and load
tickFile = None
try:
tickFile = open(os.path.join(os.path.dirname(sys.argv[1]), tickFileName), "r")
cursor = conn.cursor()

# Symbol (optional), Date (optional), Time (optional), Open, High, Low, Close, Volume
tickFile.readline() # remove the header
tickLines = tickFile.readlines()
for line in tickLines:
# detect the end of the ticks in the file
if line.find("barsofar") != -1:
break
tokens = line.split(",")
try:
sqlDate = time.strptime(tokens[0], "%m/%d/%Y %H:%M:%S")
except (ValueError, IndexError):
print "Unexpected format " + str(sys.exc_info()[0])
break #the file might be empty so proceed with next one

try:
cursor.execute("""INSERT INTO `"""+tableName+"""` (datetime, open, high, low, close, volume) VALUES(%s, %s, %s, %s, %s, %s)""",
(time.strftime("%Y-%m-%d %H:%M:%S", sqlDate),tokens[1],tokens[2],tokens[3],tokens[4],tokens[5]))
except MySQLdb.Error, e:
print "Error %d: %s" %(e.args[0], e.args[1])
break

cursor.close()
conn.commit()
except MySQLdb.Error, e:
print "Error %d: %s" %(e.args[0], e.args[1])
try:
conn.rollback ()
except:
pass
finally:
if tickFile != None:
tickFile.close()

# end of for loop over files

conn.close()



The files I created are in the format MMM.intraday and I call the script "python intraday_db_load.py *.intrday".

If the experiments are a success and a trading pattern is discovered, the plan is to use the Interactive Brokers API for real-time data. I am hoping that opentick.com comes back and I can make use of their API too.

No comments: