From 020701bcba397d590d284962f3ce5df3134aaa08 Mon Sep 17 00:00:00 2001 From: Evgeny Fadeev Date: Tue, 9 Mar 2010 22:05:39 -0500 Subject: SE loader seems to work, details are in stackexchange/README --- stackexchange/README | 34 +++++++++++++++++++++++++++++----- 1 file changed, 29 insertions(+), 5 deletions(-) (limited to 'stackexchange/README') diff --git a/stackexchange/README b/stackexchange/README index b2a39e1c..bad11c9f 100644 --- a/stackexchange/README +++ b/stackexchange/README @@ -1,11 +1,12 @@ this app's function will be to: -* install it's own tables <--- done -* read SE xml dump into DjangoDB <--- done -* populate osqa database <-- user accounts and Q&A revisions loaded -* remove SE tables +* install it's own tables (#todo: not yet automated) +* read SE xml dump into DjangoDB (automated) +* populate osqa database (automated) +* remove SE tables (#todo: not done yet) -Current process to load SE data into OSQA: +Current process to load SE data into OSQA is: +============================================== 1) backup database @@ -36,3 +37,26 @@ Current process to load SE data into OSQA: if anything doesn't go right - run 'python manage.py flush' and repeat steps 6 and 7 + +NOTES: +============ + +Here is the load script that I used for the testing +it assumes that SE dump has been unzipped inside the tmp directory + + #!/bin/sh$ + python manage.py flush + #delete all data + mysql -u osqa -p osqa < sql_scripts/badges.sql + python manage.py load_stackexchange tmp + +Untested parts are tagged with comments starting with +#todo: + +The test set did not have all the usage cases of StackExchange represented so +it may break with other sets. + +The job takes some time to run, especially +content revisions and votes - may be optimized + +Some of the fringe cases are described in file stackexchange/ANOMALIES -- cgit v1.2.3-1-g7c22