diff options
author | Evgeny Fadeev <evgeny.fadeev@gmail.com> | 2010-04-25 17:15:26 -0400 |
---|---|---|
committer | Evgeny Fadeev <evgeny.fadeev@gmail.com> | 2010-04-25 17:15:26 -0400 |
commit | cc8337da9046bff5243672e20f1dea9c18b00da6 (patch) | |
tree | 77ade69869105f1838df5e8616bb994844507cec /forum/importers/stackexchange/README | |
parent | 3122fb8a2599944e623c8e21f285a9e4dd9e132a (diff) | |
parent | 02510a462392dd2e9e46e945d51efb374e0dc06f (diff) | |
download | askbot-cc8337da9046bff5243672e20f1dea9c18b00da6.tar.gz askbot-cc8337da9046bff5243672e20f1dea9c18b00da6.tar.bz2 askbot-cc8337da9046bff5243672e20f1dea9c18b00da6.zip |
merged newer ui branch to master
Diffstat (limited to 'forum/importers/stackexchange/README')
-rw-r--r-- | forum/importers/stackexchange/README | 62 |
1 files changed, 62 insertions, 0 deletions
diff --git a/forum/importers/stackexchange/README b/forum/importers/stackexchange/README new file mode 100644 index 00000000..598a8555 --- /dev/null +++ b/forum/importers/stackexchange/README @@ -0,0 +1,62 @@ +this app's function will be to: + +* install it's own tables (#todo: not yet automated) +* read SE xml dump into DjangoDB (automated) +* populate askbot database (automated) +* remove SE tables (#todo: not done yet) + +Current process to load SE data into Askbot is: +============================================== + +1) backup database + +2) unzip SE dump into dump_dir (any directory name) + you may want to make sure that your dump directory in .gitignore file + so that you don't publish it by mistake + +3) enable 'stackexchange' in the list of installed apps (probably aready in settings.py) + +4) (optional - create models.py for SE, which is included anyway) run: + + #a) run in-place removal of xml namspace prefix to make parsing easier + perl -pi -w -e 's/xs://g' $SE_DUMP_PATH/xsd/*.xsd + cd stackexchange + python parse_models.py $SE_DUMP_PATH/xsd/*.xsd > models.py + +5) Install stackexchange models (as well as any other missing models) + python manage.py syncdb + +6) make sure that badges are installed + if not, run (example for mysql): + + mysql -u user -p dbname < sql_scripts/badges.sql + +7) load SE data: + + python manage.py load_stackexchange dump_dir + + if anything doesn't go right - run 'python manage.py flush' and repeat + steps 6 and 7 + +NOTES: +============ + +Here is the load script that I used for the testing +it assumes that SE dump has been unzipped inside the tmp directory + + #!/bin/sh$ + python manage.py flush + #delete all data + mysql -u askbot -p aksbot < sql_scripts/badges.sql + python manage.py load_stackexchange tmp + +Untested parts are tagged with comments starting with +#todo: + +The test set did not have all the usage cases of StackExchange represented so +it may break with other sets. + +The job takes some time to run, especially +content revisions and votes - may be optimized + +Some of the fringe cases are described in file stackexchange/ANOMALIES |