Command Line:
python db_dump.py [-svdh] [--settings] dump [applist]
If applist is ignored,then it means that all app will be dumped. applist can be one or more app name.
Description of options:
-s Output will displayed in console, default is writing into file
-v Display execution infomation, default is does not display
-d Directory of output, default is datadir in current directory. If the path is not existed, it'll be created automatically.
-h Display help information.
--settings settings model, default is settings.py in current directory.
It can only support Python format for now. It'll create a standard python source file, for example:
dump = {'table': 'tablename', 'records': [[...]], 'fields': [...]}
table' is table name in database, records is all records of the table, it's a list of list, that is each record is a list. fields` is the fields name of the table. Load(Restore) ¶
Command Line:
python db_dump.py [-svdrh] [--settings] load [applist]
You can refer to the above description for same option. Others is:
-r Does not empty the table as loading the data, default is empty the table first then load the data
Using this tool, you can not only restore the database, but also can deal with the simple changes of database. It can select the suitable field from the backup data file according to the changed Model automatically, and it can also deal with the default value define in Model, such as default parameter and auto_now and auto_now_add parameter for Date-like field. And you can even edit the backup data file manually, and add a default key for specify the default value for some fields, the basic format is:
'default':{'fieldname':('type', 'value')}
default is a dict, the key will be the field name of the table, the value will be a two element tuple, and the first element of this tuple is type field, the second element is its value. Below is a description of type field:
type value description
'value' real value using the value field directly 'reference' referred field name the value of this filed will use the value of referred field. It'll be used when the field name is changed 'date' 'now'|'yyyy-mm-dd' It's a date date type, if the value field is 'now', then the value be current time. Otherwise, it'll be a string, it's format is 'yyyy-mm-dd' 'datetime' 'now'|'yyyy-mm-dd hh:mm:ss' The same as above 'time' 'now'|'hh:mm:ss' The same as above
The strategy of selection of default value of a field is: first, create a default value dict according the Model, then update it according the default key of backup data file. So you can see if there is a same definition of a field in both Model and backup data file, it'll use the one in backup data file.
According the process of default value, this tool will suport these changes, such as: change of field name, add or remove field name, etc. So you can use this tool to finish some simple update work of database.
But I don't give it too much test, and my situation is in sqlite3. So download and test are welcome, and I hope you can give me some improve advices.
# Author: limodou (limodou@gmail.com)
#
# This tool is used for dump and reload data from and into database
# You can see the help info through:
#
# python db_dump.py -h
#
# For now, it only support .py format, so the output result will
# be saved as python source code, and you can import it.
#
# Version 2.2 2007-11-01
# * improve postgresql sequence field process, thanks oyvind.saltvik@gmail.com and Matthew Wensing
# * add errorquit option in command line
#
# Version 2.1 2007-09-18
# * add Time type support
#
# Version 2.0 2007-09-11
# * refact, and add aoto reset postgres sequence, thanks Eric SIMORRE
#
# Version 1.9 2007-09-02 (Merge from RichardH)
# * Adds try-except to catch the changes in db.backend refactoring in
# svn version. So db_dump.py can support old version except trunk.
#
# Version 1.8 2007-08-30
# * Fix backend.quote_name to backend.DatabaseOperations().quote_name
# Thanks to richardh
#
# Version 1.7 2007-05-28
# * keep up with the change of GenericRel, so you can use db_dump.py
# in trunk and version before 0.97
#
# Version 1.6 2007-04-09
# * Add float support
#
# Version 1.5 2007-02-08
# * If the filename is not exists, then skip it
#
# Version 1.4 2007-01-21
# * support mysql
#
# Version 1.3 2007-01-20
# * change the output format of data file, and improve the process
# effective of dumpping and loading
#
# Version 1.2 2007-01-20
# * change dumpdb to use model info but not cursor.description,
# because some database backend does not support cursor.description
#
# Version 1.1 2007-01-19
# * if no arguments after db_dump.py, then it'll show help infomation
#
# Version 1.0 2007-01-18
#
import os, sys
import datetime
import decimal
from optparse import OptionParser
quote_flag = None
def _get_table_order(app_labels):
from django.db.models import get_app, get_apps, get_models
from django.db.models import ForeignKey, OneToOneField
if not app_labels:
app_list = get_apps()
else:
app_list = [get_app(app_label) for app_label in app_labels]
models = {}
for app in app_list:
for model in get_models(app):
models[model._meta.db_table] = model
s = []
rules = []
def order(s, rule):
a, b = rule
try:
i = s.index(a)
try:
j = s.index(b)
if j<i:
del s[i]
s.insert(j, a)
except:
s.append(b)
except:
s.append(a)
try:
j = s.index(b)
del s[j]
s.append(b)
except:
s.append(b)
for i, table in enumerate(models.keys()[:]):
for field in models[table]._meta.fields:
if isinstance(field, (ForeignKey, OneToOneField)):
tname = field.rel.to._meta.db_table
if not models.has_key(tname) or tname == table:
continue
rules.append((tname, table))
order(s, (tname, table))
n = []
for k, v in models.items():
if s.count(k) == 0:
n.append(k)
return [models[k] for k in s+n]
def _find_key(d, key):
if not d:
return None
for k, v in d.items()[:]:
if k == key:
return d
else:
result = _find_key(v, key)
if result is not None:
return result
def loaddb(app_labels, format, options):
from django.db import connection, transaction
if options.verbose:
print "Begin to load data for %s format...\n" % format
models = _get_table_order(app_labels)
cursor = connection.cursor()
errornum = 0
if not options.remain and not options.stdout:
m = models[:]
m.reverse()
for model in m:
cursor.execute('DELETE FROM %s WHERE 1=1;' % quote_name(model._meta.db_table))
for table, fields in get_model_many2many_stru(model):
cursor.execute('DELETE FROM %s WHERE 1=1;' % quote_name(table))
success = True
for model in models:
try:
load_model(cursor, model, format, options)
setSequence(cursor, model)
for table, fields in get_model_many2many_stru(model):
load_model(cursor, (table, fields), format, options)
setSequence(cursor, model)
except Exception, e:
import traceback
traceback.print_exc()
sys.stderr.write("Problem loading %s format '%s' : %s\n" % (format, model, str(e)))
success = False
errornum += 1
if options.errorquit:
transaction.rollback_unless_managed()
print "Error found! The database has been rollbacked!"
raise
if success:
transaction.commit_unless_managed()
else:
transaction.rollback_unless_managed()
if errornum:
print "There are %d errors found! The database has been rollbacked!" % errornum
else:
print "Successful!"
def load_model(cursor, model, format, options):
datadir, verbose, stdout = options.datadir, options.verbose, options.stdout
sql = 'INSERT INTO %s (%s) VALUES (%s);'
if isinstance(model, (tuple, list)):
filename = os.path.join(datadir, model[0] + '.%s' % format)
fields, default = model[1], {}
else:
opts = model._meta
filename = os.path.join(datadir, opts.db_table + '.%s' % format)
fields, default = get_model_stru(model)
if verbose:
print '..Dealing %s for %s format...\n' % (filename, format)
if not os.path.exists(filename):
if verbose:
print '..%s does not exists, so Skip it..\n' % filename
return
try:
objs = {}
if format == 'py':
s = []
f = file(filename, 'rb')
for line in f:
varname = line.split('=')[0]
if varname.strip() != 'records':
s.append(line)
else:
d = {}
exec ''.join(s) in d
objs['table'] = d.get('table', '')
objs['fields'] = d.get('fields', [])
objs['default'] = d.get('default', {})
objs['records'] = f
break
else:
# f = file(filename, 'rb')
# objs = f.read()
# records = objs['records']
# f.close()
raise 'Not support this format %s' % format
fs = objs['fields']
table = objs['table']
default.update(objs.get('default', {}))
count = 0
for row in objs["records"]:
if row.strip() == ']':
break
row = eval(row)
d = dict(zip(fs, row))
sql_fields = []
sql_values = []
for fd in fields:
v = None
if d.has_key(fd):
v = d[fd]
else:
if default.get(fd, None) is not None:
kind, value = default[fd]
if not kind or kind == 'value':
v = value
elif kind == 'reference':
try:
v = d[value]
except KeyError:
sys.stderr.write("Referenced field [%s] does not exist\n" % value)
raise
elif kind == 'date':
if not value or value == 'now':
v = datetime.date.today().strftime('%Y-%m-%d')
else:
v = value
#add time support
elif kind == 'time':
if not value or value == 'now':
v = datetime.datetime.now().strftime('%H:%M:%S')
else:
v = value
elif kind == 'datetime':
if not value or value == 'now':
v = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
else:
v = value
else:
raise Exception, "Cann't support this default type [%s]\n" % kind
if v is not None:
sql_fields.append(fd)
sql_values.append(v)
e_sql = sql % (quote_name(table),
','.join(map(quote_name, sql_fields)), ','.join(['%s'] * len(sql_fields)))
if stdout:
print e_sql, sql_values, '\n'
else:
try:
cursor.execute(e_sql, sql_values)
count += 1
except:
sys.stderr.write("Error sql: %s %s\n" % (e_sql, sql_values))
raise
if verbose:
print '(Total %d records)\n' % count
except Exception, e:
import traceback
traceback.print_exc()
sys.stderr.write("Problem loading %s format '%s' : %s\n" %
(format, filename, str(e)))
raise
def get_model_stru(model):
from django.db.models.fields import DateField, DateTimeField, TimeField, IntegerField
fields = []
default = {}
for f in model._meta.fields:
fields.append(f.column)
v = f.get_default()
if v is not None:
default[f.column] = ('value', v)
if isinstance(f, (DateTimeField, DateField, TimeField)):
if f.auto_now or f.auto_now_add:
v = datetime.datetime.now()
default[f.column] = ('value', f.get_db_prep_save(v))
# Need to fix sqlite defaulting None values to ''
if isinstance(f, IntegerField):
default[f.column] = ('value', None)
return fields, default
def get_model_many2many_stru(model):
try:
from django.db.models import GenericRel
except:
from django.contrib.contenttypes.generic import GenericRel
opts = model._meta
for f in opts.many_to_many:
fields = []
if not isinstance(f.rel, GenericRel):
fields.append('id')
fields.append(f.m2m_column_name())
fields.append(f.m2m_reverse_name())
yield f.m2m_db_table(), fields
def dumpdb(app_labels, format, options):
from django.db.models import get_app, get_apps, get_models
datadir, verbose, stdout = options.datadir, options.verbose, options.stdout
if verbose:
print "Begin to dump data for %s format...\n" % format
if len(app_labels) == 0:
app_list = get_apps()
else:
app_list = [get_app(app_label) for app_label in app_labels]
if not os.path.exists(datadir):
os.makedirs(datadir)
errornum = 0
for app in app_list:
for model in get_models(app):
try:
write_result(dump_model(model), format, options)
for result in dump_many2many(model):
write_result(result, format, options)
except Exception, e:
import traceback
traceback.print_exc()
sys.stderr.write("Unable to dump database: %s\n" % e)
errornum += 1
if options.errorquit:
raise
if errornum:
print "There are %d errors found!" % errornum
else:
print "Successful!"
def dump_model(model):
from django.db import connection
opts = model._meta
cursor = connection.cursor()
fields, default = get_model_stru(model)
cursor.execute('select %s from %s' %
(','.join(map(quote_name, fields)), quote_name(opts.db_table)))
return call_cursor(opts.db_table, fields, cursor)
def call_cursor(table, fields, cursor):
yield table
yield fields
while 1:
rows = cursor.fetchmany(100)
if rows:
for row in rows:
yield _pre_data(row)
else:
break
def _pre_data(row):
row = list(row)
for i, fd in enumerate(row):
if isinstance(fd, datetime.datetime):
row[i] = row[i].strftime('%Y-%m-%d %H:%M:%S') # + '.' + str(row[i].microsecond).rstrip('0')
elif isinstance(fd, datetime.date):
row[i] = row[i].strftime('%Y-%m-%d')
elif isinstance(fd, datetime.time):
row[i] = row[i].strftime('%H:%M:%S')
elif isinstance(fd, decimal.Decimal):
row[i] = row[i].__float__()
return row
def dump_many2many(model):
from django.db import connection
cursor = connection.cursor()
for table, fields in get_model_many2many_stru(model):
cursor.execute('select %s from %s' %
(','.join(map(quote_name, fields)), quote_name(table)))
yield call_cursor(table, fields, cursor)
def write_result(result, format, options):
table = result.next()
fields = result.next()
filename = os.path.join(options.datadir, table + '.%s' % format)
if options.verbose:
print '..Dumping %s ...\n' % filename
if not options.stdout:
f = file(filename, 'wb')
else:
f = sys.stdout
print >>f, 'table = %r' % table
print >>f, 'fields = %r' % fields
print >>f, '#default item format: "fieldname":("type", "value")'
print >>f, 'default = {}'
print >>f, 'records = ['
i = 0
for t in result:
print >>f, repr(t)
i += 1
print >>f, ']'
if options.verbose:
print '(Total %d records)\n' % i
if not options.stdout:
f.close()
def quote_name(s):
from django.db import backend
if quote_flag == 'old':
return backend.quote_name(s)
else:
return backend.DatabaseOperations().quote_name(s)
#thanks for Matthew Wensin
def setSequence(cursor, model):
from django.conf import settings
from django.db.models import AutoField
# postgresql: reset sequence
if settings.DATABASE_ENGINE in ('postgresql_psycopg2', 'postgresql'):
autofields = [field for field in model._meta.fields if isinstance(field, AutoField)]
for f in autofields:
seq = quote_name('%s_%s_seq' % (model._meta.db_table, f.name))
cursor.execute("SELECT nextval('%s');" % seq)
nb = cursor.fetchall()[0][0]
if nb:
cursor.execute('ALTER SEQUENCE %s RESTART WITH %d;' % (seq, nb))
def get_usage():
usage = """
%prog [options] action [applist]:
action: dump load
"""
return usage
def execute_from_command_line(argv=None):
# Use sys.argv if we've not passed in a custom argv
if argv is None:
argv = sys.argv
# Parse the command-line arguments. optparse handles the dirty work.
parser = OptionParser(usage=get_usage())
parser.add_option('--settings',
help='Python path to settings module, e.g. "myproject.settings.main". If this isn\'t provided, the DJANGO_SETTINGS_MODULE environment variable will be used.')
parser.add_option('-d', '--dir', help='Output/Input directory.', default="datadir", dest="datadir")
# parser.add_option('-f', '--format', help='Data format(json, xml, python).', type="choice",
# choices=['json', 'xml', 'python'], default='json')
parser.add_option('-v', '--verbose', help='Verbose mode', action='store_true')
parser.add_option('-s', '--stdout', help='Output the data to stdout', action='store_true')
parser.add_option('-r', '--remain', help='Remain the records of the tables, default will delete all the records. Only used for loading.', action='store_true')
parser.add_option('-e', '--errorquit', help='If there are errors occured, then exit the program.', action='store_true')
options, args = parser.parse_args(argv[1:])
if len(args) == 0:
parser.print_help()
sys.exit(0)
action = args[0]
apps = args[1:]
if options.settings:
os.environ['DJANGO_SETTINGS_MODULE'] = options.settings
else:
from django.core.management import setup_environ
try:
import settings
except ImportError:
print "You don't appear to have a settings file in this directory!"
print "Please run this from inside a project directory"
sys.exit()
setup_environ(settings)
global quote_flag
import django.db
try:
# Earlier Django versions.
django.db.backend.quote_name
quote_flag = 'old'
except AttributeError:
# Django after backend refactoring.
quote_flag = 'new'
if action == 'dump':
dumpdb(apps, 'py', options)
elif action == 'load':
loaddb(apps, 'py', options)
else:
parser.print_help()
if __name__ == '__main__':
execute_from_command_line()